pax_global_header00006660000000000000000000000064132420053270014510gustar00rootroot0000000000000052 comment=a6c94dc4a4e57abb138821714b34f59856b21cca python3-antlr3-3.5.2/000077500000000000000000000000001324200532700143245ustar00rootroot00000000000000python3-antlr3-3.5.2/.gitignore000066400000000000000000000000251324200532700163110ustar00rootroot00000000000000.*.swp *~ *.pyc *.gz python3-antlr3-3.5.2/AUTHORS000066400000000000000000000004041324200532700153720ustar00rootroot00000000000000Python target: Benjamin Niemann : Main developer of Python target. Clinton Roy : AST templates and runtime. Python3 target: Benjamin S Wolf (http://github.com/Zannick): Converted Python target to Python3. python3-antlr3-3.5.2/ChangeLog000066400000000000000000000040661324200532700161040ustar00rootroot000000000000002012-06-26 Benjamin S Wolf Initial Python3 target, branched from the Python target by Benjamin Niemann, with lots of code cleanup and minor refactoring. * CodeGenerator.java, Python3.stg: Generated code now uses set notation for setTest, rather than long conditionals like "a == FOO or a == BAR or 10 <= a <= 12". This is a (slight) performance improvement. * tokens.py: Token objects no longer have get/set methods for their attributes as I switched them to use @property instead. The attributes should be accessed directly. * tokens.py, Python3.stg: Fix a circular dependency in generated parsers, and give Token objects the ability to return their typeName when asked for it. (The generated recognizer gives Token the mapping from token type to type name.) 2007-11-03 Benjamin Niemann * PythonTarget.java, dfa.py, exceptions.py, recognizer.py, streams.py: ANTLRStringStream.LA() now returns the character's ordinal and generated lexers operate on integers. Also made various performance tunings. 2007-10-07 Benjamin Niemann * main.py, Python.stg (outputFile): Added simple __main__ section to generated code, so (simple) grammars can be executed as standalone script. * tree.py (RecognitionException.extractInformationFromTreeNodeStream), exceptions.py (CommonTree): Small bugfixes. 2007-09-30 Benjamin Niemann * recognizers.py (TokenSource): Added iterator interface to TokenSource class - and thus to Lexer. 2007-06-27 Benjamin Niemann * Python.stg (genericParser, parser, treeParser): Use correct @init action block for tree parsers. 2007-05-24 Benjamin Niemann * Python.stg (rule): Added support for @decorate {...} action for parser rules to add decorators to the rule method. 2007-05-18 Benjamin Niemann * Python.stg (isolatedLookaheadRangeTest, lookaheadRangeTest): Minor improvement of generated code (use ' <= <= ' instead of ' >= and <= '). python3-antlr3-3.5.2/LICENSE000066400000000000000000000026201324200532700153310ustar00rootroot00000000000000[The "BSD licence"] Copyright (c) 2003-2012 Terence Parr All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. 3. The name of the author may not be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. python3-antlr3-3.5.2/README000066400000000000000000000050661324200532700152130ustar00rootroot000000000000001) ABOUT ======== This is the Python3 package 'antlr3', which is required to use parsers created by the ANTLR3 tool. See for more information about ANTLR3. 2) STATUS ========= The Python3 target for ANTLR3 is still in beta. Documentation is lacking, some bits of the code is not yet done, some functionality has not been tested yet. Also the API might change a bit - it currently mimics the Java implementation, but it may be made a bit more pythonic here and there. WARNING: The runtime library is not compatible with recognizers generated by ANTLR versions preceding V3.4.x. If you are an application developer, then the suggested way to solve this is to package the correct runtime with your application. Installing the runtime in the global site-packages directory may not be a good idea. Sorry for the inconvenience. 3) DOWNLOAD =========== This runtime is part of the ANTLR distribution. The latest version can be found at . If you are interested in the latest, most bleeding edge version, have a look at the git repository at . 4) INSTALLATION =============== Just like any other Python package: $ python3 setup.py install See for more information. 5) DOCUMENTATION ================ Documentation (as far as it exists) can be found in the wiki 6) REPORTING BUGS ================= Please file bug reports on github: . 7) HACKING ========== Only the runtime package can be found here. There are also some StringTemplate files in 'src/org/antlr/codegen/templates/Python3/' and some Java code in 'src/org/antlr/codegen/Python3Target.java' (of the main ANTLR3 source distribution). If there are no directories 'tests' and 'unittests' in 'runtime/Python3', you should fetch the latest ANTLR3 version from the perforce depot. See section DOWNLOAD. You'll need java and ant in order to compile and use the tool. Be sure to properly setup your CLASSPATH. (FIXME: is there some generic information, how to build it yourself? I should point to it to avoid duplication.) You can then use the commands $ python3 setup.py unittest $ python3 setup.py functest to ensure that changes do not break existing behaviour. Please send patches as pull requests on github. For larger code contributions you'll have to sign the "Developer's Certificate of Origin", which can be found on or use the feedback form at . python3-antlr3-3.5.2/antlr3/000077500000000000000000000000001324200532700155275ustar00rootroot00000000000000python3-antlr3-3.5.2/antlr3/__init__.py000066400000000000000000000122651324200532700176460ustar00rootroot00000000000000""" @package antlr3 @brief ANTLR3 runtime package This module contains all support classes, which are needed to use recognizers generated by ANTLR3. @mainpage \\note Please be warned that the line numbers in the API documentation do not match the real locations in the source code of the package. This is an unintended artifact of doxygen, which I could only convince to use the correct module names by concatenating all files from the package into a single module file... Here is a little overview over the most commonly used classes provided by this runtime: @section recognizers Recognizers These recognizers are baseclasses for the code which is generated by ANTLR3. - BaseRecognizer: Base class with common recognizer functionality. - Lexer: Base class for lexers. - Parser: Base class for parsers. - tree.TreeParser: Base class for %tree parser. @section streams Streams Each recognizer pulls its input from one of the stream classes below. Streams handle stuff like buffering, look-ahead and seeking. A character stream is usually the first element in the pipeline of a typical ANTLR3 application. It is used as the input for a Lexer. - ANTLRStringStream: Reads from a string objects. The input should be a unicode object, or ANTLR3 will have trouble decoding non-ascii data. - ANTLRFileStream: Opens a file and read the contents, with optional character decoding. - ANTLRInputStream: Reads the date from a file-like object, with optional character decoding. A Parser needs a TokenStream as input (which in turn is usually fed by a Lexer): - CommonTokenStream: A basic and most commonly used TokenStream implementation. - TokenRewriteStream: A modification of CommonTokenStream that allows the stream to be altered (by the Parser). See the 'tweak' example for a usecase. And tree.TreeParser finally fetches its input from a tree.TreeNodeStream: - tree.CommonTreeNodeStream: A basic and most commonly used tree.TreeNodeStream implementation. @section tokenstrees Tokens and Trees A Lexer emits Token objects which are usually buffered by a TokenStream. A Parser can build a Tree, if the output=AST option has been set in the grammar. The runtime provides these Token implementations: - CommonToken: A basic and most commonly used Token implementation. - ClassicToken: A Token object as used in ANTLR 2.x, used to %tree construction. Tree objects are wrapper for Token objects. - tree.CommonTree: A basic and most commonly used Tree implementation. A tree.TreeAdaptor is used by the parser to create tree.Tree objects for the input Token objects. - tree.CommonTreeAdaptor: A basic and most commonly used tree.TreeAdaptor implementation. @section Exceptions RecognitionException are generated, when a recognizer encounters incorrect or unexpected input. - RecognitionException - MismatchedRangeException - MismatchedSetException - MismatchedNotSetException . - MismatchedTokenException - MismatchedTreeNodeException - NoViableAltException - EarlyExitException - FailedPredicateException . . A tree.RewriteCardinalityException is raised, when the parsers hits a cardinality mismatch during AST construction. Although this is basically a bug in your grammar, it can only be detected at runtime. - tree.RewriteCardinalityException - tree.RewriteEarlyExitException - tree.RewriteEmptyStreamException . . """ # tree.RewriteRuleElementStream # tree.RewriteRuleSubtreeStream # tree.RewriteRuleTokenStream # CharStream # DFA # TokenSource # [The "BSD licence"] # Copyright (c) 2005-2012 Terence Parr # All rights reserved. # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions # are met: # 1. Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # 2. Redistributions in binary form must reproduce the above copyright # notice, this list of conditions and the following disclaimer in the # documentation and/or other materials provided with the distribution. # 3. The name of the author may not be used to endorse or promote products # derived from this software without specific prior written permission. # # THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR # IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES # OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. # IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, # INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT # NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, # DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY # THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT # (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF # THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. __version__ = '3.4' # This runtime is compatible with generated parsers using the # API versions listed in constants.compatible_api_versions. # 'HEAD' is only used by unittests. from .constants import * from .dfa import * from .exceptions import * from .recognizers import * from .streams import * from .tokens import * python3-antlr3-3.5.2/antlr3/constants.py000066400000000000000000000041501324200532700201150ustar00rootroot00000000000000"""ANTLR3 runtime package""" # begin[licence] # # [The "BSD licence"] # Copyright (c) 2005-2012 Terence Parr # All rights reserved. # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions # are met: # 1. Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # 2. Redistributions in binary form must reproduce the above copyright # notice, this list of conditions and the following disclaimer in the # documentation and/or other materials provided with the distribution. # 3. The name of the author may not be used to endorse or promote products # derived from this software without specific prior written permission. # # THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR # IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES # OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. # IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, # INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT # NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, # DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY # THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT # (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF # THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. # # end[licence] compatible_api_versions = ['HEAD', 1] EOF = -1 ## All tokens go to the parser (unless skip() is called in that rule) # on a particular "channel". The parser tunes to a particular channel # so that whitespace etc... can go to the parser on a "hidden" channel. DEFAULT_CHANNEL = 0 ## Anything on different channel than DEFAULT_CHANNEL is not parsed # by parser. HIDDEN_CHANNEL = 99 # Predefined token types EOR_TOKEN_TYPE = 1 ## # imaginary tree navigation type; traverse "get child" link DOWN = 2 ## #imaginary tree navigation type; finish with a child list UP = 3 MIN_TOKEN_TYPE = UP + 1 INVALID_TOKEN_TYPE = 0 python3-antlr3-3.5.2/antlr3/debug.py000066400000000000000000001030551324200532700171730ustar00rootroot00000000000000# begin[licence] # # [The "BSD licence"] # Copyright (c) 2005-2012 Terence Parr # All rights reserved. # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions # are met: # 1. Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # 2. Redistributions in binary form must reproduce the above copyright # notice, this list of conditions and the following disclaimer in the # documentation and/or other materials provided with the distribution. # 3. The name of the author may not be used to endorse or promote products # derived from this software without specific prior written permission. # THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR # IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES # OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. # IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, # INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT # NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, # DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY # THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT # (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF # THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. # # end[licence] import socket import sys from .constants import INVALID_TOKEN_TYPE from .exceptions import RecognitionException from .recognizers import Parser from .streams import TokenStream from .tokens import Token from .tree import CommonTreeAdaptor, TreeAdaptor, Tree class DebugParser(Parser): def __init__(self, stream, state=None, dbg=None, *args, **kwargs): # wrap token stream in DebugTokenStream (unless user already did so). if not isinstance(stream, DebugTokenStream): stream = DebugTokenStream(stream, dbg) super().__init__(stream, state, *args, **kwargs) # Who to notify when events in the parser occur. self._dbg = None self.setDebugListener(dbg) def setDebugListener(self, dbg): """Provide a new debug event listener for this parser. Notify the input stream too that it should send events to this listener. """ if hasattr(self.input, 'dbg'): self.input.dbg = dbg self._dbg = dbg def getDebugListener(self): return self._dbg dbg = property(getDebugListener, setDebugListener) def beginResync(self): self._dbg.beginResync() def endResync(self): self._dbg.endResync() def beginBacktrack(self, level): self._dbg.beginBacktrack(level) def endBacktrack(self, level, successful): self._dbg.endBacktrack(level, successful) def reportError(self, exc): Parser.reportError(self, exc) if isinstance(exc, RecognitionException): self._dbg.recognitionException(exc) class DebugTokenStream(TokenStream): def __init__(self, input, dbg=None): super().__init__() self.input = input self.initialStreamState = True # Track the last mark() call result value for use in rewind(). self.lastMarker = None self._dbg = None self.setDebugListener(dbg) # force TokenStream to get at least first valid token # so we know if there are any hidden tokens first in the stream self.input.LT(1) def getDebugListener(self): return self._dbg def setDebugListener(self, dbg): self._dbg = dbg dbg = property(getDebugListener, setDebugListener) def consume(self): if self.initialStreamState: self.consumeInitialHiddenTokens() a = self.input.index() t = self.input.LT(1) self.input.consume() b = self.input.index() self._dbg.consumeToken(t) if b > a + 1: # then we consumed more than one token; must be off channel tokens for idx in range(a + 1, b): self._dbg.consumeHiddenToken(self.input.get(idx)) def consumeInitialHiddenTokens(self): """consume all initial off-channel tokens""" firstOnChannelTokenIndex = self.input.index() for idx in range(firstOnChannelTokenIndex): self._dbg.consumeHiddenToken(self.input.get(idx)) self.initialStreamState = False def LT(self, i): if self.initialStreamState: self.consumeInitialHiddenTokens() t = self.input.LT(i) self._dbg.LT(i, t) return t def LA(self, i): if self.initialStreamState: self.consumeInitialHiddenTokens() t = self.input.LT(i) self._dbg.LT(i, t) return t.type def get(self, i): return self.input.get(i) def index(self): return self.input.index() def mark(self): self.lastMarker = self.input.mark() self._dbg.mark(self.lastMarker) return self.lastMarker def rewind(self, marker=None): self._dbg.rewind(marker) self.input.rewind(marker) def release(self, marker): pass def seek(self, index): # TODO: implement seek in dbg interface # self._dbg.seek(index); self.input.seek(index) def size(self): return self.input.size() def getTokenSource(self): return self.input.getTokenSource() def getSourceName(self): return self.getTokenSource().getSourceName() def toString(self, start=None, stop=None): return self.input.toString(start, stop) class DebugTreeAdaptor(TreeAdaptor): """A TreeAdaptor proxy that fires debugging events to a DebugEventListener delegate and uses the TreeAdaptor delegate to do the actual work. All AST events are triggered by this adaptor; no code gen changes are needed in generated rules. Debugging events are triggered *after* invoking tree adaptor routines. Trees created with actions in rewrite actions like "-> ^(ADD {foo} {bar})" cannot be tracked as they might not use the adaptor to create foo, bar. The debug listener has to deal with tree node IDs for which it did not see a createNode event. A single node is sufficient even if it represents a whole tree. """ def __init__(self, dbg, adaptor): super().__init__() self.dbg = dbg self.adaptor = adaptor def createWithPayload(self, payload): if payload.index < 0: # could be token conjured up during error recovery return self.createFromType(payload.type, payload.text) node = self.adaptor.createWithPayload(payload) self.dbg.createNode(node, payload) return node def createFromToken(self, tokenType, fromToken, text=None): node = self.adaptor.createFromToken(tokenType, fromToken, text) self.dbg.createNode(node) return node def createFromType(self, tokenType, text): node = self.adaptor.createFromType(tokenType, text) self.dbg.createNode(node) return node def errorNode(self, input, start, stop, exc): node = self.adaptor.errorNode(input, start, stop, exc) if node is not None: self.dbg.errorNode(node) return node def dupTree(self, tree): t = self.adaptor.dupTree(tree) # walk the tree and emit create and add child events # to simulate what dupTree has done. dupTree does not call this debug # adapter so I must simulate. self.simulateTreeConstruction(t) return t def simulateTreeConstruction(self, t): """^(A B C): emit create A, create B, add child, ...""" self.dbg.createNode(t) for i in range(self.adaptor.getChildCount(t)): child = self.adaptor.getChild(t, i) self.simulateTreeConstruction(child) self.dbg.addChild(t, child) def dupNode(self, treeNode): d = self.adaptor.dupNode(treeNode) self.dbg.createNode(d) return d def nil(self): node = self.adaptor.nil() self.dbg.nilNode(node) return node def isNil(self, tree): return self.adaptor.isNil(tree) def addChild(self, t, child): if isinstance(child, Token): n = self.createWithPayload(child) self.addChild(t, n) else: if t is None or child is None: return self.adaptor.addChild(t, child) self.dbg.addChild(t, child) def becomeRoot(self, newRoot, oldRoot): if isinstance(newRoot, Token): n = self.createWithPayload(newRoot) self.adaptor.becomeRoot(n, oldRoot) else: n = self.adaptor.becomeRoot(newRoot, oldRoot) self.dbg.becomeRoot(newRoot, oldRoot) return n def rulePostProcessing(self, root): return self.adaptor.rulePostProcessing(root) def getType(self, t): return self.adaptor.getType(t) def setType(self, t, type): self.adaptor.setType(t, type) def getText(self, t): return self.adaptor.getText(t) def setText(self, t, text): self.adaptor.setText(t, text) def getToken(self, t): return self.adaptor.getToken(t) def setTokenBoundaries(self, t, startToken, stopToken): self.adaptor.setTokenBoundaries(t, startToken, stopToken) if t and startToken and stopToken: self.dbg.setTokenBoundaries( t, startToken.index, stopToken.index) def getTokenStartIndex(self, t): return self.adaptor.getTokenStartIndex(t) def getTokenStopIndex(self, t): return self.adaptor.getTokenStopIndex(t) def getChild(self, t, i): return self.adaptor.getChild(t, i) def setChild(self, t, i, child): self.adaptor.setChild(t, i, child) def deleteChild(self, t, i): return self.adaptor.deleteChild(t, i) def getChildCount(self, t): return self.adaptor.getChildCount(t) def getUniqueID(self, node): return self.adaptor.getUniqueID(node) def getParent(self, t): return self.adaptor.getParent(t) def getChildIndex(self, t): return self.adaptor.getChildIndex(t) def setParent(self, t, parent): self.adaptor.setParent(t, parent) def setChildIndex(self, t, index): self.adaptor.setChildIndex(t, index) def replaceChildren(self, parent, startChildIndex, stopChildIndex, t): self.adaptor.replaceChildren(parent, startChildIndex, stopChildIndex, t) ## support def getDebugListener(self): return self.dbg def setDebugListener(self, dbg): self.dbg = dbg def getTreeAdaptor(self): return self.adaptor class DebugEventListener(object): """All debugging events that a recognizer can trigger. I did not create a separate AST debugging interface as it would create lots of extra classes and DebugParser has a dbg var defined, which makes it hard to change to ASTDebugEventListener. I looked hard at this issue and it is easier to understand as one monolithic event interface for all possible events. Hopefully, adding ST debugging stuff won't be bad. Leave for future. 4/26/2006. """ # Moved to version 2 for v3.1: added grammar name to enter/exit Rule PROTOCOL_VERSION = "2" def enterRule(self, grammarFileName, ruleName): """The parser has just entered a rule. No decision has been made about which alt is predicted. This is fired AFTER init actions have been executed. Attributes are defined and available etc... The grammarFileName allows composite grammars to jump around among multiple grammar files. """ pass def enterAlt(self, alt): """Because rules can have lots of alternatives, it is very useful to know which alt you are entering. This is 1..n for n alts. """ pass def exitRule(self, grammarFileName, ruleName): """This is the last thing executed before leaving a rule. It is executed even if an exception is thrown. This is triggered after error reporting and recovery have occurred (unless the exception is not caught in this rule). This implies an "exitAlt" event. The grammarFileName allows composite grammars to jump around among multiple grammar files. """ pass def enterSubRule(self, decisionNumber): """Track entry into any (...) subrule other EBNF construct""" pass def exitSubRule(self, decisionNumber): pass def enterDecision(self, decisionNumber, couldBacktrack): """Every decision, fixed k or arbitrary, has an enter/exit event so that a GUI can easily track what LT/consume events are associated with prediction. You will see a single enter/exit subrule but multiple enter/exit decision events, one for each loop iteration. """ pass def exitDecision(self, decisionNumber): pass def consumeToken(self, t): """An input token was consumed; matched by any kind of element. Trigger after the token was matched by things like match(), matchAny(). """ pass def consumeHiddenToken(self, t): """An off-channel input token was consumed. Trigger after the token was matched by things like match(), matchAny(). (unless of course the hidden token is first stuff in the input stream). """ pass def LT(self, i, t): """Somebody (anybody) looked ahead. Note that this actually gets triggered by both LA and LT calls. The debugger will want to know which Token object was examined. Like consumeToken, this indicates what token was seen at that depth. A remote debugger cannot look ahead into a file it doesn't have so LT events must pass the token even if the info is redundant. For tree parsers, if the type is UP or DOWN, then the ID is not really meaningful as it's fixed--there is just one UP node and one DOWN navigation node. """ pass def mark(self, marker): """The parser is going to look arbitrarily ahead; mark this location, the token stream's marker is sent in case you need it. """ pass def rewind(self, marker=None): """After an arbitrairly long lookahead as with a cyclic DFA (or with any backtrack), this informs the debugger that stream should be rewound to the position associated with marker. """ pass def beginBacktrack(self, level): pass def endBacktrack(self, level, successful): pass def location(self, line, pos): """To watch a parser move through the grammar, the parser needs to inform the debugger what line/charPos it is passing in the grammar. For now, this does not know how to switch from one grammar to the other and back for island grammars etc... This should also allow breakpoints because the debugger can stop the parser whenever it hits this line/pos. """ pass def recognitionException(self, e): """A recognition exception occurred such as NoViableAltException. I made this a generic event so that I can alter the exception hierachy later without having to alter all the debug objects. Upon error, the stack of enter rule/subrule must be properly unwound. If no viable alt occurs it is within an enter/exit decision, which also must be rewound. Even the rewind for each mark must be unwount. In the Java target this is pretty easy using try/finally, if a bit ugly in the generated code. The rewind is generated in DFA.predict() actually so no code needs to be generated for that. For languages w/o this "finally" feature (C++?), the target implementor will have to build an event stack or something. Across a socket for remote debugging, only the RecognitionException data fields are transmitted. The token object or whatever that caused the problem was the last object referenced by LT. The immediately preceding LT event should hold the unexpected Token or char. Here is a sample event trace for grammar: b : C ({;}A|B) // {;} is there to prevent A|B becoming a set | D ; The sequence for this rule (with no viable alt in the subrule) for input 'c c' (there are 3 tokens) is: commence LT(1) enterRule b location 7 1 enter decision 3 LT(1) exit decision 3 enterAlt1 location 7 5 LT(1) consumeToken [c/<4>,1:0] location 7 7 enterSubRule 2 enter decision 2 LT(1) LT(1) recognitionException NoViableAltException 2 1 2 exit decision 2 exitSubRule 2 beginResync LT(1) consumeToken [c/<4>,1:1] LT(1) endResync LT(-1) exitRule b terminate """ pass def beginResync(self): """Indicates the recognizer is about to consume tokens to resynchronize the parser. Any consume events from here until the recovered event are not part of the parse--they are dead tokens. """ pass def endResync(self): """Indicates that the recognizer has finished consuming tokens in order to resychronize. There may be multiple beginResync/endResync pairs before the recognizer comes out of errorRecovery mode (in which multiple errors are suppressed). This will be useful in a gui where you want to probably grey out tokens that are consumed but not matched to anything in grammar. Anything between a beginResync/endResync pair was tossed out by the parser. """ pass def semanticPredicate(self, result, predicate): """A semantic predicate was evaluate with this result and action text""" pass def commence(self): """Announce that parsing has begun. Not technically useful except for sending events over a socket. A GUI for example will launch a thread to connect and communicate with a remote parser. The thread will want to notify the GUI when a connection is made. ANTLR parsers trigger this upon entry to the first rule (the ruleLevel is used to figure this out). """ pass def terminate(self): """Parsing is over; successfully or not. Mostly useful for telling remote debugging listeners that it's time to quit. When the rule invocation level goes to zero at the end of a rule, we are done parsing. """ pass ## T r e e P a r s i n g def consumeNode(self, t): """Input for a tree parser is an AST, but we know nothing for sure about a node except its type and text (obtained from the adaptor). This is the analog of the consumeToken method. Again, the ID is the hashCode usually of the node so it only works if hashCode is not implemented. If the type is UP or DOWN, then the ID is not really meaningful as it's fixed--there is just one UP node and one DOWN navigation node. """ pass ## A S T E v e n t s def nilNode(self, t): """A nil was created (even nil nodes have a unique ID... they are not "null" per se). As of 4/28/2006, this seems to be uniquely triggered when starting a new subtree such as when entering a subrule in automatic mode and when building a tree in rewrite mode. If you are receiving this event over a socket via RemoteDebugEventSocketListener then only t.ID is set. """ pass def errorNode(self, t): """Upon syntax error, recognizers bracket the error with an error node if they are building ASTs. """ pass def createNode(self, node, token=None): """Announce a new node built from token elements such as type etc... If you are receiving this event over a socket via RemoteDebugEventSocketListener then only t.ID, type, text are set. """ pass def becomeRoot(self, newRoot, oldRoot): """Make a node the new root of an existing root. Note: the newRootID parameter is possibly different than the TreeAdaptor.becomeRoot() newRoot parameter. In our case, it will always be the result of calling TreeAdaptor.becomeRoot() and not root_n or whatever. The listener should assume that this event occurs only when the current subrule (or rule) subtree is being reset to newRootID. If you are receiving this event over a socket via RemoteDebugEventSocketListener then only IDs are set. @see antlr3.tree.TreeAdaptor.becomeRoot() """ pass def addChild(self, root, child): """Make childID a child of rootID. If you are receiving this event over a socket via RemoteDebugEventSocketListener then only IDs are set. @see antlr3.tree.TreeAdaptor.addChild() """ pass def setTokenBoundaries(self, t, tokenStartIndex, tokenStopIndex): """Set the token start/stop token index for a subtree root or node. If you are receiving this event over a socket via RemoteDebugEventSocketListener then only t.ID is set. """ pass class BlankDebugEventListener(DebugEventListener): """A blank listener that does nothing; useful for real classes so they don't have to have lots of blank methods and are less sensitive to updates to debug interface. Note: this class is identical to DebugEventListener and exists purely for compatibility with Java. """ pass class TraceDebugEventListener(DebugEventListener): """A listener that simply records text representations of the events. Useful for debugging the debugging facility ;) Subclasses can override the record() method (which defaults to printing to stdout) to record the events in a different way. """ def __init__(self, adaptor=None): super().__init__() if adaptor is None: adaptor = CommonTreeAdaptor() self.adaptor = adaptor def record(self, event): sys.stdout.write(event + '\n') def enterRule(self, grammarFileName, ruleName): self.record("enterRule " + ruleName) def exitRule(self, grammarFileName, ruleName): self.record("exitRule " + ruleName) def enterSubRule(self, decisionNumber): self.record("enterSubRule") def exitSubRule(self, decisionNumber): self.record("exitSubRule") def location(self, line, pos): self.record("location {}:{}".format(line, pos)) ## Tree parsing stuff def consumeNode(self, t): self.record("consumeNode {} {} {}".format( self.adaptor.getUniqueID(t), self.adaptor.getText(t), self.adaptor.getType(t))) def LT(self, i, t): self.record("LT {} {} {} {}".format( i, self.adaptor.getUniqueID(t), self.adaptor.getText(t), self.adaptor.getType(t))) ## AST stuff def nilNode(self, t): self.record("nilNode {}".format(self.adaptor.getUniqueID(t))) def createNode(self, t, token=None): if token is None: self.record("create {}: {}, {}".format( self.adaptor.getUniqueID(t), self.adaptor.getText(t), self.adaptor.getType(t))) else: self.record("create {}: {}".format( self.adaptor.getUniqueID(t), token.index)) def becomeRoot(self, newRoot, oldRoot): self.record("becomeRoot {}, {}".format( self.adaptor.getUniqueID(newRoot), self.adaptor.getUniqueID(oldRoot))) def addChild(self, root, child): self.record("addChild {}, {}".format( self.adaptor.getUniqueID(root), self.adaptor.getUniqueID(child))) def setTokenBoundaries(self, t, tokenStartIndex, tokenStopIndex): self.record("setTokenBoundaries {}, {}, {}".format( self.adaptor.getUniqueID(t), tokenStartIndex, tokenStopIndex)) class RecordDebugEventListener(TraceDebugEventListener): """A listener that records events as strings in an array.""" def __init__(self, adaptor=None): super().__init__(adaptor) self.events = [] def record(self, event): self.events.append(event) class DebugEventSocketProxy(DebugEventListener): """A proxy debug event listener that forwards events over a socket to a debugger (or any other listener) using a simple text-based protocol; one event per line. ANTLRWorks listens on server socket with a RemoteDebugEventSocketListener instance. These two objects must therefore be kept in sync. New events must be handled on both sides of socket. """ DEFAULT_DEBUGGER_PORT = 49100 def __init__(self, recognizer, adaptor=None, port=None, debug=None): super().__init__() self.grammarFileName = recognizer.getGrammarFileName() # Almost certainly the recognizer will have adaptor set, but # we don't know how to cast it (Parser or TreeParser) to get # the adaptor field. Must be set with a constructor. :( self.adaptor = adaptor self.port = port or self.DEFAULT_DEBUGGER_PORT self.debug = debug self.socket = None self.connection = None self.input = None self.output = None def log(self, msg): if self.debug: self.debug.write(msg + '\n') def handshake(self): if self.socket is None: # create listening socket self.socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM) self.socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1) self.socket.bind(('', self.port)) self.socket.listen(1) self.log("Waiting for incoming connection on port {}".format(self.port)) # wait for an incoming connection self.connection, addr = self.socket.accept() self.log("Accepted connection from {}:{}".format(addr[0], addr[1])) self.connection.setblocking(1) self.connection.setsockopt(socket.SOL_TCP, socket.TCP_NODELAY, 1) self.output = self.connection.makefile('w', 1) self.input = self.connection.makefile('r', 1) self.write("ANTLR {}".format(self.PROTOCOL_VERSION)) self.write('grammar "{}"'.format(self.grammarFileName)) self.ack() def write(self, msg): self.log("> {}".format(msg)) self.output.write("{}\n".format(msg)) self.output.flush() def ack(self): t = self.input.readline() self.log("< {}".format(t.rstrip())) def transmit(self, event): self.write(event) self.ack() def commence(self): # don't bother sending event; listener will trigger upon connection pass def terminate(self): self.transmit("terminate") self.output.close() self.input.close() self.connection.close() self.socket.close() def enterRule(self, grammarFileName, ruleName): self.transmit("enterRule\t{}\t{}".format(grammarFileName, ruleName)) def enterAlt(self, alt): self.transmit("enterAlt\t{}".format(alt)) def exitRule(self, grammarFileName, ruleName): self.transmit("exitRule\t{}\t{}".format(grammarFileName, ruleName)) def enterSubRule(self, decisionNumber): self.transmit("enterSubRule\t{}".format(decisionNumber)) def exitSubRule(self, decisionNumber): self.transmit("exitSubRule\t{}".format(decisionNumber)) def enterDecision(self, decisionNumber, couldBacktrack): self.transmit( "enterDecision\t{}\t{:d}".format(decisionNumber, couldBacktrack)) def exitDecision(self, decisionNumber): self.transmit("exitDecision\t{}".format(decisionNumber)) def consumeToken(self, t): self.transmit("consumeToken\t{}".format(self.serializeToken(t))) def consumeHiddenToken(self, t): self.transmit("consumeHiddenToken\t{}".format(self.serializeToken(t))) def LT(self, i, o): if isinstance(o, Tree): return self.LT_tree(i, o) return self.LT_token(i, o) def LT_token(self, i, t): if t is not None: self.transmit("LT\t{}\t{}".format(i, self.serializeToken(t))) def mark(self, i): self.transmit("mark\t{}".format(i)) def rewind(self, i=None): if i is not None: self.transmit("rewind\t{}".format(i)) else: self.transmit("rewind") def beginBacktrack(self, level): self.transmit("beginBacktrack\t{}".format(level)) def endBacktrack(self, level, successful): self.transmit("endBacktrack\t{}\t{}".format( level, '1' if successful else '0')) def location(self, line, pos): self.transmit("location\t{}\t{}".format(line, pos)) def recognitionException(self, exc): self.transmit('\t'.join([ "exception", exc.__class__.__name__, str(int(exc.index)), str(int(exc.line)), str(int(exc.charPositionInLine))])) def beginResync(self): self.transmit("beginResync") def endResync(self): self.transmit("endResync") def semanticPredicate(self, result, predicate): self.transmit('\t'.join([ "semanticPredicate", str(int(result)), self.escapeNewlines(predicate)])) ## A S T P a r s i n g E v e n t s def consumeNode(self, t): FIXME(31) # StringBuffer buf = new StringBuffer(50); # buf.append("consumeNode"); # serializeNode(buf, t); # transmit(buf.toString()); def LT_tree(self, i, t): FIXME(34) # int ID = adaptor.getUniqueID(t); # String text = adaptor.getText(t); # int type = adaptor.getType(t); # StringBuffer buf = new StringBuffer(50); # buf.append("LN\t"); // lookahead node; distinguish from LT in protocol # buf.append(i); # serializeNode(buf, t); # transmit(buf.toString()); def serializeNode(self, buf, t): FIXME(33) # int ID = adaptor.getUniqueID(t); # String text = adaptor.getText(t); # int type = adaptor.getType(t); # buf.append("\t"); # buf.append(ID); # buf.append("\t"); # buf.append(type); # Token token = adaptor.getToken(t); # int line = -1; # int pos = -1; # if ( token!=null ) { # line = token.getLine(); # pos = token.getCharPositionInLine(); # } # buf.append("\t"); # buf.append(line); # buf.append("\t"); # buf.append(pos); # int tokenIndex = adaptor.getTokenStartIndex(t); # buf.append("\t"); # buf.append(tokenIndex); # serializeText(buf, text); ## A S T E v e n t s def nilNode(self, t): self.transmit("nilNode\t{}".format(self.adaptor.getUniqueID(t))) def errorNode(self, t): self.transmit('errorNode\t{}\t{}\t"{}'.format( self.adaptor.getUniqueID(t), INVALID_TOKEN_TYPE, self.escapeNewlines(t.toString()))) def createNode(self, node, token=None): if token is not None: self.transmit("createNode\t{}\t{}".format( self.adaptor.getUniqueID(node), token.index)) else: self.transmit('createNodeFromTokenElements\t{}\t{}\t"{}'.format( self.adaptor.getUniqueID(node), self.adaptor.getType(node), self.adaptor.getText(node))) def becomeRoot(self, newRoot, oldRoot): self.transmit("becomeRoot\t{}\t{}".format( self.adaptor.getUniqueID(newRoot), self.adaptor.getUniqueID(oldRoot))) def addChild(self, root, child): self.transmit("addChild\t{}\t{}".format( self.adaptor.getUniqueID(root), self.adaptor.getUniqueID(child))) def setTokenBoundaries(self, t, tokenStartIndex, tokenStopIndex): self.transmit("setTokenBoundaries\t{}\t{}\t{}".format( self.adaptor.getUniqueID(t), tokenStartIndex, tokenStopIndex)) ## support def setTreeAdaptor(self, adaptor): self.adaptor = adaptor def getTreeAdaptor(self): return self.adaptor def serializeToken(self, t): buf = [str(int(t.index)), str(int(t.type)), str(int(t.channel)), str(int(t.line or 0)), str(int(t.charPositionInLine or 0)), '"' + self.escapeNewlines(t.text)] return '\t'.join(buf) def escapeNewlines(self, txt): if txt is None: return '' txt = txt.replace("%","%25") # escape all escape char ;) txt = txt.replace("\n","%0A") # escape \n txt = txt.replace("\r","%0D") # escape \r return txt python3-antlr3-3.5.2/antlr3/dfa.py000066400000000000000000000152001324200532700166310ustar00rootroot00000000000000"""ANTLR3 runtime package""" # begin[licence] # # [The "BSD licence"] # Copyright (c) 2005-2012 Terence Parr # All rights reserved. # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions # are met: # 1. Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # 2. Redistributions in binary form must reproduce the above copyright # notice, this list of conditions and the following disclaimer in the # documentation and/or other materials provided with the distribution. # 3. The name of the author may not be used to endorse or promote products # derived from this software without specific prior written permission. # # THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR # IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES # OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. # IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, # INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT # NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, # DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY # THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT # (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF # THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. # # end[licence] from .constants import EOF from .exceptions import NoViableAltException, BacktrackingFailed class DFA(object): """@brief A DFA implemented as a set of transition tables. Any state that has a semantic predicate edge is special; those states are generated with if-then-else structures in a specialStateTransition() which is generated by cyclicDFA template. """ def __init__( self, recognizer, decisionNumber, eot, eof, min, max, accept, special, transition ): ## Which recognizer encloses this DFA? Needed to check backtracking self.recognizer = recognizer self.decisionNumber = decisionNumber self.eot = eot self.eof = eof self.min = min self.max = max self.accept = accept self.special = special self.transition = transition def predict(self, input): """ From the input stream, predict what alternative will succeed using this DFA (representing the covering regular approximation to the underlying CFL). Return an alternative number 1..n. Throw an exception upon error. """ mark = input.mark() s = 0 # we always start at s0 try: for _ in range(50000): specialState = self.special[s] if specialState >= 0: s = self.specialStateTransition(specialState, input) if s == -1: self.noViableAlt(s, input) return 0 input.consume() continue if self.accept[s] >= 1: return self.accept[s] # look for a normal char transition c = input.LA(1) if c >= self.min[s] and c <= self.max[s]: # move to next state snext = self.transition[s][c-self.min[s]] if snext < 0: # was in range but not a normal transition # must check EOT, which is like the else clause. # eot[s]>=0 indicates that an EOT edge goes to another # state. if self.eot[s] >= 0: # EOT Transition to accept state? s = self.eot[s] input.consume() # TODO: I had this as return accept[eot[s]] # which assumed here that the EOT edge always # went to an accept...faster to do this, but # what about predicated edges coming from EOT # target? continue self.noViableAlt(s, input) return 0 s = snext input.consume() continue if self.eot[s] >= 0: s = self.eot[s] input.consume() continue # EOF Transition to accept state? if c == EOF and self.eof[s] >= 0: return self.accept[self.eof[s]] # not in range and not EOF/EOT, must be invalid symbol self.noViableAlt(s, input) return 0 else: raise RuntimeError("DFA bang!") finally: input.rewind(mark) def noViableAlt(self, s, input): if self.recognizer._state.backtracking > 0: raise BacktrackingFailed nvae = NoViableAltException( self.getDescription(), self.decisionNumber, s, input ) self.error(nvae) raise nvae def error(self, nvae): """A hook for debugging interface""" pass def specialStateTransition(self, s, input): return -1 def getDescription(self): return "n/a" ## def specialTransition(self, state, symbol): ## return 0 @classmethod def unpack(cls, string): """@brief Unpack the runlength encoded table data. Terence implemented packed table initializers, because Java has a size restriction on .class files and the lookup tables can grow pretty large. The generated JavaLexer.java of the Java.g example would be about 15MB with uncompressed array initializers. Python does not have any size restrictions, but the compilation of such large source files seems to be pretty memory hungry. The memory consumption of the python process grew to >1.5GB when importing a 15MB lexer, eating all my swap space and I was to impacient to see, if it could finish at all. With packed initializers that are unpacked at import time of the lexer module, everything works like a charm. """ ret = [] for i in range(0, len(string) - 1, 2): (n, v) = ord(string[i]), ord(string[i + 1]) if v == 0xFFFF: v = -1 ret += [v] * n return ret python3-antlr3-3.5.2/antlr3/exceptions.py000066400000000000000000000304021324200532700202610ustar00rootroot00000000000000"""ANTLR3 exception hierarchy""" # begin[licence] # # [The "BSD licence"] # Copyright (c) 2005-2012 Terence Parr # All rights reserved. # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions # are met: # 1. Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # 2. Redistributions in binary form must reproduce the above copyright # notice, this list of conditions and the following disclaimer in the # documentation and/or other materials provided with the distribution. # 3. The name of the author may not be used to endorse or promote products # derived from this software without specific prior written permission. # # THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR # IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES # OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. # IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, # INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT # NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, # DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY # THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT # (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF # THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. # # end[licence] from .constants import INVALID_TOKEN_TYPE class BacktrackingFailed(Exception): """@brief Raised to signal failed backtrack attempt""" pass class RecognitionException(Exception): """@brief The root of the ANTLR exception hierarchy. To avoid English-only error messages and to generally make things as flexible as possible, these exceptions are not created with strings, but rather the information necessary to generate an error. Then the various reporting methods in Parser and Lexer can be overridden to generate a localized error message. For example, MismatchedToken exceptions are built with the expected token type. So, don't expect getMessage() to return anything. Note that as of Java 1.4, you can access the stack trace, which means that you can compute the complete trace of rules from the start symbol. This gives you considerable context information with which to generate useful error messages. ANTLR generates code that throws exceptions upon recognition error and also generates code to catch these exceptions in each rule. If you want to quit upon first error, you can turn off the automatic error handling mechanism using rulecatch action, but you still need to override methods mismatch and recoverFromMismatchSet. In general, the recognition exceptions can track where in a grammar a problem occurred and/or what was the expected input. While the parser knows its state (such as current input symbol and line info) that state can change before the exception is reported so current token index is computed and stored at exception time. From this info, you can perhaps print an entire line of input not just a single token, for example. Better to just say the recognizer had a problem and then let the parser figure out a fancy report. """ def __init__(self, input=None): super().__init__() # What input stream did the error occur in? self.input = None # What is index of token/char were we looking at when the error # occurred? self.index = None # The current Token when an error occurred. Since not all streams # can retrieve the ith Token, we have to track the Token object. # For parsers. Even when it's a tree parser, token might be set. self.token = None # If this is a tree parser exception, node is set to the node with # the problem. self.node = None # The current char when an error occurred. For lexers. self.c = None # Track the line at which the error occurred in case this is # generated from a lexer. We need to track this since the # unexpected char doesn't carry the line info. self.line = None self.charPositionInLine = None # If you are parsing a tree node stream, you will encounter som # imaginary nodes w/o line/col info. We now search backwards looking # for most recent token with line/col info, but notify getErrorHeader() # that info is approximate. self.approximateLineInfo = False if input: self.input = input self.index = input.index() # late import to avoid cyclic dependencies from .streams import TokenStream, CharStream from .tree import TreeNodeStream if isinstance(self.input, TokenStream): self.token = self.input.LT(1) self.line = self.token.line self.charPositionInLine = self.token.charPositionInLine if isinstance(self.input, TreeNodeStream): self.extractInformationFromTreeNodeStream(self.input) else: if isinstance(self.input, CharStream): self.c = self.input.LT(1) self.line = self.input.line self.charPositionInLine = self.input.charPositionInLine else: self.c = self.input.LA(1) def extractInformationFromTreeNodeStream(self, nodes): from .tree import Tree, CommonTree from .tokens import CommonToken self.node = nodes.LT(1) adaptor = nodes.adaptor payload = adaptor.getToken(self.node) if payload: self.token = payload if payload.line <= 0: # imaginary node; no line/pos info; scan backwards i = -1 priorNode = nodes.LT(i) while priorNode: priorPayload = adaptor.getToken(priorNode) if priorPayload and priorPayload.line > 0: # we found the most recent real line / pos info self.line = priorPayload.line self.charPositionInLine = priorPayload.charPositionInLine self.approximateLineInfo = True break i -= 1 priorNode = nodes.LT(i) else: # node created from real token self.line = payload.line self.charPositionInLine = payload.charPositionInLine elif isinstance(self.node, Tree): self.line = self.node.line self.charPositionInLine = self.node.charPositionInLine if isinstance(self.node, CommonTree): self.token = self.node.token else: type = adaptor.getType(self.node) text = adaptor.getText(self.node) self.token = CommonToken(type=type, text=text) def getUnexpectedType(self): """Return the token type or char of the unexpected input element""" from .streams import TokenStream from .tree import TreeNodeStream if isinstance(self.input, TokenStream): return self.token.type elif isinstance(self.input, TreeNodeStream): adaptor = self.input.treeAdaptor return adaptor.getType(self.node) else: return self.c unexpectedType = property(getUnexpectedType) class MismatchedTokenException(RecognitionException): """@brief A mismatched char or Token or tree node.""" def __init__(self, expecting, input): super().__init__(input) self.expecting = expecting def __str__(self): return "MismatchedTokenException({!r}!={!r})".format( self.getUnexpectedType(), self.expecting ) __repr__ = __str__ class UnwantedTokenException(MismatchedTokenException): """An extra token while parsing a TokenStream""" def getUnexpectedToken(self): return self.token def __str__(self): exp = ", expected {}".format(self.expecting) if self.expecting == INVALID_TOKEN_TYPE: exp = "" if not self.token: return "UnwantedTokenException(found={}{})".format(None, exp) return "UnwantedTokenException(found={}{})".format(self.token.text, exp) __repr__ = __str__ class MissingTokenException(MismatchedTokenException): """ We were expecting a token but it's not found. The current token is actually what we wanted next. """ def __init__(self, expecting, input, inserted): super().__init__(expecting, input) self.inserted = inserted def getMissingType(self): return self.expecting def __str__(self): if self.token: if self.inserted: return "MissingTokenException(inserted {!r} at {!r})".format( self.inserted, self.token.text) return "MissingTokenException(at {!r})".format(self.token.text) return "MissingTokenException" __repr__ = __str__ class MismatchedRangeException(RecognitionException): """@brief The next token does not match a range of expected types.""" def __init__(self, a, b, input): super().__init__(input) self.a = a self.b = b def __str__(self): return "MismatchedRangeException({!r} not in [{!r}..{!r}])".format( self.getUnexpectedType(), self.a, self.b ) __repr__ = __str__ class MismatchedSetException(RecognitionException): """@brief The next token does not match a set of expected types.""" def __init__(self, expecting, input): super().__init__(input) self.expecting = expecting def __str__(self): return "MismatchedSetException({!r} not in {!r})".format( self.getUnexpectedType(), self.expecting ) __repr__ = __str__ class MismatchedNotSetException(MismatchedSetException): """@brief Used for remote debugger deserialization""" def __str__(self): return "MismatchedNotSetException({!r}!={!r})".format( self.getUnexpectedType(), self.expecting ) __repr__ = __str__ class NoViableAltException(RecognitionException): """@brief Unable to decide which alternative to choose.""" def __init__( self, grammarDecisionDescription, decisionNumber, stateNumber, input ): super().__init__(input) self.grammarDecisionDescription = grammarDecisionDescription self.decisionNumber = decisionNumber self.stateNumber = stateNumber def __str__(self): return "NoViableAltException({!r}!=[{!r}])".format( self.unexpectedType, self.grammarDecisionDescription ) __repr__ = __str__ class EarlyExitException(RecognitionException): """@brief The recognizer did not match anything for a (..)+ loop.""" def __init__(self, decisionNumber, input): super().__init__(input) self.decisionNumber = decisionNumber class FailedPredicateException(RecognitionException): """@brief A semantic predicate failed during validation. Validation of predicates occurs when normally parsing the alternative just like matching a token. Disambiguating predicate evaluation occurs when we hoist a predicate into a prediction decision. """ def __init__(self, input, ruleName, predicateText): super().__init__(input) self.ruleName = ruleName self.predicateText = predicateText def __str__(self): return "FailedPredicateException({},{{{}}}?)".format( self.ruleName, self.predicateText) __repr__ = __str__ class MismatchedTreeNodeException(RecognitionException): """@brief The next tree mode does not match the expected type.""" def __init__(self, expecting, input): super().__init__(input) self.expecting = expecting def __str__(self): return "MismatchedTreeNodeException({!r}!={!r})".format( self.getUnexpectedType(), self.expecting ) __repr__ = __str__ python3-antlr3-3.5.2/antlr3/main.py000066400000000000000000000164551324200532700170400ustar00rootroot00000000000000"""ANTLR3 runtime package""" # begin[licence] # # [The "BSD licence"] # Copyright (c) 2005-2012 Terence Parr # All rights reserved. # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions # are met: # 1. Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # 2. Redistributions in binary form must reproduce the above copyright # notice, this list of conditions and the following disclaimer in the # documentation and/or other materials provided with the distribution. # 3. The name of the author may not be used to endorse or promote products # derived from this software without specific prior written permission. # # THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR # IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES # OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. # IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, # INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT # NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, # DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY # THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT # (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF # THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. # # end[licence] import sys import argparse from .streams import ANTLRStringStream, ANTLRFileStream, \ ANTLRInputStream, CommonTokenStream from .tree import CommonTreeNodeStream class _Main(object): def __init__(self): self.stdin = sys.stdin self.stdout = sys.stdout self.stderr = sys.stderr def parseArgs(self, argv): argParser = argparse.ArgumentParser() argParser.add_argument("--input") argParser.add_argument("--interactive", "-i", action="store_true") argParser.add_argument("--no-output", action="store_true") argParser.add_argument("--profile", action="store_true") argParser.add_argument("--hotshot", action="store_true") argParser.add_argument("--port", type=int) argParser.add_argument("--debug-socket", action='store_true') argParser.add_argument("file", nargs='?') self.setupArgs(argParser) return argParser.parse_args(argv[1:]) def setupArgs(self, argParser): pass def execute(self, argv): args = self.parseArgs(argv) self.setUp(args) if args.interactive: while True: try: input_str = input(">>> ") except (EOFError, KeyboardInterrupt): self.stdout.write("\nBye.\n") break inStream = ANTLRStringStream(input_str) self.parseStream(args, inStream) else: if args.input: inStream = ANTLRStringStream(args.input) elif args.file and args.file != '-': inStream = ANTLRFileStream(args.file) else: inStream = ANTLRInputStream(self.stdin) if args.profile: try: import cProfile as profile except ImportError: import profile profile.runctx( 'self.parseStream(args, inStream)', globals(), locals(), 'profile.dat' ) import pstats stats = pstats.Stats('profile.dat') stats.strip_dirs() stats.sort_stats('time') stats.print_stats(100) elif args.hotshot: import hotshot profiler = hotshot.Profile('hotshot.dat') profiler.runctx( 'self.parseStream(args, inStream)', globals(), locals() ) else: self.parseStream(args, inStream) def setUp(self, args): pass def parseStream(self, args, inStream): raise NotImplementedError def write(self, args, text): if not args.no_output: self.stdout.write(text) def writeln(self, args, text): self.write(args, text + '\n') class LexerMain(_Main): def __init__(self, lexerClass): super().__init__() self.lexerClass = lexerClass def parseStream(self, args, inStream): lexer = self.lexerClass(inStream) for token in lexer: self.writeln(args, str(token)) class ParserMain(_Main): def __init__(self, lexerClassName, parserClass): super().__init__() self.lexerClassName = lexerClassName self.lexerClass = None self.parserClass = parserClass def setupArgs(self, argParser): argParser.add_argument("--lexer", dest="lexerClass", default=self.lexerClassName) argParser.add_argument("--rule", dest="parserRule") def setUp(self, args): lexerMod = __import__(args.lexerClass) self.lexerClass = getattr(lexerMod, args.lexerClass) def parseStream(self, args, inStream): kwargs = {} if args.port is not None: kwargs['port'] = args.port if args.debug_socket: kwargs['debug_socket'] = sys.stderr lexer = self.lexerClass(inStream) tokenStream = CommonTokenStream(lexer) parser = self.parserClass(tokenStream, **kwargs) result = getattr(parser, args.parserRule)() if result: if hasattr(result, 'tree') and result.tree: self.writeln(args, result.tree.toStringTree()) else: self.writeln(args, repr(result)) class WalkerMain(_Main): def __init__(self, walkerClass): super().__init__() self.lexerClass = None self.parserClass = None self.walkerClass = walkerClass def setupArgs(self, argParser): argParser.add_argument("--lexer", dest="lexerClass") argParser.add_argument("--parser", dest="parserClass") argParser.add_argument("--parser-rule", dest="parserRule") argParser.add_argument("--rule", dest="walkerRule") def setUp(self, args): lexerMod = __import__(args.lexerClass) self.lexerClass = getattr(lexerMod, args.lexerClass) parserMod = __import__(args.parserClass) self.parserClass = getattr(parserMod, args.parserClass) def parseStream(self, args, inStream): lexer = self.lexerClass(inStream) tokenStream = CommonTokenStream(lexer) parser = self.parserClass(tokenStream) result = getattr(parser, args.parserRule)() if result: assert hasattr(result, 'tree'), "Parser did not return an AST" nodeStream = CommonTreeNodeStream(result.tree) nodeStream.setTokenStream(tokenStream) walker = self.walkerClass(nodeStream) result = getattr(walker, args.walkerRule)() if result: if hasattr(result, 'tree'): self.writeln(args, result.tree.toStringTree()) else: self.writeln(args, repr(result)) python3-antlr3-3.5.2/antlr3/recognizers.py000066400000000000000000001415001324200532700204340ustar00rootroot00000000000000"""ANTLR3 runtime package""" # begin[licence] # # [The "BSD licence"] # Copyright (c) 2005-2012 Terence Parr # All rights reserved. # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions # are met: # 1. Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # 2. Redistributions in binary form must reproduce the above copyright # notice, this list of conditions and the following disclaimer in the # documentation and/or other materials provided with the distribution. # 3. The name of the author may not be used to endorse or promote products # derived from this software without specific prior written permission. # # THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR # IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES # OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. # IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, # INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT # NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, # DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY # THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT # (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF # THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. # # end[licence] import sys import inspect from .constants import compatible_api_versions, DEFAULT_CHANNEL, \ HIDDEN_CHANNEL, EOF, EOR_TOKEN_TYPE, INVALID_TOKEN_TYPE from .exceptions import RecognitionException, MismatchedTokenException, \ MismatchedRangeException, MismatchedTreeNodeException, \ NoViableAltException, EarlyExitException, MismatchedSetException, \ MismatchedNotSetException, FailedPredicateException, \ BacktrackingFailed, UnwantedTokenException, MissingTokenException from .tokens import CommonToken, SKIP_TOKEN class RecognizerSharedState(object): """ The set of fields needed by an abstract recognizer to recognize input and recover from errors etc... As a separate state object, it can be shared among multiple grammars; e.g., when one grammar imports another. These fields are publically visible but the actual state pointer per parser is protected. """ def __init__(self): # Track the set of token types that can follow any rule invocation. # Stack grows upwards. self.following = [] # This is true when we see an error and before having successfully # matched a token. Prevents generation of more than one error message # per error. self.errorRecovery = False # The index into the input stream where the last error occurred. # This is used to prevent infinite loops where an error is found # but no token is consumed during recovery...another error is found, # ad naseum. This is a failsafe mechanism to guarantee that at least # one token/tree node is consumed for two errors. self.lastErrorIndex = -1 # If 0, no backtracking is going on. Safe to exec actions etc... # If >0 then it's the level of backtracking. self.backtracking = 0 # An array[size num rules] of (int -> int) dicts that tracks # the stop token index for each rule. ruleMemo[ruleIndex] is # the memoization table for ruleIndex. For key ruleStartIndex, you # get back the stop token for associated rule or MEMO_RULE_FAILED. # # This is only used if rule memoization is on (which it is by default). self.ruleMemo = None ## Did the recognizer encounter a syntax error? Track how many. self.syntaxErrors = 0 # LEXER FIELDS (must be in same state object to avoid casting # constantly in generated code and Lexer object) :( ## The goal of all lexer rules/methods is to create a token object. # This is an instance variable as multiple rules may collaborate to # create a single token. nextToken will return this object after # matching lexer rule(s). If you subclass to allow multiple token # emissions, then set this to the last token to be matched or # something nonnull so that the auto token emit mechanism will not # emit another token. self.token = None ## What character index in the stream did the current token start at? # Needed, for example, to get the text for current token. Set at # the start of nextToken. self.tokenStartCharIndex = -1 ## The line on which the first character of the token resides self.tokenStartLine = None ## The character position of first character within the line self.tokenStartCharPositionInLine = None ## The channel number for the current token self.channel = None ## The token type for the current token self.type = None ## You can set the text for the current token to override what is in # the input char buffer. Use setText() or can set this instance var. self.text = None class BaseRecognizer(object): """ @brief Common recognizer functionality. A generic recognizer that can handle recognizers generated from lexer, parser, and tree grammars. This is all the parsing support code essentially; most of it is error recovery stuff and backtracking. """ MEMO_RULE_FAILED = -2 MEMO_RULE_UNKNOWN = -1 # copies from Token object for convenience in actions DEFAULT_TOKEN_CHANNEL = DEFAULT_CHANNEL # for convenience in actions HIDDEN = HIDDEN_CHANNEL # overridden by generated subclasses grammarFileName = None tokenNames = None # The api_version attribute has been introduced in 3.3. If it is not # overwritten in the generated recognizer, we assume a default of v0. api_version = 0 def __init__(self, state=None): # Input stream of the recognizer. Must be initialized by a subclass. self.input = None ## State of a lexer, parser, or tree parser are collected into a state # object so the state can be shared. This sharing is needed to # have one grammar import others and share same error variables # and other state variables. It's a kind of explicit multiple # inheritance via delegation of methods and shared state. if state is None: state = RecognizerSharedState() self._state = state if self.api_version not in compatible_api_versions: raise RuntimeError( "ANTLR version mismatch: " "The recognizer has been generated with API V{}, " "but this runtime does not support this." .format(self.api_version)) # this one only exists to shut up pylint :( def setInput(self, input): self.input = input def reset(self): """ reset the parser's state; subclasses must rewind the input stream """ # wack everything related to error recovery if self._state is None: # no shared state work to do return self._state.following = [] self._state.errorRecovery = False self._state.lastErrorIndex = -1 self._state.syntaxErrors = 0 # wack everything related to backtracking and memoization self._state.backtracking = 0 if self._state.ruleMemo is not None: self._state.ruleMemo = {} def match(self, input, ttype, follow): """ Match current input symbol against ttype. Attempt single token insertion or deletion error recovery. If that fails, throw MismatchedTokenException. To turn off single token insertion or deletion error recovery, override recoverFromMismatchedToken() and have it throw an exception. See TreeParser.recoverFromMismatchedToken(). This way any error in a rule will cause an exception and immediate exit from rule. Rule would recover by resynchronizing to the set of symbols that can follow rule ref. """ matchedSymbol = self.getCurrentInputSymbol(input) if self.input.LA(1) == ttype: self.input.consume() self._state.errorRecovery = False return matchedSymbol if self._state.backtracking > 0: # FIXME: need to return matchedSymbol here as well. damn!! raise BacktrackingFailed matchedSymbol = self.recoverFromMismatchedToken(input, ttype, follow) return matchedSymbol def matchAny(self): """Match the wildcard: in a symbol""" self._state.errorRecovery = False self.input.consume() def mismatchIsUnwantedToken(self, input, ttype): return input.LA(2) == ttype def mismatchIsMissingToken(self, input, follow): if follow is None: # we have no information about the follow; we can only consume # a single token and hope for the best return False # compute what can follow this grammar element reference if EOR_TOKEN_TYPE in follow: viableTokensFollowingThisRule = self.computeContextSensitiveRuleFOLLOW() follow |= viableTokensFollowingThisRule if len(self._state.following) > 0: # remove EOR if we're not the start symbol follow -= {EOR_TOKEN_TYPE} # if current token is consistent with what could come after set # then we know we're missing a token; error recovery is free to # "insert" the missing token if input.LA(1) in follow or EOR_TOKEN_TYPE in follow: return True return False def reportError(self, e): """Report a recognition problem. This method sets errorRecovery to indicate the parser is recovering not parsing. Once in recovery mode, no errors are generated. To get out of recovery mode, the parser must successfully match a token (after a resync). So it will go: 1. error occurs 2. enter recovery mode, report error 3. consume until token found in resync set 4. try to resume parsing 5. next match() will reset errorRecovery mode If you override, make sure to update syntaxErrors if you care about that. """ # if we've already reported an error and have not matched a token # yet successfully, don't report any errors. if self._state.errorRecovery: return self._state.syntaxErrors += 1 # don't count spurious self._state.errorRecovery = True self.displayRecognitionError(e) def displayRecognitionError(self, e): hdr = self.getErrorHeader(e) msg = self.getErrorMessage(e) self.emitErrorMessage(hdr + " " + msg) def getErrorMessage(self, e): """ What error message should be generated for the various exception types? Not very object-oriented code, but I like having all error message generation within one method rather than spread among all of the exception classes. This also makes it much easier for the exception handling because the exception classes do not have to have pointers back to this object to access utility routines and so on. Also, changing the message for an exception type would be difficult because you would have to subclassing exception, but then somehow get ANTLR to make those kinds of exception objects instead of the default. This looks weird, but trust me--it makes the most sense in terms of flexibility. For grammar debugging, you will want to override this to add more information such as the stack frame with getRuleInvocationStack(e, this.getClass().getName()) and, for no viable alts, the decision description and state etc... Override this to change the message generated for one or more exception types. """ if isinstance(e, UnwantedTokenException): if e.expecting == EOF: tokenName = "EOF" else: tokenName = self.tokenNames[e.expecting] msg = "extraneous input {} expecting {}".format( self.getTokenErrorDisplay(e.getUnexpectedToken()), tokenName ) elif isinstance(e, MissingTokenException): if e.expecting == EOF: tokenName = "EOF" else: tokenName = self.tokenNames[e.expecting] msg = "missing {} at {}".format( tokenName, self.getTokenErrorDisplay(e.token) ) elif isinstance(e, MismatchedTokenException): if e.expecting == EOF: tokenName = "EOF" else: tokenName = self.tokenNames[e.expecting] msg = "mismatched input {} expecting {}".format( self.getTokenErrorDisplay(e.token), tokenName ) elif isinstance(e, MismatchedTreeNodeException): if e.expecting == EOF: tokenName = "EOF" else: tokenName = self.tokenNames[e.expecting] msg = "mismatched tree node: {} expecting {}".format( e.node, tokenName) elif isinstance(e, NoViableAltException): msg = "no viable alternative at input {}".format( self.getTokenErrorDisplay(e.token)) elif isinstance(e, EarlyExitException): msg = "required (...)+ loop did not match anything at input {}".format( self.getTokenErrorDisplay(e.token)) elif isinstance(e, MismatchedSetException): msg = "mismatched input {} expecting set {!r}".format( self.getTokenErrorDisplay(e.token), e.expecting ) elif isinstance(e, MismatchedNotSetException): msg = "mismatched input {} expecting set {!r}".format( self.getTokenErrorDisplay(e.token), e.expecting ) elif isinstance(e, FailedPredicateException): msg = "rule {} failed predicate: {{{}}}?".format( e.ruleName, e.predicateText ) else: msg = str(e) return msg def getNumberOfSyntaxErrors(self): """ Get number of recognition errors (lexer, parser, tree parser). Each recognizer tracks its own number. So parser and lexer each have separate count. Does not count the spurious errors found between an error and next valid token match. See also reportError(). """ return self._state.syntaxErrors def getErrorHeader(self, e): """ What is the error header, normally line/character position information? """ source_name = self.getSourceName() if source_name is not None: return "{} line {}:{}".format(source_name, e.line, e.charPositionInLine) return "line {}:{}".format(e.line, e.charPositionInLine) def getTokenErrorDisplay(self, t): """ How should a token be displayed in an error message? The default is to display just the text, but during development you might want to have a lot of information spit out. Override in that case to use t.toString() (which, for CommonToken, dumps everything about the token). This is better than forcing you to override a method in your token objects because you don't have to go modify your lexer so that it creates a new Java type. """ s = t.text if s is None: if t.type == EOF: s = "" else: s = "<{}>".format(t.typeName) return repr(s) def emitErrorMessage(self, msg): """Override this method to change where error messages go""" sys.stderr.write(msg + '\n') def recover(self, input, re): """ Recover from an error found on the input stream. This is for NoViableAlt and mismatched symbol exceptions. If you enable single token insertion and deletion, this will usually not handle mismatched symbol exceptions but there could be a mismatched token that the match() routine could not recover from. """ # PROBLEM? what if input stream is not the same as last time # perhaps make lastErrorIndex a member of input if self._state.lastErrorIndex == input.index(): # uh oh, another error at same token index; must be a case # where LT(1) is in the recovery token set so nothing is # consumed; consume a single token so at least to prevent # an infinite loop; this is a failsafe. input.consume() self._state.lastErrorIndex = input.index() followSet = self.computeErrorRecoverySet() self.beginResync() self.consumeUntil(input, followSet) self.endResync() def beginResync(self): """ A hook to listen in on the token consumption during error recovery. The DebugParser subclasses this to fire events to the listenter. """ pass def endResync(self): """ A hook to listen in on the token consumption during error recovery. The DebugParser subclasses this to fire events to the listenter. """ pass def computeErrorRecoverySet(self): """ Compute the error recovery set for the current rule. During rule invocation, the parser pushes the set of tokens that can follow that rule reference on the stack; this amounts to computing FIRST of what follows the rule reference in the enclosing rule. This local follow set only includes tokens from within the rule; i.e., the FIRST computation done by ANTLR stops at the end of a rule. EXAMPLE When you find a "no viable alt exception", the input is not consistent with any of the alternatives for rule r. The best thing to do is to consume tokens until you see something that can legally follow a call to r *or* any rule that called r. You don't want the exact set of viable next tokens because the input might just be missing a token--you might consume the rest of the input looking for one of the missing tokens. Consider grammar: a : '[' b ']' | '(' b ')' ; b : c '^' INT ; c : ID | INT ; At each rule invocation, the set of tokens that could follow that rule is pushed on a stack. Here are the various "local" follow sets: FOLLOW(b1_in_a) = FIRST(']') = ']' FOLLOW(b2_in_a) = FIRST(')') = ')' FOLLOW(c_in_b) = FIRST('^') = '^' Upon erroneous input "[]", the call chain is a -> b -> c and, hence, the follow context stack is: depth local follow set after call to rule 0 \ a (from main()) 1 ']' b 3 '^' c Notice that ')' is not included, because b would have to have been called from a different context in rule a for ')' to be included. For error recovery, we cannot consider FOLLOW(c) (context-sensitive or otherwise). We need the combined set of all context-sensitive FOLLOW sets--the set of all tokens that could follow any reference in the call chain. We need to resync to one of those tokens. Note that FOLLOW(c)='^' and if we resync'd to that token, we'd consume until EOF. We need to sync to context-sensitive FOLLOWs for a, b, and c: {']','^'}. In this case, for input "[]", LA(1) is in this set so we would not consume anything and after printing an error rule c would return normally. It would not find the required '^' though. At this point, it gets a mismatched token error and throws an exception (since LA(1) is not in the viable following token set). The rule exception handler tries to recover, but finds the same recovery set and doesn't consume anything. Rule b exits normally returning to rule a. Now it finds the ']' (and with the successful match exits errorRecovery mode). So, you cna see that the parser walks up call chain looking for the token that was a member of the recovery set. Errors are not generated in errorRecovery mode. ANTLR's error recovery mechanism is based upon original ideas: "Algorithms + Data Structures = Programs" by Niklaus Wirth and "A note on error recovery in recursive descent parsers": http://portal.acm.org/citation.cfm?id=947902.947905 Later, Josef Grosch had some good ideas: "Efficient and Comfortable Error Recovery in Recursive Descent Parsers": ftp://www.cocolab.com/products/cocktail/doca4.ps/ell.ps.zip Like Grosch I implemented local FOLLOW sets that are combined at run-time upon error to avoid overhead during parsing. """ return self.combineFollows(False) def computeContextSensitiveRuleFOLLOW(self): """ Compute the context-sensitive FOLLOW set for current rule. This is set of token types that can follow a specific rule reference given a specific call chain. You get the set of viable tokens that can possibly come next (lookahead depth 1) given the current call chain. Contrast this with the definition of plain FOLLOW for rule r: FOLLOW(r)={x | S=>*alpha r beta in G and x in FIRST(beta)} where x in T* and alpha, beta in V*; T is set of terminals and V is the set of terminals and nonterminals. In other words, FOLLOW(r) is the set of all tokens that can possibly follow references to r in *any* sentential form (context). At runtime, however, we know precisely which context applies as we have the call chain. We may compute the exact (rather than covering superset) set of following tokens. For example, consider grammar: stat : ID '=' expr ';' // FOLLOW(stat)=={EOF} | "return" expr '.' ; expr : atom ('+' atom)* ; // FOLLOW(expr)=={';','.',')'} atom : INT // FOLLOW(atom)=={'+',')',';','.'} | '(' expr ')' ; The FOLLOW sets are all inclusive whereas context-sensitive FOLLOW sets are precisely what could follow a rule reference. For input "i=(3);", here is the derivation: stat => ID '=' expr ';' => ID '=' atom ('+' atom)* ';' => ID '=' '(' expr ')' ('+' atom)* ';' => ID '=' '(' atom ')' ('+' atom)* ';' => ID '=' '(' INT ')' ('+' atom)* ';' => ID '=' '(' INT ')' ';' At the "3" token, you'd have a call chain of stat -> expr -> atom -> expr -> atom What can follow that specific nested ref to atom? Exactly ')' as you can see by looking at the derivation of this specific input. Contrast this with the FOLLOW(atom)={'+',')',';','.'}. You want the exact viable token set when recovering from a token mismatch. Upon token mismatch, if LA(1) is member of the viable next token set, then you know there is most likely a missing token in the input stream. "Insert" one by just not throwing an exception. """ return self.combineFollows(True) def combineFollows(self, exact): followSet = set() for idx, localFollowSet in reversed(list(enumerate(self._state.following))): followSet |= localFollowSet if exact: # can we see end of rule? if EOR_TOKEN_TYPE in localFollowSet: # Only leave EOR in set if at top (start rule); this lets # us know if have to include follow(start rule); i.e., EOF if idx > 0: followSet.remove(EOR_TOKEN_TYPE) else: # can't see end of rule, quit break return followSet def recoverFromMismatchedToken(self, input, ttype, follow): """Attempt to recover from a single missing or extra token. EXTRA TOKEN LA(1) is not what we are looking for. If LA(2) has the right token, however, then assume LA(1) is some extra spurious token. Delete it and LA(2) as if we were doing a normal match(), which advances the input. MISSING TOKEN If current token is consistent with what could come after ttype then it is ok to 'insert' the missing token, else throw exception For example, Input 'i=(3;' is clearly missing the ')'. When the parser returns from the nested call to expr, it will have call chain: stat -> expr -> atom and it will be trying to match the ')' at this point in the derivation: => ID '=' '(' INT ')' ('+' atom)* ';' ^ match() will see that ';' doesn't match ')' and report a mismatched token error. To recover, it sees that LA(1)==';' is in the set of tokens that can follow the ')' token reference in rule atom. It can assume that you forgot the ')'. """ e = None # if next token is what we are looking for then "delete" this token if self.mismatchIsUnwantedToken(input, ttype): e = UnwantedTokenException(ttype, input) self.beginResync() input.consume() # simply delete extra token self.endResync() # report after consuming so AW sees the token in the exception self.reportError(e) # we want to return the token we're actually matching matchedSymbol = self.getCurrentInputSymbol(input) # move past ttype token as if all were ok input.consume() return matchedSymbol # can't recover with single token deletion, try insertion if self.mismatchIsMissingToken(input, follow): inserted = self.getMissingSymbol(input, e, ttype, follow) e = MissingTokenException(ttype, input, inserted) # report after inserting so AW sees the token in the exception self.reportError(e) return inserted # even that didn't work; must throw the exception e = MismatchedTokenException(ttype, input) raise e def recoverFromMismatchedSet(self, input, e, follow): """Not currently used""" if self.mismatchIsMissingToken(input, follow): self.reportError(e) # we don't know how to conjure up a token for sets yet return self.getMissingSymbol(input, e, INVALID_TOKEN_TYPE, follow) # TODO do single token deletion like above for Token mismatch raise e def getCurrentInputSymbol(self, input): """ Match needs to return the current input symbol, which gets put into the label for the associated token ref; e.g., x=ID. Token and tree parsers need to return different objects. Rather than test for input stream type or change the IntStream interface, I use a simple method to ask the recognizer to tell me what the current input symbol is. This is ignored for lexers. """ return None def getMissingSymbol(self, input, e, expectedTokenType, follow): """Conjure up a missing token during error recovery. The recognizer attempts to recover from single missing symbols. But, actions might refer to that missing symbol. For example, x=ID {f($x);}. The action clearly assumes that there has been an identifier matched previously and that $x points at that token. If that token is missing, but the next token in the stream is what we want we assume that this token is missing and we keep going. Because we have to return some token to replace the missing token, we have to conjure one up. This method gives the user control over the tokens returned for missing tokens. Mostly, you will want to create something special for identifier tokens. For literals such as '{' and ',', the default action in the parser or tree parser works. It simply creates a CommonToken of the appropriate type. The text will be the token. If you change what tokens must be created by the lexer, override this method to create the appropriate tokens. """ return None def consumeUntil(self, input, tokenTypes): """ Consume tokens until one matches the given token or token set tokenTypes can be a single token type or a set of token types """ if not isinstance(tokenTypes, (set, frozenset)): tokenTypes = frozenset([tokenTypes]) ttype = input.LA(1) while ttype != EOF and ttype not in tokenTypes: input.consume() ttype = input.LA(1) def getRuleInvocationStack(self): """ Return List of the rules in your parser instance leading up to a call to this method. You could override if you want more details such as the file/line info of where in the parser java code a rule is invoked. This is very useful for error messages and for context-sensitive error recovery. You must be careful, if you subclass a generated recognizers. The default implementation will only search the module of self for rules, but the subclass will not contain any rules. You probably want to override this method to look like def getRuleInvocationStack(self): return self._getRuleInvocationStack(.__module__) where is the class of the generated recognizer, e.g. the superclass of self. """ return self._getRuleInvocationStack(self.__module__) @classmethod def _getRuleInvocationStack(cls, module): """ A more general version of getRuleInvocationStack where you can pass in, for example, a RecognitionException to get it's rule stack trace. This routine is shared with all recognizers, hence, static. TODO: move to a utility class or something; weird having lexer call this """ # mmmhhh,... perhaps look at the first argument # (f_locals[co_varnames[0]]?) and test if it's a (sub)class of # requested recognizer... rules = [] for frame in reversed(inspect.stack()): code = frame[0].f_code codeMod = inspect.getmodule(code) if codeMod is None: continue # skip frames not in requested module if codeMod.__name__ != module: continue # skip some unwanted names if code.co_name in ('nextToken', ''): continue rules.append(code.co_name) return rules def getBacktrackingLevel(self): return self._state.backtracking def setBacktrackingLevel(self, n): self._state.backtracking = n def getGrammarFileName(self): """For debugging and other purposes, might want the grammar name. Have ANTLR generate an implementation for this method. """ return self.grammarFileName def getSourceName(self): raise NotImplementedError def toStrings(self, tokens): """A convenience method for use most often with template rewrites. Convert a Token list to a str list. """ if tokens is None: return None return [token.text for token in tokens] def getRuleMemoization(self, ruleIndex, ruleStartIndex): """ Given a rule number and a start token index number, return MEMO_RULE_UNKNOWN if the rule has not parsed input starting from start index. If this rule has parsed input starting from the start index before, then return where the rule stopped parsing. It returns the index of the last token matched by the rule. """ if ruleIndex not in self._state.ruleMemo: self._state.ruleMemo[ruleIndex] = {} return self._state.ruleMemo[ruleIndex].get( ruleStartIndex, self.MEMO_RULE_UNKNOWN ) def alreadyParsedRule(self, input, ruleIndex): """ Has this rule already parsed input at the current index in the input stream? Return the stop token index or MEMO_RULE_UNKNOWN. If we attempted but failed to parse properly before, return MEMO_RULE_FAILED. This method has a side-effect: if we have seen this input for this rule and successfully parsed before, then seek ahead to 1 past the stop token matched for this rule last time. """ stopIndex = self.getRuleMemoization(ruleIndex, input.index()) if stopIndex == self.MEMO_RULE_UNKNOWN: return False if stopIndex == self.MEMO_RULE_FAILED: raise BacktrackingFailed else: input.seek(stopIndex + 1) return True def memoize(self, input, ruleIndex, ruleStartIndex, success): """ Record whether or not this rule parsed the input at this position successfully. """ if success: stopTokenIndex = input.index() - 1 else: stopTokenIndex = self.MEMO_RULE_FAILED if ruleIndex in self._state.ruleMemo: self._state.ruleMemo[ruleIndex][ruleStartIndex] = stopTokenIndex def traceIn(self, ruleName, ruleIndex, inputSymbol): sys.stdout.write("enter {} {}".format(ruleName, inputSymbol)) if self._state.backtracking > 0: sys.stdout.write(" backtracking={}".format(self._state.backtracking)) sys.stdout.write('\n') def traceOut(self, ruleName, ruleIndex, inputSymbol): sys.stdout.write("exit {} {}".format(ruleName, inputSymbol)) if self._state.backtracking > 0: sys.stdout.write(" backtracking={}".format(self._state.backtracking)) # mmmm... we use BacktrackingFailed exceptions now. So how could we # get that information here? #if self._state.failed: # sys.stdout.write(" failed") #else: # sys.stdout.write(" succeeded") sys.stdout.write('\n') class TokenSource(object): """ @brief Abstract baseclass for token producers. A source of tokens must provide a sequence of tokens via nextToken() and also must reveal it's source of characters; CommonToken's text is computed from a CharStream; it only store indices into the char stream. Errors from the lexer are never passed to the parser. Either you want to keep going or you do not upon token recognition error. If you do not want to continue lexing then you do not want to continue parsing. Just throw an exception not under RecognitionException and Java will naturally toss you all the way out of the recognizers. If you want to continue lexing then you should not throw an exception to the parser--it has already requested a token. Keep lexing until you get a valid one. Just report errors and keep going, looking for a valid token. """ def nextToken(self): """Return a Token object from your input stream (usually a CharStream). Do not fail/return upon lexing error; keep chewing on the characters until you get a good one; errors are not passed through to the parser. """ raise NotImplementedError def __iter__(self): """The TokenSource is an interator. The iteration will not include the final EOF token, see also the note for the __next__() method. """ return self def __next__(self): """Return next token or raise StopIteration. Note that this will raise StopIteration when hitting the EOF token, so EOF will not be part of the iteration. """ token = self.nextToken() if token is None or token.type == EOF: raise StopIteration return token class Lexer(BaseRecognizer, TokenSource): """ @brief Baseclass for generated lexer classes. A lexer is recognizer that draws input symbols from a character stream. lexer grammars result in a subclass of this object. A Lexer object uses simplified match() and error recovery mechanisms in the interest of speed. """ def __init__(self, input, state=None): BaseRecognizer.__init__(self, state) TokenSource.__init__(self) # Where is the lexer drawing characters from? self.input = input def reset(self): super().reset() # reset all recognizer state variables if self.input is not None: # rewind the input self.input.seek(0) if self._state is None: # no shared state work to do return # wack Lexer state variables self._state.token = None self._state.type = INVALID_TOKEN_TYPE self._state.channel = DEFAULT_CHANNEL self._state.tokenStartCharIndex = -1 self._state.tokenStartLine = -1 self._state.tokenStartCharPositionInLine = -1 self._state.text = None def makeEOFToken(self): eof = CommonToken( type=EOF, channel=DEFAULT_CHANNEL, input=self.input, start=self.input.index(), stop=self.input.index()) eof.line = self.input.line eof.charPositionInLine = self.input.charPositionInLine return eof def nextToken(self): """ Return a token from this source; i.e., match a token on the char stream. """ while 1: self._state.token = None self._state.channel = DEFAULT_CHANNEL self._state.tokenStartCharIndex = self.input.index() self._state.tokenStartCharPositionInLine = self.input.charPositionInLine self._state.tokenStartLine = self.input.line self._state.text = None if self.input.LA(1) == EOF: return self.makeEOFToken() try: self.mTokens() if self._state.token is None: self.emit() elif self._state.token == SKIP_TOKEN: continue return self._state.token except NoViableAltException as re: self.reportError(re) self.recover(re) # throw out current char and try again except RecognitionException as re: self.reportError(re) # match() routine has already called recover() def skip(self): """ Instruct the lexer to skip creating a token for current lexer rule and look for another token. nextToken() knows to keep looking when a lexer rule finishes with token set to SKIP_TOKEN. Recall that if token==null at end of any token rule, it creates one for you and emits it. """ self._state.token = SKIP_TOKEN def mTokens(self): """This is the lexer entry point that sets instance var 'token'""" # abstract method raise NotImplementedError def setCharStream(self, input): """Set the char stream and reset the lexer""" self.input = None self.reset() self.input = input def getSourceName(self): return self.input.getSourceName() def emit(self, token=None): """ The standard method called to automatically emit a token at the outermost lexical rule. The token object should point into the char buffer start..stop. If there is a text override in 'text', use that to set the token's text. Override this method to emit custom Token objects. If you are building trees, then you should also override Parser or TreeParser.getMissingSymbol(). """ if token is None: token = CommonToken( input=self.input, type=self._state.type, channel=self._state.channel, start=self._state.tokenStartCharIndex, stop=self.getCharIndex()-1 ) token.line = self._state.tokenStartLine token.text = self._state.text token.charPositionInLine = self._state.tokenStartCharPositionInLine self._state.token = token return token def match(self, s): if isinstance(s, str): for c in s: if self.input.LA(1) != ord(c): if self._state.backtracking > 0: raise BacktrackingFailed mte = MismatchedTokenException(c, self.input) self.recover(mte) raise mte self.input.consume() else: if self.input.LA(1) != s: if self._state.backtracking > 0: raise BacktrackingFailed mte = MismatchedTokenException(chr(s), self.input) self.recover(mte) # don't really recover; just consume in lexer raise mte self.input.consume() def matchAny(self): self.input.consume() def matchRange(self, a, b): if self.input.LA(1) < a or self.input.LA(1) > b: if self._state.backtracking > 0: raise BacktrackingFailed mre = MismatchedRangeException(chr(a), chr(b), self.input) self.recover(mre) raise mre self.input.consume() def getLine(self): return self.input.line def getCharPositionInLine(self): return self.input.charPositionInLine def getCharIndex(self): """What is the index of the current character of lookahead?""" return self.input.index() def getText(self): """ Return the text matched so far for the current token or any text override. """ if self._state.text is not None: return self._state.text return self.input.substring( self._state.tokenStartCharIndex, self.getCharIndex()-1 ) def setText(self, text): """ Set the complete text of this token; it wipes any previous changes to the text. """ self._state.text = text text = property(getText, setText) def reportError(self, e): ## TODO: not thought about recovery in lexer yet. ## # if we've already reported an error and have not matched a token ## # yet successfully, don't report any errors. ## if self.errorRecovery: ## return ## ## self.errorRecovery = True self.displayRecognitionError(e) def getErrorMessage(self, e): msg = None if isinstance(e, MismatchedTokenException): msg = "mismatched character {} expecting {}".format( self.getCharErrorDisplay(e.c), self.getCharErrorDisplay(e.expecting)) elif isinstance(e, NoViableAltException): msg = "no viable alternative at character {}".format( self.getCharErrorDisplay(e.c)) elif isinstance(e, EarlyExitException): msg = "required (...)+ loop did not match anything at character {}".format( self.getCharErrorDisplay(e.c)) elif isinstance(e, MismatchedNotSetException): msg = "mismatched character {} expecting set {!r}".format( self.getCharErrorDisplay(e.c), e.expecting) elif isinstance(e, MismatchedSetException): msg = "mismatched character {} expecting set {!r}".format( self.getCharErrorDisplay(e.c), e.expecting) elif isinstance(e, MismatchedRangeException): msg = "mismatched character {} expecting set {}..{}".format( self.getCharErrorDisplay(e.c), self.getCharErrorDisplay(e.a), self.getCharErrorDisplay(e.b)) else: msg = super().getErrorMessage(e) return msg def getCharErrorDisplay(self, c): if c == EOF: c = '' return repr(c) def recover(self, re): """ Lexers can normally match any char in it's vocabulary after matching a token, so do the easy thing and just kill a character and hope it all works out. You can instead use the rule invocation stack to do sophisticated error recovery if you are in a fragment rule. """ self.input.consume() def traceIn(self, ruleName, ruleIndex): inputSymbol = "{} line={}:{}".format(self.input.LT(1), self.getLine(), self.getCharPositionInLine() ) super().traceIn(ruleName, ruleIndex, inputSymbol) def traceOut(self, ruleName, ruleIndex): inputSymbol = "{} line={}:{}".format(self.input.LT(1), self.getLine(), self.getCharPositionInLine() ) super().traceOut(ruleName, ruleIndex, inputSymbol) class Parser(BaseRecognizer): """ @brief Baseclass for generated parser classes. """ def __init__(self, lexer, state=None): super().__init__(state) self.input = lexer def reset(self): super().reset() # reset all recognizer state variables if self.input is not None: self.input.seek(0) # rewind the input def getCurrentInputSymbol(self, input): return input.LT(1) def getMissingSymbol(self, input, e, expectedTokenType, follow): if expectedTokenType == EOF: tokenText = "" else: tokenText = "".format(self.tokenNames[expectedTokenType]) t = CommonToken(type=expectedTokenType, text=tokenText) current = input.LT(1) if current.type == EOF: current = input.LT(-1) if current is not None: t.line = current.line t.charPositionInLine = current.charPositionInLine t.channel = DEFAULT_CHANNEL return t def setTokenStream(self, input): """Set the token stream and reset the parser""" self.input = None self.reset() self.input = input def getTokenStream(self): return self.input def getSourceName(self): return self.input.getSourceName() def traceIn(self, ruleName, ruleIndex): super().traceIn(ruleName, ruleIndex, self.input.LT(1)) def traceOut(self, ruleName, ruleIndex): super().traceOut(ruleName, ruleIndex, self.input.LT(1)) class RuleReturnScope(object): """ Rules can return start/stop info as well as possible trees and templates. """ def getStart(self): """Return the start token or tree.""" return None def getStop(self): """Return the stop token or tree.""" return None def getTree(self): """Has a value potentially if output=AST.""" return None def getTemplate(self): """Has a value potentially if output=template.""" return None class ParserRuleReturnScope(RuleReturnScope): """ Rules that return more than a single value must return an object containing all the values. Besides the properties defined in RuleLabelScope.predefinedRulePropertiesScope there may be user-defined return values. This class simply defines the minimum properties that are always defined and methods to access the others that might be available depending on output option such as template and tree. Note text is not an actual property of the return value, it is computed from start and stop using the input stream's toString() method. I could add a ctor to this so that we can pass in and store the input stream, but I'm not sure we want to do that. It would seem to be undefined to get the .text property anyway if the rule matches tokens from multiple input streams. I do not use getters for fields of objects that are used simply to group values such as this aggregate. The getters/setters are there to satisfy the superclass interface. """ def __init__(self): super().__init__() self.start = None self.stop = None self.tree = None # only used when output=AST def getStart(self): return self.start def getStop(self): return self.stop def getTree(self): return self.tree python3-antlr3-3.5.2/antlr3/streams.py000066400000000000000000001255221324200532700175660ustar00rootroot00000000000000"""ANTLR3 runtime package""" # begin[licence] # # [The "BSD licence"] # Copyright (c) 2005-2012 Terence Parr # All rights reserved. # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions # are met: # 1. Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # 2. Redistributions in binary form must reproduce the above copyright # notice, this list of conditions and the following disclaimer in the # documentation and/or other materials provided with the distribution. # 3. The name of the author may not be used to endorse or promote products # derived from this software without specific prior written permission. # # THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR # IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES # OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. # IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, # INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT # NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, # DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY # THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT # (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF # THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. # # end[licence] from io import StringIO from .constants import DEFAULT_CHANNEL, EOF from .tokens import Token ############################################################################ # # basic interfaces # IntStream # +- CharStream # \- TokenStream # # subclasses must implemented all methods # ############################################################################ class IntStream(object): """ @brief Base interface for streams of integer values. A simple stream of integers used when all I care about is the char or token type sequence (such as interpretation). """ def consume(self): raise NotImplementedError def LA(self, i): """Get int at current input pointer + i ahead where i=1 is next int. Negative indexes are allowed. LA(-1) is previous token (token just matched). LA(-i) where i is before first token should yield -1, invalid char / EOF. """ raise NotImplementedError def mark(self): """ Tell the stream to start buffering if it hasn't already. Return current input position, index(), or some other marker so that when passed to rewind() you get back to the same spot. rewind(mark()) should not affect the input cursor. The Lexer track line/col info as well as input index so its markers are not pure input indexes. Same for tree node streams. """ raise NotImplementedError def index(self): """ Return the current input symbol index 0..n where n indicates the last symbol has been read. The index is the symbol about to be read not the most recently read symbol. """ raise NotImplementedError def rewind(self, marker=None): """ Reset the stream so that next call to index would return marker. The marker will usually be index() but it doesn't have to be. It's just a marker to indicate what state the stream was in. This is essentially calling release() and seek(). If there are markers created after this marker argument, this routine must unroll them like a stack. Assume the state the stream was in when this marker was created. If marker is None: Rewind to the input position of the last marker. Used currently only after a cyclic DFA and just before starting a sem/syn predicate to get the input position back to the start of the decision. Do not "pop" the marker off the state. mark(i) and rewind(i) should balance still. It is like invoking rewind(last marker) but it should not "pop" the marker off. It's like seek(last marker's input position). """ raise NotImplementedError def release(self, marker=None): """ You may want to commit to a backtrack but don't want to force the stream to keep bookkeeping objects around for a marker that is no longer necessary. This will have the same behavior as rewind() except it releases resources without the backward seek. This must throw away resources for all markers back to the marker argument. So if you're nested 5 levels of mark(), and then release(2) you have to release resources for depths 2..5. """ raise NotImplementedError def seek(self, index): """ Set the input cursor to the position indicated by index. This is normally used to seek ahead in the input stream. No buffering is required to do this unless you know your stream will use seek to move backwards such as when backtracking. This is different from rewind in its multi-directional requirement and in that its argument is strictly an input cursor (index). For char streams, seeking forward must update the stream state such as line number. For seeking backwards, you will be presumably backtracking using the mark/rewind mechanism that restores state and so this method does not need to update state when seeking backwards. Currently, this method is only used for efficient backtracking using memoization, but in the future it may be used for incremental parsing. The index is 0..n-1. A seek to position i means that LA(1) will return the ith symbol. So, seeking to 0 means LA(1) will return the first element in the stream. """ raise NotImplementedError def size(self): """ Only makes sense for streams that buffer everything up probably, but might be useful to display the entire stream or for testing. This value includes a single EOF. """ raise NotImplementedError def getSourceName(self): """ Where are you getting symbols from? Normally, implementations will pass the buck all the way to the lexer who can ask its input stream for the file name or whatever. """ raise NotImplementedError class CharStream(IntStream): """ @brief A source of characters for an ANTLR lexer. This is an abstract class that must be implemented by a subclass. """ # pylint does not realize that this is an interface, too #pylint: disable-msg=W0223 EOF = -1 def __init__(self): # line number 1..n within the input self._line = 1 # The index of the character relative to the beginning of the # line 0..n-1 self._charPositionInLine = 0 def substring(self, start, stop): """ For infinite streams, you don't need this; primarily I'm providing a useful interface for action code. Just make sure actions don't use this on streams that don't support it. """ raise NotImplementedError def LT(self, i): """ Get the ith character of lookahead. This is the same usually as LA(i). This will be used for labels in the generated lexer code. I'd prefer to return a char here type-wise, but it's probably better to be 32-bit clean and be consistent with LA. """ raise NotImplementedError @property def line(self): """ANTLR tracks the line information automatically""" return self._line @line.setter def line(self, value): """ Because this stream can rewind, we need to be able to reset the line """ self._line = value @property def charPositionInLine(self): """ The index of the character relative to the beginning of the line 0..n-1 """ return self._charPositionInLine @charPositionInLine.setter def charPositionInLine(self, pos): self._charPositionInLine = pos class TokenStream(IntStream): """ @brief A stream of tokens accessing tokens from a TokenSource This is an abstract class that must be implemented by a subclass. """ # pylint does not realize that this is an interface, too #pylint: disable-msg=W0223 def LT(self, k): """ Get Token at current input pointer + i ahead where i=1 is next Token. i<0 indicates tokens in the past. So -1 is previous token and -2 is two tokens ago. LT(0) is undefined. For i>=n, return Token.EOFToken. Return null for LT(0) and any index that results in an absolute address that is negative. """ raise NotImplementedError def range(self): """ How far ahead has the stream been asked to look? The return value is a valid index from 0..n-1. """ raise NotImplementedError def get(self, i): """ Get a token at an absolute index i; 0..n-1. This is really only needed for profiling and debugging and token stream rewriting. If you don't want to buffer up tokens, then this method makes no sense for you. Naturally you can't use the rewrite stream feature. I believe DebugTokenStream can easily be altered to not use this method, removing the dependency. """ raise NotImplementedError def getTokenSource(self): """ Where is this stream pulling tokens from? This is not the name, but the object that provides Token objects. """ raise NotImplementedError def toString(self, start=None, stop=None): """ Return the text of all tokens from start to stop, inclusive. If the stream does not buffer all the tokens then it can just return "" or null; Users should not access $ruleLabel.text in an action of course in that case. Because the user is not required to use a token with an index stored in it, we must provide a means for two token objects themselves to indicate the start/end location. Most often this will just delegate to the other toString(int,int). This is also parallel with the TreeNodeStream.toString(Object,Object). """ raise NotImplementedError ############################################################################ # # character streams for use in lexers # CharStream # \- ANTLRStringStream # ############################################################################ class ANTLRStringStream(CharStream): """ @brief CharStream that pull data from a unicode string. A pretty quick CharStream that pulls all data from an array directly. Every method call counts in the lexer. """ def __init__(self, data): """ @param data This should be a unicode string holding the data you want to parse. If you pass in a byte string, the Lexer will choke on non-ascii data. """ super().__init__() # The data being scanned self.strdata = str(data) self.data = [ord(c) for c in self.strdata] # How many characters are actually in the buffer self.n = len(data) # 0..n-1 index into string of next char self.p = 0 # A list of CharStreamState objects that tracks the stream state # values line, charPositionInLine, and p that can change as you # move through the input stream. Indexed from 0..markDepth-1. self._markers = [ ] self.lastMarker = None self.markDepth = 0 # What is name or source of this char stream? self.name = None def reset(self): """ Reset the stream so that it's in the same state it was when the object was created *except* the data array is not touched. """ self.p = 0 self._line = 1 self.charPositionInLine = 0 self._markers = [ ] self.lastMarker = None self.markDepth = 0 def consume(self): if self.p < self.n: if self.data[self.p] == 10: # ord('\n') self._line += 1 self.charPositionInLine = 0 else: self.charPositionInLine += 1 self.p += 1 # else we reached EOF # just do nothing def LA(self, i): if i == 0: return 0 # undefined if i < 0: i += 1 # e.g., translate LA(-1) to use offset i=0; then data[p+0-1] if self.p + i - 1 < self.n: return self.data[self.p + i - 1] else: return EOF def LT(self, i): if i == 0: return 0 # undefined if i < 0: i += 1 # e.g., translate LA(-1) to use offset i=0; then data[p+0-1] if self.p + i - 1 < self.n: return self.strdata[self.p + i - 1] else: return EOF def index(self): """ Return the current input symbol index 0..n where n indicates the last symbol has been read. The index is the index of char to be returned from LA(1). """ return self.p def size(self): return self.n def mark(self): state = (self.p, self.line, self.charPositionInLine) if self.markDepth < len(self._markers): self._markers[self.markDepth] = state else: self._markers.append(state) self.markDepth += 1 self.lastMarker = self.markDepth return self.lastMarker def rewind(self, marker=None): if marker is None: marker = self.lastMarker p, line, charPositionInLine = self._markers[marker - 1] self.seek(p) self._line = line self.charPositionInLine = charPositionInLine self.release(marker) def release(self, marker=None): if marker is None: marker = self.lastMarker self.markDepth = marker - 1 def seek(self, index): """ consume() ahead until p==index; can't just set p=index as we must update line and charPositionInLine. """ if index <= self.p: self.p = index # just jump; don't update stream state (line, ...) return # seek forward, consume until p hits index while self.p < index: self.consume() def substring(self, start, stop): return self.strdata[start:stop + 1] def getSourceName(self): return self.name class ANTLRFileStream(ANTLRStringStream): """ @brief CharStream that opens a file to read the data. This is a char buffer stream that is loaded from a file all at once when you construct the object. """ def __init__(self, fileName): """ @param fileName The path to the file to be opened. The file will be opened with mode 'r'. """ self._fileName = fileName with open(fileName, 'r') as fp: super().__init__(fp.read()) @property def fileName(self): return self._fileName class ANTLRInputStream(ANTLRStringStream): """ @brief CharStream that reads data from a file-like object. This is a char buffer stream that is loaded from a file like object all at once when you construct the object. All input is consumed from the file, but it is not closed. """ def __init__(self, file): """ @param file A file-like object holding your input. Only the read() method must be implemented. """ data = file.read() super().__init__(data) # I guess the ANTLR prefix exists only to avoid a name clash with some Java # mumbojumbo. A plain "StringStream" looks better to me, which should be # the preferred name in Python. StringStream = ANTLRStringStream FileStream = ANTLRFileStream InputStream = ANTLRInputStream ############################################################################ # # Token streams # TokenStream # +- CommonTokenStream # \- TokenRewriteStream # ############################################################################ class CommonTokenStream(TokenStream): """ @brief The most common stream of tokens The most common stream of tokens is one where every token is buffered up and tokens are prefiltered for a certain channel (the parser will only see these tokens and cannot change the filter channel number during the parse). """ def __init__(self, tokenSource=None, channel=DEFAULT_CHANNEL): """ @param tokenSource A TokenSource instance (usually a Lexer) to pull the tokens from. @param channel Skip tokens on any channel but this one; this is how we skip whitespace... """ super().__init__() self.tokenSource = tokenSource # Record every single token pulled from the source so we can reproduce # chunks of it later. self.tokens = [] # Map to override some Tokens' channel numbers self.channelOverrideMap = {} # Set; discard any tokens with this type self.discardSet = set() # Skip tokens on any channel but this one; this is how we skip # whitespace... self.channel = channel # By default, track all incoming tokens self.discardOffChannelTokens = False # The index into the tokens list of the current token (next token # to consume). p==-1 indicates that the tokens list is empty self.p = -1 # Remember last marked position self.lastMarker = None # how deep have we gone? self._range = -1 def makeEOFToken(self): return self.tokenSource.makeEOFToken() def setTokenSource(self, tokenSource): """Reset this token stream by setting its token source.""" self.tokenSource = tokenSource self.tokens = [] self.p = -1 self.channel = DEFAULT_CHANNEL def reset(self): self.p = 0 self.lastMarker = None def fillBuffer(self): """ Load all tokens from the token source and put in tokens. This is done upon first LT request because you might want to set some token type / channel overrides before filling buffer. """ index = 0 t = self.tokenSource.nextToken() while t and t.type != EOF: discard = False if self.discardSet and t.type in self.discardSet: discard = True elif self.discardOffChannelTokens and t.channel != self.channel: discard = True # is there a channel override for token type? if t.type in self.channelOverrideMap: overrideChannel = self.channelOverrideMap[t.type] if overrideChannel == self.channel: t.channel = overrideChannel else: discard = True if not discard: t.index = index self.tokens.append(t) index += 1 t = self.tokenSource.nextToken() # leave p pointing at first token on channel self.p = 0 self.p = self.skipOffTokenChannels(self.p) def consume(self): """ Move the input pointer to the next incoming token. The stream must become active with LT(1) available. consume() simply moves the input pointer so that LT(1) points at the next input symbol. Consume at least one token. Walk past any token not on the channel the parser is listening to. """ if self.p < len(self.tokens): self.p += 1 self.p = self.skipOffTokenChannels(self.p) # leave p on valid token def skipOffTokenChannels(self, i): """ Given a starting index, return the index of the first on-channel token. """ n = len(self.tokens) while i < n and self.tokens[i].channel != self.channel: i += 1 return i def skipOffTokenChannelsReverse(self, i): while i >= 0 and self.tokens[i].channel != self.channel: i -= 1 return i def setTokenTypeChannel(self, ttype, channel): """ A simple filter mechanism whereby you can tell this token stream to force all tokens of type ttype to be on channel. For example, when interpreting, we cannot exec actions so we need to tell the stream to force all WS and NEWLINE to be a different, ignored channel. """ self.channelOverrideMap[ttype] = channel def discardTokenType(self, ttype): self.discardSet.add(ttype) def getTokens(self, start=None, stop=None, types=None): """ Given a start and stop index, return a list of all tokens in the token type set. Return None if no tokens were found. This method looks at both on and off channel tokens. """ if self.p == -1: self.fillBuffer() if stop is None or stop > len(self.tokens): stop = len(self.tokens) if start is None or start < 0: start = 0 if start > stop: return None if isinstance(types, int): # called with a single type, wrap into set types = set([types]) filteredTokens = [ token for token in self.tokens[start:stop] if types is None or token.type in types ] if len(filteredTokens) == 0: return None return filteredTokens def LT(self, k): """ Get the ith token from the current position 1..n where k=1 is the first symbol of lookahead. """ if self.p == -1: self.fillBuffer() if k == 0: return None if k < 0: return self.LB(-k) i = self.p n = 1 # find k good tokens while n < k: # skip off-channel tokens i = self.skipOffTokenChannels(i + 1) # leave p on valid token n += 1 if i > self._range: self._range = i if i < len(self.tokens): return self.tokens[i] else: return self.makeEOFToken() def LB(self, k): """Look backwards k tokens on-channel tokens""" if self.p == -1: self.fillBuffer() if k == 0: return None if self.p - k < 0: return None i = self.p n = 1 # find k good tokens looking backwards while n <= k: # skip off-channel tokens i = self.skipOffTokenChannelsReverse(i - 1) # leave p on valid token n += 1 if i < 0: return None return self.tokens[i] def get(self, i): """ Return absolute token i; ignore which channel the tokens are on; that is, count all tokens not just on-channel tokens. """ return self.tokens[i] def slice(self, start, stop): if self.p == -1: self.fillBuffer() if start < 0 or stop < 0: return None return self.tokens[start:stop + 1] def LA(self, i): return self.LT(i).type def mark(self): self.lastMarker = self.index() return self.lastMarker def release(self, marker=None): # no resources to release pass def size(self): return len(self.tokens) def range(self): return self._range def index(self): return self.p def rewind(self, marker=None): if marker is None: marker = self.lastMarker self.seek(marker) def seek(self, index): self.p = index def getTokenSource(self): return self.tokenSource def getSourceName(self): return self.tokenSource.getSourceName() def toString(self, start=None, stop=None): """Returns a string of all tokens between start and stop (inclusive).""" if self.p == -1: self.fillBuffer() if start is None: start = 0 elif not isinstance(start, int): start = start.index if stop is None: stop = len(self.tokens) - 1 elif not isinstance(stop, int): stop = stop.index if stop >= len(self.tokens): stop = len(self.tokens) - 1 return ''.join([t.text for t in self.tokens[start:stop + 1]]) class RewriteOperation(object): """@brief Internal helper class.""" def __init__(self, stream, index, text): self.stream = stream # What index into rewrites List are we? self.instructionIndex = None # Token buffer index. self.index = index self.text = text def execute(self, buf): """Execute the rewrite operation by possibly adding to the buffer. Return the index of the next token to operate on. """ return self.index def toString(self): opName = self.__class__.__name__ return '<{opName}@{0.index}:"{0.text}">'.format(self, opName=opName) __str__ = toString __repr__ = toString class InsertBeforeOp(RewriteOperation): """@brief Internal helper class.""" def execute(self, buf): buf.write(self.text) if self.stream.tokens[self.index].type != EOF: buf.write(self.stream.tokens[self.index].text) return self.index + 1 class ReplaceOp(RewriteOperation): """ @brief Internal helper class. I'm going to try replacing range from x..y with (y-x)+1 ReplaceOp instructions. """ def __init__(self, stream, first, last, text): super().__init__(stream, first, text) self.lastIndex = last def execute(self, buf): if self.text is not None: buf.write(self.text) return self.lastIndex + 1 def toString(self): if self.text is None: return ''.format(self) return ''.format(self) __str__ = toString __repr__ = toString class TokenRewriteStream(CommonTokenStream): """@brief CommonTokenStream that can be modified. Useful for dumping out the input stream after doing some augmentation or other manipulations. You can insert stuff, replace, and delete chunks. Note that the operations are done lazily--only if you convert the buffer to a String. This is very efficient because you are not moving data around all the time. As the buffer of tokens is converted to strings, the toString() method(s) check to see if there is an operation at the current index. If so, the operation is done and then normal String rendering continues on the buffer. This is like having multiple Turing machine instruction streams (programs) operating on a single input tape. :) Since the operations are done lazily at toString-time, operations do not screw up the token index values. That is, an insert operation at token index i does not change the index values for tokens i+1..n-1. Because operations never actually alter the buffer, you may always get the original token stream back without undoing anything. Since the instructions are queued up, you can easily simulate transactions and roll back any changes if there is an error just by removing instructions. For example, CharStream input = new ANTLRFileStream("input"); TLexer lex = new TLexer(input); TokenRewriteStream tokens = new TokenRewriteStream(lex); T parser = new T(tokens); parser.startRule(); Then in the rules, you can execute Token t,u; ... input.insertAfter(t, "text to put after t");} input.insertAfter(u, "text after u");} System.out.println(tokens.toString()); Actually, you have to cast the 'input' to a TokenRewriteStream. :( You can also have multiple "instruction streams" and get multiple rewrites from a single pass over the input. Just name the instruction streams and use that name again when printing the buffer. This could be useful for generating a C file and also its header file--all from the same buffer: tokens.insertAfter("pass1", t, "text to put after t");} tokens.insertAfter("pass2", u, "text after u");} System.out.println(tokens.toString("pass1")); System.out.println(tokens.toString("pass2")); If you don't use named rewrite streams, a "default" stream is used as the first example shows. """ DEFAULT_PROGRAM_NAME = "default" MIN_TOKEN_INDEX = 0 def __init__(self, tokenSource=None, channel=DEFAULT_CHANNEL): super().__init__(tokenSource, channel) # You may have multiple, named streams of rewrite operations. # I'm calling these things "programs." # Maps String (name) -> rewrite (List) self.programs = {} self.programs[self.DEFAULT_PROGRAM_NAME] = [] # Map String (program name) -> Integer index self.lastRewriteTokenIndexes = {} def rollback(self, *args): """ Rollback the instruction stream for a program so that the indicated instruction (via instructionIndex) is no longer in the stream. UNTESTED! """ if len(args) == 2: programName = args[0] instructionIndex = args[1] elif len(args) == 1: programName = self.DEFAULT_PROGRAM_NAME instructionIndex = args[0] else: raise TypeError("Invalid arguments") p = self.programs.get(programName) if p: self.programs[programName] = ( p[self.MIN_TOKEN_INDEX:instructionIndex]) def deleteProgram(self, programName=DEFAULT_PROGRAM_NAME): """Reset the program so that no instructions exist""" self.rollback(programName, self.MIN_TOKEN_INDEX) def insertAfter(self, *args): if len(args) == 2: programName = self.DEFAULT_PROGRAM_NAME index = args[0] text = args[1] elif len(args) == 3: programName = args[0] index = args[1] text = args[2] else: raise TypeError("Invalid arguments") if isinstance(index, Token): # index is a Token, grap the stream index from it index = index.index # to insert after, just insert before next index (even if past end) self.insertBefore(programName, index + 1, text) def insertBefore(self, *args): if len(args) == 2: programName = self.DEFAULT_PROGRAM_NAME index = args[0] text = args[1] elif len(args) == 3: programName = args[0] index = args[1] text = args[2] else: raise TypeError("Invalid arguments") if isinstance(index, Token): # index is a Token, grab the stream index from it index = index.index op = InsertBeforeOp(self, index, text) rewrites = self.getProgram(programName) op.instructionIndex = len(rewrites) rewrites.append(op) def replace(self, *args): if len(args) == 2: programName = self.DEFAULT_PROGRAM_NAME first = args[0] last = args[0] text = args[1] elif len(args) == 3: programName = self.DEFAULT_PROGRAM_NAME first = args[0] last = args[1] text = args[2] elif len(args) == 4: programName = args[0] first = args[1] last = args[2] text = args[3] else: raise TypeError("Invalid arguments") if isinstance(first, Token): # first is a Token, grap the stream index from it first = first.index if isinstance(last, Token): # last is a Token, grap the stream index from it last = last.index if first > last or first < 0 or last < 0 or last >= len(self.tokens): raise ValueError( "replace: range invalid: {}..{} (size={})" .format(first, last, len(self.tokens))) op = ReplaceOp(self, first, last, text) rewrites = self.getProgram(programName) op.instructionIndex = len(rewrites) rewrites.append(op) def delete(self, *args): self.replace(*(list(args) + [None])) def getLastRewriteTokenIndex(self, programName=DEFAULT_PROGRAM_NAME): return self.lastRewriteTokenIndexes.get(programName, -1) def setLastRewriteTokenIndex(self, programName, i): self.lastRewriteTokenIndexes[programName] = i def getProgram(self, name): p = self.programs.get(name) if not p: p = self.initializeProgram(name) return p def initializeProgram(self, name): p = [] self.programs[name] = p return p def toOriginalString(self, start=None, end=None): if self.p == -1: self.fillBuffer() if start is None: start = self.MIN_TOKEN_INDEX if end is None: end = self.size() - 1 buf = StringIO() i = start while i >= self.MIN_TOKEN_INDEX and i <= end and i < len(self.tokens): if self.get(i).type != EOF: buf.write(self.get(i).text) i += 1 return buf.getvalue() def toString(self, *args): if self.p == -1: self.fillBuffer() if len(args) == 0: programName = self.DEFAULT_PROGRAM_NAME start = self.MIN_TOKEN_INDEX end = self.size() - 1 elif len(args) == 1: programName = args[0] start = self.MIN_TOKEN_INDEX end = self.size() - 1 elif len(args) == 2: programName = self.DEFAULT_PROGRAM_NAME start = args[0] end = args[1] if start is None: start = self.MIN_TOKEN_INDEX elif not isinstance(start, int): start = start.index if end is None: end = len(self.tokens) - 1 elif not isinstance(end, int): end = end.index # ensure start/end are in range if end >= len(self.tokens): end = len(self.tokens) - 1 if start < 0: start = 0 rewrites = self.programs.get(programName) if not rewrites: # no instructions to execute return self.toOriginalString(start, end) buf = StringIO() # First, optimize instruction stream indexToOp = self.reduceToSingleOperationPerIndex(rewrites) # Walk buffer, executing instructions and emitting tokens i = start while i <= end and i < len(self.tokens): # remove so any left have index size-1 op = indexToOp.pop(i, None) t = self.tokens[i] if op is None: # no operation at that index, just dump token if t.type != EOF: buf.write(t.text) i += 1 # move to next token else: i = op.execute(buf) # execute operation and skip # include stuff after end if it's last index in buffer # So, if they did an insertAfter(lastValidIndex, "foo"), include # foo if end == lastValidIndex. if end == len(self.tokens) - 1: # Scan any remaining operations after last token # should be included (they will be inserts). for i, op in sorted(indexToOp.items()): if op.index >= len(self.tokens) - 1: buf.write(op.text) return buf.getvalue() __str__ = toString def reduceToSingleOperationPerIndex(self, rewrites): """ We need to combine operations and report invalid operations (like overlapping replaces that are not completed nested). Inserts to same index need to be combined etc... Here are the cases: I.i.u I.j.v leave alone, nonoverlapping I.i.u I.i.v combine: Iivu R.i-j.u R.x-y.v | i-j in x-y delete first R R.i-j.u R.i-j.v delete first R R.i-j.u R.x-y.v | x-y in i-j ERROR R.i-j.u R.x-y.v | boundaries overlap ERROR Delete special case of replace (text==null): D.i-j.u D.x-y.v | boundaries overlapcombine to max(min)..max(right) I.i.u R.x-y.v | i in (x+1)-ydelete I (since insert before we're not deleting i) I.i.u R.x-y.v | i not in (x+1)-yleave alone, nonoverlapping R.x-y.v I.i.u | i in x-y ERROR R.x-y.v I.x.u R.x-y.uv (combine, delete I) R.x-y.v I.i.u | i not in x-y leave alone, nonoverlapping I.i.u = insert u before op @ index i R.x-y.u = replace x-y indexed tokens with u First we need to examine replaces. For any replace op: 1. wipe out any insertions before op within that range. 2. Drop any replace op before that is contained completely within that range. 3. Throw exception upon boundary overlap with any previous replace. Then we can deal with inserts: 1. for any inserts to same index, combine even if not adjacent. 2. for any prior replace with same left boundary, combine this insert with replace and delete this replace. 3. throw exception if index in same range as previous replace Don't actually delete; make op null in list. Easier to walk list. Later we can throw as we add to index -> op map. Note that I.2 R.2-2 will wipe out I.2 even though, technically, the inserted stuff would be before the replace range. But, if you add tokens in front of a method body '{' and then delete the method body, I think the stuff before the '{' you added should disappear too. Return a map from token index to operation. """ # WALK REPLACES for i, rop in enumerate(rewrites): if not rop: continue if not isinstance(rop, ReplaceOp): continue # Wipe prior inserts within range for j, iop in self.getKindOfOps(rewrites, InsertBeforeOp, i): if iop.index == rop.index: # E.g., insert before 2, delete 2..2; update replace # text to include insert before, kill insert rewrites[iop.instructionIndex] = None rop.text = self.catOpText(iop.text, rop.text) elif iop.index > rop.index and iop.index <= rop.lastIndex: # delete insert as it's a no-op. rewrites[j] = None # Drop any prior replaces contained within for j, prevRop in self.getKindOfOps(rewrites, ReplaceOp, i): if (prevRop.index >= rop.index and prevRop.lastIndex <= rop.lastIndex): # delete replace as it's a no-op. rewrites[j] = None continue # throw exception unless disjoint or identical disjoint = (prevRop.lastIndex < rop.index or prevRop.index > rop.lastIndex) same = (prevRop.index == rop.index and prevRop.lastIndex == rop.lastIndex) # Delete special case of replace (text==null): # D.i-j.u D.x-y.v| boundaries overlapcombine to # max(min)..max(right) if prevRop.text is None and rop.text is None and not disjoint: # kill first delete rewrites[prevRop.instructionIndex] = None rop.index = min(prevRop.index, rop.index) rop.lastIndex = max(prevRop.lastIndex, rop.lastIndex) elif not disjoint and not same: raise ValueError( "replace op boundaries of {} overlap with previous {}" .format(rop, prevRop)) # WALK INSERTS for i, iop in enumerate(rewrites): if iop is None: continue if not isinstance(iop, InsertBeforeOp): continue # combine current insert with prior if any at same index for j, prevIop in self.getKindOfOps(rewrites, InsertBeforeOp, i): if prevIop.index == iop.index: # combine objects # convert to strings...we're in process of toString'ing # whole token buffer so no lazy eval issue with any # templates iop.text = self.catOpText(iop.text, prevIop.text) # delete redundant prior insert rewrites[j] = None # look for replaces where iop.index is in range; error for j, rop in self.getKindOfOps(rewrites, ReplaceOp, i): if iop.index == rop.index: rop.text = self.catOpText(iop.text, rop.text) # delete current insert rewrites[i] = None continue if iop.index >= rop.index and iop.index <= rop.lastIndex: raise ValueError( "insert op {} within boundaries of previous {}" .format(iop, rop)) m = {} for i, op in enumerate(rewrites): if op is None: # ignore deleted ops continue assert op.index not in m, "should only be one op per index" m[op.index] = op return m def catOpText(self, a, b): x = "" y = "" if a: x = a if b: y = b return x + y def getKindOfOps(self, rewrites, kind, before=None): """Get all operations before an index of a particular kind.""" if before is None: before = len(rewrites) elif before > len(rewrites): before = len(rewrites) for i, op in enumerate(rewrites[:before]): # ignore deleted if op and op.__class__ == kind: yield i, op def toDebugString(self, start=None, end=None): if start is None: start = self.MIN_TOKEN_INDEX if end is None: end = self.size() - 1 buf = StringIO() i = start while i >= self.MIN_TOKEN_INDEX and i <= end and i < len(self.tokens): buf.write(self.get(i)) i += 1 return buf.getvalue() python3-antlr3-3.5.2/antlr3/tokens.py000066400000000000000000000230151324200532700174050ustar00rootroot00000000000000"""ANTLR3 runtime package""" # begin[licence] # # [The "BSD licence"] # Copyright (c) 2005-2012 Terence Parr # All rights reserved. # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions # are met: # 1. Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # 2. Redistributions in binary form must reproduce the above copyright # notice, this list of conditions and the following disclaimer in the # documentation and/or other materials provided with the distribution. # 3. The name of the author may not be used to endorse or promote products # derived from this software without specific prior written permission. # # THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR # IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES # OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. # IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, # INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT # NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, # DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY # THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT # (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF # THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. # # end[licence] from .constants import DEFAULT_CHANNEL, EOF, INVALID_TOKEN_TYPE ############################################################################ # # basic token interface # ############################################################################ class Token(object): """@brief Abstract token baseclass.""" TOKEN_NAMES_MAP = None @classmethod def registerTokenNamesMap(cls, tokenNamesMap): """@brief Store a mapping from token type to token name. This enables token.typeName to give something more meaningful than, e.g., '6'. """ cls.TOKEN_NAMES_MAP = tokenNamesMap cls.TOKEN_NAMES_MAP[EOF] = "EOF" def __init__(self, type=None, channel=DEFAULT_CHANNEL, text=None, index=-1, line=0, charPositionInLine=-1, input=None): # We use -1 for index and charPositionInLine as an invalid index self._type = type self._channel = channel self._text = text self._index = index self._line = 0 self._charPositionInLine = charPositionInLine self.input = input # To override a property, you'll need to override both the getter and setter. @property def text(self): return self._text @text.setter def text(self, value): self._text = value @property def type(self): return self._type @type.setter def type(self, value): self._type = value # For compatibility def getType(self): return self._type @property def typeName(self): if self.TOKEN_NAMES_MAP: return self.TOKEN_NAMES_MAP.get(self._type, "INVALID_TOKEN_TYPE") else: return str(self._type) @property def line(self): """Lines are numbered 1..n.""" return self._line @line.setter def line(self, value): self._line = value @property def charPositionInLine(self): """Columns are numbered 0..n-1.""" return self._charPositionInLine @charPositionInLine.setter def charPositionInLine(self, pos): self._charPositionInLine = pos @property def channel(self): return self._channel @channel.setter def channel(self, value): self._channel = value @property def index(self): """ An index from 0..n-1 of the token object in the input stream. This must be valid in order to use the ANTLRWorks debugger. """ return self._index @index.setter def index(self, value): self._index = value def getInputStream(self): """@brief From what character stream was this token created. You don't have to implement but it's nice to know where a Token comes from if you have include files etc... on the input.""" raise NotImplementedError def setInputStream(self, input): """@brief From what character stream was this token created. You don't have to implement but it's nice to know where a Token comes from if you have include files etc... on the input.""" raise NotImplementedError ############################################################################ # # token implementations # # Token # +- CommonToken # \- ClassicToken # ############################################################################ class CommonToken(Token): """@brief Basic token implementation. This implementation does not copy the text from the input stream upon creation, but keeps start/stop pointers into the stream to avoid unnecessary copy operations. """ def __init__(self, type=None, channel=DEFAULT_CHANNEL, text=None, input=None, start=None, stop=None, oldToken=None): if oldToken: super().__init__(oldToken.type, oldToken.channel, oldToken.text, oldToken.index, oldToken.line, oldToken.charPositionInLine, oldToken.input) if isinstance(oldToken, CommonToken): self.start = oldToken.start self.stop = oldToken.stop else: self.start = start self.stop = stop else: super().__init__(type=type, channel=channel, input=input) # We need to be able to change the text once in a while. If # this is non-null, then getText should return this. Note that # start/stop are not affected by changing this. self._text = text # The char position into the input buffer where this token starts self.start = start # The char position into the input buffer where this token stops # This is the index of the last char, *not* the index after it! self.stop = stop @property def text(self): # Could be the empty string, and we want to return that. if self._text is not None: return self._text if not self.input: return None if self.start < self.input.size() and self.stop < self.input.size(): return self.input.substring(self.start, self.stop) return '' @text.setter def text(self, value): """ Override the text for this token. getText() will return this text rather than pulling from the buffer. Note that this does not mean that start/stop indexes are not valid. It means that that input was converted to a new string in the token object. """ self._text = value def getInputStream(self): return self.input def setInputStream(self, input): self.input = input def __str__(self): if self.type == EOF: return "" channelStr = "" if self.channel > 0: channelStr = ",channel=" + str(self.channel) txt = self.text if txt: # Put 2 backslashes in front of each character txt = txt.replace("\n", r"\\n") txt = txt.replace("\r", r"\\r") txt = txt.replace("\t", r"\\t") else: txt = "" return ("[@{0.index},{0.start}:{0.stop}={txt!r}," "<{0.typeName}>{channelStr}," "{0.line}:{0.charPositionInLine}]" .format(self, txt=txt, channelStr=channelStr)) class ClassicToken(Token): """@brief Alternative token implementation. A Token object like we'd use in ANTLR 2.x; has an actual string created and associated with this object. These objects are needed for imaginary tree nodes that have payload objects. We need to create a Token object that has a string; the tree node will point at this token. CommonToken has indexes into a char stream and hence cannot be used to introduce new strings. """ def __init__(self, type=None, text=None, channel=DEFAULT_CHANNEL, oldToken=None): if oldToken: super().__init__(type=oldToken.type, channel=oldToken.channel, text=oldToken.text, line=oldToken.line, charPositionInLine=oldToken.charPositionInLine) else: super().__init__(type=type, channel=channel, text=text, index=None, line=None, charPositionInLine=None) def getInputStream(self): return None def setInputStream(self, input): pass def toString(self): channelStr = "" if self.channel > 0: channelStr = ",channel=" + str(self.channel) txt = self.text if not txt: txt = "" return ("[@{0.index!r},{txt!r},<{0.type!r}>{channelStr}," "{0.line!r}:{0.charPositionInLine!r}]" .format(self, txt=txt, channelStr=channelStr)) __str__ = toString __repr__ = toString INVALID_TOKEN = CommonToken(type=INVALID_TOKEN_TYPE) # In an action, a lexer rule can set token to this SKIP_TOKEN and ANTLR # will avoid creating a token for this symbol and try to fetch another. SKIP_TOKEN = CommonToken(type=INVALID_TOKEN_TYPE) python3-antlr3-3.5.2/antlr3/tree.py000066400000000000000000002363271324200532700170550ustar00rootroot00000000000000""" @package antlr3.tree @brief ANTLR3 runtime package, tree module This module contains all support classes for AST construction and tree parsers. """ # begin[licence] # # [The "BSD licence"] # Copyright (c) 2005-2012 Terence Parr # All rights reserved. # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions # are met: # 1. Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # 2. Redistributions in binary form must reproduce the above copyright # notice, this list of conditions and the following disclaimer in the # documentation and/or other materials provided with the distribution. # 3. The name of the author may not be used to endorse or promote products # derived from this software without specific prior written permission. # # THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR # IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES # OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. # IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, # INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT # NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, # DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY # THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT # (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF # THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. # # end[licence] # lot's of docstrings are missing, don't complain for now... # pylint: disable-msg=C0111 import re from antlr3.constants import UP, DOWN, EOF, INVALID_TOKEN_TYPE from antlr3.recognizers import BaseRecognizer, RuleReturnScope from antlr3.streams import IntStream from antlr3.tokens import CommonToken, Token, INVALID_TOKEN from antlr3.exceptions import MismatchedTreeNodeException, \ MissingTokenException, UnwantedTokenException, MismatchedTokenException, \ NoViableAltException ############################################################################ # # tree related exceptions # ############################################################################ class RewriteCardinalityException(RuntimeError): """ @brief Base class for all exceptions thrown during AST rewrite construction. This signifies a case where the cardinality of two or more elements in a subrule are different: (ID INT)+ where |ID|!=|INT| """ def __init__(self, elementDescription): RuntimeError.__init__(self, elementDescription) self.elementDescription = elementDescription def getMessage(self): return self.elementDescription class RewriteEarlyExitException(RewriteCardinalityException): """@brief No elements within a (...)+ in a rewrite rule""" def __init__(self, elementDescription=None): RewriteCardinalityException.__init__(self, elementDescription) class RewriteEmptyStreamException(RewriteCardinalityException): """ @brief Ref to ID or expr but no tokens in ID stream or subtrees in expr stream """ pass ############################################################################ # # basic Tree and TreeAdaptor interfaces # ############################################################################ class Tree(object): """ @brief Abstract baseclass for tree nodes. What does a tree look like? ANTLR has a number of support classes such as CommonTreeNodeStream that work on these kinds of trees. You don't have to make your trees implement this interface, but if you do, you'll be able to use more support code. NOTE: When constructing trees, ANTLR can build any kind of tree; it can even use Token objects as trees if you add a child list to your tokens. This is a tree node without any payload; just navigation and factory stuff. """ def getChild(self, i): raise NotImplementedError def getChildCount(self): raise NotImplementedError def getParent(self): """Tree tracks parent and child index now > 3.0""" raise NotImplementedError def setParent(self, t): """Tree tracks parent and child index now > 3.0""" raise NotImplementedError def hasAncestor(self, ttype): """Walk upwards looking for ancestor with this token type.""" raise NotImplementedError def getAncestor(self, ttype): """Walk upwards and get first ancestor with this token type.""" raise NotImplementedError def getAncestors(self): """Return a list of all ancestors of this node. The first node of list is the root and the last is the parent of this node. """ raise NotImplementedError def getChildIndex(self): """This node is what child index? 0..n-1""" raise NotImplementedError def setChildIndex(self, index): """This node is what child index? 0..n-1""" raise NotImplementedError def freshenParentAndChildIndexes(self): """Set the parent and child index values for all children""" raise NotImplementedError def addChild(self, t): """ Add t as a child to this node. If t is null, do nothing. If t is nil, add all children of t to this' children. """ raise NotImplementedError def setChild(self, i, t): """Set ith child (0..n-1) to t; t must be non-null and non-nil node""" raise NotImplementedError def deleteChild(self, i): raise NotImplementedError def replaceChildren(self, startChildIndex, stopChildIndex, t): """ Delete children from start to stop and replace with t even if t is a list (nil-root tree). num of children can increase or decrease. For huge child lists, inserting children can force walking rest of children to set their childindex; could be slow. """ raise NotImplementedError def isNil(self): """ Indicates the node is a nil node but may still have children, meaning the tree is a flat list. """ raise NotImplementedError def getTokenStartIndex(self): """ What is the smallest token index (indexing from 0) for this node and its children? """ raise NotImplementedError def setTokenStartIndex(self, index): raise NotImplementedError def getTokenStopIndex(self): """ What is the largest token index (indexing from 0) for this node and its children? """ raise NotImplementedError def setTokenStopIndex(self, index): raise NotImplementedError def dupNode(self): raise NotImplementedError def getType(self): """Return a token type; needed for tree parsing.""" raise NotImplementedError def getText(self): raise NotImplementedError def getLine(self): """ In case we don't have a token payload, what is the line for errors? """ raise NotImplementedError def getCharPositionInLine(self): raise NotImplementedError def toStringTree(self): raise NotImplementedError def toString(self): raise NotImplementedError class TreeAdaptor(object): """ @brief Abstract baseclass for tree adaptors. How to create and navigate trees. Rather than have a separate factory and adaptor, I've merged them. Makes sense to encapsulate. This takes the place of the tree construction code generated in the generated code in 2.x and the ASTFactory. I do not need to know the type of a tree at all so they are all generic Objects. This may increase the amount of typecasting needed. :( """ # C o n s t r u c t i o n def createWithPayload(self, payload): """ Create a tree node from Token object; for CommonTree type trees, then the token just becomes the payload. This is the most common create call. Override if you want another kind of node to be built. """ raise NotImplementedError def dupNode(self, treeNode): """Duplicate a single tree node. Override if you want another kind of node to be built.""" raise NotImplementedError def dupTree(self, tree): """Duplicate tree recursively, using dupNode() for each node""" raise NotImplementedError def nil(self): """ Return a nil node (an empty but non-null node) that can hold a list of element as the children. If you want a flat tree (a list) use "t=adaptor.nil(); t.addChild(x); t.addChild(y);" """ raise NotImplementedError def errorNode(self, input, start, stop, exc): """ Return a tree node representing an error. This node records the tokens consumed during error recovery. The start token indicates the input symbol at which the error was detected. The stop token indicates the last symbol consumed during recovery. You must specify the input stream so that the erroneous text can be packaged up in the error node. The exception could be useful to some applications; default implementation stores ptr to it in the CommonErrorNode. This only makes sense during token parsing, not tree parsing. Tree parsing should happen only when parsing and tree construction succeed. """ raise NotImplementedError def isNil(self, tree): """Is tree considered a nil node used to make lists of child nodes?""" raise NotImplementedError def addChild(self, t, child): """ Add a child to the tree t. If child is a flat tree (a list), make all in list children of t. Warning: if t has no children, but child does and child isNil then you can decide it is ok to move children to t via t.children = child.children; i.e., without copying the array. Just make sure that this is consistent with have the user will build ASTs. Do nothing if t or child is null. """ raise NotImplementedError def becomeRoot(self, newRoot, oldRoot): """ If oldRoot is a nil root, just copy or move the children to newRoot. If not a nil root, make oldRoot a child of newRoot. old=^(nil a b c), new=r yields ^(r a b c) old=^(a b c), new=r yields ^(r ^(a b c)) If newRoot is a nil-rooted single child tree, use the single child as the new root node. old=^(nil a b c), new=^(nil r) yields ^(r a b c) old=^(a b c), new=^(nil r) yields ^(r ^(a b c)) If oldRoot was null, it's ok, just return newRoot (even if isNil). old=null, new=r yields r old=null, new=^(nil r) yields ^(nil r) Return newRoot. Throw an exception if newRoot is not a simple node or nil root with a single child node--it must be a root node. If newRoot is ^(nil x) return x as newRoot. Be advised that it's ok for newRoot to point at oldRoot's children; i.e., you don't have to copy the list. We are constructing these nodes so we should have this control for efficiency. """ raise NotImplementedError def rulePostProcessing(self, root): """ Given the root of the subtree created for this rule, post process it to do any simplifications or whatever you want. A required behavior is to convert ^(nil singleSubtree) to singleSubtree as the setting of start/stop indexes relies on a single non-nil root for non-flat trees. Flat trees such as for lists like "idlist : ID+ ;" are left alone unless there is only one ID. For a list, the start/stop indexes are set in the nil node. This method is executed after all rule tree construction and right before setTokenBoundaries(). """ raise NotImplementedError def getUniqueID(self, node): """For identifying trees. How to identify nodes so we can say "add node to a prior node"? Even becomeRoot is an issue. Use System.identityHashCode(node) usually. """ raise NotImplementedError # R e w r i t e R u l e s def createFromToken(self, tokenType, fromToken, text=None): """ Create a new node derived from a token, with a new token type and (optionally) new text. This is invoked from an imaginary node ref on right side of a rewrite rule as IMAG[$tokenLabel] or IMAG[$tokenLabel "IMAG"]. This should invoke createToken(Token). """ raise NotImplementedError def createFromType(self, tokenType, text): """Create a new node derived from a token, with a new token type. This is invoked from an imaginary node ref on right side of a rewrite rule as IMAG["IMAG"]. This should invoke createToken(int,String). """ raise NotImplementedError # C o n t e n t def getType(self, t): """For tree parsing, I need to know the token type of a node""" raise NotImplementedError def setType(self, t, type): """Node constructors can set the type of a node""" raise NotImplementedError def getText(self, t): raise NotImplementedError def setText(self, t, text): """Node constructors can set the text of a node""" raise NotImplementedError def getToken(self, t): """Return the token object from which this node was created. Currently used only for printing an error message. The error display routine in BaseRecognizer needs to display where the input the error occurred. If your tree of limitation does not store information that can lead you to the token, you can create a token filled with the appropriate information and pass that back. See BaseRecognizer.getErrorMessage(). """ raise NotImplementedError def setTokenBoundaries(self, t, startToken, stopToken): """ Where are the bounds in the input token stream for this node and all children? Each rule that creates AST nodes will call this method right before returning. Flat trees (i.e., lists) will still usually have a nil root node just to hold the children list. That node would contain the start/stop indexes then. """ raise NotImplementedError def getTokenStartIndex(self, t): """ Get the token start index for this subtree; return -1 if no such index """ raise NotImplementedError def getTokenStopIndex(self, t): """ Get the token stop index for this subtree; return -1 if no such index """ raise NotImplementedError # N a v i g a t i o n / T r e e P a r s i n g def getChild(self, t, i): """Get a child 0..n-1 node""" raise NotImplementedError def setChild(self, t, i, child): """Set ith child (0..n-1) to t; t must be non-null and non-nil node""" raise NotImplementedError def deleteChild(self, t, i): """Remove ith child and shift children down from right.""" raise NotImplementedError def getChildCount(self, t): """How many children? If 0, then this is a leaf node""" raise NotImplementedError def getParent(self, t): """ Who is the parent node of this node; if null, implies node is root. If your node type doesn't handle this, it's ok but the tree rewrites in tree parsers need this functionality. """ raise NotImplementedError def setParent(self, t, parent): """ Who is the parent node of this node; if null, implies node is root. If your node type doesn't handle this, it's ok but the tree rewrites in tree parsers need this functionality. """ raise NotImplementedError def getChildIndex(self, t): """ What index is this node in the child list? Range: 0..n-1 If your node type doesn't handle this, it's ok but the tree rewrites in tree parsers need this functionality. """ raise NotImplementedError def setChildIndex(self, t, index): """ What index is this node in the child list? Range: 0..n-1 If your node type doesn't handle this, it's ok but the tree rewrites in tree parsers need this functionality. """ raise NotImplementedError def replaceChildren(self, parent, startChildIndex, stopChildIndex, t): """ Replace from start to stop child index of parent with t, which might be a list. Number of children may be different after this call. If parent is null, don't do anything; must be at root of overall tree. Can't replace whatever points to the parent externally. Do nothing. """ raise NotImplementedError # Misc def create(self, *args): """ Deprecated, use createWithPayload, createFromToken or createFromType. This method only exists to mimic the Java interface of TreeAdaptor. """ if len(args) == 1 and isinstance(args[0], Token): # Object create(Token payload); ## warnings.warn( ## "Using create() is deprecated, use createWithPayload()", ## DeprecationWarning, ## stacklevel=2 ## ) return self.createWithPayload(args[0]) if (len(args) == 2 and isinstance(args[0], int) and isinstance(args[1], Token)): # Object create(int tokenType, Token fromToken); ## warnings.warn( ## "Using create() is deprecated, use createFromToken()", ## DeprecationWarning, ## stacklevel=2 ## ) return self.createFromToken(args[0], args[1]) if (len(args) == 3 and isinstance(args[0], int) and isinstance(args[1], Token) and isinstance(args[2], str)): # Object create(int tokenType, Token fromToken, String text); ## warnings.warn( ## "Using create() is deprecated, use createFromToken()", ## DeprecationWarning, ## stacklevel=2 ## ) return self.createFromToken(args[0], args[1], args[2]) if (len(args) == 2 and isinstance(args[0], int) and isinstance(args[1], str)): # Object create(int tokenType, String text); ## warnings.warn( ## "Using create() is deprecated, use createFromType()", ## DeprecationWarning, ## stacklevel=2 ## ) return self.createFromType(args[0], args[1]) raise TypeError( "No create method with this signature found: {}" .format(', '.join(type(v).__name__ for v in args))) ############################################################################ # # base implementation of Tree and TreeAdaptor # # Tree # \- BaseTree # # TreeAdaptor # \- BaseTreeAdaptor # ############################################################################ class BaseTree(Tree): """ @brief A generic tree implementation with no payload. You must subclass to actually have any user data. ANTLR v3 uses a list of children approach instead of the child-sibling approach in v2. A flat tree (a list) is an empty node whose children represent the list. An empty, but non-null node is called "nil". """ # BaseTree is abstract, no need to complain about not implemented abstract # methods # pylint: disable-msg=W0223 def __init__(self, node=None): """ Create a new node from an existing node does nothing for BaseTree as there are no fields other than the children list, which cannot be copied as the children are not considered part of this node. """ super().__init__() self.children = [] self.parent = None self.childIndex = 0 def getChild(self, i): try: return self.children[i] except IndexError: return None def getChildren(self): """@brief Get the children internal List Note that if you directly mess with the list, do so at your own risk. """ # FIXME: mark as deprecated return self.children def getFirstChildWithType(self, treeType): for child in self.children: if child.getType() == treeType: return child return None def getChildCount(self): return len(self.children) def addChild(self, childTree): """Add t as child of this node. Warning: if t has no children, but child does and child isNil then this routine moves children to t via t.children = child.children; i.e., without copying the array. """ # this implementation is much simpler and probably less efficient # than the mumbo-jumbo that Ter did for the Java runtime. if childTree is None: return if childTree.isNil(): # t is an empty node possibly with children if self.children is childTree.children: raise ValueError("attempt to add child list to itself") # fix parent pointer and childIndex for new children for idx, child in enumerate(childTree.children): child.parent = self child.childIndex = len(self.children) + idx self.children += childTree.children else: # child is not nil (don't care about children) self.children.append(childTree) childTree.parent = self childTree.childIndex = len(self.children) - 1 def addChildren(self, children): """Add all elements of kids list as children of this node""" self.children += children def setChild(self, i, t): if t is None: return if t.isNil(): raise ValueError("Can't set single child to a list") self.children[i] = t t.parent = self t.childIndex = i def deleteChild(self, i): killed = self.children[i] del self.children[i] # walk rest and decrement their child indexes for idx, child in enumerate(self.children[i:]): child.childIndex = i + idx return killed def replaceChildren(self, startChildIndex, stopChildIndex, newTree): """ Delete children from start to stop and replace with t even if t is a list (nil-root tree). num of children can increase or decrease. For huge child lists, inserting children can force walking rest of children to set their childindex; could be slow. """ if (startChildIndex >= len(self.children) or stopChildIndex >= len(self.children)): raise IndexError("indexes invalid") replacingHowMany = stopChildIndex - startChildIndex + 1 # normalize to a list of children to add: newChildren if newTree.isNil(): newChildren = newTree.children else: newChildren = [newTree] replacingWithHowMany = len(newChildren) delta = replacingHowMany - replacingWithHowMany if delta == 0: # if same number of nodes, do direct replace for idx, child in enumerate(newChildren): self.children[idx + startChildIndex] = child child.parent = self child.childIndex = idx + startChildIndex else: # length of children changes... # ...delete replaced segment... del self.children[startChildIndex:stopChildIndex+1] # ...insert new segment... self.children[startChildIndex:startChildIndex] = newChildren # ...and fix indeces self.freshenParentAndChildIndexes(startChildIndex) def isNil(self): return False def freshenParentAndChildIndexes(self, offset=0): for idx, child in enumerate(self.children[offset:]): child.childIndex = idx + offset child.parent = self def sanityCheckParentAndChildIndexes(self, parent=None, i=-1): if parent != self.parent: raise ValueError( "parents don't match; expected {!r} found {!r}" .format(parent, self.parent)) if i != self.childIndex: raise ValueError( "child indexes don't match; expected {} found {}" .format(i, self.childIndex)) for idx, child in enumerate(self.children): child.sanityCheckParentAndChildIndexes(self, idx) def getChildIndex(self): """BaseTree doesn't track child indexes.""" return 0 def setChildIndex(self, index): """BaseTree doesn't track child indexes.""" pass def getParent(self): """BaseTree doesn't track parent pointers.""" return None def setParent(self, t): """BaseTree doesn't track parent pointers.""" pass def hasAncestor(self, ttype): """Walk upwards looking for ancestor with this token type.""" return self.getAncestor(ttype) is not None def getAncestor(self, ttype): """Walk upwards and get first ancestor with this token type.""" t = self.getParent() while t is not None: if t.getType() == ttype: return t t = t.getParent() return None def getAncestors(self): """Return a list of all ancestors of this node. The first node of list is the root and the last is the parent of this node. """ if self.getParent() is None: return None ancestors = [] t = self.getParent() while t is not None: ancestors.insert(0, t) # insert at start t = t.getParent() return ancestors def toStringTree(self): """Print out a whole tree not just a node""" if len(self.children) == 0: return self.toString() buf = [] if not self.isNil(): buf.append('(') buf.append(self.toString()) buf.append(' ') for i, child in enumerate(self.children): if i > 0: buf.append(' ') buf.append(child.toStringTree()) if not self.isNil(): buf.append(')') return ''.join(buf) def getLine(self): return 0 def getCharPositionInLine(self): return 0 def toString(self): """Override to say how a node (not a tree) should look as text""" raise NotImplementedError class BaseTreeAdaptor(TreeAdaptor): """ @brief A TreeAdaptor that works with any Tree implementation. """ # BaseTreeAdaptor is abstract, no need to complain about not implemented # abstract methods # pylint: disable-msg=W0223 def nil(self): return self.createWithPayload(None) def errorNode(self, input, start, stop, exc): """ create tree node that holds the start and stop tokens associated with an error. If you specify your own kind of tree nodes, you will likely have to override this method. CommonTree returns Token.INVALID_TOKEN_TYPE if no token payload but you might have to set token type for diff node type. You don't have to subclass CommonErrorNode; you will likely need to subclass your own tree node class to avoid class cast exception. """ return CommonErrorNode(input, start, stop, exc) def isNil(self, tree): return tree.isNil() def dupTree(self, t, parent=None): """ This is generic in the sense that it will work with any kind of tree (not just Tree interface). It invokes the adaptor routines not the tree node routines to do the construction. """ if t is None: return None newTree = self.dupNode(t) # ensure new subtree root has parent/child index set # same index in new tree self.setChildIndex(newTree, self.getChildIndex(t)) self.setParent(newTree, parent) for i in range(self.getChildCount(t)): child = self.getChild(t, i) newSubTree = self.dupTree(child, t) self.addChild(newTree, newSubTree) return newTree def addChild(self, tree, child): """ Add a child to the tree t. If child is a flat tree (a list), make all in list children of t. Warning: if t has no children, but child does and child isNil then you can decide it is ok to move children to t via t.children = child.children; i.e., without copying the array. Just make sure that this is consistent with have the user will build ASTs. """ #if isinstance(child, Token): # child = self.createWithPayload(child) if tree is not None and child is not None: tree.addChild(child) def becomeRoot(self, newRoot, oldRoot): """ If oldRoot is a nil root, just copy or move the children to newRoot. If not a nil root, make oldRoot a child of newRoot. old=^(nil a b c), new=r yields ^(r a b c) old=^(a b c), new=r yields ^(r ^(a b c)) If newRoot is a nil-rooted single child tree, use the single child as the new root node. old=^(nil a b c), new=^(nil r) yields ^(r a b c) old=^(a b c), new=^(nil r) yields ^(r ^(a b c)) If oldRoot was null, it's ok, just return newRoot (even if isNil). old=null, new=r yields r old=null, new=^(nil r) yields ^(nil r) Return newRoot. Throw an exception if newRoot is not a simple node or nil root with a single child node--it must be a root node. If newRoot is ^(nil x) return x as newRoot. Be advised that it's ok for newRoot to point at oldRoot's children; i.e., you don't have to copy the list. We are constructing these nodes so we should have this control for efficiency. """ if isinstance(newRoot, Token): newRoot = self.create(newRoot) if oldRoot is None: return newRoot if not isinstance(newRoot, CommonTree): newRoot = self.createWithPayload(newRoot) # handle ^(nil real-node) if newRoot.isNil(): nc = newRoot.getChildCount() if nc == 1: newRoot = newRoot.getChild(0) elif nc > 1: # TODO: make tree run time exceptions hierarchy raise RuntimeError("more than one node as root") # add oldRoot to newRoot; addChild takes care of case where oldRoot # is a flat list (i.e., nil-rooted tree). All children of oldRoot # are added to newRoot. newRoot.addChild(oldRoot) return newRoot def rulePostProcessing(self, root): """Transform ^(nil x) to x and nil to null""" if root is not None and root.isNil(): if root.getChildCount() == 0: root = None elif root.getChildCount() == 1: root = root.getChild(0) # whoever invokes rule will set parent and child index root.setParent(None) root.setChildIndex(-1) return root def createFromToken(self, tokenType, fromToken, text=None): if fromToken is None: return self.createFromType(tokenType, text) assert isinstance(tokenType, int), type(tokenType).__name__ assert isinstance(fromToken, Token), type(fromToken).__name__ assert text is None or isinstance(text, str), type(text).__name__ fromToken = self.createToken(fromToken) fromToken.type = tokenType if text is not None: fromToken.text = text t = self.createWithPayload(fromToken) return t def createFromType(self, tokenType, text): assert isinstance(tokenType, int), type(tokenType).__name__ assert isinstance(text, str) or text is None, type(text).__name__ fromToken = self.createToken(tokenType=tokenType, text=text) t = self.createWithPayload(fromToken) return t def getType(self, t): return t.getType() def setType(self, t, type): raise RuntimeError("don't know enough about Tree node") def getText(self, t): return t.getText() def setText(self, t, text): raise RuntimeError("don't know enough about Tree node") def getChild(self, t, i): return t.getChild(i) def setChild(self, t, i, child): t.setChild(i, child) def deleteChild(self, t, i): return t.deleteChild(i) def getChildCount(self, t): return t.getChildCount() def getUniqueID(self, node): return hash(node) def createToken(self, fromToken=None, tokenType=None, text=None): """ Tell me how to create a token for use with imaginary token nodes. For example, there is probably no input symbol associated with imaginary token DECL, but you need to create it as a payload or whatever for the DECL node as in ^(DECL type ID). If you care what the token payload objects' type is, you should override this method and any other createToken variant. """ raise NotImplementedError ############################################################################ # # common tree implementation # # Tree # \- BaseTree # \- CommonTree # \- CommonErrorNode # # TreeAdaptor # \- BaseTreeAdaptor # \- CommonTreeAdaptor # ############################################################################ class CommonTree(BaseTree): """@brief A tree node that is wrapper for a Token object. After 3.0 release while building tree rewrite stuff, it became clear that computing parent and child index is very difficult and cumbersome. Better to spend the space in every tree node. If you don't want these extra fields, it's easy to cut them out in your own BaseTree subclass. """ def __init__(self, payload): BaseTree.__init__(self) # What token indexes bracket all tokens associated with this node # and below? self.startIndex = -1 self.stopIndex = -1 # Who is the parent node of this node; if null, implies node is root self.parent = None # What index is this node in the child list? Range: 0..n-1 self.childIndex = -1 # A single token is the payload if payload is None: self.token = None elif isinstance(payload, CommonTree): self.token = payload.token self.startIndex = payload.startIndex self.stopIndex = payload.stopIndex elif payload is None or isinstance(payload, Token): self.token = payload else: raise TypeError(type(payload).__name__) def getToken(self): return self.token def dupNode(self): return CommonTree(self) def isNil(self): return self.token is None def getType(self): if self.token is None: return INVALID_TOKEN_TYPE return self.token.type type = property(getType) def getText(self): if self.token is None: return None return self.token.text text = property(getText) def getLine(self): if self.token is None or self.token.line == 0: if self.getChildCount(): return self.getChild(0).getLine() else: return 0 return self.token.line line = property(getLine) def getCharPositionInLine(self): if self.token is None or self.token.charPositionInLine == -1: if self.getChildCount(): return self.getChild(0).getCharPositionInLine() else: return 0 else: return self.token.charPositionInLine charPositionInLine = property(getCharPositionInLine) def getTokenStartIndex(self): if self.startIndex == -1 and self.token: return self.token.index return self.startIndex def setTokenStartIndex(self, index): self.startIndex = index tokenStartIndex = property(getTokenStartIndex, setTokenStartIndex) def getTokenStopIndex(self): if self.stopIndex == -1 and self.token: return self.token.index return self.stopIndex def setTokenStopIndex(self, index): self.stopIndex = index tokenStopIndex = property(getTokenStopIndex, setTokenStopIndex) def setUnknownTokenBoundaries(self): """For every node in this subtree, make sure it's start/stop token's are set. Walk depth first, visit bottom up. Only updates nodes with at least one token index < 0. """ if self.children is None: if self.startIndex < 0 or self.stopIndex < 0: self.startIndex = self.stopIndex = self.token.index return for child in self.children: child.setUnknownTokenBoundaries() if self.startIndex >= 0 and self.stopIndex >= 0: # already set return if self.children: firstChild = self.children[0] lastChild = self.children[-1] self.startIndex = firstChild.getTokenStartIndex() self.stopIndex = lastChild.getTokenStopIndex() def getChildIndex(self): #FIXME: mark as deprecated return self.childIndex def setChildIndex(self, idx): #FIXME: mark as deprecated self.childIndex = idx def getParent(self): #FIXME: mark as deprecated return self.parent def setParent(self, t): #FIXME: mark as deprecated self.parent = t def toString(self): if self.isNil(): return "nil" if self.getType() == INVALID_TOKEN_TYPE: return "" return self.token.text __str__ = toString def toStringTree(self): if not self.children: return self.toString() ret = '' if not self.isNil(): ret += '({!s} '.format(self) ret += ' '.join([child.toStringTree() for child in self.children]) if not self.isNil(): ret += ')' return ret INVALID_NODE = CommonTree(INVALID_TOKEN) class CommonErrorNode(CommonTree): """A node representing erroneous token range in token stream""" def __init__(self, input, start, stop, exc): CommonTree.__init__(self, None) if (stop is None or (stop.index < start.index and stop.type != EOF)): # sometimes resync does not consume a token (when LT(1) is # in follow set. So, stop will be 1 to left to start. adjust. # Also handle case where start is the first token and no token # is consumed during recovery; LT(-1) will return null. stop = start self.input = input self.start = start self.stop = stop self.trappedException = exc def isNil(self): return False def getType(self): return INVALID_TOKEN_TYPE def getText(self): if isinstance(self.start, Token): i = self.start.index j = self.stop.index if self.stop.type == EOF: j = self.input.size() badText = self.input.toString(i, j) elif isinstance(self.start, Tree): badText = self.input.toString(self.start, self.stop) else: # people should subclass if they alter the tree type so this # next one is for sure correct. badText = "" return badText def toString(self): if isinstance(self.trappedException, MissingTokenException): return ("") elif isinstance(self.trappedException, UnwantedTokenException): return ("") elif isinstance(self.trappedException, MismatchedTokenException): return ("") elif isinstance(self.trappedException, NoViableAltException): return ("") return "" __str__ = toString class CommonTreeAdaptor(BaseTreeAdaptor): """ @brief A TreeAdaptor that works with any Tree implementation. It provides really just factory methods; all the work is done by BaseTreeAdaptor. If you would like to have different tokens created than ClassicToken objects, you need to override this and then set the parser tree adaptor to use your subclass. To get your parser to build nodes of a different type, override create(Token), errorNode(), and to be safe, YourTreeClass.dupNode(). dupNode is called to duplicate nodes during rewrite operations. """ def dupNode(self, treeNode): """ Duplicate a node. This is part of the factory; override if you want another kind of node to be built. I could use reflection to prevent having to override this but reflection is slow. """ if treeNode is None: return None return treeNode.dupNode() def createWithPayload(self, payload): return CommonTree(payload) def createToken(self, fromToken=None, tokenType=None, text=None): """ Tell me how to create a token for use with imaginary token nodes. For example, there is probably no input symbol associated with imaginary token DECL, but you need to create it as a payload or whatever for the DECL node as in ^(DECL type ID). If you care what the token payload objects' type is, you should override this method and any other createToken variant. """ if fromToken is not None: return CommonToken(oldToken=fromToken) return CommonToken(type=tokenType, text=text) def setTokenBoundaries(self, t, startToken, stopToken): """ Track start/stop token for subtree root created for a rule. Only works with Tree nodes. For rules that match nothing, seems like this will yield start=i and stop=i-1 in a nil node. Might be useful info so I'll not force to be i..i. """ if t is None: return start = 0 stop = 0 if startToken is not None: start = startToken.index if stopToken is not None: stop = stopToken.index t.setTokenStartIndex(start) t.setTokenStopIndex(stop) def getTokenStartIndex(self, t): if t is None: return -1 return t.getTokenStartIndex() def getTokenStopIndex(self, t): if t is None: return -1 return t.getTokenStopIndex() def getText(self, t): if t is None: return None return t.text def getType(self, t): if t is None: return INVALID_TOKEN_TYPE return t.type def getToken(self, t): """ What is the Token associated with this node? If you are not using CommonTree, then you must override this in your own adaptor. """ if isinstance(t, CommonTree): return t.getToken() return None # no idea what to do def getChild(self, t, i): if t is None: return None return t.getChild(i) def getChildCount(self, t): if t is None: return 0 return t.getChildCount() def getParent(self, t): return t.getParent() def setParent(self, t, parent): t.setParent(parent) def getChildIndex(self, t): if t is None: return 0 return t.getChildIndex() def setChildIndex(self, t, index): t.setChildIndex(index) def replaceChildren(self, parent, startChildIndex, stopChildIndex, t): if parent is not None: parent.replaceChildren(startChildIndex, stopChildIndex, t) ############################################################################ # # streams # # TreeNodeStream # \- BaseTree # \- CommonTree # # TreeAdaptor # \- BaseTreeAdaptor # \- CommonTreeAdaptor # ############################################################################ class TreeNodeStream(IntStream): """@brief A stream of tree nodes It accessing nodes from a tree of some kind. """ # TreeNodeStream is abstract, no need to complain about not implemented # abstract methods # pylint: disable-msg=W0223 def get(self, i): """Get a tree node at an absolute index i; 0..n-1. If you don't want to buffer up nodes, then this method makes no sense for you. """ raise NotImplementedError def LT(self, k): """ Get tree node at current input pointer + i ahead where i=1 is next node. i<0 indicates nodes in the past. So LT(-1) is previous node, but implementations are not required to provide results for k < -1. LT(0) is undefined. For i>=n, return null. Return null for LT(0) and any index that results in an absolute address that is negative. This is analogus to the LT() method of the TokenStream, but this returns a tree node instead of a token. Makes code gen identical for both parser and tree grammars. :) """ raise NotImplementedError def getTreeSource(self): """ Where is this stream pulling nodes from? This is not the name, but the object that provides node objects. """ raise NotImplementedError def getTokenStream(self): """ If the tree associated with this stream was created from a TokenStream, you can specify it here. Used to do rule $text attribute in tree parser. Optional unless you use tree parser rule text attribute or output=template and rewrite=true options. """ raise NotImplementedError def getTreeAdaptor(self): """ What adaptor can tell me how to interpret/navigate nodes and trees. E.g., get text of a node. """ raise NotImplementedError def setUniqueNavigationNodes(self, uniqueNavigationNodes): """ As we flatten the tree, we use UP, DOWN nodes to represent the tree structure. When debugging we need unique nodes so we have to instantiate new ones. When doing normal tree parsing, it's slow and a waste of memory to create unique navigation nodes. Default should be false; """ raise NotImplementedError def reset(self): """ Reset the tree node stream in such a way that it acts like a freshly constructed stream. """ raise NotImplementedError def toString(self, start, stop): """ Return the text of all nodes from start to stop, inclusive. If the stream does not buffer all the nodes then it can still walk recursively from start until stop. You can always return null or "" too, but users should not access $ruleLabel.text in an action of course in that case. """ raise NotImplementedError # REWRITING TREES (used by tree parser) def replaceChildren(self, parent, startChildIndex, stopChildIndex, t): """ Replace from start to stop child index of parent with t, which might be a list. Number of children may be different after this call. The stream is notified because it is walking the tree and might need to know you are monkeying with the underlying tree. Also, it might be able to modify the node stream to avoid restreaming for future phases. If parent is null, don't do anything; must be at root of overall tree. Can't replace whatever points to the parent externally. Do nothing. """ raise NotImplementedError class CommonTreeNodeStream(TreeNodeStream): """@brief A buffered stream of tree nodes. Nodes can be from a tree of ANY kind. This node stream sucks all nodes out of the tree specified in the constructor during construction and makes pointers into the tree using an array of Object pointers. The stream necessarily includes pointers to DOWN and UP and EOF nodes. This stream knows how to mark/release for backtracking. This stream is most suitable for tree interpreters that need to jump around a lot or for tree parsers requiring speed (at cost of memory). There is some duplicated functionality here with UnBufferedTreeNodeStream but just in bookkeeping, not tree walking etc... @see UnBufferedTreeNodeStream """ def __init__(self, *args): TreeNodeStream.__init__(self) if len(args) == 1: adaptor = CommonTreeAdaptor() tree = args[0] nodes = None down = None up = None eof = None elif len(args) == 2: adaptor = args[0] tree = args[1] nodes = None down = None up = None eof = None elif len(args) == 3: parent = args[0] start = args[1] stop = args[2] adaptor = parent.adaptor tree = parent.root nodes = parent.nodes[start:stop] down = parent.down up = parent.up eof = parent.eof else: raise TypeError("Invalid arguments") # all these navigation nodes are shared and hence they # cannot contain any line/column info if down is not None: self.down = down else: self.down = adaptor.createFromType(DOWN, "DOWN") if up is not None: self.up = up else: self.up = adaptor.createFromType(UP, "UP") if eof is not None: self.eof = eof else: self.eof = adaptor.createFromType(EOF, "EOF") # The complete mapping from stream index to tree node. # This buffer includes pointers to DOWN, UP, and EOF nodes. # It is built upon ctor invocation. The elements are type # Object as we don't what the trees look like. # Load upon first need of the buffer so we can set token types # of interest for reverseIndexing. Slows us down a wee bit to # do all of the if p==-1 testing everywhere though. if nodes is not None: self.nodes = nodes else: self.nodes = [] # Pull nodes from which tree? self.root = tree # IF this tree (root) was created from a token stream, track it. self.tokens = None # What tree adaptor was used to build these trees self.adaptor = adaptor # Reuse same DOWN, UP navigation nodes unless this is true self.uniqueNavigationNodes = False # The index into the nodes list of the current node (next node # to consume). If -1, nodes array not filled yet. self.p = -1 # Track the last mark() call result value for use in rewind(). self.lastMarker = None # Stack of indexes used for push/pop calls self.calls = [] def fillBuffer(self): """Walk tree with depth-first-search and fill nodes buffer. Don't do DOWN, UP nodes if its a list (t is isNil). """ self._fillBuffer(self.root) self.p = 0 # buffer of nodes intialized now def _fillBuffer(self, t): nil = self.adaptor.isNil(t) if not nil: self.nodes.append(t) # add this node # add DOWN node if t has children n = self.adaptor.getChildCount(t) if not nil and n > 0: self.addNavigationNode(DOWN) # and now add all its children for c in range(n): self._fillBuffer(self.adaptor.getChild(t, c)) # add UP node if t has children if not nil and n > 0: self.addNavigationNode(UP) def getNodeIndex(self, node): """What is the stream index for node? 0..n-1 Return -1 if node not found. """ if self.p == -1: self.fillBuffer() for i, t in enumerate(self.nodes): if t == node: return i return -1 def addNavigationNode(self, ttype): """ As we flatten the tree, we use UP, DOWN nodes to represent the tree structure. When debugging we need unique nodes so instantiate new ones when uniqueNavigationNodes is true. """ navNode = None if ttype == DOWN: if self.hasUniqueNavigationNodes(): navNode = self.adaptor.createFromType(DOWN, "DOWN") else: navNode = self.down else: if self.hasUniqueNavigationNodes(): navNode = self.adaptor.createFromType(UP, "UP") else: navNode = self.up self.nodes.append(navNode) def get(self, i): if self.p == -1: self.fillBuffer() return self.nodes[i] def LT(self, k): if self.p == -1: self.fillBuffer() if k == 0: return None if k < 0: return self.LB(-k) if self.p + k - 1 >= len(self.nodes): return self.eof return self.nodes[self.p + k - 1] def getCurrentSymbol(self): return self.LT(1) def LB(self, k): """Look backwards k nodes""" if k == 0: return None if self.p - k < 0: return None return self.nodes[self.p - k] def isEOF(self, obj): return self.adaptor.getType(obj) == EOF def getTreeSource(self): return self.root def getSourceName(self): return self.getTokenStream().getSourceName() def getTokenStream(self): return self.tokens def setTokenStream(self, tokens): self.tokens = tokens def getTreeAdaptor(self): return self.adaptor def hasUniqueNavigationNodes(self): return self.uniqueNavigationNodes def setUniqueNavigationNodes(self, uniqueNavigationNodes): self.uniqueNavigationNodes = uniqueNavigationNodes def consume(self): if self.p == -1: self.fillBuffer() self.p += 1 def LA(self, i): return self.adaptor.getType(self.LT(i)) def mark(self): if self.p == -1: self.fillBuffer() self.lastMarker = self.index() return self.lastMarker def release(self, marker=None): # no resources to release pass def index(self): return self.p def rewind(self, marker=None): if marker is None: marker = self.lastMarker self.seek(marker) def seek(self, index): if self.p == -1: self.fillBuffer() self.p = index def push(self, index): """ Make stream jump to a new location, saving old location. Switch back with pop(). """ self.calls.append(self.p) # save current index self.seek(index) def pop(self): """ Seek back to previous index saved during last push() call. Return top of stack (return index). """ ret = self.calls.pop(-1) self.seek(ret) return ret def reset(self): self.p = 0 self.lastMarker = 0 self.calls = [] def size(self): if self.p == -1: self.fillBuffer() return len(self.nodes) # TREE REWRITE INTERFACE def replaceChildren(self, parent, startChildIndex, stopChildIndex, t): if parent is not None: self.adaptor.replaceChildren( parent, startChildIndex, stopChildIndex, t ) def __str__(self): """Used for testing, just return the token type stream""" if self.p == -1: self.fillBuffer() return ' '.join([str(self.adaptor.getType(node)) for node in self.nodes ]) def toString(self, start, stop): if start is None or stop is None: return None if self.p == -1: self.fillBuffer() #System.out.println("stop: "+stop); #if ( start instanceof CommonTree ) # System.out.print("toString: "+((CommonTree)start).getToken()+", "); #else # System.out.println(start); #if ( stop instanceof CommonTree ) # System.out.println(((CommonTree)stop).getToken()); #else # System.out.println(stop); # if we have the token stream, use that to dump text in order if self.tokens is not None: beginTokenIndex = self.adaptor.getTokenStartIndex(start) endTokenIndex = self.adaptor.getTokenStopIndex(stop) # if it's a tree, use start/stop index from start node # else use token range from start/stop nodes if self.adaptor.getType(stop) == UP: endTokenIndex = self.adaptor.getTokenStopIndex(start) elif self.adaptor.getType(stop) == EOF: endTokenIndex = self.size() -2 # don't use EOF return self.tokens.toString(beginTokenIndex, endTokenIndex) # walk nodes looking for start i, t = 0, None for i, t in enumerate(self.nodes): if t == start: break # now walk until we see stop, filling string buffer with text buf = [] t = self.nodes[i] while t != stop: text = self.adaptor.getText(t) if text is None: text = " " + self.adaptor.getType(t) buf.append(text) i += 1 t = self.nodes[i] # include stop node too text = self.adaptor.getText(stop) if text is None: text = " " +self.adaptor.getType(stop) buf.append(text) return ''.join(buf) ## iterator interface def __iter__(self): if self.p == -1: self.fillBuffer() for node in self.nodes: yield node ############################################################################# # # tree parser # ############################################################################# class TreeParser(BaseRecognizer): """@brief Baseclass for generated tree parsers. A parser for a stream of tree nodes. "tree grammars" result in a subclass of this. All the error reporting and recovery is shared with Parser via the BaseRecognizer superclass. """ def __init__(self, input, state=None): BaseRecognizer.__init__(self, state) self.input = None self.setTreeNodeStream(input) def reset(self): BaseRecognizer.reset(self) # reset all recognizer state variables if self.input is not None: self.input.seek(0) # rewind the input def setTreeNodeStream(self, input): """Set the input stream""" self.input = input def getTreeNodeStream(self): return self.input def getSourceName(self): return self.input.getSourceName() def getCurrentInputSymbol(self, input): return input.LT(1) def getMissingSymbol(self, input, e, expectedTokenType, follow): tokenText = "" adaptor = input.adaptor return adaptor.createToken( CommonToken(type=expectedTokenType, text=tokenText)) # precompiled regex used by inContext dotdot = ".*[^.]\\.\\.[^.].*" doubleEtc = ".*\\.\\.\\.\\s+\\.\\.\\..*" dotdotPattern = re.compile(dotdot) doubleEtcPattern = re.compile(doubleEtc) def inContext(self, context, adaptor=None, tokenName=None, t=None): """Check if current node in input has a context. Context means sequence of nodes towards root of tree. For example, you might say context is "MULT" which means my parent must be MULT. "CLASS VARDEF" says current node must be child of a VARDEF and whose parent is a CLASS node. You can use "..." to mean zero-or-more nodes. "METHOD ... VARDEF" means my parent is VARDEF and somewhere above that is a METHOD node. The first node in the context is not necessarily the root. The context matcher stops matching and returns true when it runs out of context. There is no way to force the first node to be the root. """ return self._inContext( self.input.getTreeAdaptor(), self.tokenNames, self.input.LT(1), context) @classmethod def _inContext(cls, adaptor, tokenNames, t, context): """The worker for inContext. It's static and full of parameters for testing purposes. """ if cls.dotdotPattern.match(context): # don't allow "..", must be "..." raise ValueError("invalid syntax: ..") if cls.doubleEtcPattern.match(context): # don't allow double "..." raise ValueError("invalid syntax: ... ...") # ensure spaces around ... context = context.replace("...", " ... ") context = context.strip() nodes = context.split() ni = len(nodes) - 1 t = adaptor.getParent(t) while ni >= 0 and t is not None: if nodes[ni] == "...": # walk upwards until we see nodes[ni-1] then continue walking if ni == 0: # ... at start is no-op return True goal = nodes[ni-1] ancestor = cls._getAncestor(adaptor, tokenNames, t, goal) if ancestor is None: return False t = ancestor ni -= 1 name = tokenNames[adaptor.getType(t)] if name != nodes[ni]: return False # advance to parent and to previous element in context node list ni -= 1 t = adaptor.getParent(t) # at root but more nodes to match if t is None and ni >= 0: return False return True @staticmethod def _getAncestor(adaptor, tokenNames, t, goal): """Helper for static inContext.""" while t is not None: name = tokenNames[adaptor.getType(t)] if name == goal: return t t = adaptor.getParent(t) return None def matchAny(self): """ Match '.' in tree parser has special meaning. Skip node or entire tree if node has children. If children, scan until corresponding UP node. """ self._state.errorRecovery = False look = self.input.LT(1) if self.input.getTreeAdaptor().getChildCount(look) == 0: self.input.consume() # not subtree, consume 1 node and return return # current node is a subtree, skip to corresponding UP. # must count nesting level to get right UP level = 0 tokenType = self.input.getTreeAdaptor().getType(look) while tokenType != EOF and not (tokenType == UP and level==0): self.input.consume() look = self.input.LT(1) tokenType = self.input.getTreeAdaptor().getType(look) if tokenType == DOWN: level += 1 elif tokenType == UP: level -= 1 self.input.consume() # consume UP def mismatch(self, input, ttype, follow): """ We have DOWN/UP nodes in the stream that have no line info; override. plus we want to alter the exception type. Don't try to recover from tree parser errors inline... """ raise MismatchedTreeNodeException(ttype, input) def getErrorHeader(self, e): """ Prefix error message with the grammar name because message is always intended for the programmer because the parser built the input tree not the user. """ return (self.getGrammarFileName() + ": node from {}line {}:{}".format( "after " if e.approximateLineInfo else '', e.line, e.charPositionInLine)) def getErrorMessage(self, e): """ Tree parsers parse nodes they usually have a token object as payload. Set the exception token and do the default behavior. """ if isinstance(self, TreeParser): adaptor = e.input.getTreeAdaptor() e.token = adaptor.getToken(e.node) if e.token is not None: # could be an UP/DOWN node e.token = CommonToken( type=adaptor.getType(e.node), text=adaptor.getText(e.node) ) return BaseRecognizer.getErrorMessage(self, e) def traceIn(self, ruleName, ruleIndex): BaseRecognizer.traceIn(self, ruleName, ruleIndex, self.input.LT(1)) def traceOut(self, ruleName, ruleIndex): BaseRecognizer.traceOut(self, ruleName, ruleIndex, self.input.LT(1)) ############################################################################# # # tree visitor # ############################################################################# class TreeVisitor(object): """Do a depth first walk of a tree, applying pre() and post() actions we go. """ def __init__(self, adaptor=None): if adaptor is not None: self.adaptor = adaptor else: self.adaptor = CommonTreeAdaptor() def visit(self, t, pre_action=None, post_action=None): """Visit every node in tree t and trigger an action for each node before/after having visited all of its children. Bottom up walk. Execute both actions even if t has no children. Ignore return results from transforming children since they will have altered the child list of this node (their parent). Return result of applying post action to this node. The Python version differs from the Java version by taking two callables 'pre_action' and 'post_action' instead of a class instance that wraps those methods. Those callables must accept a TreeNode as their single argument and return the (potentially transformed or replaced) TreeNode. """ isNil = self.adaptor.isNil(t) if pre_action is not None and not isNil: # if rewritten, walk children of new t t = pre_action(t) idx = 0 while idx < self.adaptor.getChildCount(t): child = self.adaptor.getChild(t, idx) self.visit(child, pre_action, post_action) idx += 1 if post_action is not None and not isNil: t = post_action(t) return t ############################################################################# # # tree iterator # ############################################################################# class TreeIterator(object): """ Return a node stream from a doubly-linked tree whose nodes know what child index they are. Emit navigation nodes (DOWN, UP, and EOF) to let show tree structure. """ def __init__(self, tree, adaptor=None): if adaptor is None: adaptor = CommonTreeAdaptor() self.root = tree self.adaptor = adaptor self.first_time = True self.tree = tree # If we emit UP/DOWN nodes, we need to spit out multiple nodes per # next() call. self.nodes = [] # navigation nodes to return during walk and at end self.down = adaptor.createFromType(DOWN, "DOWN") self.up = adaptor.createFromType(UP, "UP") self.eof = adaptor.createFromType(EOF, "EOF") def reset(self): self.first_time = True self.tree = self.root self.nodes = [] def __iter__(self): return self def has_next(self): if self.first_time: return self.root is not None if len(self.nodes) > 0: return True if self.tree is None: return False if self.adaptor.getChildCount(self.tree) > 0: return True # back at root? return self.adaptor.getParent(self.tree) is not None def __next__(self): if not self.has_next(): raise StopIteration if self.first_time: # initial condition self.first_time = False if self.adaptor.getChildCount(self.tree) == 0: # single node tree (special) self.nodes.append(self.eof) return self.tree return self.tree # if any queued up, use those first if len(self.nodes) > 0: return self.nodes.pop(0) # no nodes left? if self.tree is None: return self.eof # next node will be child 0 if any children if self.adaptor.getChildCount(self.tree) > 0: self.tree = self.adaptor.getChild(self.tree, 0) # real node is next after DOWN self.nodes.append(self.tree) return self.down # if no children, look for next sibling of tree or ancestor parent = self.adaptor.getParent(self.tree) # while we're out of siblings, keep popping back up towards root while (parent is not None and self.adaptor.getChildIndex(self.tree)+1 >= self.adaptor.getChildCount(parent)): # we're moving back up self.nodes.append(self.up) self.tree = parent parent = self.adaptor.getParent(self.tree) # no nodes left? if parent is None: self.tree = None # back at root? nothing left then self.nodes.append(self.eof) # add to queue, might have UP nodes in there return self.nodes.pop(0) # must have found a node with an unvisited sibling # move to it and return it nextSiblingIndex = self.adaptor.getChildIndex(self.tree) + 1 self.tree = self.adaptor.getChild(parent, nextSiblingIndex) self.nodes.append(self.tree) # add to queue, might have UP nodes in there return self.nodes.pop(0) ############################################################################# # # streams for rule rewriting # ############################################################################# class RewriteRuleElementStream(object): """@brief Internal helper class. A generic list of elements tracked in an alternative to be used in a -> rewrite rule. We need to subclass to fill in the next() method, which returns either an AST node wrapped around a token payload or an existing subtree. Once you start next()ing, do not try to add more elements. It will break the cursor tracking I believe. @see org.antlr.runtime.tree.RewriteRuleSubtreeStream @see org.antlr.runtime.tree.RewriteRuleTokenStream TODO: add mechanism to detect/puke on modification after reading from stream """ def __init__(self, adaptor, elementDescription, elements=None): # Cursor 0..n-1. If singleElement!=null, cursor is 0 until you next(), # which bumps it to 1 meaning no more elements. self.cursor = 0 # Track single elements w/o creating a list. Upon 2nd add, alloc list self.singleElement = None # The list of tokens or subtrees we are tracking self.elements = None # Once a node / subtree has been used in a stream, it must be dup'd # from then on. Streams are reset after subrules so that the streams # can be reused in future subrules. So, reset must set a dirty bit. # If dirty, then next() always returns a dup. self.dirty = False # The element or stream description; usually has name of the token or # rule reference that this list tracks. Can include rulename too, but # the exception would track that info. self.elementDescription = elementDescription self.adaptor = adaptor if isinstance(elements, (list, tuple)): # Create a stream, but feed off an existing list self.singleElement = None self.elements = elements else: # Create a stream with one element self.add(elements) def reset(self): """ Reset the condition of this stream so that it appears we have not consumed any of its elements. Elements themselves are untouched. Once we reset the stream, any future use will need duplicates. Set the dirty bit. """ self.cursor = 0 self.dirty = True def add(self, el): if el is None: return if self.elements is not None: # if in list, just add self.elements.append(el) return if self.singleElement is None: # no elements yet, track w/o list self.singleElement = el return # adding 2nd element, move to list self.elements = [] self.elements.append(self.singleElement) self.singleElement = None self.elements.append(el) def nextTree(self): """ Return the next element in the stream. If out of elements, throw an exception unless size()==1. If size is 1, then return elements[0]. Return a duplicate node/subtree if stream is out of elements and size==1. If we've already used the element, dup (dirty bit set). """ if (self.dirty or (self.cursor >= len(self) and len(self) == 1) ): # if out of elements and size is 1, dup el = self._next() return self.dup(el) # test size above then fetch el = self._next() return el def _next(self): """ do the work of getting the next element, making sure that it's a tree node or subtree. Deal with the optimization of single- element list versus list of size > 1. Throw an exception if the stream is empty or we're out of elements and size>1. protected so you can override in a subclass if necessary. """ if len(self) == 0: raise RewriteEmptyStreamException(self.elementDescription) if self.cursor >= len(self): # out of elements? if len(self) == 1: # if size is 1, it's ok; return and we'll dup return self.toTree(self.singleElement) # out of elements and size was not 1, so we can't dup raise RewriteCardinalityException(self.elementDescription) # we have elements if self.singleElement is not None: self.cursor += 1 # move cursor even for single element list return self.toTree(self.singleElement) # must have more than one in list, pull from elements o = self.toTree(self.elements[self.cursor]) self.cursor += 1 return o def dup(self, el): """ When constructing trees, sometimes we need to dup a token or AST subtree. Dup'ing a token means just creating another AST node around it. For trees, you must call the adaptor.dupTree() unless the element is for a tree root; then it must be a node dup. """ raise NotImplementedError def toTree(self, el): """ Ensure stream emits trees; tokens must be converted to AST nodes. AST nodes can be passed through unmolested. """ return el def hasNext(self): return ( (self.singleElement is not None and self.cursor < 1) or (self.elements is not None and self.cursor < len(self.elements) ) ) def size(self): if self.singleElement is not None: return 1 if self.elements is not None: return len(self.elements) return 0 __len__ = size def getDescription(self): """Deprecated. Directly access elementDescription attribute""" return self.elementDescription class RewriteRuleTokenStream(RewriteRuleElementStream): """@brief Internal helper class.""" def toTree(self, el): # Don't convert to a tree unless they explicitly call nextTree. # This way we can do hetero tree nodes in rewrite. return el def nextNode(self): t = self._next() return self.adaptor.createWithPayload(t) def nextToken(self): return self._next() def dup(self, el): raise TypeError("dup can't be called for a token stream.") class RewriteRuleSubtreeStream(RewriteRuleElementStream): """@brief Internal helper class.""" def nextNode(self): """ Treat next element as a single node even if it's a subtree. This is used instead of next() when the result has to be a tree root node. Also prevents us from duplicating recently-added children; e.g., ^(type ID)+ adds ID to type and then 2nd iteration must dup the type node, but ID has been added. Referencing a rule result twice is ok; dup entire tree as we can't be adding trees as root; e.g., expr expr. Hideous code duplication here with super.next(). Can't think of a proper way to refactor. This needs to always call dup node and super.next() doesn't know which to call: dup node or dup tree. """ if (self.dirty or (self.cursor >= len(self) and len(self) == 1) ): # if out of elements and size is 1, dup (at most a single node # since this is for making root nodes). el = self._next() return self.adaptor.dupNode(el) # test size above then fetch el = self._next() while self.adaptor.isNil(el) and self.adaptor.getChildCount(el) == 1: el = self.adaptor.getChild(el, 0) # dup just the root (want node here) return self.adaptor.dupNode(el) def dup(self, el): return self.adaptor.dupTree(el) class RewriteRuleNodeStream(RewriteRuleElementStream): """ Queues up nodes matched on left side of -> in a tree parser. This is the analog of RewriteRuleTokenStream for normal parsers. """ def nextNode(self): return self._next() def toTree(self, el): return self.adaptor.dupNode(el) def dup(self, el): # we dup every node, so don't have to worry about calling dup; short- #circuited next() so it doesn't call. raise TypeError("dup can't be called for a node stream.") class TreeRuleReturnScope(RuleReturnScope): """ This is identical to the ParserRuleReturnScope except that the start property is a tree nodes not Token object when you are parsing trees. To be generic the tree node types have to be Object. """ def __init__(self): super().__init__() self.start = None self.tree = None def getStart(self): return self.start def getTree(self): return self.tree python3-antlr3-3.5.2/antlr3/treewizard.py000066400000000000000000000433021324200532700202630ustar00rootroot00000000000000""" @package antlr3.tree @brief ANTLR3 runtime package, treewizard module A utility module to create ASTs at runtime. See for an overview. Note that the API of the Python implementation is slightly different. """ # begin[licence] # # [The "BSD licence"] # Copyright (c) 2005-2012 Terence Parr # All rights reserved. # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions # are met: # 1. Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # 2. Redistributions in binary form must reproduce the above copyright # notice, this list of conditions and the following disclaimer in the # documentation and/or other materials provided with the distribution. # 3. The name of the author may not be used to endorse or promote products # derived from this software without specific prior written permission. # # THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR # IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES # OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. # IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, # INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT # NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, # DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY # THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT # (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF # THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. # # end[licence] from .constants import INVALID_TOKEN_TYPE from .tokens import CommonToken from .tree import CommonTree, CommonTreeAdaptor def computeTokenTypes(tokenNames): """ Compute a dict that is an inverted index of tokenNames (which maps int token types to names). """ if tokenNames: return dict((name, type) for type, name in enumerate(tokenNames)) return {} ## token types for pattern parser EOF = -1 BEGIN = 1 END = 2 ID = 3 ARG = 4 PERCENT = 5 COLON = 6 DOT = 7 class TreePatternLexer(object): def __init__(self, pattern): ## The tree pattern to lex like "(A B C)" self.pattern = pattern ## Index into input string self.p = -1 ## Current char self.c = None ## How long is the pattern in char? self.n = len(pattern) ## Set when token type is ID or ARG self.sval = None self.error = False self.consume() __idStartChar = frozenset( 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ_' ) __idChar = __idStartChar | frozenset('0123456789') def nextToken(self): self.sval = "" while self.c != EOF: if self.c in (' ', '\n', '\r', '\t'): self.consume() continue if self.c in self.__idStartChar: self.sval += self.c self.consume() while self.c in self.__idChar: self.sval += self.c self.consume() return ID if self.c == '(': self.consume() return BEGIN if self.c == ')': self.consume() return END if self.c == '%': self.consume() return PERCENT if self.c == ':': self.consume() return COLON if self.c == '.': self.consume() return DOT if self.c == '[': # grab [x] as a string, returning x self.consume() while self.c != ']': if self.c == '\\': self.consume() if self.c != ']': self.sval += '\\' self.sval += self.c else: self.sval += self.c self.consume() self.consume() return ARG self.consume() self.error = True return EOF return EOF def consume(self): self.p += 1 if self.p >= self.n: self.c = EOF else: self.c = self.pattern[self.p] class TreePatternParser(object): def __init__(self, tokenizer, wizard, adaptor): self.tokenizer = tokenizer self.wizard = wizard self.adaptor = adaptor self.ttype = tokenizer.nextToken() # kickstart def pattern(self): if self.ttype == BEGIN: return self.parseTree() elif self.ttype == ID: node = self.parseNode() if self.ttype == EOF: return node return None # extra junk on end return None def parseTree(self): if self.ttype != BEGIN: return None self.ttype = self.tokenizer.nextToken() root = self.parseNode() if root is None: return None while self.ttype in (BEGIN, ID, PERCENT, DOT): if self.ttype == BEGIN: subtree = self.parseTree() self.adaptor.addChild(root, subtree) else: child = self.parseNode() if child is None: return None self.adaptor.addChild(root, child) if self.ttype != END: return None self.ttype = self.tokenizer.nextToken() return root def parseNode(self): # "%label:" prefix label = None if self.ttype == PERCENT: self.ttype = self.tokenizer.nextToken() if self.ttype != ID: return None label = self.tokenizer.sval self.ttype = self.tokenizer.nextToken() if self.ttype != COLON: return None self.ttype = self.tokenizer.nextToken() # move to ID following colon # Wildcard? if self.ttype == DOT: self.ttype = self.tokenizer.nextToken() wildcardPayload = CommonToken(0, ".") node = WildcardTreePattern(wildcardPayload) if label is not None: node.label = label return node # "ID" or "ID[arg]" if self.ttype != ID: return None tokenName = self.tokenizer.sval self.ttype = self.tokenizer.nextToken() if tokenName == "nil": return self.adaptor.nil() text = tokenName # check for arg arg = None if self.ttype == ARG: arg = self.tokenizer.sval text = arg self.ttype = self.tokenizer.nextToken() # create node treeNodeType = self.wizard.getTokenType(tokenName) if treeNodeType == INVALID_TOKEN_TYPE: return None node = self.adaptor.createFromType(treeNodeType, text) if label is not None and isinstance(node, TreePattern): node.label = label if arg is not None and isinstance(node, TreePattern): node.hasTextArg = True return node class TreePattern(CommonTree): """ When using %label:TOKENNAME in a tree for parse(), we must track the label. """ def __init__(self, payload): super().__init__(payload) self.label = None self.hasTextArg = None def toString(self): if self.label: return '%' + self.label + ':' + super().toString() else: return super().toString() class WildcardTreePattern(TreePattern): pass class TreePatternTreeAdaptor(CommonTreeAdaptor): """This adaptor creates TreePattern objects for use during scan()""" def createWithPayload(self, payload): return TreePattern(payload) class TreeWizard(object): """ Build and navigate trees with this object. Must know about the names of tokens so you have to pass in a map or array of token names (from which this class can build the map). I.e., Token DECL means nothing unless the class can translate it to a token type. In order to create nodes and navigate, this class needs a TreeAdaptor. This class can build a token type -> node index for repeated use or for iterating over the various nodes with a particular type. This class works in conjunction with the TreeAdaptor rather than moving all this functionality into the adaptor. An adaptor helps build and navigate trees using methods. This class helps you do it with string patterns like "(A B C)". You can create a tree from that pattern or match subtrees against it. """ def __init__(self, adaptor=None, tokenNames=None, typeMap=None): if adaptor is None: self.adaptor = CommonTreeAdaptor() else: self.adaptor = adaptor if typeMap is None: self.tokenNameToTypeMap = computeTokenTypes(tokenNames) else: if tokenNames: raise ValueError("Can't have both tokenNames and typeMap") self.tokenNameToTypeMap = typeMap def getTokenType(self, tokenName): """Using the map of token names to token types, return the type.""" if tokenName in self.tokenNameToTypeMap: return self.tokenNameToTypeMap[tokenName] else: return INVALID_TOKEN_TYPE def create(self, pattern): """ Create a tree or node from the indicated tree pattern that closely follows ANTLR tree grammar tree element syntax: (root child1 ... child2). You can also just pass in a node: ID Any node can have a text argument: ID[foo] (notice there are no quotes around foo--it's clear it's a string). nil is a special name meaning "give me a nil node". Useful for making lists: (nil A B C) is a list of A B C. """ tokenizer = TreePatternLexer(pattern) parser = TreePatternParser(tokenizer, self, self.adaptor) return parser.pattern() def index(self, tree): """Walk the entire tree and make a node name to nodes mapping. For now, use recursion but later nonrecursive version may be more efficient. Returns a dict int -> list where the list is of your AST node type. The int is the token type of the node. """ m = {} self._index(tree, m) return m def _index(self, t, m): """Do the work for index""" if t is None: return ttype = self.adaptor.getType(t) elements = m.get(ttype) if elements is None: m[ttype] = elements = [] elements.append(t) for i in range(self.adaptor.getChildCount(t)): child = self.adaptor.getChild(t, i) self._index(child, m) def find(self, tree, what): """Return a list of matching token. what may either be an integer specifzing the token type to find or a string with a pattern that must be matched. """ if isinstance(what, int): return self._findTokenType(tree, what) elif isinstance(what, str): return self._findPattern(tree, what) else: raise TypeError("'what' must be string or integer") def _findTokenType(self, t, ttype): """Return a List of tree nodes with token type ttype""" nodes = [] def visitor(tree, parent, childIndex, labels): nodes.append(tree) self.visit(t, ttype, visitor) return nodes def _findPattern(self, t, pattern): """Return a List of subtrees matching pattern.""" subtrees = [] # Create a TreePattern from the pattern tokenizer = TreePatternLexer(pattern) parser = TreePatternParser(tokenizer, self, TreePatternTreeAdaptor()) tpattern = parser.pattern() # don't allow invalid patterns if (tpattern is None or tpattern.isNil() or isinstance(tpattern, WildcardTreePattern)): return None rootTokenType = tpattern.getType() def visitor(tree, parent, childIndex, label): if self._parse(tree, tpattern, None): subtrees.append(tree) self.visit(t, rootTokenType, visitor) return subtrees def visit(self, tree, what, visitor): """Visit every node in tree matching what, invoking the visitor. If what is a string, it is parsed as a pattern and only matching subtrees will be visited. The implementation uses the root node of the pattern in combination with visit(t, ttype, visitor) so nil-rooted patterns are not allowed. Patterns with wildcard roots are also not allowed. If what is an integer, it is used as a token type and visit will match all nodes of that type (this is faster than the pattern match). The labels arg of the visitor action method is never set (it's None) since using a token type rather than a pattern doesn't let us set a label. """ if isinstance(what, int): self._visitType(tree, None, 0, what, visitor) elif isinstance(what, str): self._visitPattern(tree, what, visitor) else: raise TypeError("'what' must be string or integer") def _visitType(self, t, parent, childIndex, ttype, visitor): """Do the recursive work for visit""" if t is None: return if self.adaptor.getType(t) == ttype: visitor(t, parent, childIndex, None) for i in range(self.adaptor.getChildCount(t)): child = self.adaptor.getChild(t, i) self._visitType(child, t, i, ttype, visitor) def _visitPattern(self, tree, pattern, visitor): """ For all subtrees that match the pattern, execute the visit action. """ # Create a TreePattern from the pattern tokenizer = TreePatternLexer(pattern) parser = TreePatternParser(tokenizer, self, TreePatternTreeAdaptor()) tpattern = parser.pattern() # don't allow invalid patterns if (tpattern is None or tpattern.isNil() or isinstance(tpattern, WildcardTreePattern)): return rootTokenType = tpattern.getType() def rootvisitor(tree, parent, childIndex, labels): labels = {} if self._parse(tree, tpattern, labels): visitor(tree, parent, childIndex, labels) self.visit(tree, rootTokenType, rootvisitor) def parse(self, t, pattern, labels=None): """ Given a pattern like (ASSIGN %lhs:ID %rhs:.) with optional labels on the various nodes and '.' (dot) as the node/subtree wildcard, return true if the pattern matches and fill the labels Map with the labels pointing at the appropriate nodes. Return false if the pattern is malformed or the tree does not match. If a node specifies a text arg in pattern, then that must match for that node in t. """ tokenizer = TreePatternLexer(pattern) parser = TreePatternParser(tokenizer, self, TreePatternTreeAdaptor()) tpattern = parser.pattern() return self._parse(t, tpattern, labels) def _parse(self, t1, tpattern, labels): """ Do the work for parse. Check to see if the tpattern fits the structure and token types in t1. Check text if the pattern has text arguments on nodes. Fill labels map with pointers to nodes in tree matched against nodes in pattern with labels. """ # make sure both are non-null if t1 is None or tpattern is None: return False # check roots (wildcard matches anything) if not isinstance(tpattern, WildcardTreePattern): if self.adaptor.getType(t1) != tpattern.getType(): return False # if pattern has text, check node text if (tpattern.hasTextArg and self.adaptor.getText(t1) != tpattern.getText()): return False if tpattern.label is not None and labels is not None: # map label in pattern to node in t1 labels[tpattern.label] = t1 # check children n1 = self.adaptor.getChildCount(t1) n2 = tpattern.getChildCount() if n1 != n2: return False for i in range(n1): child1 = self.adaptor.getChild(t1, i) child2 = tpattern.getChild(i) if not self._parse(child1, child2, labels): return False return True def equals(self, t1, t2, adaptor=None): """ Compare t1 and t2; return true if token types/text, structure match exactly. The trees are examined in their entirety so that (A B) does not match (A B C) nor (A (B C)). """ if adaptor is None: adaptor = self.adaptor return self._equals(t1, t2, adaptor) def _equals(self, t1, t2, adaptor): # make sure both are non-null if t1 is None or t2 is None: return False # check roots if adaptor.getType(t1) != adaptor.getType(t2): return False if adaptor.getText(t1) != adaptor.getText(t2): return False # check children n1 = adaptor.getChildCount(t1) n2 = adaptor.getChildCount(t2) if n1 != n2: return False for i in range(n1): child1 = adaptor.getChild(t1, i) child2 = adaptor.getChild(t2, i) if not self._equals(child1, child2, adaptor): return False return True python3-antlr3-3.5.2/doxyfile000066400000000000000000000235121324200532700160750ustar00rootroot00000000000000# -*- mode: doxymacs -*- #--------------------------------------------------------------------------- # Project related configuration options #--------------------------------------------------------------------------- DOXYFILE_ENCODING = UTF-8 PROJECT_NAME = "ANTLR Python3 API" PROJECT_NUMBER = 3.3 OUTPUT_DIRECTORY = api CREATE_SUBDIRS = NO OUTPUT_LANGUAGE = English BRIEF_MEMBER_DESC = YES REPEAT_BRIEF = YES ABBREVIATE_BRIEF = "The $name class" \ "The $name widget" \ "The $name file" \ is \ provides \ specifies \ contains \ represents \ a \ an \ the ALWAYS_DETAILED_SEC = YES INLINE_INHERITED_MEMB = NO FULL_PATH_NAMES = YES STRIP_FROM_PATH = build/doc/ STRIP_FROM_INC_PATH = SHORT_NAMES = NO JAVADOC_AUTOBRIEF = NO MULTILINE_CPP_IS_BRIEF = NO DETAILS_AT_TOP = NO INHERIT_DOCS = YES SEPARATE_MEMBER_PAGES = NO TAB_SIZE = 8 ALIASES = OPTIMIZE_OUTPUT_FOR_C = NO OPTIMIZE_OUTPUT_JAVA = YES BUILTIN_STL_SUPPORT = NO CPP_CLI_SUPPORT = NO DISTRIBUTE_GROUP_DOC = NO SUBGROUPING = YES #--------------------------------------------------------------------------- # Build related configuration options #--------------------------------------------------------------------------- EXTRACT_ALL = YES EXTRACT_PRIVATE = YES EXTRACT_STATIC = YES EXTRACT_LOCAL_CLASSES = YES EXTRACT_LOCAL_METHODS = NO HIDE_UNDOC_MEMBERS = NO HIDE_UNDOC_CLASSES = NO HIDE_FRIEND_COMPOUNDS = NO HIDE_IN_BODY_DOCS = NO INTERNAL_DOCS = NO CASE_SENSE_NAMES = NO HIDE_SCOPE_NAMES = NO SHOW_INCLUDE_FILES = YES INLINE_INFO = YES SORT_MEMBER_DOCS = YES SORT_BRIEF_DOCS = NO SORT_BY_SCOPE_NAME = NO GENERATE_TODOLIST = YES GENERATE_TESTLIST = NO GENERATE_BUGLIST = NO GENERATE_DEPRECATEDLIST= NO ENABLED_SECTIONS = MAX_INITIALIZER_LINES = 30 SHOW_USED_FILES = YES SHOW_DIRECTORIES = NO FILE_VERSION_FILTER = #--------------------------------------------------------------------------- # configuration options related to warning and progress messages #--------------------------------------------------------------------------- QUIET = NO WARNINGS = YES WARN_IF_UNDOCUMENTED = YES WARN_IF_DOC_ERROR = YES WARN_NO_PARAMDOC = NO WARN_FORMAT = "$file:$line: $text" WARN_LOGFILE = #--------------------------------------------------------------------------- # configuration options related to the input files #--------------------------------------------------------------------------- INPUT = build/doc INPUT_ENCODING = UTF-8 FILE_PATTERNS = *.c \ *.cc \ *.cxx \ *.cpp \ *.c++ \ *.d \ *.java \ *.ii \ *.ixx \ *.ipp \ *.i++ \ *.inl \ *.h \ *.hh \ *.hxx \ *.hpp \ *.h++ \ *.idl \ *.odl \ *.cs \ *.php \ *.php3 \ *.inc \ *.m \ *.mm \ *.dox \ *.py RECURSIVE = YES EXCLUDE = build/doc/antlr3/__init__.py EXCLUDE_SYMLINKS = NO EXCLUDE_PATTERNS = EXCLUDE_SYMBOLS = dfa exceptions recognizers streams tokens constants EXAMPLE_PATH = EXAMPLE_PATTERNS = * EXAMPLE_RECURSIVE = NO IMAGE_PATH = INPUT_FILTER = FILTER_PATTERNS = FILTER_SOURCE_FILES = NO #--------------------------------------------------------------------------- # configuration options related to source browsing #--------------------------------------------------------------------------- SOURCE_BROWSER = YES INLINE_SOURCES = NO STRIP_CODE_COMMENTS = YES REFERENCED_BY_RELATION = NO REFERENCES_RELATION = NO REFERENCES_LINK_SOURCE = YES USE_HTAGS = NO VERBATIM_HEADERS = YES #--------------------------------------------------------------------------- # configuration options related to the alphabetical class index #--------------------------------------------------------------------------- ALPHABETICAL_INDEX = NO COLS_IN_ALPHA_INDEX = 5 IGNORE_PREFIX = #--------------------------------------------------------------------------- # configuration options related to the HTML output #--------------------------------------------------------------------------- GENERATE_HTML = YES HTML_OUTPUT = . HTML_FILE_EXTENSION = .html HTML_HEADER = HTML_FOOTER = HTML_STYLESHEET = HTML_ALIGN_MEMBERS = YES GENERATE_HTMLHELP = NO CHM_FILE = HHC_LOCATION = GENERATE_CHI = NO BINARY_TOC = NO TOC_EXPAND = NO DISABLE_INDEX = NO ENUM_VALUES_PER_LINE = 4 GENERATE_TREEVIEW = NO TREEVIEW_WIDTH = 250 #--------------------------------------------------------------------------- # configuration options related to the LaTeX output #--------------------------------------------------------------------------- GENERATE_LATEX = NO LATEX_OUTPUT = latex LATEX_CMD_NAME = latex MAKEINDEX_CMD_NAME = makeindex COMPACT_LATEX = NO PAPER_TYPE = a4wide EXTRA_PACKAGES = LATEX_HEADER = PDF_HYPERLINKS = NO USE_PDFLATEX = YES LATEX_BATCHMODE = NO LATEX_HIDE_INDICES = NO #--------------------------------------------------------------------------- # configuration options related to the RTF output #--------------------------------------------------------------------------- GENERATE_RTF = NO RTF_OUTPUT = rtf COMPACT_RTF = NO RTF_HYPERLINKS = NO RTF_STYLESHEET_FILE = RTF_EXTENSIONS_FILE = #--------------------------------------------------------------------------- # configuration options related to the man page output #--------------------------------------------------------------------------- GENERATE_MAN = NO MAN_OUTPUT = man MAN_EXTENSION = .3 MAN_LINKS = NO #--------------------------------------------------------------------------- # configuration options related to the XML output #--------------------------------------------------------------------------- GENERATE_XML = NO XML_OUTPUT = xml XML_SCHEMA = XML_DTD = XML_PROGRAMLISTING = YES #--------------------------------------------------------------------------- # configuration options for the AutoGen Definitions output #--------------------------------------------------------------------------- GENERATE_AUTOGEN_DEF = NO #--------------------------------------------------------------------------- # configuration options related to the Perl module output #--------------------------------------------------------------------------- GENERATE_PERLMOD = NO PERLMOD_LATEX = NO PERLMOD_PRETTY = YES PERLMOD_MAKEVAR_PREFIX = #--------------------------------------------------------------------------- # Configuration options related to the preprocessor #--------------------------------------------------------------------------- ENABLE_PREPROCESSING = YES MACRO_EXPANSION = YES EXPAND_ONLY_PREDEF = NO SEARCH_INCLUDES = YES INCLUDE_PATH = INCLUDE_FILE_PATTERNS = PREDEFINED = DOXYGEN_SHOULD_SKIP_THIS EXPAND_AS_DEFINED = SKIP_FUNCTION_MACROS = YES #--------------------------------------------------------------------------- # Configuration::additions related to external references #--------------------------------------------------------------------------- TAGFILES = GENERATE_TAGFILE = ALLEXTERNALS = NO EXTERNAL_GROUPS = YES PERL_PATH = /usr/bin/perl #--------------------------------------------------------------------------- # Configuration options related to the dot tool #--------------------------------------------------------------------------- CLASS_DIAGRAMS = NO MSCGEN_PATH = HIDE_UNDOC_RELATIONS = YES HAVE_DOT = YES CLASS_GRAPH = YES COLLABORATION_GRAPH = YES GROUP_GRAPHS = YES UML_LOOK = NO TEMPLATE_RELATIONS = NO INCLUDE_GRAPH = YES INCLUDED_BY_GRAPH = YES CALL_GRAPH = NO CALLER_GRAPH = NO GRAPHICAL_HIERARCHY = YES DIRECTORY_GRAPH = YES DOT_IMAGE_FORMAT = png DOT_PATH = DOTFILE_DIRS = DOT_GRAPH_MAX_NODES = 50 DOT_TRANSPARENT = NO DOT_MULTI_TARGETS = NO GENERATE_LEGEND = YES DOT_CLEANUP = YES #--------------------------------------------------------------------------- # Configuration::additions related to the search engine #--------------------------------------------------------------------------- SEARCHENGINE = NO #--------------------------------------------------------------------------- # doxypy integration #--------------------------------------------------------------------------- FILTER_SOURCE_FILES = YES INPUT_FILTER = "python doxypy.py" python3-antlr3-3.5.2/ez_setup.py000066400000000000000000000366151324200532700165470ustar00rootroot00000000000000#!python """Bootstrap distribute installation If you want to use setuptools in your package's setup.py, just include this file in the same directory with it, and add this to the top of your setup.py:: from distribute_setup import use_setuptools use_setuptools() If you want to require a specific version of setuptools, set a download mirror, or use an alternate download directory, you can do so by supplying the appropriate options to ``use_setuptools()``. This file can also be run as a script to install or upgrade setuptools. """ import os import sys import time import fnmatch import tempfile import tarfile from distutils import log try: from site import USER_SITE except ImportError: USER_SITE = None try: import subprocess def _python_cmd(*args): args = (sys.executable,) + args return subprocess.call(args) == 0 except ImportError: # will be used for python 2.3 def _python_cmd(*args): args = (sys.executable,) + args # quoting arguments if windows if sys.platform == 'win32': def quote(arg): if ' ' in arg: return '"%s"' % arg return arg args = [quote(arg) for arg in args] return os.spawnl(os.P_WAIT, sys.executable, *args) == 0 DEFAULT_VERSION = "0.6.14" DEFAULT_URL = "http://pypi.python.org/packages/source/d/distribute/" SETUPTOOLS_FAKED_VERSION = "0.6c11" SETUPTOOLS_PKG_INFO = """\ Metadata-Version: 1.0 Name: setuptools Version: %s Summary: xxxx Home-page: xxx Author: xxx Author-email: xxx License: xxx Description: xxx """ % SETUPTOOLS_FAKED_VERSION def _install(tarball): # extracting the tarball tmpdir = tempfile.mkdtemp() log.warn('Extracting in %s', tmpdir) old_wd = os.getcwd() try: os.chdir(tmpdir) tar = tarfile.open(tarball) _extractall(tar) tar.close() # going in the directory subdir = os.path.join(tmpdir, os.listdir(tmpdir)[0]) os.chdir(subdir) log.warn('Now working in %s', subdir) # installing log.warn('Installing Distribute') if not _python_cmd('setup.py', 'install'): log.warn('Something went wrong during the installation.') log.warn('See the error message above.') finally: os.chdir(old_wd) def _build_egg(egg, tarball, to_dir): # extracting the tarball tmpdir = tempfile.mkdtemp() log.warn('Extracting in %s', tmpdir) old_wd = os.getcwd() try: os.chdir(tmpdir) tar = tarfile.open(tarball) _extractall(tar) tar.close() # going in the directory subdir = os.path.join(tmpdir, os.listdir(tmpdir)[0]) os.chdir(subdir) log.warn('Now working in %s', subdir) # building an egg log.warn('Building a Distribute egg in %s', to_dir) _python_cmd('setup.py', '-q', 'bdist_egg', '--dist-dir', to_dir) finally: os.chdir(old_wd) # returning the result log.warn(egg) if not os.path.exists(egg): raise IOError('Could not build the egg.') def _do_download(version, download_base, to_dir, download_delay): egg = os.path.join(to_dir, 'distribute-%s-py%d.%d.egg' % (version, sys.version_info[0], sys.version_info[1])) if not os.path.exists(egg): tarball = download_setuptools(version, download_base, to_dir, download_delay) _build_egg(egg, tarball, to_dir) sys.path.insert(0, egg) import setuptools setuptools.bootstrap_install_from = egg def use_setuptools(version=DEFAULT_VERSION, download_base=DEFAULT_URL, to_dir=os.curdir, download_delay=15, no_fake=True): # making sure we use the absolute path to_dir = os.path.abspath(to_dir) was_imported = 'pkg_resources' in sys.modules or \ 'setuptools' in sys.modules try: try: import pkg_resources if not hasattr(pkg_resources, '_distribute'): if not no_fake: _fake_setuptools() raise ImportError except ImportError: return _do_download(version, download_base, to_dir, download_delay) try: pkg_resources.require("distribute>="+version) return except pkg_resources.VersionConflict: e = sys.exc_info()[1] if was_imported: sys.stderr.write( "The required version of distribute (>=%s) is not available,\n" "and can't be installed while this script is running. Please\n" "install a more recent version first, using\n" "'easy_install -U distribute'." "\n\n(Currently using %r)\n" % (version, e.args[0])) sys.exit(2) else: del pkg_resources, sys.modules['pkg_resources'] # reload ok return _do_download(version, download_base, to_dir, download_delay) except pkg_resources.DistributionNotFound: return _do_download(version, download_base, to_dir, download_delay) finally: if not no_fake: _create_fake_setuptools_pkg_info(to_dir) def download_setuptools(version=DEFAULT_VERSION, download_base=DEFAULT_URL, to_dir=os.curdir, delay=15): """Download distribute from a specified location and return its filename `version` should be a valid distribute version number that is available as an egg for download under the `download_base` URL (which should end with a '/'). `to_dir` is the directory where the egg will be downloaded. `delay` is the number of seconds to pause before an actual download attempt. """ # making sure we use the absolute path to_dir = os.path.abspath(to_dir) try: from urllib.request import urlopen except ImportError: from urllib2 import urlopen tgz_name = "distribute-%s.tar.gz" % version url = download_base + tgz_name saveto = os.path.join(to_dir, tgz_name) src = dst = None if not os.path.exists(saveto): # Avoid repeated downloads try: log.warn("Downloading %s", url) src = urlopen(url) # Read/write all in one block, so we don't create a corrupt file # if the download is interrupted. data = src.read() dst = open(saveto, "wb") dst.write(data) finally: if src: src.close() if dst: dst.close() return os.path.realpath(saveto) def _no_sandbox(function): def __no_sandbox(*args, **kw): try: from setuptools.sandbox import DirectorySandbox if not hasattr(DirectorySandbox, '_old'): def violation(*args): pass DirectorySandbox._old = DirectorySandbox._violation DirectorySandbox._violation = violation patched = True else: patched = False except ImportError: patched = False try: return function(*args, **kw) finally: if patched: DirectorySandbox._violation = DirectorySandbox._old del DirectorySandbox._old return __no_sandbox def _patch_file(path, content): """Will backup the file then patch it""" existing_content = open(path).read() if existing_content == content: # already patched log.warn('Already patched.') return False log.warn('Patching...') _rename_path(path) f = open(path, 'w') try: f.write(content) finally: f.close() return True _patch_file = _no_sandbox(_patch_file) def _same_content(path, content): return open(path).read() == content def _rename_path(path): new_name = path + '.OLD.%s' % time.time() log.warn('Renaming %s into %s', path, new_name) os.rename(path, new_name) return new_name def _remove_flat_installation(placeholder): if not os.path.isdir(placeholder): log.warn('Unkown installation at %s', placeholder) return False found = False for file in os.listdir(placeholder): if fnmatch.fnmatch(file, 'setuptools*.egg-info'): found = True break if not found: log.warn('Could not locate setuptools*.egg-info') return log.warn('Removing elements out of the way...') pkg_info = os.path.join(placeholder, file) if os.path.isdir(pkg_info): patched = _patch_egg_dir(pkg_info) else: patched = _patch_file(pkg_info, SETUPTOOLS_PKG_INFO) if not patched: log.warn('%s already patched.', pkg_info) return False # now let's move the files out of the way for element in ('setuptools', 'pkg_resources.py', 'site.py'): element = os.path.join(placeholder, element) if os.path.exists(element): _rename_path(element) else: log.warn('Could not find the %s element of the ' 'Setuptools distribution', element) return True _remove_flat_installation = _no_sandbox(_remove_flat_installation) def _after_install(dist): log.warn('After install bootstrap.') placeholder = dist.get_command_obj('install').install_purelib _create_fake_setuptools_pkg_info(placeholder) def _create_fake_setuptools_pkg_info(placeholder): if not placeholder or not os.path.exists(placeholder): log.warn('Could not find the install location') return pyver = '%s.%s' % (sys.version_info[0], sys.version_info[1]) setuptools_file = 'setuptools-%s-py%s.egg-info' % \ (SETUPTOOLS_FAKED_VERSION, pyver) pkg_info = os.path.join(placeholder, setuptools_file) if os.path.exists(pkg_info): log.warn('%s already exists', pkg_info) return log.warn('Creating %s', pkg_info) f = open(pkg_info, 'w') try: f.write(SETUPTOOLS_PKG_INFO) finally: f.close() pth_file = os.path.join(placeholder, 'setuptools.pth') log.warn('Creating %s', pth_file) f = open(pth_file, 'w') try: f.write(os.path.join(os.curdir, setuptools_file)) finally: f.close() _create_fake_setuptools_pkg_info = _no_sandbox(_create_fake_setuptools_pkg_info) def _patch_egg_dir(path): # let's check if it's already patched pkg_info = os.path.join(path, 'EGG-INFO', 'PKG-INFO') if os.path.exists(pkg_info): if _same_content(pkg_info, SETUPTOOLS_PKG_INFO): log.warn('%s already patched.', pkg_info) return False _rename_path(path) os.mkdir(path) os.mkdir(os.path.join(path, 'EGG-INFO')) pkg_info = os.path.join(path, 'EGG-INFO', 'PKG-INFO') f = open(pkg_info, 'w') try: f.write(SETUPTOOLS_PKG_INFO) finally: f.close() return True _patch_egg_dir = _no_sandbox(_patch_egg_dir) def _before_install(): log.warn('Before install bootstrap.') _fake_setuptools() def _under_prefix(location): if 'install' not in sys.argv: return True args = sys.argv[sys.argv.index('install')+1:] for index, arg in enumerate(args): for option in ('--root', '--prefix'): if arg.startswith('%s=' % option): top_dir = arg.split('root=')[-1] return location.startswith(top_dir) elif arg == option: if len(args) > index: top_dir = args[index+1] return location.startswith(top_dir) if arg == '--user' and USER_SITE is not None: return location.startswith(USER_SITE) return True def _fake_setuptools(): log.warn('Scanning installed packages') try: import pkg_resources except ImportError: # we're cool log.warn('Setuptools or Distribute does not seem to be installed.') return ws = pkg_resources.working_set try: setuptools_dist = ws.find(pkg_resources.Requirement.parse('setuptools', replacement=False)) except TypeError: # old distribute API setuptools_dist = ws.find(pkg_resources.Requirement.parse('setuptools')) if setuptools_dist is None: log.warn('No setuptools distribution found') return # detecting if it was already faked setuptools_location = setuptools_dist.location log.warn('Setuptools installation detected at %s', setuptools_location) # if --root or --preix was provided, and if # setuptools is not located in them, we don't patch it if not _under_prefix(setuptools_location): log.warn('Not patching, --root or --prefix is installing Distribute' ' in another location') return # let's see if its an egg if not setuptools_location.endswith('.egg'): log.warn('Non-egg installation') res = _remove_flat_installation(setuptools_location) if not res: return else: log.warn('Egg installation') pkg_info = os.path.join(setuptools_location, 'EGG-INFO', 'PKG-INFO') if (os.path.exists(pkg_info) and _same_content(pkg_info, SETUPTOOLS_PKG_INFO)): log.warn('Already patched.') return log.warn('Patching...') # let's create a fake egg replacing setuptools one res = _patch_egg_dir(setuptools_location) if not res: return log.warn('Patched done.') _relaunch() def _relaunch(): log.warn('Relaunching...') # we have to relaunch the process # pip marker to avoid a relaunch bug if sys.argv[:3] == ['-c', 'install', '--single-version-externally-managed']: sys.argv[0] = 'setup.py' args = [sys.executable] + sys.argv sys.exit(subprocess.call(args)) def _extractall(self, path=".", members=None): """Extract all members from the archive to the current working directory and set owner, modification time and permissions on directories afterwards. `path' specifies a different directory to extract to. `members' is optional and must be a subset of the list returned by getmembers(). """ import copy import operator from tarfile import ExtractError directories = [] if members is None: members = self for tarinfo in members: if tarinfo.isdir(): # Extract directories with a safe mode. directories.append(tarinfo) tarinfo = copy.copy(tarinfo) tarinfo.mode = 448 # decimal for oct 0700 self.extract(tarinfo, path) # Reverse sort directories. if sys.version_info < (2, 4): def sorter(dir1, dir2): return cmp(dir1.name, dir2.name) directories.sort(sorter) directories.reverse() else: directories.sort(key=operator.attrgetter('name'), reverse=True) # Set correct owner, mtime and filemode on directories. for tarinfo in directories: dirpath = os.path.join(path, tarinfo.name) try: self.chown(tarinfo, dirpath) self.utime(tarinfo, dirpath) self.chmod(tarinfo, dirpath) except ExtractError: e = sys.exc_info()[1] if self.errorlevel > 1: raise else: self._dbg(1, "tarfile: %s" % e) def main(argv, version=DEFAULT_VERSION): """Install or upgrade setuptools and EasyInstall""" tarball = download_setuptools() _install(tarball) if __name__ == '__main__': main(sys.argv[1:]) python3-antlr3-3.5.2/mkdoxy.sh000077500000000000000000000006361324200532700162030ustar00rootroot00000000000000#!/bin/bash if [ -e doxygen.sh ]; then . doxygen.sh fi rm -fr build/doc mkdir -p build/doc/antlr3 for f in __init__ exceptions constants dfa tokens streams recognizers; do sed -e '/begin\[licence\]/,/end\[licence\]/d' antlr3/$f.py \ >>build/doc/antlr3.py done touch build/doc/antlr3/__init__.py cp -f antlr3/tree.py build/doc/antlr3 cp -f antlr3/treewizard.py build/doc/antlr3 doxygen doxyfile python3-antlr3-3.5.2/pylintrc000066400000000000000000000166021324200532700161200ustar00rootroot00000000000000# lint Python modules using external checkers [MASTER] # Specify a configuration file. #rcfile= # Python code to execute, usually for sys.path manipulation such as # pygtk.require(). #init-hook= # Profiled execution. profile=no # Add to the black list. It should be a base name, not a # path. You may set this option multiple times. ignore=CVS # Pickle collected data for later comparisons. persistent=yes # List of plugins (as comma separated values of python modules names) to load, # usually to register additional checkers. load-plugins= [MESSAGES CONTROL] # Enable the message, report, category or checker with the given id(s). You can # either give multiple identifier separated by comma (,) or put this option # multiple time. #enable= # Disable the message, report, category or checker with the given id(s). You # can either give multiple identifier separated by comma (,) or put this option # multiple time (only on the command line, not in the configuration file where # it should appear only once). # W0622: Redefining built-in '...' # C0103: Invalid name # R0904: Too many public methods # R0201: Method could be a function # C0302: Too many lines in a module # R0902: Too many instance attributes # R0913: Too many arguments # R0912: Too many branches # R0903: Too few public methods # C0111: Missing docstring # W0403: Relative import # W0401: Wildcard import # W0142: */** magic disable=W0622,C0103,R0904,R0201,C0302,R0902,R0913,R0912,R0903,C0111,W0403,W0401,W0142 [REPORTS] # Set the output format. Available formats are text, parseable, colorized, msvs # (visual studio) and html output-format=text # Include message's id in output include-ids=yes # Put messages in a separate file for each module / package specified on the # command line instead of printing them on stdout. Reports (if any) will be # written in a file name "pylint_global.[txt|html]". files-output=no # Tells whether to display a full report or only the messages reports=yes # Python expression which should return a note less than 10 (10 is the highest # note). You have access to the variables errors warning, statement which # respectively contain the number of errors / warnings messages and the total # number of statements analyzed. This is used by the global evaluation report # (RP0004). evaluation=10.0 - ((float(5 * error + warning + refactor + convention) / statement) * 10) # Add a comment according to your evaluation note. This is used by the global # evaluation report (RP0004). comment=no [BASIC] # Required attributes for module, separated by a comma required-attributes= # List of builtins function names that should not be used, separated by a comma bad-functions=map,filter,apply,input # Regular expression which should only match correct module names module-rgx=(([a-z_][a-z0-9_]*)|([A-Z][a-zA-Z0-9]+))$ # Regular expression which should only match correct module level names const-rgx=(([A-Z_][A-Z0-9_]*)|(__.*__))$ # Regular expression which should only match correct class names class-rgx=[A-Z_][a-zA-Z0-9]+$ # Regular expression which should only match correct function names function-rgx=[a-z_][a-z0-9_]{2,30}$ # Regular expression which should only match correct method names method-rgx=[a-z_][a-z0-9_]{2,30}$ # Regular expression which should only match correct instance attribute names attr-rgx=[a-z_][a-z0-9_]{2,30}$ # Regular expression which should only match correct argument names argument-rgx=[a-z_][a-z0-9_]{2,30}$ # Regular expression which should only match correct variable names variable-rgx=[a-z_][a-z0-9_]{2,30}$ # Regular expression which should only match correct list comprehension / # generator expression variable names inlinevar-rgx=[A-Za-z_][A-Za-z0-9_]*$ # Good variable names which should always be accepted, separated by a comma good-names=i,j,k,ex,Run,_ # Bad variable names which should always be refused, separated by a comma bad-names=foo,bar,baz,toto,tutu,tata # Regular expression which should only match functions or classes name which do # not require a docstring no-docstring-rgx=__.*__ [FORMAT] # Maximum number of characters on a single line. max-line-length=80 # Maximum number of lines in a module max-module-lines=1000 # String used as indentation unit. This is usually " " (4 spaces) or "\t" (1 # tab). indent-string=' ' [MISCELLANEOUS] # List of note tags to take in consideration, separated by a comma. notes=FIXME,XXX,TODO [VARIABLES] # Tells whether we should check for unused import in __init__ files. init-import=no # A regular expression matching the beginning of the name of dummy variables # (i.e. not used). dummy-variables-rgx=_|dummy # List of additional names supposed to be defined in builtins. Remember that # you should avoid to define new builtins when possible. additional-builtins= [SIMILARITIES] # Minimum lines number of a similarity. min-similarity-lines=4 # Ignore comments when computing similarities. ignore-comments=yes # Ignore docstrings when computing similarities. ignore-docstrings=yes [TYPECHECK] # Tells whether missing members accessed in mixin class should be ignored. A # mixin class is detected if its name ends with "mixin" (case insensitive). ignore-mixin-members=yes # List of classes names for which member attributes should not be checked # (useful for classes with attributes dynamically set). ignored-classes=SQLObject # When zope mode is activated, add a predefined set of Zope acquired attributes # to generated-members. zope=no # List of members which are set dynamically and missed by pylint inference # system, and so shouldn't trigger E0201 when accessed. generated-members=REQUEST,acl_users,aq_parent [CLASSES] # List of interface methods to ignore, separated by a comma. This is used for # instance to not check methods defines in Zope's Interface base class. ignore-iface-methods=isImplementedBy,deferred,extends,names,namesAndDescriptions,queryDescriptionFor,getBases,getDescriptionFor,getDoc,getName,getTaggedValue,getTaggedValueTags,isEqualOrExtendedBy,setTaggedValue,isImplementedByInstancesOf,adaptWith,is_implemented_by # List of method names used to declare (i.e. assign) instance attributes. defining-attr-methods=__init__,__new__,setUp [DESIGN] # Maximum number of arguments for function / method max-args=5 # Argument names that match this expression will be ignored. Default to name # with leading underscore ignored-argument-names=_.* # Maximum number of locals for function / method body max-locals=15 # Maximum number of return / yield for function / method body max-returns=6 # Maximum number of branch for function / method body max-branchs=12 # Maximum number of statements in function / method body max-statements=50 # Maximum number of parents for a class (see R0901). max-parents=7 # Maximum number of attributes for a class (see R0902). max-attributes=7 # Minimum number of public methods for a class (see R0903). min-public-methods=2 # Maximum number of public methods for a class (see R0904). max-public-methods=20 [IMPORTS] # Deprecated modules which should not be used, separated by a comma deprecated-modules=regsub,string,TERMIOS,Bastion,rexec # Create a graph of every (i.e. internal and external) dependencies in the # given file (report RP0402 must not be disabled) import-graph= # Create a graph of external dependencies in the given file (report RP0402 must # not be disabled) ext-import-graph= # Create a graph of internal dependencies in the given file (report RP0402 must # not be disabled) int-import-graph= python3-antlr3-3.5.2/setup.py000066400000000000000000000210171324200532700160370ustar00rootroot00000000000000 import sys if sys.version_info < (3, 2): print('This antlr3 module requires Python 3.2 or later. You can ' 'download Python 3 from\nhttps://python.org/, ' 'or visit http://www.antlr.org/ for the Python target.') sys.exit(1) # bootstrapping setuptools import ez_setup ez_setup.use_setuptools() import os import textwrap from distutils.errors import * from distutils.command.clean import clean as _clean from distutils.cmd import Command from setuptools import setup from distutils import log from distutils.core import setup class clean(_clean): """Also cleanup local temp files.""" def run(self): _clean.run(self) import fnmatch # kill temporary files patterns = [ # generic tempfiles '*~', '*.bak', '*.pyc', # tempfiles generated by ANTLR runs 't[0-9]*Lexer.py', 't[0-9]*Parser.py', '*.tokens', '*__.g', ] for path in ('antlr3', 'unittests', 'tests'): path = os.path.join(os.path.dirname(__file__), path) if os.path.isdir(path): for root, dirs, files in os.walk(path, topdown=True): graveyard = [] for pat in patterns: graveyard.extend(fnmatch.filter(files, pat)) for name in graveyard: filePath = os.path.join(root, name) try: log.info("removing '%s'", filePath) os.unlink(filePath) except OSError as exc: log.warn( "Failed to delete '%s': %s", filePath, exc ) class TestError(DistutilsError): pass # grml.. the class name appears in the --help output: # ... # Options for 'CmdUnitTest' command # ... # so I have to use a rather ugly name... class unittest(Command): """Run unit tests for package""" description = "run unit tests for package" user_options = [] boolean_options = [] def initialize_options(self): pass def finalize_options(self): pass def run(self): testDir = os.path.join(os.path.dirname(__file__), 'unittests') if not os.path.isdir(testDir): raise DistutilsFileError( "There is no 'unittests' directory. Did you fetch the " "development version?", ) import glob import imp import unittest import traceback import io suite = unittest.TestSuite() loadFailures = [] # collect tests from all unittests/test*.py files testFiles = [] for testPath in glob.glob(os.path.join(testDir, 'test*.py')): testFiles.append(testPath) testFiles.sort() for testPath in testFiles: testID = os.path.basename(testPath)[:-3] try: modFile, modPathname, modDescription \ = imp.find_module(testID, [testDir]) testMod = imp.load_module( testID, modFile, modPathname, modDescription ) suite.addTests( unittest.defaultTestLoader.loadTestsFromModule(testMod) ) except Exception: buf = io.StringIO() traceback.print_exc(file=buf) loadFailures.append( (os.path.basename(testPath), buf.getvalue()) ) runner = unittest.TextTestRunner(verbosity=2) result = runner.run(suite) for testName, error in loadFailures: sys.stderr.write('\n' + '='*70 + '\n') sys.stderr.write( "Failed to load test module {}\n".format(testName) ) sys.stderr.write(error) sys.stderr.write('\n') if not result.wasSuccessful() or loadFailures: raise TestError( "Unit test suite failed!", ) class functest(Command): """Run functional tests for package""" description = "run functional tests for package" user_options = [ ('testcase=', None, "testcase to run [default: run all]"), ('antlr-version=', None, "ANTLR version to use [default: HEAD (in ../../build)]"), ('antlr-jar=', None, "Explicit path to an antlr jar (overrides --antlr-version)"), ] boolean_options = [] def initialize_options(self): self.testcase = None self.antlr_version = 'HEAD' self.antlr_jar = None def finalize_options(self): pass def run(self): import glob import imp import unittest import traceback import io testDir = os.path.join(os.path.dirname(__file__), 'tests') if not os.path.isdir(testDir): raise DistutilsFileError( "There is not 'tests' directory. Did you fetch the " "development version?", ) # make sure, relative imports from testcases work sys.path.insert(0, testDir) rootDir = os.path.abspath( os.path.join(os.path.dirname(__file__), '..', '..')) if self.antlr_jar is not None: classpath = [self.antlr_jar] elif self.antlr_version == 'HEAD': classpath = [ os.path.join(rootDir, 'tool', 'target', 'classes'), os.path.join(rootDir, 'runtime', 'Java', 'target', 'classes') ] else: classpath = [ os.path.join(rootDir, 'archive', 'antlr-{}.jar'.format(self.antlr_version)) ] classpath.extend([ os.path.join(rootDir, 'lib', 'antlr-3.4.1-SNAPSHOT.jar'), os.path.join(rootDir, 'lib', 'antlr-runtime-3.4.jar'), os.path.join(rootDir, 'lib', 'ST-4.0.5.jar'), ]) os.environ['CLASSPATH'] = ':'.join(classpath) os.environ['ANTLRVERSION'] = self.antlr_version suite = unittest.TestSuite() loadFailures = [] # collect tests from all tests/t*.py files testFiles = [] test_glob = 't[0-9][0-9][0-9]*.py' for testPath in glob.glob(os.path.join(testDir, test_glob)): if testPath.endswith('Lexer.py') or testPath.endswith('Parser.py'): continue # if a single testcase has been selected, filter out all other # tests if (self.testcase is not None and not os.path.basename(testPath)[:-3].startswith(self.testcase)): continue testFiles.append(testPath) testFiles.sort() for testPath in testFiles: testID = os.path.basename(testPath)[:-3] try: modFile, modPathname, modDescription \ = imp.find_module(testID, [testDir]) testMod = imp.load_module( testID, modFile, modPathname, modDescription) suite.addTests( unittest.defaultTestLoader.loadTestsFromModule(testMod)) except Exception: buf = io.StringIO() traceback.print_exc(file=buf) loadFailures.append( (os.path.basename(testPath), buf.getvalue())) runner = unittest.TextTestRunner(verbosity=2) result = runner.run(suite) for testName, error in loadFailures: sys.stderr.write('\n' + '='*70 + '\n') sys.stderr.write( "Failed to load test module {}\n".format(testName) ) sys.stderr.write(error) sys.stderr.write('\n') if not result.wasSuccessful() or loadFailures: raise TestError( "Functional test suite failed!", ) setup(name='antlr_python3_runtime', version='3.4', packages=['antlr3'], author="Benjamin S Wolf", author_email="jokeserver+antlr3@gmail.com", url="http://www.antlr.org/", download_url="http://www.antlr.org/download.html", license="BSD", description="Runtime package for ANTLR3", long_description=textwrap.dedent('''\ This is the runtime package for ANTLR3, which is required to use parsers generated by ANTLR3. '''), cmdclass={'unittest': unittest, 'functest': functest, 'clean': clean }, ) python3-antlr3-3.5.2/tests/000077500000000000000000000000001324200532700154665ustar00rootroot00000000000000python3-antlr3-3.5.2/tests/t001lexer.g000066400000000000000000000001071324200532700173600ustar00rootroot00000000000000lexer grammar t001lexer; options { language = Python3; } ZERO: '0'; python3-antlr3-3.5.2/tests/t001lexer.py000066400000000000000000000026231324200532700175670ustar00rootroot00000000000000import antlr3 import testbase import unittest class t001lexer(testbase.ANTLRTest): def setUp(self): self.compileGrammar() def lexerClass(self, base): class TLexer(base): def emitErrorMessage(self, msg): # report errors to /dev/null pass def reportError(self, re): # no error recovery yet, just crash! raise re return TLexer def testValid(self): stream = antlr3.StringStream('0') lexer = self.getLexer(stream) token = lexer.nextToken() self.assertEqual(token.type, self.lexerModule.ZERO) token = lexer.nextToken() self.assertEqual(token.type, self.lexerModule.EOF) def testIteratorInterface(self): stream = antlr3.StringStream('0') lexer = self.getLexer(stream) types = [token.type for token in lexer] self.assertEqual(types, [self.lexerModule.ZERO]) def testMalformedInput(self): stream = antlr3.StringStream('1') lexer = self.getLexer(stream) try: token = lexer.nextToken() self.fail() except antlr3.MismatchedTokenException as exc: self.assertEqual(exc.expecting, '0') self.assertEqual(exc.unexpectedType, '1') if __name__ == '__main__': unittest.main() python3-antlr3-3.5.2/tests/t002lexer.g000066400000000000000000000001211324200532700173550ustar00rootroot00000000000000lexer grammar t002lexer; options { language = Python3; } ZERO: '0'; ONE: '1'; python3-antlr3-3.5.2/tests/t002lexer.py000066400000000000000000000023231324200532700175650ustar00rootroot00000000000000import antlr3 import testbase import unittest class t002lexer(testbase.ANTLRTest): def setUp(self): self.compileGrammar() def lexerClass(self, base): class TLexer(base): def emitErrorMessage(self, msg): # report errors to /dev/null pass def reportError(self, re): # no error recovery yet, just crash! raise re return TLexer def testValid(self): stream = antlr3.StringStream('01') lexer = self.getLexer(stream) token = lexer.nextToken() self.assertEqual(token.type, self.lexerModule.ZERO) token = lexer.nextToken() self.assertEqual(token.type, self.lexerModule.ONE) token = lexer.nextToken() self.assertEqual(token.type, self.lexerModule.EOF) def testMalformedInput(self): stream = antlr3.StringStream('2') lexer = self.getLexer(stream) try: token = lexer.nextToken() self.fail() except antlr3.NoViableAltException as exc: self.assertEqual(exc.unexpectedType, '2') if __name__ == '__main__': unittest.main() python3-antlr3-3.5.2/tests/t003lexer.g000066400000000000000000000001411324200532700173600ustar00rootroot00000000000000lexer grammar t003lexer; options { language = Python3; } ZERO: '0'; ONE: '1'; FOOZE: 'fooze'; python3-antlr3-3.5.2/tests/t003lexer.py000066400000000000000000000024701324200532700175710ustar00rootroot00000000000000import antlr3 import testbase import unittest class t003lexer(testbase.ANTLRTest): def setUp(self): self.compileGrammar() def lexerClass(self, base): class TLexer(base): def emitErrorMessage(self, msg): # report errors to /dev/null pass def reportError(self, re): # no error recovery yet, just crash! raise re return TLexer def testValid(self): stream = antlr3.StringStream('0fooze1') lexer = self.getLexer(stream) token = lexer.nextToken() self.assertEqual(token.type, self.lexerModule.ZERO) token = lexer.nextToken() self.assertEqual(token.type, self.lexerModule.FOOZE) token = lexer.nextToken() self.assertEqual(token.type, self.lexerModule.ONE) token = lexer.nextToken() self.assertEqual(token.type, self.lexerModule.EOF) def testMalformedInput(self): stream = antlr3.StringStream('2') lexer = self.getLexer(stream) try: token = lexer.nextToken() self.fail() except antlr3.NoViableAltException as exc: self.assertEqual(exc.unexpectedType, '2') if __name__ == '__main__': unittest.main() python3-antlr3-3.5.2/tests/t004lexer.g000066400000000000000000000001131324200532700173600ustar00rootroot00000000000000lexer grammar t004lexer; options { language = Python3; } FOO: 'f' 'o'*; python3-antlr3-3.5.2/tests/t004lexer.py000066400000000000000000000036761324200532700176030ustar00rootroot00000000000000import antlr3 import testbase import unittest class t004lexer(testbase.ANTLRTest): def setUp(self): self.compileGrammar() def lexerClass(self, base): class TLexer(base): def emitErrorMessage(self, msg): # report errors to /dev/null pass def reportError(self, re): # no error recovery yet, just crash! raise re return TLexer def testValid(self): stream = antlr3.StringStream('ffofoofooo') lexer = self.getLexer(stream) token = lexer.nextToken() self.assertEqual(token.type, self.lexerModule.FOO) self.assertEqual(token.start, 0) self.assertEqual(token.stop, 0) self.assertEqual(token.text, 'f') token = lexer.nextToken() self.assertEqual(token.type, self.lexerModule.FOO) self.assertEqual(token.start, 1) self.assertEqual(token.stop, 2) self.assertEqual(token.text, 'fo') token = lexer.nextToken() self.assertEqual(token.type, self.lexerModule.FOO) self.assertEqual(token.start, 3) self.assertEqual(token.stop, 5) self.assertEqual(token.text, 'foo') token = lexer.nextToken() self.assertEqual(token.type, self.lexerModule.FOO) self.assertEqual(token.start, 6) self.assertEqual(token.stop, 9) self.assertEqual(token.text, 'fooo') token = lexer.nextToken() self.assertEqual(token.type, self.lexerModule.EOF) def testMalformedInput(self): stream = antlr3.StringStream('2') lexer = self.getLexer(stream) try: token = lexer.nextToken() self.fail() except antlr3.MismatchedTokenException as exc: self.assertEqual(exc.expecting, 'f') self.assertEqual(exc.unexpectedType, '2') if __name__ == '__main__': unittest.main() python3-antlr3-3.5.2/tests/t005lexer.g000066400000000000000000000001131324200532700173610ustar00rootroot00000000000000lexer grammar t005lexer; options { language = Python3; } FOO: 'f' 'o'+; python3-antlr3-3.5.2/tests/t005lexer.py000066400000000000000000000040241324200532700175700ustar00rootroot00000000000000import antlr3 import testbase import unittest class t005lexer(testbase.ANTLRTest): def setUp(self): self.compileGrammar() def lexerClass(self, base): class TLexer(base): def emitErrorMessage(self, msg): # report errors to /dev/null pass def reportError(self, re): # no error recovery yet, just crash! raise re return TLexer def testValid(self): stream = antlr3.StringStream('fofoofooo') lexer = self.getLexer(stream) token = lexer.nextToken() self.assertEqual(token.type, self.lexerModule.FOO) self.assertEqual(token.start, 0) self.assertEqual(token.stop, 1) self.assertEqual(token.text, 'fo') token = lexer.nextToken() self.assertEqual(token.type, self.lexerModule.FOO) self.assertEqual(token.start, 2) self.assertEqual(token.stop, 4) self.assertEqual(token.text, 'foo') token = lexer.nextToken() self.assertEqual(token.type, self.lexerModule.FOO) self.assertEqual(token.start, 5) self.assertEqual(token.stop, 8) self.assertEqual(token.text, 'fooo') token = lexer.nextToken() self.assertEqual(token.type, self.lexerModule.EOF) def testMalformedInput1(self): stream = antlr3.StringStream('2') lexer = self.getLexer(stream) try: token = lexer.nextToken() self.fail() except antlr3.MismatchedTokenException as exc: self.assertEqual(exc.expecting, 'f') self.assertEqual(exc.unexpectedType, '2') def testMalformedInput2(self): stream = antlr3.StringStream('f') lexer = self.getLexer(stream) try: token = lexer.nextToken() self.fail() except antlr3.EarlyExitException as exc: self.assertEqual(exc.unexpectedType, antlr3.EOF) if __name__ == '__main__': unittest.main() python3-antlr3-3.5.2/tests/t006lexer.g000066400000000000000000000001231324200532700173630ustar00rootroot00000000000000lexer grammar t006lexer; options { language = Python3; } FOO: 'f' ('o' | 'a')*; python3-antlr3-3.5.2/tests/t006lexer.py000066400000000000000000000032471324200532700175770ustar00rootroot00000000000000import antlr3 import testbase import unittest class t006lexer(testbase.ANTLRTest): def setUp(self): self.compileGrammar() def lexerClass(self, base): class TLexer(base): def emitErrorMessage(self, msg): # report errors to /dev/null pass def reportError(self, re): # no error recovery yet, just crash! raise re return TLexer def testValid(self): stream = antlr3.StringStream('fofaaooa') lexer = self.getLexer(stream) token = lexer.nextToken() self.assertEqual(token.type, self.lexerModule.FOO) self.assertEqual(token.start, 0) self.assertEqual(token.stop, 1) self.assertEqual(token.text, 'fo') token = lexer.nextToken() self.assertEqual(token.type, self.lexerModule.FOO) self.assertEqual(token.start, 2) self.assertEqual(token.stop, 7) self.assertEqual(token.text, 'faaooa') token = lexer.nextToken() self.assertEqual(token.type, self.lexerModule.EOF) def testMalformedInput(self): stream = antlr3.StringStream('fofoaooaoa2') lexer = self.getLexer(stream) lexer.nextToken() lexer.nextToken() try: token = lexer.nextToken() self.fail(token) except antlr3.MismatchedTokenException as exc: self.assertEqual(exc.expecting, 'f') self.assertEqual(exc.unexpectedType, '2') self.assertEqual(exc.charPositionInLine, 10) self.assertEqual(exc.line, 1) if __name__ == '__main__': unittest.main() python3-antlr3-3.5.2/tests/t007lexer.g000066400000000000000000000001301324200532700173620ustar00rootroot00000000000000lexer grammar t007lexer; options { language = Python3; } FOO: 'f' ('o' | 'a' 'b'+)*; python3-antlr3-3.5.2/tests/t007lexer.py000066400000000000000000000031031324200532700175670ustar00rootroot00000000000000import antlr3 import testbase import unittest class t007lexer(testbase.ANTLRTest): def setUp(self): self.compileGrammar() def lexerClass(self, base): class TLexer(base): def emitErrorMessage(self, msg): # report errors to /dev/null pass def reportError(self, re): # no error recovery yet, just crash! raise re return TLexer def testValid(self): stream = antlr3.StringStream('fofababbooabb') lexer = self.getLexer(stream) token = lexer.nextToken() self.assertEqual(token.type, self.lexerModule.FOO) self.assertEqual(token.start, 0) self.assertEqual(token.stop, 1) self.assertEqual(token.text, 'fo') token = lexer.nextToken() self.assertEqual(token.type, self.lexerModule.FOO) self.assertEqual(token.start, 2) self.assertEqual(token.stop, 12) self.assertEqual(token.text, 'fababbooabb') token = lexer.nextToken() self.assertEqual(token.type, self.lexerModule.EOF) def testMalformedInput(self): stream = antlr3.StringStream('foaboao') lexer = self.getLexer(stream) try: token = lexer.nextToken() self.fail(token) except antlr3.EarlyExitException as exc: self.assertEqual(exc.unexpectedType, 'o') self.assertEqual(exc.charPositionInLine, 6) self.assertEqual(exc.line, 1) if __name__ == '__main__': unittest.main() python3-antlr3-3.5.2/tests/t008lexer.g000066400000000000000000000001131324200532700173640ustar00rootroot00000000000000lexer grammar t008lexer; options { language = Python3; } FOO: 'f' 'a'?; python3-antlr3-3.5.2/tests/t008lexer.py000066400000000000000000000034761324200532700176050ustar00rootroot00000000000000import antlr3 import testbase import unittest class t008lexer(testbase.ANTLRTest): def setUp(self): self.compileGrammar() def lexerClass(self, base): class TLexer(base): def emitErrorMessage(self, msg): # report errors to /dev/null pass def reportError(self, re): # no error recovery yet, just crash! raise re return TLexer def testValid(self): stream = antlr3.StringStream('ffaf') lexer = self.getLexer(stream) token = lexer.nextToken() self.assertEqual(token.type, self.lexerModule.FOO) self.assertEqual(token.start, 0) self.assertEqual(token.stop, 0) self.assertEqual(token.text, 'f') token = lexer.nextToken() self.assertEqual(token.type, self.lexerModule.FOO) self.assertEqual(token.start, 1) self.assertEqual(token.stop, 2) self.assertEqual(token.text, 'fa') token = lexer.nextToken() self.assertEqual(token.type, self.lexerModule.FOO) self.assertEqual(token.start, 3) self.assertEqual(token.stop, 3) self.assertEqual(token.text, 'f') token = lexer.nextToken() self.assertEqual(token.type, self.lexerModule.EOF) def testMalformedInput(self): stream = antlr3.StringStream('fafb') lexer = self.getLexer(stream) lexer.nextToken() lexer.nextToken() try: token = lexer.nextToken() self.fail(token) except antlr3.MismatchedTokenException as exc: self.assertEqual(exc.unexpectedType, 'b') self.assertEqual(exc.charPositionInLine, 3) self.assertEqual(exc.line, 1) if __name__ == '__main__': unittest.main() python3-antlr3-3.5.2/tests/t009lexer.g000066400000000000000000000001171324200532700173710ustar00rootroot00000000000000lexer grammar t009lexer; options { language = Python3; } DIGIT: '0' .. '9'; python3-antlr3-3.5.2/tests/t009lexer.py000066400000000000000000000035511324200532700176000ustar00rootroot00000000000000import antlr3 import testbase import unittest class t009lexer(testbase.ANTLRTest): def setUp(self): self.compileGrammar() def lexerClass(self, base): class TLexer(base): def emitErrorMessage(self, msg): # report errors to /dev/null pass def reportError(self, re): # no error recovery yet, just crash! raise re return TLexer def testValid(self): stream = antlr3.StringStream('085') lexer = self.getLexer(stream) token = lexer.nextToken() self.assertEqual(token.type, self.lexerModule.DIGIT) self.assertEqual(token.start, 0) self.assertEqual(token.stop, 0) self.assertEqual(token.text, '0') token = lexer.nextToken() self.assertEqual(token.type, self.lexerModule.DIGIT) self.assertEqual(token.start, 1) self.assertEqual(token.stop, 1) self.assertEqual(token.text, '8') token = lexer.nextToken() self.assertEqual(token.type, self.lexerModule.DIGIT) self.assertEqual(token.start, 2) self.assertEqual(token.stop, 2) self.assertEqual(token.text, '5') token = lexer.nextToken() self.assertEqual(token.type, self.lexerModule.EOF) def testMalformedInput(self): stream = antlr3.StringStream('2a') lexer = self.getLexer(stream) lexer.nextToken() try: token = lexer.nextToken() self.fail(token) except antlr3.MismatchedSetException as exc: # TODO: This should provide more useful information self.assertIsNone(exc.expecting) self.assertEqual(exc.unexpectedType, 'a') self.assertEqual(exc.charPositionInLine, 1) self.assertEqual(exc.line, 1) if __name__ == '__main__': unittest.main() python3-antlr3-3.5.2/tests/t010lexer.g000066400000000000000000000002261324200532700173620ustar00rootroot00000000000000lexer grammar t010lexer; options { language = Python3; } IDENTIFIER: ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')*; WS: (' ' | '\n')+; python3-antlr3-3.5.2/tests/t010lexer.py000066400000000000000000000044141324200532700175670ustar00rootroot00000000000000import antlr3 import testbase import unittest class t010lexer(testbase.ANTLRTest): def setUp(self): self.compileGrammar() def lexerClass(self, base): class TLexer(base): def emitErrorMessage(self, msg): # report errors to /dev/null pass def reportError(self, re): # no error recovery yet, just crash! raise re return TLexer def testValid(self): stream = antlr3.StringStream('foobar _Ab98 \n A12sdf') lexer = self.getLexer(stream) token = lexer.nextToken() self.assertEqual(token.type, self.lexerModule.IDENTIFIER) self.assertEqual(token.start, 0) self.assertEqual(token.stop, 5) self.assertEqual(token.text, 'foobar') token = lexer.nextToken() self.assertEqual(token.type, self.lexerModule.WS) self.assertEqual(token.start, 6) self.assertEqual(token.stop, 6) self.assertEqual(token.text, ' ') token = lexer.nextToken() self.assertEqual(token.type, self.lexerModule.IDENTIFIER) self.assertEqual(token.start, 7) self.assertEqual(token.stop, 11) self.assertEqual(token.text, '_Ab98') token = lexer.nextToken() self.assertEqual(token.type, self.lexerModule.WS) self.assertEqual(token.start, 12) self.assertEqual(token.stop, 14) self.assertEqual(token.text, ' \n ') token = lexer.nextToken() self.assertEqual(token.type, self.lexerModule.IDENTIFIER) self.assertEqual(token.start, 15) self.assertEqual(token.stop, 20) self.assertEqual(token.text, 'A12sdf') token = lexer.nextToken() self.assertEqual(token.type, self.lexerModule.EOF) def testMalformedInput(self): stream = antlr3.StringStream('a-b') lexer = self.getLexer(stream) lexer.nextToken() try: token = lexer.nextToken() self.fail(token) except antlr3.NoViableAltException as exc: self.assertEqual(exc.unexpectedType, '-') self.assertEqual(exc.charPositionInLine, 1) self.assertEqual(exc.line, 1) if __name__ == '__main__': unittest.main() python3-antlr3-3.5.2/tests/t011lexer.g000066400000000000000000000004541324200532700173660ustar00rootroot00000000000000lexer grammar t011lexer; options { language = Python3; } IDENTIFIER: ('a'..'z'|'A'..'Z'|'_') ('a'..'z' |'A'..'Z' |'0'..'9' |'_' { print("Underscore") print("foo") } )* ; WS: (' ' | '\n')+; python3-antlr3-3.5.2/tests/t011lexer.py000066400000000000000000000044141324200532700175700ustar00rootroot00000000000000import antlr3 import testbase import unittest class t011lexer(testbase.ANTLRTest): def setUp(self): self.compileGrammar() def lexerClass(self, base): class TLexer(base): def emitErrorMessage(self, msg): # report errors to /dev/null pass def reportError(self, re): # no error recovery yet, just crash! raise re return TLexer def testValid(self): stream = antlr3.StringStream('foobar _Ab98 \n A12sdf') lexer = self.getLexer(stream) token = lexer.nextToken() self.assertEqual(token.type, self.lexerModule.IDENTIFIER) self.assertEqual(token.start, 0) self.assertEqual(token.stop, 5) self.assertEqual(token.text, 'foobar') token = lexer.nextToken() self.assertEqual(token.type, self.lexerModule.WS) self.assertEqual(token.start, 6) self.assertEqual(token.stop, 6) self.assertEqual(token.text, ' ') token = lexer.nextToken() self.assertEqual(token.type, self.lexerModule.IDENTIFIER) self.assertEqual(token.start, 7) self.assertEqual(token.stop, 11) self.assertEqual(token.text, '_Ab98') token = lexer.nextToken() self.assertEqual(token.type, self.lexerModule.WS) self.assertEqual(token.start, 12) self.assertEqual(token.stop, 14) self.assertEqual(token.text, ' \n ') token = lexer.nextToken() self.assertEqual(token.type, self.lexerModule.IDENTIFIER) self.assertEqual(token.start, 15) self.assertEqual(token.stop, 20) self.assertEqual(token.text, 'A12sdf') token = lexer.nextToken() self.assertEqual(token.type, self.lexerModule.EOF) def testMalformedInput(self): stream = antlr3.StringStream('a-b') lexer = self.getLexer(stream) lexer.nextToken() try: token = lexer.nextToken() self.fail(token) except antlr3.NoViableAltException as exc: self.assertEqual(exc.unexpectedType, '-') self.assertEqual(exc.charPositionInLine, 1) self.assertEqual(exc.line, 1) if __name__ == '__main__': unittest.main() python3-antlr3-3.5.2/tests/t012lexerXML.input000066400000000000000000000005341324200532700206600ustar00rootroot00000000000000 ]> Text öäüß & < python3-antlr3-3.5.2/tests/t012lexerXML.output000066400000000000000000000010241324200532700210540ustar00rootroot00000000000000XML declaration Attr: version='1.0' ROOTELEMENT: component INTERNAL DTD: [ ] Start Tag: component Attr: attr="val'ue" Attr: attr2='val"ue' PCDATA: " " Comment: "" PCDATA: " Text " CDATA: "" PCDATA: " öäüß & < " PI: xtal Attr: cursor='11' PCDATA: " " Empty Element: sub PCDATA: " " Start Tag: sub End Tag: sub PCDATA: " " End Tag: component python3-antlr3-3.5.2/tests/t012lexerXML.py000066400000000000000000000062521324200532700201540ustar00rootroot00000000000000import antlr3 import testbase import unittest import os import sys from io import StringIO import textwrap class t012lexerXML(testbase.ANTLRTest): def setUp(self): self.compileGrammar('t012lexerXMLLexer.g') def lexerClass(self, base): class TLexer(base): def emitErrorMessage(self, msg): # report errors to /dev/null pass def reportError(self, re): # no error recovery yet, just crash! raise re return TLexer def testValid(self): inputPath = os.path.splitext(__file__)[0] + '.input' with open(inputPath) as f: data = f.read() stream = antlr3.StringStream(data) lexer = self.getLexer(stream) while True: token = lexer.nextToken() if token.type == self.lexerModule.EOF: break output = lexer.outbuf.getvalue() outputPath = os.path.splitext(__file__)[0] + '.output' with open(outputPath) as f: testOutput = f.read() self.assertEqual(output, testOutput) def testMalformedInput1(self): input = textwrap.dedent("""\ """) stream = antlr3.StringStream(input) lexer = self.getLexer(stream) try: while True: token = lexer.nextToken() # Should raise NoViableAltException before hitting EOF if token.type == antlr3.EOF: self.fail() except antlr3.NoViableAltException as exc: self.assertEqual(exc.unexpectedType, '>') self.assertEqual(exc.charPositionInLine, 11) self.assertEqual(exc.line, 2) def testMalformedInput2(self): input = textwrap.dedent("""\ """) stream = antlr3.StringStream(input) lexer = self.getLexer(stream) try: while True: token = lexer.nextToken() # Should raise NoViableAltException before hitting EOF if token.type == antlr3.EOF: self.fail() except antlr3.MismatchedSetException as exc: self.assertEqual(exc.unexpectedType, 't') self.assertEqual(exc.charPositionInLine, 2) self.assertEqual(exc.line, 1) def testMalformedInput3(self): input = textwrap.dedent("""\ """) stream = antlr3.StringStream(input) lexer = self.getLexer(stream) try: while True: token = lexer.nextToken() # Should raise NoViableAltException before hitting EOF if token.type == antlr3.EOF: self.fail() except antlr3.NoViableAltException as exc: self.assertEqual(exc.unexpectedType, 'a') self.assertEqual(exc.charPositionInLine, 11) self.assertEqual(exc.line, 2) if __name__ == '__main__': unittest.main() python3-antlr3-3.5.2/tests/t012lexerXMLLexer.g000066400000000000000000000052461324200532700207540ustar00rootroot00000000000000lexer grammar t012lexerXMLLexer; options { language = Python3; } @header { from io import StringIO } @lexer::init { self.outbuf = StringIO() } @lexer::members { def output(self, line): self.outbuf.write(line + "\n") } DOCUMENT : XMLDECL? WS? DOCTYPE? WS? ELEMENT WS? ; fragment DOCTYPE : '' ; fragment INTERNAL_DTD : '[' (options {greedy=false;} : .)* ']' ; fragment PI : '' ; fragment XMLDECL : '' ; fragment ELEMENT : ( START_TAG (ELEMENT | t=PCDATA {self.output('PCDATA: "{}"'.format($t.text))} | t=CDATA {self.output('CDATA: "{}"'.format($t.text))} | t=COMMENT {self.output('Comment: "{}"'.format($t.text))} | pi=PI )* END_TAG | EMPTY_ELEMENT ) ; fragment START_TAG : '<' WS? name=GENERIC_ID WS? {self.output("Start Tag: "+name.text)} ( ATTRIBUTE WS? )* '>' ; fragment EMPTY_ELEMENT : '<' WS? name=GENERIC_ID WS? {self.output("Empty Element: "+name.text)} ( ATTRIBUTE WS? )* '/>' ; fragment ATTRIBUTE : name=GENERIC_ID WS? '=' WS? value=VALUE {self.output("Attr: {}={}".format(name.text, value.text))} ; fragment END_TAG : '' {self.output("End Tag: "+name.text)} ; fragment COMMENT : '' ; fragment CDATA : '' ; fragment PCDATA : (~'<')+ ; fragment VALUE : ( '\"' (~'\"')* '\"' | '\'' (~'\'')* '\'' ) ; fragment GENERIC_ID : ( LETTER | '_' | ':') ( options {greedy=true;} : LETTER | '0'..'9' | '.' | '-' | '_' | ':' )* ; fragment LETTER : 'a'..'z' | 'A'..'Z' ; fragment WS : ( ' ' | '\t' | ( '\n' | '\r\n' | '\r' ) )+ ; python3-antlr3-3.5.2/tests/t013parser.g000066400000000000000000000006511324200532700175440ustar00rootroot00000000000000grammar t013parser; options { language = Python3; } @parser::init { self.identifiers = [] self.reportedErrors = [] } @parser::members { def foundIdentifier(self, name): self.identifiers.append(name) def emitErrorMessage(self, msg): self.reportedErrors.append(msg) } document: t=IDENTIFIER {self.foundIdentifier($t.text)} ; IDENTIFIER: ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')*; python3-antlr3-3.5.2/tests/t013parser.py000066400000000000000000000017271324200532700177530ustar00rootroot00000000000000import antlr3 import testbase import unittest class t013parser(testbase.ANTLRTest): def setUp(self): self.compileGrammar() def testValid(self): cStream = antlr3.StringStream('foobar') lexer = self.getLexer(cStream) tStream = antlr3.CommonTokenStream(lexer) parser = self.getParser(tStream) parser.document() self.assertEqual(parser.reportedErrors, []) self.assertEqual(parser.identifiers, ['foobar']) def testMalformedInput1(self): cStream = antlr3.StringStream('') lexer = self.getLexer(cStream) tStream = antlr3.CommonTokenStream(lexer) parser = self.getParser(tStream) parser.document() # FIXME: currently strings with formatted errors are collected # can't check error locations yet self.assertEqual(len(parser.reportedErrors), 1, parser.reportedErrors) if __name__ == '__main__': unittest.main() python3-antlr3-3.5.2/tests/t014parser.g000066400000000000000000000011331324200532700175410ustar00rootroot00000000000000grammar t014parser; options { language = Python3; } @parser::init { self.events = [] self.reportedErrors = [] } @parser::members { def emitErrorMessage(self, msg): self.reportedErrors.append(msg) } document: ( declaration | call )* EOF ; declaration: 'var' t=IDENTIFIER ';' {self.events.append(('decl', $t.text))} ; call: t=IDENTIFIER '(' ')' ';' {self.events.append(('call', $t.text))} ; IDENTIFIER: ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')*; WS: (' '|'\r'|'\t'|'\n') {$channel=HIDDEN;}; python3-antlr3-3.5.2/tests/t014parser.py000066400000000000000000000042611324200532700177500ustar00rootroot00000000000000import antlr3 import testbase import unittest class t014parser(testbase.ANTLRTest): def setUp(self): self.compileGrammar() def testValid(self): cStream = antlr3.StringStream('var foobar; gnarz(); var blupp; flupp ( ) ;') lexer = self.getLexer(cStream) tStream = antlr3.CommonTokenStream(lexer) parser = self.getParser(tStream) parser.document() self.assertEqual(parser.reportedErrors, []) self.assertEqual(parser.events, [('decl', 'foobar'), ('call', 'gnarz'), ('decl', 'blupp'), ('call', 'flupp')]) def testMalformedInput1(self): cStream = antlr3.StringStream('var; foo();') lexer = self.getLexer(cStream) tStream = antlr3.CommonTokenStream(lexer) parser = self.getParser(tStream) parser.document() # FIXME: currently strings with formatted errors are collected # can't check error locations yet self.assertEqual(len(parser.reportedErrors), 1, parser.reportedErrors) self.assertEqual(parser.events, []) def testMalformedInput2(self): cStream = antlr3.StringStream('var foobar(); gnarz();') lexer = self.getLexer(cStream) tStream = antlr3.CommonTokenStream(lexer) parser = self.getParser(tStream) parser.document() # FIXME: currently strings with formatted errors are collected # can't check error locations yet self.assertEqual(len(parser.reportedErrors), 1, parser.reportedErrors) self.assertEqual(parser.events, [('call', 'gnarz')]) def testMalformedInput3(self): cStream = antlr3.StringStream('gnarz(; flupp();') lexer = self.getLexer(cStream) tStream = antlr3.CommonTokenStream(lexer) parser = self.getParser(tStream) parser.document() # FIXME: currently strings with formatted errors are collected # can't check error locations yet self.assertEqual(len(parser.reportedErrors), 1, parser.reportedErrors) self.assertEqual(parser.events, [('call', 'gnarz'), ('call', 'flupp')]) if __name__ == '__main__': unittest.main() python3-antlr3-3.5.2/tests/t015calc.g000066400000000000000000000017471324200532700171630ustar00rootroot00000000000000grammar t015calc; options { language = Python3; } @header { import math } @parser::init { self.reportedErrors = [] } @parser::members { def emitErrorMessage(self, msg): self.reportedErrors.append(msg) } evaluate returns [result]: r=expression {result = r}; expression returns [result]: r=mult ( '+' r2=mult {r += r2} | '-' r2=mult {r -= r2} )* {result = r}; mult returns [result]: r=log ( '*' r2=log {r *= r2} | '/' r2=log {r /= r2} // | '%' r2=log {r %= r2} )* {result = r}; log returns [result]: 'ln' r=exp {result = math.log(r)} | r=exp {result = r} ; exp returns [result]: r=atom ('^' r2=atom {r = math.pow(r,r2)} )? {result = r} ; atom returns [result]: n=INTEGER {result = int($n.text)} | n=DECIMAL {result = float($n.text)} | '(' r=expression {result = r} ')' | 'PI' {result = math.pi} | 'E' {result = math.e} ; INTEGER: DIGIT+; DECIMAL: DIGIT+ '.' DIGIT+; fragment DIGIT: '0'..'9'; WS: (' ' | '\n' | '\t')+ {$channel = HIDDEN}; python3-antlr3-3.5.2/tests/t015calc.py000066400000000000000000000022551324200532700173600ustar00rootroot00000000000000import antlr3 import testbase import unittest class t015calc(testbase.ANTLRTest): def setUp(self): self.compileGrammar() def _evaluate(self, expr, expected, errors=[]): cStream = antlr3.StringStream(expr) lexer = self.getLexer(cStream) tStream = antlr3.CommonTokenStream(lexer) parser = self.getParser(tStream) result = parser.evaluate() self.assertEqual(result, expected) self.assertEqual(len(parser.reportedErrors), len(errors), parser.reportedErrors) def testValid01(self): self._evaluate("1 + 2", 3) def testValid02(self): self._evaluate("1 + 2 * 3", 7) def testValid03(self): self._evaluate("10 / 2", 5) def testValid04(self): self._evaluate("6 + 2*(3+1) - 4", 10) def testMalformedInput(self): self._evaluate("6 - (2*1", 4, ["mismatched token at pos 8"]) # FIXME: most parse errors result in TypeErrors in action code, because # rules return None, which is then added/multiplied... to integers. # evaluate("6 - foo 2", 4, ["some error"]) if __name__ == '__main__': unittest.main() python3-antlr3-3.5.2/tests/t016actions.g000066400000000000000000000007351324200532700177160ustar00rootroot00000000000000grammar t016actions; options { language = Python3; } declaration returns [name] : functionHeader ';' {$name = $functionHeader.name} ; functionHeader returns [name] : type ID {$name = $ID.text} ; type : 'int' | 'char' | 'void' ; ID : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')* ; WS : ( ' ' | '\t' | '\r' | '\n' )+ {$channel=HIDDEN} ; python3-antlr3-3.5.2/tests/t016actions.py000066400000000000000000000007571324200532700201240ustar00rootroot00000000000000import antlr3 import testbase import unittest class t016actions(testbase.ANTLRTest): def setUp(self): self.compileGrammar() def testValid(self): cStream = antlr3.StringStream("int foo;") lexer = self.getLexer(cStream) tStream = antlr3.CommonTokenStream(lexer) parser = self.getParser(tStream) name = parser.declaration() self.assertEqual(name, 'foo') if __name__ == '__main__': unittest.main() python3-antlr3-3.5.2/tests/t017parser.g000066400000000000000000000022261324200532700175500ustar00rootroot00000000000000grammar t017parser; options { language = Python3; } program : declaration+ ; declaration : variable | functionHeader ';' | functionHeader block ; variable : type declarator ';' ; declarator : ID ; functionHeader : type ID '(' ( formalParameter ( ',' formalParameter )* )? ')' ; formalParameter : type declarator ; type : 'int' | 'char' | 'void' | ID ; block : '{' variable* stat* '}' ; stat: forStat | expr ';' | block | assignStat ';' | ';' ; forStat : 'for' '(' assignStat ';' expr ';' assignStat ')' block ; assignStat : ID '=' expr ; expr: condExpr ; condExpr : aexpr ( ('==' | '<') aexpr )? ; aexpr : atom ( '+' atom )* ; atom : ID | INT | '(' expr ')' ; ID : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')* ; INT : ('0'..'9')+ ; WS : ( ' ' | '\t' | '\r' | '\n' )+ {$channel=HIDDEN} ; python3-antlr3-3.5.2/tests/t017parser.py000066400000000000000000000032731324200532700177550ustar00rootroot00000000000000import antlr3 import testbase import unittest class t017parser(testbase.ANTLRTest): def setUp(self): self.compileGrammar() def parserClass(self, base): class TestParser(base): def __init__(self, *args, **kwargs): super().__init__(*args, **kwargs) self.reportedErrors = [] def emitErrorMessage(self, msg): self.reportedErrors.append(msg) return TestParser def testValid(self): cStream = antlr3.StringStream("int foo;") lexer = self.getLexer(cStream) tStream = antlr3.CommonTokenStream(lexer) parser = self.getParser(tStream) parser.program() self.assertEqual(parser.reportedErrors, []) def testMalformedInput1(self): cStream = antlr3.StringStream('int foo() { 1+2 }') lexer = self.getLexer(cStream) tStream = antlr3.CommonTokenStream(lexer) parser = self.getParser(tStream) parser.program() # FIXME: currently strings with formatted errors are collected # can't check error locations yet self.assertEqual(len(parser.reportedErrors), 1, parser.reportedErrors) def testMalformedInput2(self): cStream = antlr3.StringStream('int foo() { 1+; 1+2 }') lexer = self.getLexer(cStream) tStream = antlr3.CommonTokenStream(lexer) parser = self.getParser(tStream) parser.program() # FIXME: currently strings with formatted errors are collected # can't check error locations yet self.assertEqual(len(parser.reportedErrors), 2, parser.reportedErrors) if __name__ == '__main__': unittest.main() python3-antlr3-3.5.2/tests/t018llstar.g000066400000000000000000000035671324200532700175670ustar00rootroot00000000000000grammar t018llstar; options { language = Python3; } @header { from io import StringIO } @init { self.output = StringIO() } program : declaration+ ; /** In this rule, the functionHeader left prefix on the last two * alternatives is not LL(k) for a fixed k. However, it is * LL(*). The LL(*) algorithm simply scans ahead until it sees * either the ';' or the '{' of the block and then it picks * the appropriate alternative. Lookhead can be arbitrarily * long in theory, but is <=10 in most cases. Works great. * Use ANTLRWorks to see the lookahead use (step by Location) * and look for blue tokens in the input window pane. :) */ declaration : variable | functionHeader ';' {self.output.write($functionHeader.name+" is a declaration\n")} | functionHeader block {self.output.write($functionHeader.name+" is a definition\n")} ; variable : type declarator ';' ; declarator : ID ; functionHeader returns [name] : type ID '(' ( formalParameter ( ',' formalParameter )* )? ')' {$name = $ID.text} ; formalParameter : type declarator ; type : 'int' | 'char' | 'void' | ID ; block : '{' variable* stat* '}' ; stat: forStat | expr ';' | block | assignStat ';' | ';' ; forStat : 'for' '(' assignStat ';' expr ';' assignStat ')' block ; assignStat : ID '=' expr ; expr: condExpr ; condExpr : aexpr ( ('==' | '<') aexpr )? ; aexpr : atom ( '+' atom )* ; atom : ID | INT | '(' expr ')' ; ID : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')* ; INT : ('0'..'9')+ ; WS : ( ' ' | '\t' | '\r' | '\n' )+ {$channel=HIDDEN} ; python3-antlr3-3.5.2/tests/t018llstar.input000066400000000000000000000001661324200532700204700ustar00rootroot00000000000000char c; int x; void bar(int x); int foo(int y, char d) { int i; for (i=0; i<3; i=i+1) { x=3; y=5; } } python3-antlr3-3.5.2/tests/t018llstar.output000066400000000000000000000000511324200532700206620ustar00rootroot00000000000000bar is a declaration foo is a definition python3-antlr3-3.5.2/tests/t018llstar.py000066400000000000000000000014261324200532700177610ustar00rootroot00000000000000import antlr3 import testbase import unittest import os import sys from io import StringIO class t018llstar(testbase.ANTLRTest): def setUp(self): self.compileGrammar() def testValid(self): inputPath = os.path.splitext(__file__)[0] + '.input' with open(inputPath) as f: cStream = antlr3.StringStream(f.read()) lexer = self.getLexer(cStream) tStream = antlr3.CommonTokenStream(lexer) parser = self.getParser(tStream) parser.program() output = parser.output.getvalue() outputPath = os.path.splitext(__file__)[0] + '.output' with open(outputPath) as f: testOutput = f.read() self.assertEqual(output, testOutput) if __name__ == '__main__': unittest.main() python3-antlr3-3.5.2/tests/t019lexer.g000066400000000000000000000020001324200532700173630ustar00rootroot00000000000000lexer grammar t019lexer; options { language=Python3; filter=true; } IMPORT : 'import' WS name=QIDStar WS? ';' ; /** Avoids having "return foo;" match as a field */ RETURN : 'return' (options {greedy=false;}:.)* ';' ; CLASS : 'class' WS name=ID WS? ('extends' WS QID WS?)? ('implements' WS QID WS? (',' WS? QID WS?)*)? '{' ; COMMENT : '/*' (options {greedy=false;} : . )* '*/' ; STRING : '"' (options {greedy=false;}: ESC | .)* '"' ; CHAR : '\'' (options {greedy=false;}: ESC | .)* '\'' ; WS : (' '|'\t'|'\n')+ ; fragment QID : ID ('.' ID)* ; /** QID cannot see beyond end of token so using QID '.*'? somewhere won't * ever match since k=1 lookahead in the QID loop of '.' will make it loop. * I made this rule to compensate. */ fragment QIDStar : ID ('.' ID)* '.*'? ; fragment TYPE: QID '[]'? ; fragment ARG : TYPE WS ID ; fragment ID : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'_'|'0'..'9')* ; fragment ESC : '\\' ('"'|'\''|'\\') ; python3-antlr3-3.5.2/tests/t019lexer.input000066400000000000000000000005441324200532700203070ustar00rootroot00000000000000import org.antlr.runtime.*; public class Main { public static void main(String[] args) throws Exception { for (int i=0; i ID // only visible, if b was called with True | NUM ; /* rule scopes, from the book, final beta, p.148 */ c returns [res] scope { symbols } @init { $c::symbols = set(); } : '{' c1* c2+ '}' { $res = $c::symbols; } ; c1 : 'int' ID {$c::symbols.add($ID.text)} ';' ; c2 : ID '=' NUM ';' { if $ID.text not in $c::symbols: raise RuntimeError($ID.text) } ; /* recursive rule scopes, from the book, final beta, p.150 */ d returns [res] scope { symbols } @init { $d::symbols = set(); } : '{' d1* d2* '}' { $res = $d::symbols; } ; d1 : 'int' ID {$d::symbols.add($ID.text)} ';' ; d2 : ID '=' NUM ';' { for s in reversed(range(len($d))): if $ID.text in $d[s]::symbols: break else: raise RuntimeError($ID.text) } | d ; /* recursive rule scopes, access bottom-most scope */ e returns [res] scope { a } @after { $res = $e::a; } : NUM { $e[0]::a = int($NUM.text); } | '{' e '}' ; /* recursive rule scopes, access with negative index */ f returns [res] scope { a } @after { $res = $f::a; } : NUM { $f[-2]::a = int($NUM.text); } | '{' f '}' ; /* tokens */ ID : ('a'..'z')+ ; NUM : ('0'..'9')+ ; WS : (' '|'\n'|'\r')+ {$channel=HIDDEN} ; python3-antlr3-3.5.2/tests/t022scopes.py000066400000000000000000000072451324200532700177540ustar00rootroot00000000000000import antlr3 import testbase import unittest import textwrap class t022scopes(testbase.ANTLRTest): def setUp(self): self.compileGrammar() def parserClass(self, base): class TParser(base): def emitErrorMessage(self, msg): # report errors to /dev/null pass def reportError(self, re): # no error recovery yet, just crash! raise re return TParser def testa1(self): cStream = antlr3.StringStream('foobar') lexer = self.getLexer(cStream) tStream = antlr3.CommonTokenStream(lexer) parser = self.getParser(tStream) parser.a() def testb1(self): cStream = antlr3.StringStream('foobar') lexer = self.getLexer(cStream) tStream = antlr3.CommonTokenStream(lexer) parser = self.getParser(tStream) self.assertRaises(antlr3.RecognitionException, parser.b, False) def testb2(self): cStream = antlr3.StringStream('foobar') lexer = self.getLexer(cStream) tStream = antlr3.CommonTokenStream(lexer) parser = self.getParser(tStream) parser.b(True) def testc1(self): cStream = antlr3.StringStream( textwrap.dedent('''\ { int i; int j; i = 0; } ''')) lexer = self.getLexer(cStream) tStream = antlr3.CommonTokenStream(lexer) parser = self.getParser(tStream) symbols = parser.c() self.assertEqual( symbols, set(['i', 'j']) ) def testc2(self): cStream = antlr3.StringStream( textwrap.dedent('''\ { int i; int j; i = 0; x = 4; } ''')) lexer = self.getLexer(cStream) tStream = antlr3.CommonTokenStream(lexer) parser = self.getParser(tStream) self.assertRaisesRegex(RuntimeError, r'x', parser.c) def testd1(self): cStream = antlr3.StringStream( textwrap.dedent('''\ { int i; int j; i = 0; { int i; int x; x = 5; } } ''')) lexer = self.getLexer(cStream) tStream = antlr3.CommonTokenStream(lexer) parser = self.getParser(tStream) symbols = parser.d() self.assertEqual( symbols, set(['i', 'j']) ) def teste1(self): cStream = antlr3.StringStream( textwrap.dedent('''\ { { { { 12 } } } } ''')) lexer = self.getLexer(cStream) tStream = antlr3.CommonTokenStream(lexer) parser = self.getParser(tStream) res = parser.e() self.assertEqual(res, 12) def testf1(self): cStream = antlr3.StringStream( textwrap.dedent('''\ { { { { 12 } } } } ''')) lexer = self.getLexer(cStream) tStream = antlr3.CommonTokenStream(lexer) parser = self.getParser(tStream) res = parser.f() self.assertIsNone(res) def testf2(self): cStream = antlr3.StringStream( textwrap.dedent('''\ { { 12 } } ''')) lexer = self.getLexer(cStream) tStream = antlr3.CommonTokenStream(lexer) parser = self.getParser(tStream) res = parser.f() self.assertIsNone(res) if __name__ == '__main__': unittest.main() python3-antlr3-3.5.2/tests/t023scopes.g000066400000000000000000000003021324200532700175360ustar00rootroot00000000000000grammar t023scopes; options { language=Python3; } prog scope { name } : ID {$prog::name=$ID.text;} ; ID : ('a'..'z')+ ; WS : (' '|'\n'|'\r')+ {$channel=HIDDEN} ; python3-antlr3-3.5.2/tests/t023scopes.py000066400000000000000000000006461324200532700177530ustar00rootroot00000000000000import antlr3 import testbase import unittest class t023scopes(testbase.ANTLRTest): def setUp(self): self.compileGrammar() def testValid1(self): cStream = antlr3.StringStream('foobar') lexer = self.getLexer(cStream) tStream = antlr3.CommonTokenStream(lexer) parser = self.getParser(tStream) parser.prog() if __name__ == '__main__': unittest.main() python3-antlr3-3.5.2/tests/t024finally.g000066400000000000000000000005171324200532700177110ustar00rootroot00000000000000grammar t024finally; options { language=Python3; } prog returns [events] @init {events = []} @after {events.append('after')} : ID {raise RuntimeError} ; catch [RuntimeError] {events.append('catch')} finally {events.append('finally')} ID : ('a'..'z')+ ; WS : (' '|'\n'|'\r')+ {$channel=HIDDEN} ; python3-antlr3-3.5.2/tests/t024finally.py000066400000000000000000000007511324200532700201130ustar00rootroot00000000000000import antlr3 import testbase import unittest class t024finally(testbase.ANTLRTest): def setUp(self): self.compileGrammar() def testValid1(self): cStream = antlr3.StringStream('foobar') lexer = self.getLexer(cStream) tStream = antlr3.CommonTokenStream(lexer) parser = self.getParser(tStream) events = parser.prog() self.assertEqual(events, ['catch', 'finally']) if __name__ == '__main__': unittest.main() python3-antlr3-3.5.2/tests/t025lexerRulePropertyRef.g000066400000000000000000000005111324200532700224170ustar00rootroot00000000000000lexer grammar t025lexerRulePropertyRef; options { language = Python3; } @lexer::init { self.properties = [] } IDENTIFIER: ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')* { self.properties.append( ($text, $type, $line, $pos, $index, $channel, $start, $stop) ) } ; WS: (' ' | '\n')+; python3-antlr3-3.5.2/tests/t025lexerRulePropertyRef.py000066400000000000000000000043421324200532700226270ustar00rootroot00000000000000import antlr3 import testbase import unittest class t025lexerRulePropertyRef(testbase.ANTLRTest): def setUp(self): self.compileGrammar() def testValid1(self): stream = antlr3.StringStream('foobar _Ab98 \n A12sdf') lexer = self.getLexer(stream) while True: token = lexer.nextToken() if token.type == antlr3.EOF: break self.assertEqual(len(lexer.properties), 3, lexer.properties) text, type, line, pos, index, channel, start, stop = lexer.properties[0] self.assertEqual(text, 'foobar', lexer.properties[0]) self.assertEqual(type, self.lexerModule.IDENTIFIER, lexer.properties[0]) self.assertEqual(line, 1, lexer.properties[0]) self.assertEqual(pos, 0, lexer.properties[0]) self.assertEqual(index, -1, lexer.properties[0]) self.assertEqual(channel, antlr3.DEFAULT_CHANNEL, lexer.properties[0]) self.assertEqual(start, 0, lexer.properties[0]) self.assertEqual(stop, 5, lexer.properties[0]) text, type, line, pos, index, channel, start, stop = lexer.properties[1] self.assertEqual(text, '_Ab98', lexer.properties[1]) self.assertEqual(type, self.lexerModule.IDENTIFIER, lexer.properties[1]) self.assertEqual(line, 1, lexer.properties[1]) self.assertEqual(pos, 7, lexer.properties[1]) self.assertEqual(index, -1, lexer.properties[1]) self.assertEqual(channel, antlr3.DEFAULT_CHANNEL, lexer.properties[1]) self.assertEqual(start, 7, lexer.properties[1]) self.assertEqual(stop, 11, lexer.properties[1]) text, type, line, pos, index, channel, start, stop = lexer.properties[2] self.assertEqual(text, 'A12sdf', lexer.properties[2]) self.assertEqual(type, self.lexerModule.IDENTIFIER, lexer.properties[2]) self.assertEqual(line, 2, lexer.properties[2]) self.assertEqual(pos, 1, lexer.properties[2]) self.assertEqual(index, -1, lexer.properties[2]) self.assertEqual(channel, antlr3.DEFAULT_CHANNEL, lexer.properties[2]) self.assertEqual(start, 15, lexer.properties[2]) self.assertEqual(stop, 20, lexer.properties[2]) if __name__ == '__main__': unittest.main() python3-antlr3-3.5.2/tests/t026actions.g000066400000000000000000000013451324200532700177150ustar00rootroot00000000000000grammar t026actions; options { language = Python3; } @lexer::init { self.foobar = 'attribute;' } prog @init { self.capture('init;') } @after { self.capture('after;') } : IDENTIFIER EOF ; catch [ RecognitionException as exc ] { self.capture('catch;') raise } finally { self.capture('finally;') } IDENTIFIER : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')* { # a comment self.capture('action;') self.capture('{!r} {!r} {!r} {!r} {!r} {!r} {!r} {!r};'.format($text, $type, $line, $pos, $index, $channel, $start, $stop)) if True: self.capture(self.foobar) } ; WS: (' ' | '\n')+; python3-antlr3-3.5.2/tests/t026actions.py000066400000000000000000000031031324200532700201110ustar00rootroot00000000000000import antlr3 import testbase import unittest class t026actions(testbase.ANTLRTest): def parserClass(self, base): class TParser(base): def __init__(self, *args, **kwargs): super().__init__(*args, **kwargs) self._errors = [] self._output = "" def capture(self, t): self._output += t def emitErrorMessage(self, msg): self._errors.append(msg) return TParser def lexerClass(self, base): class TLexer(base): def __init__(self, *args, **kwargs): super().__init__(*args, **kwargs) self._errors = [] self._output = "" def capture(self, t): self._output += t def emitErrorMessage(self, msg): self._errors.append(msg) return TLexer def setUp(self): self.compileGrammar() def testValid1(self): cStream = antlr3.StringStream('foobar _Ab98 \n A12sdf') lexer = self.getLexer(cStream) tStream = antlr3.CommonTokenStream(lexer) parser = self.getParser(tStream) parser.prog() self.assertEqual( parser._output, 'init;after;finally;') self.assertEqual( lexer._output, "action;'foobar' 4 1 0 -1 0 0 5;attribute;action;" "'_Ab98' 4 1 7 -1 0 7 11;attribute;action;" "'A12sdf' 4 2 1 -1 0 15 20;attribute;") if __name__ == '__main__': unittest.main() python3-antlr3-3.5.2/tests/t027eof.g000066400000000000000000000001211324200532700170160ustar00rootroot00000000000000lexer grammar t027eof; options { language=Python3; } END: EOF; SPACE: ' '; python3-antlr3-3.5.2/tests/t027eof.py000066400000000000000000000011131324200532700172220ustar00rootroot00000000000000import antlr3 import testbase import unittest class t027eof(testbase.ANTLRTest): def setUp(self): self.compileGrammar() @testbase.broken("That's not how EOF is supposed to be used", Exception) def testValid1(self): cStream = antlr3.StringStream(' ') lexer = self.getLexer(cStream) tok = lexer.nextToken() self.assertEqual(tok.type, self.lexerModule.SPACE, tok) tok = lexer.nextToken() self.assertEqual(tok.type, self.lexerModule.END, tok) if __name__ == '__main__': unittest.main() python3-antlr3-3.5.2/tests/t028labelExpr.g.disabled000066400000000000000000000001721324200532700217400ustar00rootroot00000000000000lexer grammar t028labelExpr; ETAGO: (' ' ' '<'; CDATA: '<'; python3-antlr3-3.5.2/tests/t029synpredgate.g000066400000000000000000000002231324200532700205770ustar00rootroot00000000000000lexer grammar t029synpredgate; options { language = Python3; } FOO : ('ab')=> A | ('ac')=> B ; fragment A: 'a'; fragment B: 'a'; python3-antlr3-3.5.2/tests/t029synpredgate.py000066400000000000000000000005311324200532700210030ustar00rootroot00000000000000import antlr3 import testbase import unittest class t029synpredgate(testbase.ANTLRTest): def setUp(self): self.compileGrammar() def testValid1(self): stream = antlr3.StringStream('ac') lexer = self.getLexer(stream) token = lexer.nextToken() if __name__ == '__main__': unittest.main() python3-antlr3-3.5.2/tests/t030specialStates.g000066400000000000000000000005741324200532700210570ustar00rootroot00000000000000grammar t030specialStates; options { language = Python3; } @init { self.cond = True } @members { def recover(self, input, re): # no error recovery yet, just crash! raise re } r : ( {self.cond}? NAME | {not self.cond}? NAME WS+ NAME ) ( WS+ NAME )? EOF ; NAME: ('a'..'z') ('a'..'z' | '0'..'9')+; NUMBER: ('0'..'9')+; WS: ' '+; python3-antlr3-3.5.2/tests/t030specialStates.py000066400000000000000000000022661324200532700212610ustar00rootroot00000000000000import antlr3 import testbase import unittest class t030specialStates(testbase.ANTLRTest): def setUp(self): self.compileGrammar() def testValid1(self): cStream = antlr3.StringStream('foo') lexer = self.getLexer(cStream) tStream = antlr3.CommonTokenStream(lexer) parser = self.getParser(tStream) events = parser.r() def testValid2(self): cStream = antlr3.StringStream('foo name1') lexer = self.getLexer(cStream) tStream = antlr3.CommonTokenStream(lexer) parser = self.getParser(tStream) events = parser.r() def testValid3(self): cStream = antlr3.StringStream('bar name1') lexer = self.getLexer(cStream) tStream = antlr3.CommonTokenStream(lexer) parser = self.getParser(tStream) parser.cond = False events = parser.r() def testValid4(self): cStream = antlr3.StringStream('bar name1 name2') lexer = self.getLexer(cStream) tStream = antlr3.CommonTokenStream(lexer) parser = self.getParser(tStream) parser.cond = False events = parser.r() if __name__ == '__main__': unittest.main() python3-antlr3-3.5.2/tests/t031emptyAlt.g000066400000000000000000000003311324200532700200420ustar00rootroot00000000000000grammar t031emptyAlt; options { language = Python3; } r : NAME ( {self.cond}?=> WS+ NAME | ) EOF ; NAME: ('a'..'z') ('a'..'z' | '0'..'9')+; NUMBER: ('0'..'9')+; WS: ' '+; python3-antlr3-3.5.2/tests/t031emptyAlt.py000066400000000000000000000006541324200532700202540ustar00rootroot00000000000000import antlr3 import testbase import unittest class t031emptyAlt(testbase.ANTLRTest): def setUp(self): self.compileGrammar() def testValid1(self): cStream = antlr3.StringStream('foo') lexer = self.getLexer(cStream) tStream = antlr3.CommonTokenStream(lexer) parser = self.getParser(tStream) events = parser.r() if __name__ == '__main__': unittest.main() python3-antlr3-3.5.2/tests/t032subrulePredict.g000066400000000000000000000001611324200532700212410ustar00rootroot00000000000000grammar t032subrulePredict; options { language = Python3; } a: 'BEGIN' b WS+ 'END'; b: ( WS+ 'A' )+; WS: ' '; python3-antlr3-3.5.2/tests/t032subrulePredict.py000066400000000000000000000017701324200532700214520ustar00rootroot00000000000000import antlr3 import testbase import unittest class t032subrulePredict(testbase.ANTLRTest): def setUp(self): self.compileGrammar() def parserClass(self, base): class TParser(base): def recover(self, input, re): # no error recovery yet, just crash! raise return TParser def testValid1(self): cStream = antlr3.StringStream( 'BEGIN A END' ) lexer = self.getLexer(cStream) tStream = antlr3.CommonTokenStream(lexer) parser = self.getParser(tStream) events = parser.a() @testbase.broken("DFA tries to look beyond end of rule b", Exception) def testValid2(self): cStream = antlr3.StringStream( ' A' ) lexer = self.getLexer(cStream) tStream = antlr3.CommonTokenStream(lexer) parser = self.getParser(tStream) events = parser.b() if __name__ == '__main__': unittest.main() python3-antlr3-3.5.2/tests/t033backtracking.g000066400000000000000000000237251324200532700207040ustar00rootroot00000000000000grammar t033backtracking; options { language=Python3; backtrack=true; memoize=true; k=2; } scope Symbols { types; } @members { def isTypeName(self, name): for scope in reversed(self.Symbols_stack): if name in scope.types: return True return False } translation_unit scope Symbols; // entire file is a scope @init { $Symbols::types = set() } : external_declaration+ ; /** Either a function definition or any other kind of C decl/def. * The LL(*) analysis algorithm fails to deal with this due to * recursion in the declarator rules. I'm putting in a * manual predicate here so that we don't backtrack over * the entire function. Further, you get a better error * as errors within the function itself don't make it fail * to predict that it's a function. Weird errors previously. * Remember: the goal is to avoid backtrack like the plague * because it makes debugging, actions, and errors harder. * * Note that k=1 results in a much smaller predictor for the * fixed lookahead; k=2 made a few extra thousand lines. ;) * I'll have to optimize that in the future. */ external_declaration options {k=1;} : ( declaration_specifiers? declarator declaration* '{' )=> function_definition | declaration ; function_definition scope Symbols; // put parameters and locals into same scope for now @init { $Symbols::types = set() } : declaration_specifiers? declarator // ( declaration+ compound_statement // K&R style // | compound_statement // ANSI style // ) ; declaration scope { isTypedef; } @init { $declaration::isTypedef = False } : 'typedef' declaration_specifiers? {$declaration::isTypedef = True} init_declarator_list ';' // special case, looking for typedef | declaration_specifiers init_declarator_list? ';' ; declaration_specifiers : ( storage_class_specifier | type_specifier | type_qualifier )+ ; init_declarator_list : init_declarator (',' init_declarator)* ; init_declarator : declarator //('=' initializer)? ; storage_class_specifier : 'extern' | 'static' | 'auto' | 'register' ; type_specifier : 'void' | 'char' | 'short' | 'int' | 'long' | 'float' | 'double' | 'signed' | 'unsigned' // | struct_or_union_specifier // | enum_specifier | type_id ; type_id : {self.isTypeName(self.input.LT(1).getText())}? IDENTIFIER // {System.out.println($IDENTIFIER.text+" is a type");} ; // struct_or_union_specifier // options {k=3;} // scope Symbols; // structs are scopes // @init { // $Symbols::types = set() // } // : struct_or_union IDENTIFIER? '{' struct_declaration_list '}' // | struct_or_union IDENTIFIER // ; // struct_or_union // : 'struct' // | 'union' // ; // struct_declaration_list // : struct_declaration+ // ; // struct_declaration // : specifier_qualifier_list struct_declarator_list ';' // ; // specifier_qualifier_list // : ( type_qualifier | type_specifier )+ // ; // struct_declarator_list // : struct_declarator (',' struct_declarator)* // ; // struct_declarator // : declarator (':' constant_expression)? // | ':' constant_expression // ; // enum_specifier // options {k=3;} // : 'enum' '{' enumerator_list '}' // | 'enum' IDENTIFIER '{' enumerator_list '}' // | 'enum' IDENTIFIER // ; // enumerator_list // : enumerator (',' enumerator)* // ; // enumerator // : IDENTIFIER ('=' constant_expression)? // ; type_qualifier : 'const' | 'volatile' ; declarator : pointer? direct_declarator | pointer ; direct_declarator : ( IDENTIFIER { if $declaration and $declaration::isTypedef: $Symbols::types.add($IDENTIFIER.text) print("define type "+$IDENTIFIER.text) } | '(' declarator ')' ) declarator_suffix* ; declarator_suffix : /*'[' constant_expression ']' |*/ '[' ']' // | '(' parameter_type_list ')' // | '(' identifier_list ')' | '(' ')' ; pointer : '*' type_qualifier+ pointer? | '*' pointer | '*' ; // parameter_type_list // : parameter_list (',' '...')? // ; // parameter_list // : parameter_declaration (',' parameter_declaration)* // ; // parameter_declaration // : declaration_specifiers (declarator|abstract_declarator)* // ; // identifier_list // : IDENTIFIER (',' IDENTIFIER)* // ; // type_name // : specifier_qualifier_list abstract_declarator? // ; // abstract_declarator // : pointer direct_abstract_declarator? // | direct_abstract_declarator // ; // direct_abstract_declarator // : ( '(' abstract_declarator ')' | abstract_declarator_suffix ) abstract_declarator_suffix* // ; // abstract_declarator_suffix // : '[' ']' // | '[' constant_expression ']' // | '(' ')' // | '(' parameter_type_list ')' // ; // initializer // : assignment_expression // | '{' initializer_list ','? '}' // ; // initializer_list // : initializer (',' initializer)* // ; // // E x p r e s s i o n s // argument_expression_list // : assignment_expression (',' assignment_expression)* // ; // additive_expression // : (multiplicative_expression) ('+' multiplicative_expression | '-' multiplicative_expression)* // ; // multiplicative_expression // : (cast_expression) ('*' cast_expression | '/' cast_expression | '%' cast_expression)* // ; // cast_expression // : '(' type_name ')' cast_expression // | unary_expression // ; // unary_expression // : postfix_expression // | '++' unary_expression // | '--' unary_expression // | unary_operator cast_expression // | 'sizeof' unary_expression // | 'sizeof' '(' type_name ')' // ; // postfix_expression // : primary_expression // ( '[' expression ']' // | '(' ')' // | '(' argument_expression_list ')' // | '.' IDENTIFIER // | '*' IDENTIFIER // | '->' IDENTIFIER // | '++' // | '--' // )* // ; // unary_operator // : '&' // | '*' // | '+' // | '-' // | '~' // | '!' // ; // primary_expression // : IDENTIFIER // | constant // | '(' expression ')' // ; // constant // : HEX_LITERAL // | OCTAL_LITERAL // | DECIMAL_LITERAL // | CHARACTER_LITERAL // | STRING_LITERAL // | FLOATING_POINT_LITERAL // ; // ///// // expression // : assignment_expression (',' assignment_expression)* // ; // constant_expression // : conditional_expression // ; // assignment_expression // : lvalue assignment_operator assignment_expression // | conditional_expression // ; // lvalue // : unary_expression // ; // assignment_operator // : '=' // | '*=' // | '/=' // | '%=' // | '+=' // | '-=' // | '<<=' // | '>>=' // | '&=' // | '^=' // | '|=' // ; // conditional_expression // : logical_or_expression ('?' expression ':' conditional_expression)? // ; // logical_or_expression // : logical_and_expression ('||' logical_and_expression)* // ; // logical_and_expression // : inclusive_or_expression ('&&' inclusive_or_expression)* // ; // inclusive_or_expression // : exclusive_or_expression ('|' exclusive_or_expression)* // ; // exclusive_or_expression // : and_expression ('^' and_expression)* // ; // and_expression // : equality_expression ('&' equality_expression)* // ; // equality_expression // : relational_expression (('=='|'!=') relational_expression)* // ; // relational_expression // : shift_expression (('<'|'>'|'<='|'>=') shift_expression)* // ; // shift_expression // : additive_expression (('<<'|'>>') additive_expression)* // ; // // S t a t e m e n t s // statement // : labeled_statement // | compound_statement // | expression_statement // | selection_statement // | iteration_statement // | jump_statement // ; // labeled_statement // : IDENTIFIER ':' statement // | 'case' constant_expression ':' statement // | 'default' ':' statement // ; // compound_statement // scope Symbols; // blocks have a scope of symbols // @init { // $Symbols::types = {} // } // : '{' declaration* statement_list? '}' // ; // statement_list // : statement+ // ; // expression_statement // : ';' // | expression ';' // ; // selection_statement // : 'if' '(' expression ')' statement (options {k=1; backtrack=false;}:'else' statement)? // | 'switch' '(' expression ')' statement // ; // iteration_statement // : 'while' '(' expression ')' statement // | 'do' statement 'while' '(' expression ')' ';' // | 'for' '(' expression_statement expression_statement expression? ')' statement // ; // jump_statement // : 'goto' IDENTIFIER ';' // | 'continue' ';' // | 'break' ';' // | 'return' ';' // | 'return' expression ';' // ; IDENTIFIER : LETTER (LETTER|'0'..'9')* ; fragment LETTER : '$' | 'A'..'Z' | 'a'..'z' | '_' ; CHARACTER_LITERAL : '\'' ( EscapeSequence | ~('\''|'\\') ) '\'' ; STRING_LITERAL : '"' ( EscapeSequence | ~('\\'|'"') )* '"' ; HEX_LITERAL : '0' ('x'|'X') HexDigit+ IntegerTypeSuffix? ; DECIMAL_LITERAL : ('0' | '1'..'9' '0'..'9'*) IntegerTypeSuffix? ; OCTAL_LITERAL : '0' ('0'..'7')+ IntegerTypeSuffix? ; fragment HexDigit : ('0'..'9'|'a'..'f'|'A'..'F') ; fragment IntegerTypeSuffix : ('u'|'U')? ('l'|'L') | ('u'|'U') ('l'|'L')? ; FLOATING_POINT_LITERAL : ('0'..'9')+ '.' ('0'..'9')* Exponent? FloatTypeSuffix? | '.' ('0'..'9')+ Exponent? FloatTypeSuffix? | ('0'..'9')+ Exponent FloatTypeSuffix? | ('0'..'9')+ Exponent? FloatTypeSuffix ; fragment Exponent : ('e'|'E') ('+'|'-')? ('0'..'9')+ ; fragment FloatTypeSuffix : ('f'|'F'|'d'|'D') ; fragment EscapeSequence : '\\' ('b'|'t'|'n'|'f'|'r'|'\"'|'\''|'\\') | OctalEscape ; fragment OctalEscape : '\\' ('0'..'3') ('0'..'7') ('0'..'7') | '\\' ('0'..'7') ('0'..'7') | '\\' ('0'..'7') ; fragment UnicodeEscape : '\\' 'u' HexDigit HexDigit HexDigit HexDigit ; WS : (' '|'\r'|'\t'|'\u000C'|'\n') {$channel=HIDDEN;} ; COMMENT : '/*' ( options {greedy=false;} : . )* '*/' {$channel=HIDDEN;} ; LINE_COMMENT : '//' ~('\n'|'\r')* '\r'? '\n' {$channel=HIDDEN;} ; // ignore #line info for now LINE_COMMAND : '#' ~('\n'|'\r')* '\r'? '\n' {$channel=HIDDEN;} ; python3-antlr3-3.5.2/tests/t033backtracking.py000066400000000000000000000013011324200532700210700ustar00rootroot00000000000000import antlr3 import testbase import unittest class t033backtracking(testbase.ANTLRTest): def setUp(self): self.compileGrammar() def parserClass(self, base): class TParser(base): def recover(self, input, re): # no error recovery yet, just crash! raise return TParser @testbase.broken("Some bug in the tool", SyntaxError) def testValid1(self): cStream = antlr3.StringStream('int a;') lexer = self.getLexer(cStream) tStream = antlr3.CommonTokenStream(lexer) parser = self.getParser(tStream) events = parser.translation_unit() if __name__ == '__main__': unittest.main() python3-antlr3-3.5.2/tests/t034tokenLabelPropertyRef.g000066400000000000000000000007321324200532700225350ustar00rootroot00000000000000grammar t034tokenLabelPropertyRef; options { language = Python3; } a: t=A { print($t.text) print($t.type) print($t.line) print($t.pos) print($t.channel) print($t.index) #print($t.tree) } ; A: 'a'..'z'; WS : ( ' ' | '\t' | ( '\n' | '\r\n' | '\r' ) )+ { $channel = HIDDEN } ; python3-antlr3-3.5.2/tests/t034tokenLabelPropertyRef.py000066400000000000000000000015521324200532700227400ustar00rootroot00000000000000import antlr3 import testbase import unittest class t034tokenLabelPropertyRef(testbase.ANTLRTest): def setUp(self): self.compileGrammar() def lexerClass(self, base): class TLexer(base): def recover(self, input, re): # no error recovery yet, just crash! raise return TLexer def parserClass(self, base): class TParser(base): def recover(self, input, re): # no error recovery yet, just crash! raise return TParser def testValid1(self): cStream = antlr3.StringStream(' a') lexer = self.getLexer(cStream) tStream = antlr3.CommonTokenStream(lexer) parser = self.getParser(tStream) events = parser.a() if __name__ == '__main__': unittest.main() python3-antlr3-3.5.2/tests/t035ruleLabelPropertyRef.g000066400000000000000000000003321324200532700223610ustar00rootroot00000000000000grammar t035ruleLabelPropertyRef; options { language = Python3; } a returns [bla]: t=b { $bla = $t.start, $t.stop, $t.text } ; b: A+; A: 'a'..'z'; WS: ' '+ { $channel = HIDDEN }; python3-antlr3-3.5.2/tests/t035ruleLabelPropertyRef.py000066400000000000000000000022321324200532700225640ustar00rootroot00000000000000import antlr3 import testbase import unittest class t035ruleLabelPropertyRef(testbase.ANTLRTest): def setUp(self): self.compileGrammar() def lexerClass(self, base): class TLexer(base): def recover(self, input, re): # no error recovery yet, just crash! raise return TLexer def parserClass(self, base): class TParser(base): def recover(self, input, re): # no error recovery yet, just crash! raise return TParser def testValid1(self): cStream = antlr3.StringStream(' a a a a ') lexer = self.getLexer(cStream) tStream = antlr3.CommonTokenStream(lexer) parser = self.getParser(tStream) start, stop, text = parser.a() # first token of rule b is the 2nd token (counting hidden tokens) self.assertEqual(start.index, 1, start) # first token of rule b is the 7th token (counting hidden tokens) self.assertEqual(stop.index, 7, stop) self.assertEqual(text, "a a a a") if __name__ == '__main__': unittest.main() python3-antlr3-3.5.2/tests/t036multipleReturnValues.g000066400000000000000000000005351324200532700224710ustar00rootroot00000000000000grammar t036multipleReturnValues; options { language = Python3; } a returns [foo, bar]: A { $foo = "foo"; $bar = "bar"; } ; A: 'a'..'z'; WS : ( ' ' | '\t' | ( '\n' | '\r\n' | '\r' ) )+ { $channel = HIDDEN } ; python3-antlr3-3.5.2/tests/t036multipleReturnValues.py000066400000000000000000000016711324200532700226750ustar00rootroot00000000000000import antlr3 import testbase import unittest class t036multipleReturnValues(testbase.ANTLRTest): def setUp(self): self.compileGrammar() def lexerClass(self, base): class TLexer(base): def recover(self, input, re): # no error recovery yet, just crash! raise return TLexer def parserClass(self, base): class TParser(base): def recover(self, input, re): # no error recovery yet, just crash! raise return TParser def testValid1(self): cStream = antlr3.StringStream(' a') lexer = self.getLexer(cStream) tStream = antlr3.CommonTokenStream(lexer) parser = self.getParser(tStream) ret = parser.a() self.assertEqual(ret.foo, 'foo') self.assertEqual(ret.bar, 'bar') if __name__ == '__main__': unittest.main() python3-antlr3-3.5.2/tests/t037rulePropertyRef.g000066400000000000000000000002721324200532700214260ustar00rootroot00000000000000grammar t037rulePropertyRef; options { language = Python3; } a returns [bla] @after { $bla = $start, $stop, $text } : A+ ; A: 'a'..'z'; WS: ' '+ { $channel = HIDDEN }; python3-antlr3-3.5.2/tests/t037rulePropertyRef.py000066400000000000000000000022411324200532700216260ustar00rootroot00000000000000import antlr3 import testbase import unittest class t037rulePropertyRef(testbase.ANTLRTest): def setUp(self): self.compileGrammar() def lexerClass(self, base): class TLexer(base): def recover(self, input, re): # no error recovery yet, just crash! raise return TLexer def parserClass(self, base): class TParser(base): def recover(self, input, re): # no error recovery yet, just crash! raise return TParser def testValid1(self): cStream = antlr3.StringStream(' a a a a ') lexer = self.getLexer(cStream) tStream = antlr3.CommonTokenStream(lexer) parser = self.getParser(tStream) start, stop, text = parser.a().bla # first token of rule b is the 2nd token (counting hidden tokens) self.assertEqual(start.index, 1, start) # first token of rule b is the 7th token (counting hidden tokens) self.assertEqual(stop.index, 7, stop) self.assertEqual(text, "a a a a") if __name__ == '__main__': unittest.main() python3-antlr3-3.5.2/tests/t038lexerRuleLabel.g000066400000000000000000000007461324200532700211730ustar00rootroot00000000000000lexer grammar t038lexerRuleLabel; options { language = Python3; } A: 'a'..'z' WS '0'..'9' { print($WS) print($WS.type) print($WS.line) print($WS.pos) print($WS.channel) print($WS.index) print($WS.text) } ; fragment WS : ( ' ' | '\t' | ( '\n' | '\r\n' | '\r' ) )+ { $channel = HIDDEN } ; python3-antlr3-3.5.2/tests/t038lexerRuleLabel.py000066400000000000000000000012311324200532700213630ustar00rootroot00000000000000import antlr3 import testbase import unittest class t038lexerRuleLabel(testbase.ANTLRTest): def setUp(self): self.compileGrammar() def lexerClass(self, base): class TLexer(base): def recover(self, input, re): # no error recovery yet, just crash! raise return TLexer def testValid1(self): cStream = antlr3.StringStream('a 2') lexer = self.getLexer(cStream) while True: t = lexer.nextToken() if t.type == antlr3.EOF: break print(t) if __name__ == '__main__': unittest.main() python3-antlr3-3.5.2/tests/t039labels.g000066400000000000000000000005121324200532700175160ustar00rootroot00000000000000grammar t039labels; options { language = Python3; } a returns [l] : ids+=A ( ',' ids+=(A|B) )* C D w=. ids+=. F EOF { l = ($ids, $w) } ; A: 'a'..'z'; B: '0'..'9'; C: a='A' { print($a) }; D: a='FOOBAR' { print($a) }; E: 'GNU' a=. { print($a) }; F: 'BLARZ' a=EOF { print($a) }; WS: ' '+ { $channel = HIDDEN }; python3-antlr3-3.5.2/tests/t039labels.py000066400000000000000000000024401324200532700177220ustar00rootroot00000000000000import antlr3 import testbase import unittest class t039labels(testbase.ANTLRTest): def setUp(self): self.compileGrammar() def lexerClass(self, base): class TLexer(base): def recover(self, input, re): # no error recovery yet, just crash! raise return TLexer def parserClass(self, base): class TParser(base): def recover(self, input, re): # no error recovery yet, just crash! raise return TParser def testValid1(self): cStream = antlr3.StringStream( 'a, b, c, 1, 2 A FOOBAR GNU1 A BLARZ' ) lexer = self.getLexer(cStream) tStream = antlr3.CommonTokenStream(lexer) parser = self.getParser(tStream) ids, w = parser.a() self.assertEqual(len(ids), 6, ids) self.assertEqual(ids[0].text, 'a', ids[0]) self.assertEqual(ids[1].text, 'b', ids[1]) self.assertEqual(ids[2].text, 'c', ids[2]) self.assertEqual(ids[3].text, '1', ids[3]) self.assertEqual(ids[4].text, '2', ids[4]) self.assertEqual(ids[5].text, 'A', ids[5]) self.assertEqual(w.text, 'GNU1', w) if __name__ == '__main__': unittest.main() python3-antlr3-3.5.2/tests/t040bug80.g000066400000000000000000000003221324200532700171700ustar00rootroot00000000000000lexer grammar t040bug80; options { language = Python3; } ID_LIKE : 'defined' | {False}? Identifier | Identifier ; fragment Identifier: 'a'..'z'+ ; // with just 'a', output compiles python3-antlr3-3.5.2/tests/t040bug80.py000066400000000000000000000012231324200532700173730ustar00rootroot00000000000000import antlr3 import testbase import unittest class t040bug80(testbase.ANTLRTest): def setUp(self): self.compileGrammar() def lexerClass(self, base): class TLexer(base): def recover(self, input, re): # no error recovery yet, just crash! raise return TLexer def testValid1(self): cStream = antlr3.StringStream('defined') lexer = self.getLexer(cStream) while True: t = lexer.nextToken() if t.type == antlr3.EOF: break print(t) if __name__ == '__main__': unittest.main() python3-antlr3-3.5.2/tests/t041parameters.g000066400000000000000000000003511324200532700204110ustar00rootroot00000000000000grammar t041parameters; options { language = Python3; } a[arg1, arg2] returns [l] : A+ EOF { l = ($arg1, $arg2) $arg1 = "gnarz" } ; A: 'a'..'z'; WS: ' '+ { $channel = HIDDEN }; python3-antlr3-3.5.2/tests/t041parameters.py000066400000000000000000000016271324200532700206220ustar00rootroot00000000000000import antlr3 import testbase import unittest class t041parameters(testbase.ANTLRTest): def setUp(self): self.compileGrammar() def lexerClass(self, base): class TLexer(base): def recover(self, input, re): # no error recovery yet, just crash! raise return TLexer def parserClass(self, base): class TParser(base): def recover(self, input, re): # no error recovery yet, just crash! raise return TParser def testValid1(self): cStream = antlr3.StringStream('a a a') lexer = self.getLexer(cStream) tStream = antlr3.CommonTokenStream(lexer) parser = self.getParser(tStream) r = parser.a('foo', 'bar') self.assertEqual(r, ('foo', 'bar')) if __name__ == '__main__': unittest.main() python3-antlr3-3.5.2/tests/t042ast.g000066400000000000000000000121521324200532700170400ustar00rootroot00000000000000grammar t042ast; options { language = Python3; output = AST; } tokens { VARDEF; FLOAT; EXPR; BLOCK; VARIABLE; FIELD; CALL; INDEX; FIELDACCESS; } @init { self.flag = False } r1 : INT ('+'^ INT)* ; r2 : 'assert'^ x=expression (':'! y=expression)? ';'! ; r3 : 'if'^ expression s1=statement ('else'! s2=statement)? ; r4 : 'while'^ expression statement ; r5 : 'return'^ expression? ';'! ; r6 : (INT|ID)+ ; r7 : INT -> ; r8 : 'var' ID ':' type -> ^('var' type ID) ; r9 : type ID ';' -> ^(VARDEF type ID) ; r10 : INT -> {CommonTree(CommonToken(type=FLOAT, text=$INT.text + ".0"))} ; r11 : expression -> ^(EXPR expression) | -> EXPR ; r12 : ID (',' ID)* -> ID+ ; r13 : type ID (',' ID)* ';' -> ^(type ID+) ; r14 : expression? statement* type+ -> ^(EXPR expression? statement* type+) ; r15 : INT -> INT INT ; r16 : 'int' ID (',' ID)* -> ^('int' ID)+ ; r17 : 'for' '(' start=statement ';' expression ';' next=statement ')' statement -> ^('for' $start expression $next statement) ; r18 : t='for' -> ^(BLOCK) ; r19 : t='for' -> ^(BLOCK[$t]) ; r20 : t='for' -> ^(BLOCK[$t,"FOR"]) ; r21 : t='for' -> BLOCK ; r22 : t='for' -> BLOCK[$t] ; r23 : t='for' -> BLOCK[$t,"FOR"] ; r24 : r=statement expression -> ^($r expression) ; r25 : r+=statement (',' r+=statement)+ expression -> ^($r expression) ; r26 : r+=statement (',' r+=statement)+ -> ^(BLOCK $r+) ; r27 : r=statement expression -> ^($r ^($r expression)) ; r28 : ('foo28a'|'foo28b') -> ; r29 : (r+=statement)* -> ^(BLOCK $r+) ; r30 : statement* -> ^(BLOCK statement?) ; r31 : modifier type ID ('=' expression)? ';' -> {self.flag == 0}? ^(VARDEF ID modifier* type expression?) -> {self.flag == 1}? ^(VARIABLE ID modifier* type expression?) -> ^(FIELD ID modifier* type expression?) ; r32[which] : ID INT -> {which==1}? ID -> {which==2}? INT -> // yield nothing as else-clause ; r33 : modifiers! statement ; r34 : modifiers! r34a[$modifiers.tree] //| modifiers! r33b[$modifiers.tree] ; r34a[mod] : 'class' ID ('extends' sup=type)? ( 'implements' i+=type (',' i+=type)*)? '{' statement* '}' -> ^('class' ID {$mod} ^('extends' $sup)? ^('implements' $i+)? statement* ) ; r35 : '{' 'extends' (sup=type)? '}' -> ^('extends' $sup)? ; r36 : 'if' '(' expression ')' s1=statement ( 'else' s2=statement -> ^('if' ^(EXPR expression) $s1 $s2) | -> ^('if' ^(EXPR expression) $s1) ) ; r37 : (INT -> INT) ('+' i=INT -> ^('+' $r37 $i) )* ; r38 : INT ('+'^ INT)* ; r39 : (primary->primary) // set return tree to just primary ( '(' arg=expression ')' -> ^(CALL $r39 $arg) | '[' ie=expression ']' -> ^(INDEX $r39 $ie) | '.' p=primary -> ^(FIELDACCESS $r39 $p) )* ; r40 : (INT -> INT) ( ('+' i+=INT)* -> ^('+' $r40 $i*) ) ';' ; r41 : (INT -> INT) ( ('+' i=INT) -> ^($i $r41) )* ';' ; r42 : ids+=ID (','! ids+=ID)* ; r43 returns [res] : ids+=ID! (','! ids+=ID!)* {$res = [id.text for id in $ids]} ; r44 : ids+=ID^ (','! ids+=ID^)* ; r45 : primary^ ; r46 returns [res] : ids+=primary! (','! ids+=primary!)* {$res = [id.text for id in $ids]} ; r47 : ids+=primary (','! ids+=primary)* ; r48 : ids+=. (','! ids+=.)* ; r49 : .^ ID ; r50 : ID -> ^({CommonTree(CommonToken(type=FLOAT, text="1.0"))} ID) ; /** templates tested: tokenLabelPropertyRef_tree */ r51 returns [res] : ID t=ID ID { $res = $t.tree } ; /** templates tested: rulePropertyRef_tree */ r52 returns [res] @after { $res = $tree } : ID ; /** templates tested: ruleLabelPropertyRef_tree */ r53 returns [res] : t=primary { $res = $t.tree } ; /** templates tested: ruleSetPropertyRef_tree */ r54 returns [res] @after { $tree = $t.tree; } : ID t=expression ID ; /** backtracking */ r55 options { backtrack=true; k=1; } : (modifier+ INT)=> modifier+ expression | modifier+ statement ; /** templates tested: rewriteTokenRef with len(args)>0 */ r56 : t=ID* -> ID[$t,'foo'] ; /** templates tested: rewriteTokenRefRoot with len(args)>0 */ r57 : t=ID* -> ^(ID[$t,'foo']) ; /** templates tested: ??? */ r58 : ({CommonTree(CommonToken(type=FLOAT, text="2.0"))})^ ; /** templates tested: rewriteTokenListLabelRefRoot */ r59 : (t+=ID)+ statement -> ^($t statement)+ ; primary : ID ; expression : r1 ; statement : 'fooze' | 'fooze2' ; modifiers : modifier+ ; modifier : 'public' | 'private' ; type : 'int' | 'bool' ; ID : 'a'..'z' + ; INT : '0'..'9' +; WS: (' ' | '\n' | '\t')+ {$channel = HIDDEN;}; python3-antlr3-3.5.2/tests/t042ast.py000066400000000000000000000352731324200532700172530ustar00rootroot00000000000000import unittest import textwrap import antlr3 import testbase class t042ast(testbase.ANTLRTest): ## def lexerClass(self, base): ## class TLexer(base): ## def reportError(self, re): ## # no error recovery yet, just crash! ## raise re ## return TLexer def parserClass(self, base): class TParser(base): def recover(self, input, re): # no error recovery yet, just crash! raise return TParser def parse(self, text, method, rArgs=(), **kwargs): self.compileGrammar() #options='-trace') cStream = antlr3.StringStream(text) self.lexer = self.getLexer(cStream) tStream = antlr3.CommonTokenStream(self.lexer) self.parser = self.getParser(tStream) for attr, val in kwargs.items(): setattr(self.parser, attr, val) return getattr(self.parser, method)(*rArgs) def testR1(self): r = self.parse("1 + 2", 'r1') self.assertEqual( r.tree.toStringTree(), '(+ 1 2)' ) def testR2a(self): r = self.parse("assert 2+3;", 'r2') self.assertEqual( r.tree.toStringTree(), '(assert (+ 2 3))' ) def testR2b(self): r = self.parse("assert 2+3 : 5;", 'r2') self.assertEqual( r.tree.toStringTree(), '(assert (+ 2 3) 5)' ) def testR3a(self): r = self.parse("if 1 fooze", 'r3') self.assertEqual( r.tree.toStringTree(), '(if 1 fooze)' ) def testR3b(self): r = self.parse("if 1 fooze else fooze", 'r3') self.assertEqual( r.tree.toStringTree(), '(if 1 fooze fooze)' ) def testR4a(self): r = self.parse("while 2 fooze", 'r4') self.assertEqual( r.tree.toStringTree(), '(while 2 fooze)' ) def testR5a(self): r = self.parse("return;", 'r5') self.assertEqual( r.tree.toStringTree(), 'return' ) def testR5b(self): r = self.parse("return 2+3;", 'r5') self.assertEqual( r.tree.toStringTree(), '(return (+ 2 3))' ) def testR6a(self): r = self.parse("3", 'r6') self.assertEqual( r.tree.toStringTree(), '3' ) def testR6b(self): r = self.parse("3 a", 'r6') self.assertEqual( r.tree.toStringTree(), '3 a' ) def testR7(self): r = self.parse("3", 'r7') self.assertIsNone(r.tree) def testR8(self): r = self.parse("var foo:bool", 'r8') self.assertEqual( r.tree.toStringTree(), '(var bool foo)' ) def testR9(self): r = self.parse("int foo;", 'r9') self.assertEqual( r.tree.toStringTree(), '(VARDEF int foo)' ) def testR10(self): r = self.parse("10", 'r10') self.assertEqual( r.tree.toStringTree(), '10.0' ) def testR11a(self): r = self.parse("1+2", 'r11') self.assertEqual( r.tree.toStringTree(), '(EXPR (+ 1 2))' ) def testR11b(self): r = self.parse("", 'r11') self.assertEqual( r.tree.toStringTree(), 'EXPR' ) def testR12a(self): r = self.parse("foo", 'r12') self.assertEqual( r.tree.toStringTree(), 'foo' ) def testR12b(self): r = self.parse("foo, bar, gnurz", 'r12') self.assertEqual( r.tree.toStringTree(), 'foo bar gnurz' ) def testR13a(self): r = self.parse("int foo;", 'r13') self.assertEqual( r.tree.toStringTree(), '(int foo)' ) def testR13b(self): r = self.parse("bool foo, bar, gnurz;", 'r13') self.assertEqual( r.tree.toStringTree(), '(bool foo bar gnurz)' ) def testR14a(self): r = self.parse("1+2 int", 'r14') self.assertEqual( r.tree.toStringTree(), '(EXPR (+ 1 2) int)' ) def testR14b(self): r = self.parse("1+2 int bool", 'r14') self.assertEqual( r.tree.toStringTree(), '(EXPR (+ 1 2) int bool)' ) def testR14c(self): r = self.parse("int bool", 'r14') self.assertEqual( r.tree.toStringTree(), '(EXPR int bool)' ) def testR14d(self): r = self.parse("fooze fooze int bool", 'r14') self.assertEqual( r.tree.toStringTree(), '(EXPR fooze fooze int bool)' ) def testR14e(self): r = self.parse("7+9 fooze fooze int bool", 'r14') self.assertEqual( r.tree.toStringTree(), '(EXPR (+ 7 9) fooze fooze int bool)' ) def testR15(self): r = self.parse("7", 'r15') self.assertEqual( r.tree.toStringTree(), '7 7' ) def testR16a(self): r = self.parse("int foo", 'r16') self.assertEqual( r.tree.toStringTree(), '(int foo)' ) def testR16b(self): r = self.parse("int foo, bar, gnurz", 'r16') self.assertEqual( r.tree.toStringTree(), '(int foo) (int bar) (int gnurz)' ) def testR17a(self): r = self.parse("for ( fooze ; 1 + 2 ; fooze ) fooze", 'r17') self.assertEqual( r.tree.toStringTree(), '(for fooze (+ 1 2) fooze fooze)' ) def testR18a(self): r = self.parse("for", 'r18') self.assertEqual( r.tree.toStringTree(), 'BLOCK' ) def testR19a(self): r = self.parse("for", 'r19') self.assertEqual( r.tree.toStringTree(), 'for' ) def testR20a(self): r = self.parse("for", 'r20') self.assertEqual( r.tree.toStringTree(), 'FOR' ) def testR21a(self): r = self.parse("for", 'r21') self.assertEqual( r.tree.toStringTree(), 'BLOCK' ) def testR22a(self): r = self.parse("for", 'r22') self.assertEqual( r.tree.toStringTree(), 'for' ) def testR23a(self): r = self.parse("for", 'r23') self.assertEqual( r.tree.toStringTree(), 'FOR' ) def testR24a(self): r = self.parse("fooze 1 + 2", 'r24') self.assertEqual( r.tree.toStringTree(), '(fooze (+ 1 2))' ) def testR25a(self): r = self.parse("fooze, fooze2 1 + 2", 'r25') self.assertEqual( r.tree.toStringTree(), '(fooze (+ 1 2))' ) def testR26a(self): r = self.parse("fooze, fooze2", 'r26') self.assertEqual( r.tree.toStringTree(), '(BLOCK fooze fooze2)' ) def testR27a(self): r = self.parse("fooze 1 + 2", 'r27') self.assertEqual( r.tree.toStringTree(), '(fooze (fooze (+ 1 2)))' ) def testR28(self): r = self.parse("foo28a", 'r28') self.assertIsNone(r.tree) def testR29(self): self.assertRaises(RuntimeError, self.parse, "", 'r29') # FIXME: broken upstream? ## def testR30(self): ## try: ## r = self.parse("fooze fooze", 'r30') ## self.fail(r.tree.toStringTree()) ## except RuntimeError: ## pass def testR31a(self): r = self.parse("public int gnurz = 1 + 2;", 'r31', flag=0) self.assertEqual( r.tree.toStringTree(), '(VARDEF gnurz public int (+ 1 2))' ) def testR31b(self): r = self.parse("public int gnurz = 1 + 2;", 'r31', flag=1) self.assertEqual( r.tree.toStringTree(), '(VARIABLE gnurz public int (+ 1 2))' ) def testR31c(self): r = self.parse("public int gnurz = 1 + 2;", 'r31', flag=2) self.assertEqual( r.tree.toStringTree(), '(FIELD gnurz public int (+ 1 2))' ) def testR32a(self): r = self.parse("gnurz 32", 'r32', [1], flag=2) self.assertEqual( r.tree.toStringTree(), 'gnurz' ) def testR32b(self): r = self.parse("gnurz 32", 'r32', [2], flag=2) self.assertEqual( r.tree.toStringTree(), '32' ) def testR32c(self): r = self.parse("gnurz 32", 'r32', [3], flag=2) self.assertIsNone(r.tree) def testR33a(self): r = self.parse("public private fooze", 'r33') self.assertEqual( r.tree.toStringTree(), 'fooze' ) def testR34a(self): r = self.parse("public class gnurz { fooze fooze2 }", 'r34') self.assertEqual( r.tree.toStringTree(), '(class gnurz public fooze fooze2)' ) def testR34b(self): r = self.parse("public class gnurz extends bool implements int, bool { fooze fooze2 }", 'r34') self.assertEqual( r.tree.toStringTree(), '(class gnurz public (extends bool) (implements int bool) fooze fooze2)' ) def testR35(self): self.assertRaises(RuntimeError, self.parse, "{ extends }", 'r35') def testR36a(self): r = self.parse("if ( 1 + 2 ) fooze", 'r36') self.assertEqual( r.tree.toStringTree(), '(if (EXPR (+ 1 2)) fooze)' ) def testR36b(self): r = self.parse("if ( 1 + 2 ) fooze else fooze2", 'r36') self.assertEqual( r.tree.toStringTree(), '(if (EXPR (+ 1 2)) fooze fooze2)' ) def testR37(self): r = self.parse("1 + 2 + 3", 'r37') self.assertEqual( r.tree.toStringTree(), '(+ (+ 1 2) 3)' ) def testR38(self): r = self.parse("1 + 2 + 3", 'r38') self.assertEqual( r.tree.toStringTree(), '(+ (+ 1 2) 3)' ) def testR39a(self): r = self.parse("gnurz[1]", 'r39') self.assertEqual( r.tree.toStringTree(), '(INDEX gnurz 1)' ) def testR39b(self): r = self.parse("gnurz(2)", 'r39') self.assertEqual( r.tree.toStringTree(), '(CALL gnurz 2)' ) def testR39c(self): r = self.parse("gnurz.gnarz", 'r39') self.assertEqual( r.tree.toStringTree(), '(FIELDACCESS gnurz gnarz)' ) def testR39d(self): r = self.parse("gnurz.gnarz.gnorz", 'r39') self.assertEqual( r.tree.toStringTree(), '(FIELDACCESS (FIELDACCESS gnurz gnarz) gnorz)' ) def testR40(self): r = self.parse("1 + 2 + 3;", 'r40') self.assertEqual( r.tree.toStringTree(), '(+ 1 2 3)' ) def testR41(self): r = self.parse("1 + 2 + 3;", 'r41') self.assertEqual( r.tree.toStringTree(), '(3 (2 1))' ) def testR42(self): r = self.parse("gnurz, gnarz, gnorz", 'r42') self.assertEqual( r.tree.toStringTree(), 'gnurz gnarz gnorz' ) def testR43(self): r = self.parse("gnurz, gnarz, gnorz", 'r43') self.assertIsNone(r.tree) self.assertEqual( r.res, ['gnurz', 'gnarz', 'gnorz'] ) def testR44(self): r = self.parse("gnurz, gnarz, gnorz", 'r44') self.assertEqual( r.tree.toStringTree(), '(gnorz (gnarz gnurz))' ) def testR45(self): r = self.parse("gnurz", 'r45') self.assertEqual( r.tree.toStringTree(), 'gnurz' ) def testR46(self): r = self.parse("gnurz, gnarz, gnorz", 'r46') self.assertIsNone(r.tree) self.assertEqual( r.res, ['gnurz', 'gnarz', 'gnorz'] ) def testR47(self): r = self.parse("gnurz, gnarz, gnorz", 'r47') self.assertEqual( r.tree.toStringTree(), 'gnurz gnarz gnorz' ) def testR48(self): r = self.parse("gnurz, gnarz, gnorz", 'r48') self.assertEqual( r.tree.toStringTree(), 'gnurz gnarz gnorz' ) def testR49(self): r = self.parse("gnurz gnorz", 'r49') self.assertEqual( r.tree.toStringTree(), '(gnurz gnorz)' ) def testR50(self): r = self.parse("gnurz", 'r50') self.assertEqual( r.tree.toStringTree(), '(1.0 gnurz)' ) def testR51(self): r = self.parse("gnurza gnurzb gnurzc", 'r51') self.assertEqual( r.res.toStringTree(), 'gnurzb' ) def testR52(self): r = self.parse("gnurz", 'r52') self.assertEqual( r.res.toStringTree(), 'gnurz' ) def testR53(self): r = self.parse("gnurz", 'r53') self.assertEqual( r.res.toStringTree(), 'gnurz' ) def testR54(self): r = self.parse("gnurza 1 + 2 gnurzb", 'r54') self.assertEqual( r.tree.toStringTree(), '(+ 1 2)' ) def testR55a(self): r = self.parse("public private 1 + 2", 'r55') self.assertEqual( r.tree.toStringTree(), 'public private (+ 1 2)' ) def testR55b(self): r = self.parse("public fooze", 'r55') self.assertEqual( r.tree.toStringTree(), 'public fooze' ) def testR56(self): r = self.parse("a b c d", 'r56') self.assertEqual( r.tree.toStringTree(), 'foo' ) def testR57(self): r = self.parse("a b c d", 'r57') self.assertEqual( r.tree.toStringTree(), 'foo' ) def testR59(self): r = self.parse("a b c fooze", 'r59') self.assertEqual( r.tree.toStringTree(), '(a fooze) (b fooze) (c fooze)' ) if __name__ == '__main__': unittest.main() python3-antlr3-3.5.2/tests/t043synpred.g000066400000000000000000000001741324200532700177370ustar00rootroot00000000000000grammar t043synpred; options { language = Python3; } a: ((s+ P)=> s+ b)? E; b: P 'foo'; s: S; S: ' '; P: '+'; E: '>'; python3-antlr3-3.5.2/tests/t043synpred.py000066400000000000000000000015431324200532700201420ustar00rootroot00000000000000import antlr3 import testbase import unittest class t043synpred(testbase.ANTLRTest): def setUp(self): self.compileGrammar() def lexerClass(self, base): class TLexer(base): def recover(self, input, re): # no error recovery yet, just crash! raise return TLexer def parserClass(self, base): class TParser(base): def recover(self, input, re): # no error recovery yet, just crash! raise return TParser def testValid1(self): cStream = antlr3.StringStream(' +foo>') lexer = self.getLexer(cStream) tStream = antlr3.CommonTokenStream(lexer) parser = self.getParser(tStream) events = parser.a() if __name__ == '__main__': unittest.main() python3-antlr3-3.5.2/tests/t044trace.g000066400000000000000000000004701324200532700173510ustar00rootroot00000000000000grammar t044trace; options { language = Python3; } @init { self._stack = None } a: '<' ((INT '+')=>b|c) '>'; b: c ('+' c)*; c: INT { if self._stack is None: self._stack = self.getRuleInvocationStack() } ; INT: ('0'..'9')+; WS: (' ' | '\n' | '\t')+ {$channel = HIDDEN;}; python3-antlr3-3.5.2/tests/t044trace.py000066400000000000000000000046551324200532700175640ustar00rootroot00000000000000import antlr3 import testbase import unittest class T(testbase.ANTLRTest): def setUp(self): self.compileGrammar(options='-trace') def lexerClass(self, base): class TLexer(base): def __init__(self, *args, **kwargs): super().__init__(*args, **kwargs) self.traces = [] def traceIn(self, ruleName, ruleIndex): self.traces.append('>'+ruleName) def traceOut(self, ruleName, ruleIndex): self.traces.append('<'+ruleName) def recover(self, input, re): # no error recovery yet, just crash! raise return TLexer def parserClass(self, base): class TParser(base): def __init__(self, *args, **kwargs): super().__init__(*args, **kwargs) self.traces = [] def traceIn(self, ruleName, ruleIndex): self.traces.append('>'+ruleName) def traceOut(self, ruleName, ruleIndex): self.traces.append('<'+ruleName) def recover(self, input, re): # no error recovery yet, just crash! raise def getRuleInvocationStack(self): return self._getRuleInvocationStack(base.__module__) return TParser def testTrace(self): cStream = antlr3.StringStream('< 1 + 2 + 3 >') lexer = self.getLexer(cStream) tStream = antlr3.CommonTokenStream(lexer) parser = self.getParser(tStream) parser.a() self.assertEqual( lexer.traces, [ '>T__7', 'WS', 'INT', 'WS', 'T__6', 'WS', 'INT', 'WS', 'T__6', 'WS', 'INT', 'WS', 'T__8', 'a', '>synpred1_t044trace_fragment', 'b', '>c', 'c', 'c', '') lexer = self.getLexer(cStream) tStream = antlr3.CommonTokenStream(lexer) parser = self.getParser(tStream) parser.a() self.assertEqual(parser._stack, ['a', 'b', 'c']) if __name__ == '__main__': unittest.main() python3-antlr3-3.5.2/tests/t045dfabug.g000066400000000000000000000007131324200532700175040ustar00rootroot00000000000000grammar t045dfabug; options { language = Python3; output = AST; } // this rule used to generate an infinite loop in DFA.predict r options { backtrack=true; } : (modifier+ INT)=> modifier+ expression | modifier+ statement ; expression : INT '+' INT ; statement : 'fooze' | 'fooze2' ; modifier : 'public' | 'private' ; ID : 'a'..'z' + ; INT : '0'..'9' +; WS: (' ' | '\n' | '\t')+ {$channel = HIDDEN;}; python3-antlr3-3.5.2/tests/t045dfabug.py000066400000000000000000000006311324200532700177050ustar00rootroot00000000000000import unittest import textwrap import antlr3 import testbase class T(testbase.ANTLRTest): def testbug(self): self.compileGrammar() cStream = antlr3.StringStream("public fooze") lexer = self.getLexer(cStream) tStream = antlr3.CommonTokenStream(lexer) parser = self.getParser(tStream) parser.r() if __name__ == '__main__': unittest.main() python3-antlr3-3.5.2/tests/t046rewrite.g000066400000000000000000000015011324200532700177320ustar00rootroot00000000000000grammar t046rewrite; options { language=Python3; } program @init { start = self.input.LT(1) } : method+ { self.input.insertBefore(start,"public class Wrapper {\n") self.input.insertAfter($method.stop, "\n}\n") } ; method : m='method' ID '(' ')' body {self.input.replace($m, "public void");} ; body scope { decls } @init { $body::decls = set() } : lcurly='{' stat* '}' { for it in $body::decls: self.input.insertAfter($lcurly, "\nint "+it+";") } ; stat: ID '=' expr ';' {$body::decls.add($ID.text);} ; expr: mul ('+' mul)* ; mul : atom ('*' atom)* ; atom: ID | INT ; ID : ('a'..'z'|'A'..'Z')+ ; INT : ('0'..'9')+ ; WS : (' '|'\t'|'\n')+ {$channel=HIDDEN;} ; python3-antlr3-3.5.2/tests/t046rewrite.py000066400000000000000000000017501324200532700201420ustar00rootroot00000000000000import unittest import textwrap import antlr3 import testbase class T(testbase.ANTLRTest): def testRewrite(self): self.compileGrammar() input = textwrap.dedent( '''\ method foo() { i = 3; k = i; i = k*4; } method bar() { j = i*2; } ''') cStream = antlr3.StringStream(input) lexer = self.getLexer(cStream) tStream = antlr3.TokenRewriteStream(lexer) parser = self.getParser(tStream) parser.program() expectedOutput = textwrap.dedent('''\ public class Wrapper { public void foo() { int k; int i; i = 3; k = i; i = k*4; } public void bar() { int j; j = i*2; } } ''') self.assertEqual(str(tStream), expectedOutput) if __name__ == '__main__': unittest.main() python3-antlr3-3.5.2/tests/t047treeparser.g000066400000000000000000000032231324200532700204310ustar00rootroot00000000000000grammar t047treeparser; options { language=Python3; output=AST; } tokens { VAR_DEF; ARG_DEF; FUNC_HDR; FUNC_DECL; FUNC_DEF; BLOCK; } program : declaration+ ; declaration : variable | functionHeader ';' -> ^(FUNC_DECL functionHeader) | functionHeader block -> ^(FUNC_DEF functionHeader block) ; variable : type declarator ';' -> ^(VAR_DEF type declarator) ; declarator : ID ; functionHeader : type ID '(' ( formalParameter ( ',' formalParameter )* )? ')' -> ^(FUNC_HDR type ID formalParameter+) ; formalParameter : type declarator -> ^(ARG_DEF type declarator) ; type : 'int' | 'char' | 'void' | ID ; block : lc='{' variable* stat* '}' -> ^(BLOCK[$lc,"BLOCK"] variable* stat*) ; stat: forStat | expr ';'! | block | assignStat ';'! | ';'! ; forStat : 'for' '(' start=assignStat ';' expr ';' next=assignStat ')' block -> ^('for' $start expr $next block) ; assignStat : ID EQ expr -> ^(EQ ID expr) ; expr: condExpr ; condExpr : aexpr ( ('=='^ | '<'^) aexpr )? ; aexpr : atom ( '+'^ atom )* ; atom : ID | INT | '(' expr ')' -> expr ; FOR : 'for' ; INT_TYPE : 'int' ; CHAR: 'char'; VOID: 'void'; ID : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')* ; INT : ('0'..'9')+ ; EQ : '=' ; EQEQ : '==' ; LT : '<' ; PLUS : '+' ; WS : ( ' ' | '\t' | '\r' | '\n' )+ { $channel=HIDDEN } ; python3-antlr3-3.5.2/tests/t047treeparser.py000066400000000000000000000103471324200532700206400ustar00rootroot00000000000000import unittest import textwrap import antlr3 import antlr3.tree import testbase class T(testbase.ANTLRTest): def walkerClass(self, base): class TWalker(base): def __init__(self, *args, **kwargs): super().__init__(*args, **kwargs) self.traces = [] def traceIn(self, ruleName, ruleIndex): self.traces.append('>'+ruleName) def traceOut(self, ruleName, ruleIndex): self.traces.append('<'+ruleName) def recover(self, input, re): # no error recovery yet, just crash! raise return TWalker def setUp(self): self.compileGrammar() self.compileGrammar('t047treeparserWalker.g', options='-trace') def testWalker(self): input = textwrap.dedent( '''\ char c; int x; void bar(int x); int foo(int y, char d) { int i; for (i=0; i<3; i=i+1) { x=3; y=5; } } ''') cStream = antlr3.StringStream(input) lexer = self.getLexer(cStream) tStream = antlr3.CommonTokenStream(lexer) parser = self.getParser(tStream) r = parser.program() self.assertEqual( r.tree.toStringTree(), "(VAR_DEF char c) (VAR_DEF int x) (FUNC_DECL (FUNC_HDR void bar (ARG_DEF int x))) (FUNC_DEF (FUNC_HDR int foo (ARG_DEF int y) (ARG_DEF char d)) (BLOCK (VAR_DEF int i) (for (= i 0) (< i 3) (= i (+ i 1)) (BLOCK (= x 3) (= y 5)))))" ) nodes = antlr3.tree.CommonTreeNodeStream(r.tree) nodes.setTokenStream(tStream) walker = self.getWalker(nodes) walker.program() # FIXME: need to crosscheck with Java target (compile walker with # -trace option), if this is the real list. For now I'm happy that # it does not crash ;) self.assertEqual( walker.traces, [ '>program', '>declaration', '>variable', '>type', 'declarator', 'declaration', '>variable', '>type', 'declarator', 'declaration', '>functionHeader', '>type', 'formalParameter', '>type', 'declarator', 'declaration', '>functionHeader', '>type', 'formalParameter', '>type', 'declarator', 'formalParameter', '>type', 'declarator', 'block', '>variable', '>type', 'declarator', 'stat', '>forStat', '>expr', '>expr', '>atom', 'expr', '>expr', '>atom', 'expr', '>atom', 'expr', '>expr', '>expr', '>atom', 'expr', '>atom', 'block', '>stat', '>expr', '>expr', '>atom', 'stat', '>expr', '>expr', '>atom', ' within boundaries of ' r'previous '), tokens.toString) def testInsertThenReplaceSameIndex(self): tokens = self._parse("abc") tokens.insertBefore(0, "0") tokens.replace(0, "x") # supercedes insert at 0 result = tokens.toString() expecting = "0xbc" self.assertEqual(result, expecting) def test2InsertMiddleIndex(self): tokens = self._parse("abc") tokens.insertBefore(1, "x") tokens.insertBefore(1, "y") result = tokens.toString() expecting = "ayxbc" self.assertEqual(result, expecting) def test2InsertThenReplaceIndex0(self): tokens = self._parse("abc") tokens.insertBefore(0, "x") tokens.insertBefore(0, "y") tokens.replace(0, "z") result = tokens.toString() expecting = "yxzbc" self.assertEqual(result, expecting) def testReplaceThenInsertBeforeLastIndex(self): tokens = self._parse("abc") tokens.replace(2, "x") tokens.insertBefore(2, "y") result = tokens.toString() expecting = "abyx" self.assertEqual(result, expecting) def testInsertThenReplaceLastIndex(self): tokens = self._parse("abc") tokens.insertBefore(2, "y") tokens.replace(2, "x") result = tokens.toString() expecting = "abyx" self.assertEqual(result, expecting) def testReplaceThenInsertAfterLastIndex(self): tokens = self._parse("abc") tokens.replace(2, "x") tokens.insertAfter(2, "y") result = tokens.toString() expecting = "abxy" self.assertEqual(result, expecting) def testReplaceRangeThenInsertAtLeftEdge(self): tokens = self._parse("abcccba") tokens.replace(2, 4, "x") tokens.insertBefore(2, "y") result = tokens.toString() expecting = "abyxba" self.assertEqual(result, expecting) def testReplaceRangeThenInsertAtRightEdge(self): tokens = self._parse("abcccba") tokens.replace(2, 4, "x") tokens.insertBefore(4, "y") # no effect; within range of a replace self.assertRaisesRegex( ValueError, (r'insert op within boundaries of ' r'previous '), tokens.toString) def testReplaceRangeThenInsertAfterRightEdge(self): tokens = self._parse("abcccba") tokens.replace(2, 4, "x") tokens.insertAfter(4, "y") result = tokens.toString() expecting = "abxyba" self.assertEqual(result, expecting) def testReplaceAll(self): tokens = self._parse("abcccba") tokens.replace(0, 6, "x") result = tokens.toString() expecting = "x" self.assertEqual(result, expecting) def testReplaceSubsetThenFetch(self): tokens = self._parse("abcccba") tokens.replace(2, 4, "xyz") result = tokens.toString(0, 6) expecting = "abxyzba" self.assertEqual(result, expecting) def testReplaceThenReplaceSuperset(self): tokens = self._parse("abcccba") tokens.replace(2, 4, "xyz") tokens.replace(3, 5, "foo") # overlaps, error self.assertRaisesRegex( ValueError, (r'replace op boundaries of overlap ' r'with previous '), tokens.toString) def testReplaceThenReplaceLowerIndexedSuperset(self): tokens = self._parse("abcccba") tokens.replace(2, 4, "xyz") tokens.replace(1, 3, "foo") # overlap, error self.assertRaisesRegex( ValueError, (r'replace op boundaries of overlap ' r'with previous '), tokens.toString) def testReplaceSingleMiddleThenOverlappingSuperset(self): tokens = self._parse("abcba") tokens.replace(2, 2, "xyz") tokens.replace(0, 3, "foo") result = tokens.toString() expecting = "fooa" self.assertEqual(result, expecting) def testCombineInserts(self): tokens = self._parse("abc") tokens.insertBefore(0, "x") tokens.insertBefore(0, "y") result = tokens.toString() expecting = "yxabc" self.assertEqual(expecting, result) def testCombine3Inserts(self): tokens = self._parse("abc") tokens.insertBefore(1, "x") tokens.insertBefore(0, "y") tokens.insertBefore(1, "z") result = tokens.toString() expecting = "yazxbc" self.assertEqual(expecting, result) def testCombineInsertOnLeftWithReplace(self): tokens = self._parse("abc") tokens.replace(0, 2, "foo") tokens.insertBefore(0, "z") # combine with left edge of rewrite result = tokens.toString() expecting = "zfoo" self.assertEqual(expecting, result) def testCombineInsertOnLeftWithDelete(self): tokens = self._parse("abc") tokens.delete(0, 2) tokens.insertBefore(0, "z") # combine with left edge of rewrite result = tokens.toString() expecting = "z" # make sure combo is not znull self.assertEqual(expecting, result) def testDisjointInserts(self): tokens = self._parse("abc") tokens.insertBefore(1, "x") tokens.insertBefore(2, "y") tokens.insertBefore(0, "z") result = tokens.toString() expecting = "zaxbyc" self.assertEqual(expecting, result) def testOverlappingReplace(self): tokens = self._parse("abcc") tokens.replace(1, 2, "foo") tokens.replace(0, 3, "bar") # wipes prior nested replace result = tokens.toString() expecting = "bar" self.assertEqual(expecting, result) def testOverlappingReplace2(self): tokens = self._parse("abcc") tokens.replace(0, 3, "bar") tokens.replace(1, 2, "foo") # cannot split earlier replace self.assertRaisesRegex( ValueError, (r'replace op boundaries of overlap ' r'with previous '), tokens.toString) def testOverlappingReplace3(self): tokens = self._parse("abcc") tokens.replace(1, 2, "foo") tokens.replace(0, 2, "bar") # wipes prior nested replace result = tokens.toString() expecting = "barc" self.assertEqual(expecting, result) def testOverlappingReplace4(self): tokens = self._parse("abcc") tokens.replace(1, 2, "foo") tokens.replace(1, 3, "bar") # wipes prior nested replace result = tokens.toString() expecting = "abar" self.assertEqual(expecting, result) def testDropIdenticalReplace(self): tokens = self._parse("abcc") tokens.replace(1, 2, "foo") tokens.replace(1, 2, "foo") # drop previous, identical result = tokens.toString() expecting = "afooc" self.assertEqual(expecting, result) def testDropPrevCoveredInsert(self): tokens = self._parse("abc") tokens.insertBefore(1, "foo") tokens.replace(1, 2, "foo") # kill prev insert result = tokens.toString() expecting = "afoofoo" self.assertEqual(expecting, result) def testLeaveAloneDisjointInsert(self): tokens = self._parse("abcc") tokens.insertBefore(1, "x") tokens.replace(2, 3, "foo") result = tokens.toString() expecting = "axbfoo" self.assertEqual(expecting, result) def testLeaveAloneDisjointInsert2(self): tokens = self._parse("abcc") tokens.replace(2, 3, "foo") tokens.insertBefore(1, "x") result = tokens.toString() expecting = "axbfoo" self.assertEqual(expecting, result) def testInsertBeforeTokenThenDeleteThatToken(self): tokens = self._parse("abc") tokens.insertBefore(2, "y") tokens.delete(2) result = tokens.toString() expecting = "aby" self.assertEqual(expecting, result) class T2(testbase.ANTLRTest): def setUp(self): self.compileGrammar('t048rewrite2.g') def _parse(self, input): cStream = antlr3.StringStream(input) lexer = self.getLexer(cStream) tStream = antlr3.TokenRewriteStream(lexer) tStream.fillBuffer() return tStream def testToStringStartStop(self): # Tokens: 0123456789 # Input: x = 3 * 0 tokens = self._parse("x = 3 * 0;") tokens.replace(4, 8, "0") # replace 3 * 0 with 0 result = tokens.toOriginalString() expecting = "x = 3 * 0;" self.assertEqual(expecting, result) result = tokens.toString() expecting = "x = 0;" self.assertEqual(expecting, result) result = tokens.toString(0, 9) expecting = "x = 0;" self.assertEqual(expecting, result) result = tokens.toString(4, 8) expecting = "0" self.assertEqual(expecting, result) def testToStringStartStop2(self): # Tokens: 012345678901234567 # Input: x = 3 * 0 + 2 * 0 tokens = self._parse("x = 3 * 0 + 2 * 0;") result = tokens.toOriginalString() expecting = "x = 3 * 0 + 2 * 0;" self.assertEqual(expecting, result) tokens.replace(4, 8, "0") # replace 3 * 0 with 0 result = tokens.toString() expecting = "x = 0 + 2 * 0;" self.assertEqual(expecting, result) result = tokens.toString(0, 17) expecting = "x = 0 + 2 * 0;" self.assertEqual(expecting, result) result = tokens.toString(4, 8) expecting = "0" self.assertEqual(expecting, result) result = tokens.toString(0, 8) expecting = "x = 0" self.assertEqual(expecting, result) result = tokens.toString(12, 16) expecting = "2 * 0" self.assertEqual(expecting, result) tokens.insertAfter(17, "// comment") result = tokens.toString(12, 18) expecting = "2 * 0;// comment" self.assertEqual(expecting, result) result = tokens.toString(0, 8) # try again after insert at end expecting = "x = 0" self.assertEqual(expecting, result) if __name__ == '__main__': unittest.main() python3-antlr3-3.5.2/tests/t048rewrite2.g000066400000000000000000000002341324200532700200200ustar00rootroot00000000000000lexer grammar t048rewrite2; options { language=Python3; } ID : 'a'..'z'+; INT : '0'..'9'+; SEMI : ';'; PLUS : '+'; MUL : '*'; ASSIGN : '='; WS : ' '+; python3-antlr3-3.5.2/tests/t049treeparser.py000066400000000000000000000310471324200532700206420ustar00rootroot00000000000000import unittest import textwrap import antlr3 import antlr3.tree import testbase class T(testbase.ANTLRTest): def walkerClass(self, base): class TWalker(base): def __init__(self, *args, **kwargs): super().__init__(*args, **kwargs) self._output = "" def capture(self, t): self._output += t def traceIn(self, ruleName, ruleIndex): self.traces.append('>'+ruleName) def traceOut(self, ruleName, ruleIndex): self.traces.append('<'+ruleName) def recover(self, input, re): # no error recovery yet, just crash! raise return TWalker def execTreeParser(self, grammar, grammarEntry, treeGrammar, treeEntry, input): lexerCls, parserCls = self.compileInlineGrammar(grammar) walkerCls = self.compileInlineGrammar(treeGrammar) cStream = antlr3.StringStream(input) lexer = lexerCls(cStream) tStream = antlr3.CommonTokenStream(lexer) parser = parserCls(tStream) r = getattr(parser, grammarEntry)() nodes = antlr3.tree.CommonTreeNodeStream(r.tree) nodes.setTokenStream(tStream) walker = walkerCls(nodes) getattr(walker, treeEntry)() return walker._output def testFlatList(self): grammar = textwrap.dedent( r'''grammar T; options { language=Python3; output=AST; } a : ID INT; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN;} ; ''') treeGrammar = textwrap.dedent( r'''tree grammar TP; options { language=Python3; ASTLabelType=CommonTree; } a : ID INT {self.capture("{}, {}".format($ID, $INT))} ; ''') found = self.execTreeParser( grammar, 'a', treeGrammar, 'a', "abc 34" ) self.assertEqual("abc, 34", found) def testSimpleTree(self): grammar = textwrap.dedent( r'''grammar T; options { language=Python3; output=AST; } a : ID INT -> ^(ID INT); ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\\n') {$channel=HIDDEN;} ; ''') treeGrammar = textwrap.dedent( r'''tree grammar TP; options { language=Python3; ASTLabelType=CommonTree; } a : ^(ID INT) {self.capture(str($ID)+", "+str($INT))} ; ''') found = self.execTreeParser( grammar, 'a', treeGrammar, 'a', "abc 34" ) self.assertEqual("abc, 34", found) def testFlatVsTreeDecision(self): grammar = textwrap.dedent( r'''grammar T; options { language=Python3; output=AST; } a : b c ; b : ID INT -> ^(ID INT); c : ID INT; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\\n') {$channel=HIDDEN;} ; ''') treeGrammar = textwrap.dedent( r'''tree grammar TP; options { language=Python3; ASTLabelType=CommonTree; } a : b b ; b : ID INT {self.capture(str($ID)+" "+str($INT)+'\n')} | ^(ID INT) {self.capture("^("+str($ID)+" "+str($INT)+')');} ; ''') found = self.execTreeParser( grammar, 'a', treeGrammar, 'a', "a 1 b 2" ) self.assertEqual("^(a 1)b 2\n", found) def testFlatVsTreeDecision2(self): grammar = textwrap.dedent( r"""grammar T; options { language=Python3; output=AST; } a : b c ; b : ID INT+ -> ^(ID INT+); c : ID INT+; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN;} ; """) treeGrammar = textwrap.dedent( r'''tree grammar TP; options { language=Python3; ASTLabelType=CommonTree; } a : b b ; b : ID INT+ {self.capture(str($ID)+" "+str($INT)+"\n")} | ^(x=ID (y=INT)+) {self.capture("^("+str($x)+' '+str($y)+')')} ; ''') found = self.execTreeParser( grammar, 'a', treeGrammar, 'a', "a 1 2 3 b 4 5" ) self.assertEqual("^(a 3)b 5\n", found) def testCyclicDFALookahead(self): grammar = textwrap.dedent( r'''grammar T; options { language=Python3; output=AST; } a : ID INT+ PERIOD; ID : 'a'..'z'+ ; INT : '0'..'9'+; SEMI : ';' ; PERIOD : '.' ; WS : (' '|'\n') {$channel=HIDDEN;} ; ''') treeGrammar = textwrap.dedent( r'''tree grammar TP; options { language=Python3; ASTLabelType=CommonTree; } a : ID INT+ PERIOD {self.capture("alt 1")} | ID INT+ SEMI {self.capture("alt 2")} ; ''') found = self.execTreeParser( grammar, 'a', treeGrammar, 'a', "a 1 2 3." ) self.assertEqual("alt 1", found) def testNullableChildList(self): grammar = textwrap.dedent( r'''grammar T; options { language=Python3; output=AST; } a : ID INT? -> ^(ID INT?); ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\\n') {$channel=HIDDEN;} ; ''') treeGrammar = textwrap.dedent( r'''tree grammar TP; options { language=Python3; ASTLabelType=CommonTree; } a : ^(ID INT?) {self.capture(str($ID))} ; ''') found = self.execTreeParser( grammar, 'a', treeGrammar, 'a', "abc" ) self.assertEqual("abc", found) def testNullableChildList2(self): grammar = textwrap.dedent( r'''grammar T; options { language=Python3; output=AST; } a : ID INT? SEMI -> ^(ID INT?) SEMI ; ID : 'a'..'z'+ ; INT : '0'..'9'+; SEMI : ';' ; WS : (' '|'\n') {$channel=HIDDEN;} ; ''') treeGrammar = textwrap.dedent( r'''tree grammar TP; options { language=Python3; ASTLabelType=CommonTree; } a : ^(ID INT?) SEMI {self.capture(str($ID))} ; ''') found = self.execTreeParser( grammar, 'a', treeGrammar, 'a', "abc;" ) self.assertEqual("abc", found) def testNullableChildList3(self): grammar = textwrap.dedent( r'''grammar T; options { language=Python3; output=AST; } a : x=ID INT? (y=ID)? SEMI -> ^($x INT? $y?) SEMI ; ID : 'a'..'z'+ ; INT : '0'..'9'+; SEMI : ';' ; WS : (' '|'\\n') {$channel=HIDDEN;} ; ''') treeGrammar = textwrap.dedent( r'''tree grammar TP; options { language=Python3; ASTLabelType=CommonTree; } a : ^(ID INT? b) SEMI {self.capture(str($ID)+", "+str($b.text))} ; b : ID? ; ''') found = self.execTreeParser( grammar, 'a', treeGrammar, 'a', "abc def;" ) self.assertEqual("abc, def", found) def testActionsAfterRoot(self): grammar = textwrap.dedent( r'''grammar T; options { language=Python3; output=AST; } a : x=ID INT? SEMI -> ^($x INT?) ; ID : 'a'..'z'+ ; INT : '0'..'9'+; SEMI : ';' ; WS : (' '|'\n') {$channel=HIDDEN;} ; ''') treeGrammar = textwrap.dedent( r'''tree grammar TP; options { language=Python3; ASTLabelType=CommonTree; } a @init {x=0} : ^(ID {x=1} {x=2} INT?) {self.capture(str($ID)+", "+str(x))} ; ''') found = self.execTreeParser( grammar, 'a', treeGrammar, 'a', "abc;" ) self.assertEqual("abc, 2", found) def testWildcardLookahead(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3; output=AST;} a : ID '+'^ INT; ID : 'a'..'z'+ ; INT : '0'..'9'+; SEMI : ';' ; PERIOD : '.' ; WS : (' '|'\n') {$channel=HIDDEN;} ; ''') treeGrammar = textwrap.dedent( r''' tree grammar TP; options {language=Python3; tokenVocab=T; ASTLabelType=CommonTree;} a : ^('+' . INT) { self.capture("alt 1") } ; ''') found = self.execTreeParser( grammar, 'a', treeGrammar, 'a', "a + 2") self.assertEqual("alt 1", found) def testWildcardLookahead2(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3; output=AST;} a : ID '+'^ INT; ID : 'a'..'z'+ ; INT : '0'..'9'+; SEMI : ';' ; PERIOD : '.' ; WS : (' '|'\n') {$channel=HIDDEN;} ; ''') treeGrammar = textwrap.dedent( r''' tree grammar TP; options {language=Python3; tokenVocab=T; ASTLabelType=CommonTree;} a : ^('+' . INT) { self.capture("alt 1") } | ^('+' . .) { self.capture("alt 2") } ; ''') # AMBIG upon '+' DOWN INT UP etc.. but so what. found = self.execTreeParser( grammar, 'a', treeGrammar, 'a', "a + 2") self.assertEqual("alt 1", found) def testWildcardLookahead3(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3; output=AST;} a : ID '+'^ INT; ID : 'a'..'z'+ ; INT : '0'..'9'+; SEMI : ';' ; PERIOD : '.' ; WS : (' '|'\n') {$channel=HIDDEN;} ; ''') treeGrammar = textwrap.dedent( r''' tree grammar TP; options {language=Python3; tokenVocab=T; ASTLabelType=CommonTree;} a : ^('+' ID INT) { self.capture("alt 1") } | ^('+' . .) { self.capture("alt 2") } ; ''') # AMBIG upon '+' DOWN INT UP etc.. but so what. found = self.execTreeParser( grammar, 'a', treeGrammar, 'a', "a + 2") self.assertEqual("alt 1", found) def testWildcardPlusLookahead(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3; output=AST;} a : ID '+'^ INT; ID : 'a'..'z'+ ; INT : '0'..'9'+; SEMI : ';' ; PERIOD : '.' ; WS : (' '|'\n') {$channel=HIDDEN;} ; ''') treeGrammar = textwrap.dedent( r''' tree grammar TP; options {language=Python3; tokenVocab=T; ASTLabelType=CommonTree;} a : ^('+' INT INT ) { self.capture("alt 1") } | ^('+' .+) { self.capture("alt 2") } ; ''') # AMBIG upon '+' DOWN INT UP etc.. but so what. found = self.execTreeParser( grammar, 'a', treeGrammar, 'a', "a + 2") self.assertEqual("alt 2", found) if __name__ == '__main__': unittest.main() python3-antlr3-3.5.2/tests/t050decorate.g000066400000000000000000000007661324200532700200460ustar00rootroot00000000000000grammar t050decorate; options { language = Python3; } @header { def logme(func): def decorated(self, *args, **kwargs): self.events.append('before') try: return func(self, *args, **kwargs) finally: self.events.append('after') return decorated } @parser::init { self.events = [] } document @decorate { @logme } : IDENTIFIER ; IDENTIFIER: ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')*; python3-antlr3-3.5.2/tests/t050decorate.py000066400000000000000000000007701324200532700202430ustar00rootroot00000000000000import antlr3 import testbase import unittest class t013parser(testbase.ANTLRTest): def setUp(self): self.compileGrammar() def testValid(self): cStream = antlr3.StringStream('foobar') lexer = self.getLexer(cStream) tStream = antlr3.CommonTokenStream(lexer) parser = self.getParser(tStream) parser.document() self.assertEqual(parser.events, ['before', 'after']) if __name__ == '__main__': unittest.main() python3-antlr3-3.5.2/tests/t051treeRewriteAST.py000066400000000000000000001162241324200532700213310ustar00rootroot00000000000000import unittest import textwrap import antlr3 import antlr3.tree import testbase class T(testbase.ANTLRTest): def walkerClass(self, base): class TWalker(base): def __init__(self, *args, **kwargs): super().__init__(*args, **kwargs) self.buf = "" def traceIn(self, ruleName, ruleIndex): self.traces.append('>'+ruleName) def traceOut(self, ruleName, ruleIndex): self.traces.append('<'+ruleName) def recover(self, input, re): # no error recovery yet, just crash! raise return TWalker def execTreeParser(self, grammar, grammarEntry, treeGrammar, treeEntry, input): lexerCls, parserCls = self.compileInlineGrammar(grammar) walkerCls = self.compileInlineGrammar(treeGrammar) cStream = antlr3.StringStream(input) lexer = lexerCls(cStream) tStream = antlr3.CommonTokenStream(lexer) parser = parserCls(tStream) r = getattr(parser, grammarEntry)() nodes = antlr3.tree.CommonTreeNodeStream(r.tree) nodes.setTokenStream(tStream) walker = walkerCls(nodes) r = getattr(walker, treeEntry)() if r.tree: return r.tree.toStringTree() return "" def testFlatList(self): grammar = textwrap.dedent( r''' grammar T1; options { language=Python3; output=AST; } a : ID INT; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\\n') {$channel=HIDDEN;} ; ''') treeGrammar = textwrap.dedent( r''' tree grammar TP1; options { language=Python3; output=AST; ASTLabelType=CommonTree; tokenVocab=T1; } a : ID INT -> INT ID; ''') found = self.execTreeParser( grammar, 'a', treeGrammar, 'a', "abc 34" ) self.assertEqual("34 abc", found) def testSimpleTree(self): grammar = textwrap.dedent( r''' grammar T2; options { language=Python3; output=AST; } a : ID INT -> ^(ID INT); ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\\n') {$channel=HIDDEN;} ; ''') treeGrammar = textwrap.dedent( r''' tree grammar TP2; options { language=Python3; output=AST; ASTLabelType=CommonTree; tokenVocab=T2; } a : ^(ID INT) -> ^(INT ID); ''') found = self.execTreeParser( grammar, 'a', treeGrammar, 'a', "abc 34" ) self.assertEqual("(34 abc)", found) def testCombinedRewriteAndAuto(self): grammar = textwrap.dedent( r''' grammar T3; options { language=Python3; output=AST; } a : ID INT -> ^(ID INT) | INT ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\\n') {$channel=HIDDEN;} ; ''') treeGrammar = textwrap.dedent( r''' tree grammar TP3; options { language=Python3; output=AST; ASTLabelType=CommonTree; tokenVocab=T3; } a : ^(ID INT) -> ^(INT ID) | INT; ''') found = self.execTreeParser( grammar, 'a', treeGrammar, 'a', "abc 34" ) self.assertEqual("(34 abc)", found) found = self.execTreeParser( grammar, 'a', treeGrammar, 'a', "34" ) self.assertEqual("34", found) def testAvoidDup(self): grammar = textwrap.dedent( r''' grammar T4; options { language=Python3; output=AST; } a : ID ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\\n') {$channel=HIDDEN;} ; ''') treeGrammar = textwrap.dedent( r''' tree grammar TP4; options { language=Python3; output=AST; ASTLabelType=CommonTree; tokenVocab=T4; } a : ID -> ^(ID ID); ''') found = self.execTreeParser( grammar, 'a', treeGrammar, 'a', "abc" ) self.assertEqual("(abc abc)", found) def testLoop(self): grammar = textwrap.dedent( r''' grammar T5; options { language=Python3; output=AST; } a : ID+ INT+ -> (^(ID INT))+ ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\\n') {$channel=HIDDEN;} ; ''') treeGrammar = textwrap.dedent( r''' tree grammar TP5; options { language=Python3; output=AST; ASTLabelType=CommonTree; tokenVocab=T5; } a : (^(ID INT))+ -> INT+ ID+; ''') found = self.execTreeParser( grammar, 'a', treeGrammar, 'a', "a b c 3 4 5" ) self.assertEqual("3 4 5 a b c", found) def testAutoDup(self): grammar = textwrap.dedent( r''' grammar T6; options { language=Python3; output=AST; } a : ID ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\\n') {$channel=HIDDEN;} ; ''') treeGrammar = textwrap.dedent( r''' tree grammar TP6; options { language=Python3; output=AST; ASTLabelType=CommonTree; tokenVocab=T6; } a : ID; ''') found = self.execTreeParser( grammar, 'a', treeGrammar, 'a', "abc" ) self.assertEqual("abc", found) def testAutoDupRule(self): grammar = textwrap.dedent( r''' grammar T7; options { language=Python3; output=AST; } a : ID INT ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\\n') {$channel=HIDDEN;} ; ''') treeGrammar = textwrap.dedent( r''' tree grammar TP7; options { language=Python3; output=AST; ASTLabelType=CommonTree; tokenVocab=T7; } a : b c ; b : ID ; c : INT ; ''') found = self.execTreeParser( grammar, 'a', treeGrammar, 'a', "a 1" ) self.assertEqual("a 1", found) def testAutoWildcard(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : ID INT ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN;} ; ''') treeGrammar = textwrap.dedent( r''' tree grammar TP; options {language=Python3;output=AST; ASTLabelType=CommonTree; tokenVocab=T;} a : ID . ; ''') found = self.execTreeParser( grammar, 'a', treeGrammar, 'a', "abc 34") self.assertEqual("abc 34", found) def testAutoWildcard2(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : ID INT -> ^(ID INT); ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN;} ; ''') treeGrammar = textwrap.dedent( r''' tree grammar TP; options {language=Python3;output=AST; ASTLabelType=CommonTree; tokenVocab=T;} a : ^(ID .) ; ''') found = self.execTreeParser( grammar, 'a', treeGrammar, 'a', "abc 34") self.assertEqual("(abc 34)", found) def testAutoWildcardWithLabel(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : ID INT ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN;} ; ''') treeGrammar = textwrap.dedent( r''' tree grammar TP; options {language=Python3;output=AST; ASTLabelType=CommonTree; tokenVocab=T;} a : ID c=. ; ''') found = self.execTreeParser( grammar, 'a', treeGrammar, 'a', "abc 34") self.assertEqual("abc 34", found) def testAutoWildcardWithListLabel(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : ID INT ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN;} ; ''') treeGrammar = textwrap.dedent( r''' tree grammar TP; options {language=Python3;output=AST; ASTLabelType=CommonTree; tokenVocab=T;} a : ID c+=. ; ''') found = self.execTreeParser( grammar, 'a', treeGrammar, 'a', "abc 34") self.assertEqual("abc 34", found) def testAutoDupMultiple(self): grammar = textwrap.dedent( r''' grammar T8; options { language=Python3; output=AST; } a : ID ID INT; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\\n') {$channel=HIDDEN;} ; ''') treeGrammar = textwrap.dedent( r''' tree grammar TP8; options { language=Python3; output=AST; ASTLabelType=CommonTree; tokenVocab=T8; } a : ID ID INT ; ''') found = self.execTreeParser( grammar, 'a', treeGrammar, 'a', "a b 3" ) self.assertEqual("a b 3", found) def testAutoDupTree(self): grammar = textwrap.dedent( r''' grammar T9; options { language=Python3; output=AST; } a : ID INT -> ^(ID INT); ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\\n') {$channel=HIDDEN;} ; ''') treeGrammar = textwrap.dedent( r''' tree grammar TP9; options { language=Python3; output=AST; ASTLabelType=CommonTree; tokenVocab=T9; } a : ^(ID INT) ; ''') found = self.execTreeParser( grammar, 'a', treeGrammar, 'a', "a 3" ) self.assertEqual("(a 3)", found) def testAutoDupTreeWithLabels(self): grammar = textwrap.dedent( r''' grammar T10; options { language=Python3; output=AST; } a : ID INT -> ^(ID INT); ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\\n') {$channel=HIDDEN;} ; ''') treeGrammar = textwrap.dedent( r''' tree grammar TP10; options { language=Python3; output=AST; ASTLabelType=CommonTree; tokenVocab=T10; } a : ^(x=ID y=INT) ; ''') found = self.execTreeParser( grammar, 'a', treeGrammar, 'a', "a 3" ) self.assertEqual("(a 3)", found) def testAutoDupTreeWithListLabels(self): grammar = textwrap.dedent( r''' grammar T11; options { language=Python3; output=AST; } a : ID INT -> ^(ID INT); ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\\n') {$channel=HIDDEN;} ; ''') treeGrammar = textwrap.dedent( r''' tree grammar TP11; options { language=Python3; output=AST; ASTLabelType=CommonTree; tokenVocab=T11; } a : ^(x+=ID y+=INT) ; ''') found = self.execTreeParser( grammar, 'a', treeGrammar, 'a', "a 3" ) self.assertEqual("(a 3)", found) def testAutoDupTreeWithRuleRoot(self): grammar = textwrap.dedent( r''' grammar T12; options { language=Python3; output=AST; } a : ID INT -> ^(ID INT); ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\\n') {$channel=HIDDEN;} ; ''') treeGrammar = textwrap.dedent( r''' tree grammar TP12; options { language=Python3; output=AST; ASTLabelType=CommonTree; tokenVocab=T12; } a : ^(b INT) ; b : ID ; ''') found = self.execTreeParser( grammar, 'a', treeGrammar, 'a', "a 3" ) self.assertEqual("(a 3)", found) def testAutoDupTreeWithRuleRootAndLabels(self): grammar = textwrap.dedent( r''' grammar T13; options { language=Python3; output=AST; } a : ID INT -> ^(ID INT); ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\\n') {$channel=HIDDEN;} ; ''') treeGrammar = textwrap.dedent( r''' tree grammar TP13; options { language=Python3; output=AST; ASTLabelType=CommonTree; tokenVocab=T13; } a : ^(x=b INT) ; b : ID ; ''') found = self.execTreeParser( grammar, 'a', treeGrammar, 'a', "a 3" ) self.assertEqual("(a 3)", found) def testAutoDupTreeWithRuleRootAndListLabels(self): grammar = textwrap.dedent( r''' grammar T14; options { language=Python3; output=AST; } a : ID INT -> ^(ID INT); ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\\n') {$channel=HIDDEN;} ; ''') treeGrammar = textwrap.dedent( r''' tree grammar TP14; options { language=Python3; output=AST; ASTLabelType=CommonTree; tokenVocab=T14; } a : ^(x+=b y+=c) ; b : ID ; c : INT ; ''') found = self.execTreeParser( grammar, 'a', treeGrammar, 'a', "a 3" ) self.assertEqual("(a 3)", found) def testAutoDupNestedTree(self): grammar = textwrap.dedent( r''' grammar T15; options { language=Python3; output=AST; } a : x=ID y=ID INT -> ^($x ^($y INT)); ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\\n') {$channel=HIDDEN;} ; ''') treeGrammar = textwrap.dedent( r''' tree grammar TP15; options { language=Python3; output=AST; ASTLabelType=CommonTree; tokenVocab=T15; } a : ^(ID ^(ID INT)) ; ''') found = self.execTreeParser( grammar, 'a', treeGrammar, 'a', "a b 3" ) self.assertEqual("(a (b 3))", found) def testDelete(self): grammar = textwrap.dedent( r''' grammar T16; options { language=Python3; output=AST; } a : ID ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\\n') {$channel=HIDDEN;} ; ''') treeGrammar = textwrap.dedent( r''' tree grammar TP16; options { language=Python3; output=AST; ASTLabelType=CommonTree; tokenVocab=T16; } a : ID -> ; ''') found = self.execTreeParser( grammar, 'a', treeGrammar, 'a', "abc" ) self.assertEqual("", found) def testSetMatchNoRewrite(self): grammar = textwrap.dedent( r''' grammar T; options { language=Python3; output=AST; } a : ID INT ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN;} ; ''') treeGrammar = textwrap.dedent( r''' tree grammar TP; options { language=Python3; output=AST; ASTLabelType=CommonTree; tokenVocab=T; } a : b INT; b : ID | INT; ''') found = self.execTreeParser( grammar, 'a', treeGrammar, 'a', "abc 34" ) self.assertEqual("abc 34", found) def testSetOptionalMatchNoRewrite(self): grammar = textwrap.dedent( r''' grammar T; options { language=Python3; output=AST; } a : ID INT ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN;} ; ''') treeGrammar = textwrap.dedent( r''' tree grammar TP; options { language=Python3; output=AST; ASTLabelType=CommonTree; tokenVocab=T; } a : (ID|INT)? INT ; ''') found = self.execTreeParser( grammar, 'a', treeGrammar, 'a', "abc 34") self.assertEqual("abc 34", found) def testSetMatchNoRewriteLevel2(self): grammar = textwrap.dedent( r''' grammar T; options { language=Python3; output=AST; } a : x=ID INT -> ^($x INT); ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN;} ; ''') treeGrammar = textwrap.dedent( r''' tree grammar TP; options { language=Python3; output=AST; ASTLabelType=CommonTree; tokenVocab=T; } a : ^(ID (ID | INT) ) ; ''') found = self.execTreeParser( grammar, 'a', treeGrammar, 'a', "abc 34" ) self.assertEqual("(abc 34)", found) def testSetMatchNoRewriteLevel2Root(self): grammar = textwrap.dedent( r''' grammar T; options { language=Python3; output=AST; } a : x=ID INT -> ^($x INT); ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN;} ; ''') treeGrammar = textwrap.dedent( r''' tree grammar TP; options { language=Python3; output=AST; ASTLabelType=CommonTree; tokenVocab=T; } a : ^((ID | INT) INT) ; ''') found = self.execTreeParser( grammar, 'a', treeGrammar, 'a', "abc 34" ) self.assertEqual("(abc 34)", found) ## REWRITE MODE def testRewriteModeCombinedRewriteAndAuto(self): grammar = textwrap.dedent( r''' grammar T17; options { language=Python3; output=AST; } a : ID INT -> ^(ID INT) | INT ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\\n') {$channel=HIDDEN;} ; ''') treeGrammar = textwrap.dedent( r''' tree grammar TP17; options { language=Python3; output=AST; ASTLabelType=CommonTree; tokenVocab=T17; rewrite=true; } a : ^(ID INT) -> ^(ID["ick"] INT) | INT // leaves it alone, returning $a.start ; ''') found = self.execTreeParser( grammar, 'a', treeGrammar, 'a', "abc 34" ) self.assertEqual("(ick 34)", found) found = self.execTreeParser( grammar, 'a', treeGrammar, 'a', "34" ) self.assertEqual("34", found) def testRewriteModeFlatTree(self): grammar = textwrap.dedent( r''' grammar T18; options { language=Python3; output=AST; } a : ID INT -> ID INT | INT ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN;} ; ''') treeGrammar = textwrap.dedent( r''' tree grammar TP18; options { language=Python3; output=AST; ASTLabelType=CommonTree; tokenVocab=T18; rewrite=true; } s : ID a ; a : INT -> INT["1"] ; ''') found = self.execTreeParser( grammar, 'a', treeGrammar, 's', "abc 34" ) self.assertEqual("abc 1", found) def testRewriteModeChainRuleFlatTree(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3; output=AST;} a : ID INT -> ID INT | INT ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN;} ; ''') treeGrammar = textwrap.dedent( r''' tree grammar TP; options {language=Python3; output=AST; ASTLabelType=CommonTree; tokenVocab=T; rewrite=true;} s : a ; a : b ; b : ID INT -> INT ID ; ''') found = self.execTreeParser( grammar, 'a', treeGrammar, 's', "abc 34") self.assertEqual("34 abc", found) def testRewriteModeChainRuleTree(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3; output=AST;} a : ID INT -> ^(ID INT) ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN;} ; ''') treeGrammar = textwrap.dedent( r''' tree grammar TP; options {language=Python3; output=AST; ASTLabelType=CommonTree; tokenVocab=T; rewrite=true;} s : a ; a : b ; // a.tree must become b.tree b : ^(ID INT) -> INT ; ''') found = self.execTreeParser( grammar, 'a', treeGrammar, 's', "abc 34") self.assertEqual("34", found) def testRewriteModeChainRuleTree2(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3; output=AST;} a : ID INT -> ^(ID INT) ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN;} ; ''') treeGrammar = textwrap.dedent( r""" tree grammar TP; options {language=Python3; output=AST; ASTLabelType=CommonTree; tokenVocab=T; rewrite=true;} tokens { X; } s : a* b ; // only b contributes to tree, but it's after a*; s.tree = b.tree a : X ; b : ^(ID INT) -> INT ; """) found = self.execTreeParser( grammar, 'a', treeGrammar, 's', "abc 34") self.assertEqual("34", found) def testRewriteModeChainRuleTree3(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3; output=AST;} a : 'boo' ID INT -> 'boo' ^(ID INT) ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN;} ; ''') treeGrammar = textwrap.dedent( r""" tree grammar TP; options {language=Python3; output=AST; ASTLabelType=CommonTree; tokenVocab=T; rewrite=true;} tokens { X; } s : 'boo' a* b ; // don't reset s.tree to b.tree due to 'boo' a : X ; b : ^(ID INT) -> INT ; """) found = self.execTreeParser( grammar, 'a', treeGrammar, 's', "boo abc 34") self.assertEqual("boo 34", found) def testRewriteModeChainRuleTree4(self): grammar = textwrap.dedent( r""" grammar T; options {language=Python3; output=AST;} a : 'boo' ID INT -> ^('boo' ^(ID INT)) ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN;} ; """) treeGrammar = textwrap.dedent( r""" tree grammar TP; options {language=Python3; output=AST; ASTLabelType=CommonTree; tokenVocab=T; rewrite=true;} tokens { X; } s : ^('boo' a* b) ; // don't reset s.tree to b.tree due to 'boo' a : X ; b : ^(ID INT) -> INT ; """) found = self.execTreeParser( grammar, 'a', treeGrammar, 's', "boo abc 34") self.assertEqual("(boo 34)", found) def testRewriteModeChainRuleTree5(self): grammar = textwrap.dedent( r""" grammar T; options {language=Python3; output=AST;} a : 'boo' ID INT -> ^('boo' ^(ID INT)) ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN;} ; """) treeGrammar = textwrap.dedent( r""" tree grammar TP; options {language=Python3; output=AST; ASTLabelType=CommonTree; tokenVocab=T; rewrite=true;} tokens { X; } s : ^(a b) ; // s.tree is a.tree a : 'boo' ; b : ^(ID INT) -> INT ; """) found = self.execTreeParser( grammar, 'a', treeGrammar, 's', "boo abc 34") self.assertEqual("(boo 34)", found) def testRewriteOfRuleRef(self): grammar = textwrap.dedent( r""" grammar T; options {language=Python3; output=AST;} a : ID INT -> ID INT | INT ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN;} ; """) treeGrammar = textwrap.dedent( r""" tree grammar TP; options {language=Python3; output=AST; ASTLabelType=CommonTree; tokenVocab=T; rewrite=true;} s : a -> a ; a : ID INT -> ID INT ; """) found = self.execTreeParser( grammar, 'a', treeGrammar, 's', "abc 34") self.assertEqual("abc 34", found) def testRewriteOfRuleRefRoot(self): grammar = textwrap.dedent( r""" grammar T; options {language=Python3; output=AST;} a : ID INT INT -> ^(INT ^(ID INT)); ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN;} ; """) treeGrammar = textwrap.dedent( r""" tree grammar TP; options {language=Python3; output=AST; ASTLabelType=CommonTree; tokenVocab=T; rewrite=true;} s : ^(a ^(ID INT)) -> a ; a : INT ; """) found = self.execTreeParser( grammar, 'a', treeGrammar, 's', "abc 12 34") # emits whole tree when you ref the root since I can't know whether # you want the children or not. You might be returning a whole new # tree. Hmm...still seems weird. oh well. self.assertEqual("(12 (abc 34))", found) def testRewriteOfRuleRefRootLabeled(self): grammar = textwrap.dedent( r""" grammar T; options {language=Python3; output=AST;} a : ID INT INT -> ^(INT ^(ID INT)); ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN;} ; """) treeGrammar = textwrap.dedent( r""" tree grammar TP; options {language=Python3; output=AST; ASTLabelType=CommonTree; tokenVocab=T; rewrite=true;} s : ^(label=a ^(ID INT)) -> a ; a : INT ; """) found = self.execTreeParser( grammar, 'a', treeGrammar, 's', "abc 12 34") # emits whole tree when you ref the root since I can't know whether # you want the children or not. You might be returning a whole new # tree. Hmm...still seems weird. oh well. self.assertEqual("(12 (abc 34))", found) def testRewriteOfRuleRefRootListLabeled(self): grammar = textwrap.dedent( r""" grammar T; options {language=Python3; output=AST;} a : ID INT INT -> ^(INT ^(ID INT)); ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN;} ; """) treeGrammar = textwrap.dedent( r""" tree grammar TP; options {language=Python3; output=AST; ASTLabelType=CommonTree; tokenVocab=T; rewrite=true;} s : ^(label+=a ^(ID INT)) -> a ; a : INT ; """) found = self.execTreeParser( grammar, 'a', treeGrammar, 's', "abc 12 34") # emits whole tree when you ref the root since I can't know whether # you want the children or not. You might be returning a whole new # tree. Hmm...still seems weird. oh well. self.assertEqual("(12 (abc 34))", found) def testRewriteOfRuleRefChild(self): grammar = textwrap.dedent( r""" grammar T; options {language=Python3; output=AST;} a : ID INT -> ^(ID ^(INT INT)); ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN;} ; """) treeGrammar = textwrap.dedent( r""" tree grammar TP; options {language=Python3; output=AST; ASTLabelType=CommonTree; tokenVocab=T; rewrite=true;} s : ^(ID a) -> a ; a : ^(INT INT) ; """) found = self.execTreeParser( grammar, 'a', treeGrammar, 's', "abc 34") self.assertEqual("(34 34)", found) def testRewriteOfRuleRefLabel(self): grammar = textwrap.dedent( r""" grammar T; options {language=Python3; output=AST;} a : ID INT -> ^(ID ^(INT INT)); ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN;} ; """) treeGrammar = textwrap.dedent( r""" tree grammar TP; options {language=Python3; output=AST; ASTLabelType=CommonTree; tokenVocab=T; rewrite=true;} s : ^(ID label=a) -> a ; a : ^(INT INT) ; """) found = self.execTreeParser( grammar, 'a', treeGrammar, 's', "abc 34") self.assertEqual("(34 34)", found) def testRewriteOfRuleRefListLabel(self): grammar = textwrap.dedent( r""" grammar T; options {language=Python3; output=AST;} a : ID INT -> ^(ID ^(INT INT)); ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN;} ; """) treeGrammar = textwrap.dedent( r""" tree grammar TP; options {language=Python3; output=AST; ASTLabelType=CommonTree; tokenVocab=T; rewrite=true;} s : ^(ID label+=a) -> a ; a : ^(INT INT) ; """) found = self.execTreeParser( grammar, 'a', treeGrammar, 's', "abc 34") self.assertEqual("(34 34)", found) def testRewriteModeWithPredicatedRewrites(self): grammar = textwrap.dedent( r''' grammar T19; options { language=Python3; output=AST; } a : ID INT -> ^(ID["root"] ^(ID INT)) | INT -> ^(ID["root"] INT) ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN;} ; ''') treeGrammar = textwrap.dedent( r''' tree grammar TP19; options { language=Python3; output=AST; ASTLabelType=CommonTree; tokenVocab=T19; rewrite=true; } s : ^(ID a) { self.buf += $s.start.toStringTree() }; a : ^(ID INT) -> {True}? ^(ID["ick"] INT) -> INT ; ''') found = self.execTreeParser( grammar, 'a', treeGrammar, 's', "abc 34" ) self.assertEqual("(root (ick 34))", found) def testWildcardSingleNode(self): grammar = textwrap.dedent( r''' grammar T; options { language=Python3; output=AST; } a : ID INT -> ^(ID["root"] INT); ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN;} ; ''') treeGrammar = textwrap.dedent( r''' tree grammar TP; options { language=Python3; output=AST; ASTLabelType=CommonTree; tokenVocab=T; } s : ^(ID c=.) -> $c ; ''') found = self.execTreeParser( grammar, 'a', treeGrammar, 's', "abc 34" ) self.assertEqual("34", found) def testWildcardUnlabeledSingleNode(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3; output=AST;} a : ID INT -> ^(ID INT); ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN;} ; ''') treeGrammar = textwrap.dedent( r''' tree grammar TP; options {language=Python3; output=AST; ASTLabelType=CommonTree; tokenVocab=T;} s : ^(ID .) -> ID ; ''') found = self.execTreeParser( grammar, 'a', treeGrammar, 's', "abc 34") self.assertEqual("abc", found) def testWildcardGrabsSubtree(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3; output=AST;} a : ID x=INT y=INT z=INT -> ^(ID[\"root\"] ^($x $y $z)); ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN;} ; ''') treeGrammar = textwrap.dedent( r''' tree grammar TP; options {language=Python3; output=AST; ASTLabelType=CommonTree; tokenVocab=T;} s : ^(ID c=.) -> $c ; ''') found = self.execTreeParser( grammar, 'a', treeGrammar, 's', "abc 1 2 3") self.assertEqual("(1 2 3)", found) def testWildcardGrabsSubtree2(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3; output=AST;} a : ID x=INT y=INT z=INT -> ID ^($x $y $z); ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN;} ; ''') treeGrammar = textwrap.dedent( r''' tree grammar TP; options {language=Python3; output=AST; ASTLabelType=CommonTree; tokenVocab=T;} s : ID c=. -> $c ; ''') found = self.execTreeParser( grammar, 'a', treeGrammar, 's', "abc 1 2 3") self.assertEqual("(1 2 3)", found) def testWildcardListLabel(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3; output=AST;} a : INT INT INT ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN;} ; ''') treeGrammar = textwrap.dedent( r''' tree grammar TP; options {language=Python3; output=AST; ASTLabelType=CommonTree; tokenVocab=T;} s : (c+=.)+ -> $c+ ; ''') found = self.execTreeParser( grammar, 'a', treeGrammar, 's', "1 2 3") self.assertEqual("1 2 3", found) def testWildcardListLabel2(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3; output=AST; ASTLabelType=CommonTree;} a : x=INT y=INT z=INT -> ^($x ^($y $z) ^($y $z)); ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN;} ; ''') treeGrammar = textwrap.dedent( r''' tree grammar TP; options {language=Python3; output=AST; ASTLabelType=CommonTree; tokenVocab=T; rewrite=true;} s : ^(INT (c+=.)+) -> $c+ ; ''') found = self.execTreeParser( grammar, 'a', treeGrammar, 's', "1 2 3") self.assertEqual("(2 3) (2 3)", found) def testRuleResultAsRoot(self): grammar = textwrap.dedent( r''' grammar T; options { language=Python3; output=AST; } a : ID '=' INT -> ^('=' ID INT); ID : 'a'..'z'+ ; INT : '0'..'9'+; COLON : ':' ; WS : (' '|'\n') {$channel=HIDDEN;} ; ''') treeGrammar = textwrap.dedent( r''' tree grammar TP; options { language=Python3; output=AST; rewrite=true; ASTLabelType=CommonTree; tokenVocab=T; } a : ^(eq e1=ID e2=.) -> ^(eq $e2 $e1) ; eq : '=' | ':' {pass} ; // bug in set match, doesn't add to tree!! booh. force nonset. ''') found = self.execTreeParser( grammar, 'a', treeGrammar, 'a', "abc = 34") self.assertEqual("(= 34 abc)", found) if __name__ == '__main__': unittest.main() python3-antlr3-3.5.2/tests/t052import.py000066400000000000000000000234101324200532700177650ustar00rootroot00000000000000import unittest import textwrap import antlr3 import antlr3.tree import testbase import sys class T(testbase.ANTLRTest): def setUp(self): self.oldPath = sys.path[:] sys.path.insert(0, self.baseDir) def tearDown(self): sys.path = self.oldPath def parserClass(self, base): class TParser(base): def __init__(self, *args, **kwargs): super().__init__(*args, **kwargs) self._output = "" def capture(self, t): self._output += t def traceIn(self, ruleName, ruleIndex): self.traces.append('>'+ruleName) def traceOut(self, ruleName, ruleIndex): self.traces.append('<'+ruleName) def recover(self, input, re): # no error recovery yet, just crash! raise return TParser def lexerClass(self, base): class TLexer(base): def __init__(self, *args, **kwargs): super().__init__(*args, **kwargs) self._output = "" def capture(self, t): self._output += t def traceIn(self, ruleName, ruleIndex): self.traces.append('>'+ruleName) def traceOut(self, ruleName, ruleIndex): self.traces.append('<'+ruleName) def recover(self, input): # no error recovery yet, just crash! raise return TLexer def execParser(self, grammar, grammarEntry, slaves, input): for slave in slaves: parserName = self.writeInlineGrammar(slave)[0] # slave parsers are imported as normal python modules # to force reloading current version, purge module from sys.modules if parserName + 'Parser' in sys.modules: del sys.modules[parserName + 'Parser'] lexerCls, parserCls = self.compileInlineGrammar(grammar) cStream = antlr3.StringStream(input) lexer = lexerCls(cStream) tStream = antlr3.CommonTokenStream(lexer) parser = parserCls(tStream) getattr(parser, grammarEntry)() return parser._output def execLexer(self, grammar, slaves, input): for slave in slaves: parserName = self.writeInlineGrammar(slave)[0] # slave parsers are imported as normal python modules # to force reloading current version, purge module from sys.modules if parserName + 'Parser' in sys.modules: del sys.modules[parserName + 'Parser'] lexerCls = self.compileInlineGrammar(grammar) cStream = antlr3.StringStream(input) lexer = lexerCls(cStream) while True: token = lexer.nextToken() if token is None or token.type == antlr3.EOF: break lexer._output += token.text return lexer._output def testDelegatorInvokesDelegateRule(self): slave = textwrap.dedent( r''' parser grammar S1; options { language=Python3; } @members { def capture(self, t): self.gM1.capture(t) } a : B { self.capture("S.a") } ; ''') master = textwrap.dedent( r''' grammar M1; options { language=Python3; } import S1; s : a ; B : 'b' ; // defines B from inherited token space WS : (' '|'\n') {self.skip()} ; ''') found = self.execParser( master, 's', slaves=[slave], input="b" ) self.assertEqual("S.a", found) def testDelegatorInvokesDelegateRuleWithArgs(self): slave = textwrap.dedent( r''' parser grammar S2; options { language=Python3; } @members { def capture(self, t): self.gM2.capture(t) } a[x] returns [y] : B {self.capture("S.a"); $y="1000";} ; ''') master = textwrap.dedent( r''' grammar M2; options { language=Python3; } import S2; s : label=a[3] {self.capture($label.y);} ; B : 'b' ; // defines B from inherited token space WS : (' '|'\n') {self.skip()} ; ''') found = self.execParser( master, 's', slaves=[slave], input="b" ) self.assertEqual("S.a1000", found) def testDelegatorAccessesDelegateMembers(self): slave = textwrap.dedent( r''' parser grammar S3; options { language=Python3; } @members { def capture(self, t): self.gM3.capture(t) def foo(self): self.capture("foo") } a : B ; ''') master = textwrap.dedent( r''' grammar M3; // uses no rules from the import options { language=Python3; } import S3; s : 'b' {self.gS3.foo();} ; // gS is import pointer WS : (' '|'\n') {self.skip()} ; ''') found = self.execParser( master, 's', slaves=[slave], input="b" ) self.assertEqual("foo", found) def testDelegatorInvokesFirstVersionOfDelegateRule(self): slave = textwrap.dedent( r''' parser grammar S4; options { language=Python3; } @members { def capture(self, t): self.gM4.capture(t) } a : b {self.capture("S.a");} ; b : B ; ''') slave2 = textwrap.dedent( r''' parser grammar T4; options { language=Python3; } @members { def capture(self, t): self.gM4.capture(t) } a : B {self.capture("T.a");} ; // hidden by S.a ''') master = textwrap.dedent( r''' grammar M4; options { language=Python3; } import S4,T4; s : a ; B : 'b' ; WS : (' '|'\n') {self.skip()} ; ''') found = self.execParser( master, 's', slaves=[slave, slave2], input="b" ) self.assertEqual("S.a", found) def testDelegatesSeeSameTokenType(self): slave = textwrap.dedent( r''' parser grammar S5; // A, B, C token type order options { language=Python3; } tokens { A; B; C; } @members { def capture(self, t): self.gM5.capture(t) } x : A {self.capture("S.x ");} ; ''') slave2 = textwrap.dedent( r''' parser grammar T5; options { language=Python3; } tokens { C; B; A; } /// reverse order @members { def capture(self, t): self.gM5.capture(t) } y : A {self.capture("T.y");} ; ''') master = textwrap.dedent( r''' grammar M5; options { language=Python3; } import S5,T5; s : x y ; // matches AA, which should be "aa" B : 'b' ; // another order: B, A, C A : 'a' ; C : 'c' ; WS : (' '|'\n') {self.skip()} ; ''') found = self.execParser( master, 's', slaves=[slave, slave2], input="aa" ) self.assertEqual("S.x T.y", found) def testDelegatorRuleOverridesDelegate(self): slave = textwrap.dedent( r''' parser grammar S6; options { language=Python3; } @members { def capture(self, t): self.gM6.capture(t) } a : b {self.capture("S.a");} ; b : B ; ''') master = textwrap.dedent( r''' grammar M6; options { language=Python3; } import S6; b : 'b'|'c' ; WS : (' '|'\n') {self.skip()} ; ''') found = self.execParser( master, 'a', slaves=[slave], input="c" ) self.assertEqual("S.a", found) # LEXER INHERITANCE def testLexerDelegatorInvokesDelegateRule(self): slave = textwrap.dedent( r''' lexer grammar S7; options { language=Python3; } @members { def capture(self, t): self.gM7.capture(t) } A : 'a' {self.capture("S.A ");} ; C : 'c' ; ''') master = textwrap.dedent( r''' lexer grammar M7; options { language=Python3; } import S7; B : 'b' ; WS : (' '|'\n') {self.skip()} ; ''') found = self.execLexer( master, slaves=[slave], input="abc" ) self.assertEqual("S.A abc", found) def testLexerDelegatorRuleOverridesDelegate(self): slave = textwrap.dedent( r''' lexer grammar S8; options { language=Python3; } @members { def capture(self, t): self.gM8.capture(t) } A : 'a' {self.capture("S.A")} ; ''') master = textwrap.dedent( r''' lexer grammar M8; options { language=Python3; } import S8; A : 'a' {self.capture("M.A ");} ; WS : (' '|'\n') {self.skip()} ; ''') found = self.execLexer( master, slaves=[slave], input="a" ) self.assertEqual("M.A a", found) if __name__ == '__main__': unittest.main() python3-antlr3-3.5.2/tests/t053hetero.py000066400000000000000000000534731324200532700177560ustar00rootroot00000000000000import unittest import textwrap import antlr3 import antlr3.tree import testbase import sys class T(testbase.ANTLRTest): def parserClass(self, base): class TParser(base): def __init__(self, *args, **kwargs): super().__init__(*args, **kwargs) self._output = "" def capture(self, t): self._output += t def traceIn(self, ruleName, ruleIndex): self.traces.append('>'+ruleName) def traceOut(self, ruleName, ruleIndex): self.traces.append('<'+ruleName) def recover(self, input, re): # no error recovery yet, just crash! raise return TParser def lexerClass(self, base): class TLexer(base): def __init__(self, *args, **kwargs): super().__init__(*args, **kwargs) self._output = "" def capture(self, t): self._output += t def traceIn(self, ruleName, ruleIndex): self.traces.append('>'+ruleName) def traceOut(self, ruleName, ruleIndex): self.traces.append('<'+ruleName) def recover(self, input, re): # no error recovery yet, just crash! raise return TLexer def execParser(self, grammar, grammarEntry, input): lexerCls, parserCls = self.compileInlineGrammar(grammar) cStream = antlr3.StringStream(input) lexer = lexerCls(cStream) tStream = antlr3.CommonTokenStream(lexer) parser = parserCls(tStream) r = getattr(parser, grammarEntry)() if r: return r.tree.toStringTree() return "" def execTreeParser(self, grammar, grammarEntry, treeGrammar, treeEntry, input): lexerCls, parserCls = self.compileInlineGrammar(grammar) walkerCls = self.compileInlineGrammar(treeGrammar) cStream = antlr3.StringStream(input) lexer = lexerCls(cStream) tStream = antlr3.CommonTokenStream(lexer) parser = parserCls(tStream) r = getattr(parser, grammarEntry)() nodes = antlr3.tree.CommonTreeNodeStream(r.tree) nodes.setTokenStream(tStream) walker = walkerCls(nodes) r = getattr(walker, treeEntry)() if r: return r.tree.toStringTree() return "" # PARSERS -- AUTO AST def testToken(self): grammar = textwrap.dedent( r''' grammar T1; options { language=Python3; output=AST; } @header { class V(CommonTree): def toString(self): return self.token.text + "" __str__ = toString } a : ID ; ID : 'a'..'z'+ ; WS : (' '|'\n') {$channel=HIDDEN;} ; ''') found = self.execParser( grammar, 'a', input="a" ) self.assertEqual("a", found) def testTokenCommonTree(self): grammar = textwrap.dedent( r''' grammar T; options { language=Python3; output=AST; } a : ID ; ID : 'a'..'z'+ ; WS : (' '|'\n') {$channel=HIDDEN;} ; ''') found = self.execParser( grammar, 'a', input="a") self.assertEqual("a", found) def testTokenWithQualifiedType(self): grammar = textwrap.dedent( r''' grammar T; options { language=Python3; output=AST; } @members { class V(CommonTree): def toString(self): return self.token.text + "" __str__ = toString } a : ID ; // TParser.V is qualified name ID : 'a'..'z'+ ; WS : (' '|'\n') {$channel=HIDDEN;} ; ''') found = self.execParser( grammar, 'a', input="a" ) self.assertEqual("a", found) def testNamedType(self): grammar = textwrap.dedent( r""" grammar $T; options { language=Python3; output=AST; } @header { class V(CommonTree): def toString(self): return self.token.text + "" __str__ = toString } a : ID ; ID : 'a'..'z'+ ; WS : (' '|'\\n') {$channel=HIDDEN;} ; """) found = self.execParser(grammar, 'a', input="a") self.assertEqual("a", found) def testTokenWithLabel(self): grammar = textwrap.dedent( r''' grammar T2; options { language=Python3; output=AST; } @header { class V(CommonTree): def toString(self): return self.token.text + "" __str__ = toString } a : x=ID ; ID : 'a'..'z'+ ; WS : (' '|'\n') {$channel=HIDDEN;} ; ''') found = self.execParser( grammar, 'a', input="a" ) self.assertEqual("a", found) def testTokenWithListLabel(self): grammar = textwrap.dedent( r''' grammar T3; options { language=Python3; output=AST; } @header { class V(CommonTree): def toString(self): return self.token.text + "" __str__ = toString } a : x+=ID ; ID : 'a'..'z'+ ; WS : (' '|'\n') {$channel=HIDDEN;} ; ''') found = self.execParser( grammar, 'a', input="a" ) self.assertEqual("a", found) def testTokenRoot(self): grammar = textwrap.dedent( r''' grammar T4; options { language=Python3; output=AST; } @header { class V(CommonTree): def toString(self): return self.token.text + "" __str__ = toString } a : ID^ ; ID : 'a'..'z'+ ; WS : (' '|'\n') {$channel=HIDDEN;} ; ''') found = self.execParser( grammar, 'a', input="a" ) self.assertEqual("a", found) def testTokenRootWithListLabel(self): grammar = textwrap.dedent( r''' grammar T5; options { language=Python3; output=AST; } @header { class V(CommonTree): def toString(self): return self.token.text + "" __str__ = toString } a : x+=ID^ ; ID : 'a'..'z'+ ; WS : (' '|'\n') {$channel=HIDDEN;} ; ''') found = self.execParser( grammar, 'a', input="a" ) self.assertEqual("a", found) def testString(self): grammar = textwrap.dedent( r''' grammar T6; options { language=Python3; output=AST; } @header { class V(CommonTree): def toString(self): return self.token.text + "" __str__ = toString } a : 'begin' ; ID : 'a'..'z'+ ; WS : (' '|'\n') {$channel=HIDDEN;} ; ''') found = self.execParser( grammar, 'a', input="begin" ) self.assertEqual("begin", found) def testStringRoot(self): grammar = textwrap.dedent( r''' grammar T7; options { language=Python3; output=AST; } @header { class V(CommonTree): def toString(self): return self.token.text + "" __str__ = toString } a : 'begin'^ ; ID : 'a'..'z'+ ; WS : (' '|'\n') {$channel=HIDDEN;} ; ''') found = self.execParser( grammar, 'a', input="begin" ) self.assertEqual("begin", found) # PARSERS -- REWRITE AST def testRewriteToken(self): grammar = textwrap.dedent( r''' grammar T8; options { language=Python3; output=AST; } @header { class V(CommonTree): def toString(self): return self.token.text + "" __str__ = toString } a : ID -> ID ; ID : 'a'..'z'+ ; WS : (' '|'\n') {$channel=HIDDEN;} ; ''') found = self.execParser( grammar, 'a', input="a" ) self.assertEqual("a", found) def testRewriteTokenWithArgs(self): grammar = textwrap.dedent( r''' grammar T9; options { language=Python3; output=AST; } @header { class V(CommonTree): def __init__(self, *args): if len(args) == 4: ttype = args[0] x = args[1] y = args[2] z = args[3] token = CommonToken(type=ttype, text="") elif len(args) == 3: ttype = args[0] token = args[1] x = args[2] y, z = 0, 0 else: raise TypeError("Invalid args {!r}".format(args)) super().__init__(token) self.x = x self.y = y self.z = z def toString(self): txt = "" if self.token: txt += self.token.text txt +=";{0.x}{0.y}{0.z}".format(self) return txt __str__ = toString } a : ID -> ID[42,19,30] ID[$ID,99]; ID : 'a'..'z'+ ; WS : (' '|'\n') {$channel=HIDDEN;} ; ''') found = self.execParser( grammar, 'a', input="a" ) self.assertEqual(";421930 a;9900", found) def testRewriteTokenRoot(self): grammar = textwrap.dedent( r''' grammar T10; options { language=Python3; output=AST; } @header { class V(CommonTree): def toString(self): return self.token.text + "" __str__ = toString } a : ID INT -> ^(ID INT) ; ID : 'a'..'z'+ ; INT : '0'..'9'+ ; WS : (' '|'\n') {$channel=HIDDEN;} ; ''') found = self.execParser( grammar, 'a', input="a 2" ) self.assertEqual("(a 2)", found) def testRewriteString(self): grammar = textwrap.dedent( r''' grammar T11; options { language=Python3; output=AST; } @header { class V(CommonTree): def toString(self): return self.token.text + "" __str__ = toString } a : 'begin' -> 'begin' ; ID : 'a'..'z'+ ; WS : (' '|'\n') {$channel=HIDDEN;} ; ''') found = self.execParser( grammar, 'a', input="begin" ) self.assertEqual("begin", found) def testRewriteStringRoot(self): grammar = textwrap.dedent( r''' grammar T12; options { language=Python3; output=AST; } @header { class V(CommonTree): def toString(self): return self.token.text + "" __str__ = toString } a : 'begin' INT -> ^('begin' INT) ; ID : 'a'..'z'+ ; INT : '0'..'9'+ ; WS : (' '|'\n') {$channel=HIDDEN;} ; ''') found = self.execParser( grammar, 'a', input="begin 2" ) self.assertEqual("(begin 2)", found) def testRewriteRuleResults(self): grammar = textwrap.dedent( r''' grammar T; options { language=Python3; output=AST; } tokens {LIST;} @header { class V(CommonTree): def toString(self): return self.token.text + "" __str__ = toString class W(CommonTree): def __init__(self, tokenType, txt): super().__init__( CommonToken(type=tokenType, text=txt)) def toString(self): return self.token.text + "" __str__ = toString } a : id (',' id)* -> ^(LIST["LIST"] id+); id : ID -> ID; ID : 'a'..'z'+ ; WS : (' '|'\n') {$channel=HIDDEN;} ; ''') found = self.execParser( grammar, 'a', input="a,b,c") self.assertEqual("(LIST a b c)", found) def testCopySemanticsWithHetero(self): grammar = textwrap.dedent( r''' grammar T; options { language=Python3; output=AST; } @header { class V(CommonTree): def dupNode(self): return V(self) def toString(self): return self.token.text + "" __str__ = toString } a : type ID (',' ID)* ';' -> ^(type ID)+; type : 'int' ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\\n') {$channel=HIDDEN;} ; ''') found = self.execParser( grammar, 'a', input="int a, b, c;") self.assertEqual("(int a) (int b) (int c)", found) # TREE PARSERS -- REWRITE AST def testTreeParserRewriteFlatList(self): grammar = textwrap.dedent( r''' grammar T13; options { language=Python3; output=AST; } a : ID INT; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN;} ; ''') treeGrammar = textwrap.dedent( r''' tree grammar TP13; options { language=Python3; output=AST; ASTLabelType=CommonTree; tokenVocab=T13; } @header { class V(CommonTree): def toString(self): return self.token.text + "" __str__ = toString class W(CommonTree): def toString(self): return self.token.text + "" __str__ = toString } a : ID INT -> INT ID ; ''') found = self.execTreeParser( grammar, 'a', treeGrammar, 'a', input="abc 34" ) self.assertEqual("34 abc", found) def testTreeParserRewriteTree(self): grammar = textwrap.dedent( r''' grammar T14; options { language=Python3; output=AST; } a : ID INT; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN;} ; ''') treeGrammar = textwrap.dedent( r''' tree grammar TP14; options { language=Python3; output=AST; ASTLabelType=CommonTree; tokenVocab=T14; } @header { class V(CommonTree): def toString(self): return self.token.text + "" __str__ = toString class W(CommonTree): def toString(self): return self.token.text + "" __str__ = toString } a : ID INT -> ^(INT ID) ; ''') found = self.execTreeParser( grammar, 'a', treeGrammar, 'a', input="abc 34" ) self.assertEqual("(34 abc)", found) def testTreeParserRewriteImaginary(self): grammar = textwrap.dedent( r''' grammar T15; options { language=Python3; output=AST; } a : ID ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN;} ; ''') treeGrammar = textwrap.dedent( r''' tree grammar TP15; options { language=Python3; output=AST; ASTLabelType=CommonTree; tokenVocab=T15; } tokens { ROOT; } @header { class V(CommonTree): def __init__(self, tokenType): super().__init__(CommonToken(tokenType)) def toString(self): return tokenNames[self.token.type] + "" __str__ = toString } a : ID -> ROOT ID ; ''') found = self.execTreeParser( grammar, 'a', treeGrammar, 'a', input="abc" ) self.assertEqual("ROOT abc", found) def testTreeParserRewriteImaginaryWithArgs(self): grammar = textwrap.dedent( r''' grammar T16; options { language=Python3; output=AST; } a : ID ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN;} ; ''') treeGrammar = textwrap.dedent( r''' tree grammar TP16; options { language=Python3; output=AST; ASTLabelType=CommonTree; tokenVocab=T16; } tokens { ROOT; } @header { class V(CommonTree): def __init__(self, tokenType, x): super().__init__(CommonToken(tokenType)) self.x = x def toString(self): return tokenNames[self.token.type] + ";" + str(self.x) __str__ = toString } a : ID -> ROOT[42] ID ; ''') found = self.execTreeParser( grammar, 'a', treeGrammar, 'a', input="abc" ) self.assertEqual("ROOT;42 abc", found) def testTreeParserRewriteImaginaryRoot(self): grammar = textwrap.dedent( r''' grammar T17; options { language=Python3; output=AST; } a : ID ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN;} ; ''') treeGrammar = textwrap.dedent( r''' tree grammar TP17; options { language=Python3; output=AST; ASTLabelType=CommonTree; tokenVocab=T17; } tokens { ROOT; } @header { class V(CommonTree): def __init__(self, tokenType): super().__init__(CommonToken(tokenType)) def toString(self): return tokenNames[self.token.type] + "" __str__ = toString } a : ID -> ^(ROOT ID) ; ''') found = self.execTreeParser( grammar, 'a', treeGrammar, 'a', input="abc" ) self.assertEqual("(ROOT abc)", found) def testTreeParserRewriteImaginaryFromReal(self): grammar = textwrap.dedent( r''' grammar T18; options { language=Python3; output=AST; } a : ID ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN;} ; ''') treeGrammar = textwrap.dedent( r''' tree grammar TP18; options { language=Python3; output=AST; ASTLabelType=CommonTree; tokenVocab=T18; } tokens { ROOT; } @header { class V(CommonTree): def __init__(self, tokenType, tree=None): if tree is None: super().__init__(CommonToken(tokenType)) else: super().__init__(tree) self.token.type = tokenType def toString(self): return tokenNames[self.token.type]+"@"+str(self.token.line) __str__ = toString } a : ID -> ROOT[$ID] ; ''') found = self.execTreeParser( grammar, 'a', treeGrammar, 'a', input="abc" ) self.assertEqual("ROOT@1", found) def testTreeParserAutoHeteroAST(self): grammar = textwrap.dedent( r''' grammar T; options { language=Python3; output=AST; } a : ID ';' ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN;} ; ''') treeGrammar = textwrap.dedent( r''' tree grammar TP; options { language=Python3; output=AST; ASTLabelType=CommonTree; tokenVocab=T; } tokens { ROOT; } @header { class V(CommonTree): def toString(self): return CommonTree.toString(self) + "" __str__ = toString } a : ID ';'; ''') found = self.execTreeParser( grammar, 'a', treeGrammar, 'a', input="abc;" ) self.assertEqual("abc ;", found) if __name__ == '__main__': unittest.main() python3-antlr3-3.5.2/tests/t054main.py000066400000000000000000000170731324200532700174110ustar00rootroot00000000000000 import unittest import textwrap import antlr3 import antlr3.tree import testbase import sys from io import StringIO class T(testbase.ANTLRTest): def setUp(self): self.oldPath = sys.path[:] sys.path.insert(0, self.baseDir) def tearDown(self): sys.path = self.oldPath def testOverrideMain(self): grammar = textwrap.dedent( r"""lexer grammar T3; options { language = Python3; } @main { def main(argv): raise RuntimeError("no") } ID: ('a'..'z' | '\u00c0'..'\u00ff')+; WS: ' '+ { $channel = HIDDEN }; """) stdout = StringIO() lexerMod = self.compileInlineGrammar(grammar, returnModule=True) self.assertRaises(RuntimeError, lexerMod.main, ['lexer.py']) def testLexerFromFile(self): input = "foo bar" inputPath = self.writeFile("input.txt", input) grammar = textwrap.dedent( r"""lexer grammar T1; options { language = Python3; } ID: 'a'..'z'+; WS: ' '+ { $channel = HIDDEN }; """) stdout = StringIO() lexerMod = self.compileInlineGrammar(grammar, returnModule=True) lexerMod.main( ['lexer.py', inputPath], stdout=stdout ) self.assertEqual(len(stdout.getvalue().splitlines()), 3) def testLexerFromStdIO(self): input = "foo bar" grammar = textwrap.dedent( r"""lexer grammar T2; options { language = Python3; } ID: 'a'..'z'+; WS: ' '+ { $channel = HIDDEN }; """) stdout = StringIO() lexerMod = self.compileInlineGrammar(grammar, returnModule=True) lexerMod.main( ['lexer.py'], stdin=StringIO(input), stdout=stdout ) self.assertEqual(len(stdout.getvalue().splitlines()), 3) def testLexerEncoding(self): input = "föö bär" grammar = textwrap.dedent( r"""lexer grammar T3; options { language = Python3; } ID: ('a'..'z' | '\u00c0'..'\u00ff')+; WS: ' '+ { $channel = HIDDEN }; """) stdout = StringIO() lexerMod = self.compileInlineGrammar(grammar, returnModule=True) lexerMod.main( ['lexer.py'], stdin=StringIO(input), stdout=stdout ) self.assertEqual(len(stdout.getvalue().splitlines()), 3) def testCombined(self): input = "foo bar" grammar = textwrap.dedent( r"""grammar T4; options { language = Python3; } r returns [res]: (ID)+ EOF { $res = $text }; ID: 'a'..'z'+; WS: ' '+ { $channel = HIDDEN }; """) stdout = StringIO() lexerMod, parserMod = self.compileInlineGrammar(grammar, returnModule=True) parserMod.main( ['combined.py', '--rule', 'r'], stdin=StringIO(input), stdout=stdout ) stdout = stdout.getvalue() self.assertEqual(len(stdout.splitlines()), 1, stdout) def testCombinedOutputAST(self): input = "foo + bar" grammar = textwrap.dedent( r"""grammar T5; options { language = Python3; output = AST; } r: ID OP^ ID EOF!; ID: 'a'..'z'+; OP: '+'; WS: ' '+ { $channel = HIDDEN }; """) stdout = StringIO() lexerMod, parserMod = self.compileInlineGrammar(grammar, returnModule=True) parserMod.main( ['combined.py', '--rule', 'r'], stdin=StringIO(input), stdout=stdout ) stdout = stdout.getvalue().strip() self.assertEqual(stdout, "(+ foo bar)") def testTreeParser(self): grammar = textwrap.dedent( r'''grammar T6; options { language = Python3; output = AST; } r: ID OP^ ID EOF!; ID: 'a'..'z'+; OP: '+'; WS: ' '+ { $channel = HIDDEN }; ''') treeGrammar = textwrap.dedent( r'''tree grammar T6Walker; options { language=Python3; ASTLabelType=CommonTree; tokenVocab=T6; } r returns [res]: ^(OP a=ID b=ID) { $res = "{} {} {}".format($a.text, $OP.text, $b.text) } ; ''') lexerMod, parserMod = self.compileInlineGrammar(grammar, returnModule=True) walkerMod = self.compileInlineGrammar(treeGrammar, returnModule=True) stdout = StringIO() walkerMod.main( ['walker.py', '--rule', 'r', '--parser', 'T6Parser', '--parser-rule', 'r', '--lexer', 'T6Lexer'], stdin=StringIO("a+b"), stdout=stdout ) stdout = stdout.getvalue().strip() self.assertEqual(stdout, "'a + b'") def testTreeParserRewrite(self): grammar = textwrap.dedent( r'''grammar T7; options { language = Python3; output = AST; } r: ID OP^ ID EOF!; ID: 'a'..'z'+; OP: '+'; WS: ' '+ { $channel = HIDDEN }; ''') treeGrammar = textwrap.dedent( r'''tree grammar T7Walker; options { language=Python3; ASTLabelType=CommonTree; tokenVocab=T7; output=AST; } tokens { ARG; } r: ^(OP a=ID b=ID) -> ^(OP ^(ARG ID) ^(ARG ID)); ''') lexerMod, parserMod = self.compileInlineGrammar(grammar, returnModule=True) walkerMod = self.compileInlineGrammar(treeGrammar, returnModule=True) stdout = StringIO() walkerMod.main( ['walker.py', '--rule', 'r', '--parser', 'T7Parser', '--parser-rule', 'r', '--lexer', 'T7Lexer'], stdin=StringIO("a+b"), stdout=stdout ) stdout = stdout.getvalue().strip() self.assertEqual(stdout, "(+ (ARG a) (ARG b))") def testGrammarImport(self): slave = textwrap.dedent( r''' parser grammar T8S; options { language=Python3; } a : B; ''') parserName = self.writeInlineGrammar(slave)[0] # slave parsers are imported as normal python modules # to force reloading current version, purge module from sys.modules if parserName + 'Parser' in sys.modules: del sys.modules[parserName+'Parser'] master = textwrap.dedent( r''' grammar T8M; options { language=Python3; } import T8S; s returns [res]: a { $res = $a.text }; B : 'b' ; // defines B from inherited token space WS : (' '|'\n') {self.skip()} ; ''') stdout = StringIO() lexerMod, parserMod = self.compileInlineGrammar(master, returnModule=True) parserMod.main( ['import.py', '--rule', 's'], stdin=StringIO("b"), stdout=stdout ) stdout = stdout.getvalue().strip() self.assertEqual(stdout, "'b'") if __name__ == '__main__': unittest.main() python3-antlr3-3.5.2/tests/t057autoAST.py000066400000000000000000000720761324200532700200140ustar00rootroot00000000000000import unittest import textwrap import antlr3 import antlr3.tree import testbase import sys class TestAutoAST(testbase.ANTLRTest): def parserClass(self, base): class TParser(base): def __init__(self, *args, **kwargs): super().__init__(*args, **kwargs) self._errors = [] self._output = "" def capture(self, t): self._output += t def traceIn(self, ruleName, ruleIndex): self.traces.append('>'+ruleName) def traceOut(self, ruleName, ruleIndex): self.traces.append('<'+ruleName) def emitErrorMessage(self, msg): self._errors.append(msg) return TParser def lexerClass(self, base): class TLexer(base): def __init__(self, *args, **kwargs): super().__init__(*args, **kwargs) self._output = "" def capture(self, t): self._output += t def traceIn(self, ruleName, ruleIndex): self.traces.append('>'+ruleName) def traceOut(self, ruleName, ruleIndex): self.traces.append('<'+ruleName) def recover(self, input, re): # no error recovery yet, just crash! raise return TLexer def execParser(self, grammar, grammarEntry, input, expectErrors=False): lexerCls, parserCls = self.compileInlineGrammar(grammar) cStream = antlr3.StringStream(input) lexer = lexerCls(cStream) tStream = antlr3.CommonTokenStream(lexer) parser = parserCls(tStream) r = getattr(parser, grammarEntry)() if not expectErrors: self.assertEqual(len(parser._errors), 0, parser._errors) result = "" if r: if hasattr(r, 'result'): result += r.result if r.tree: result += r.tree.toStringTree() if not expectErrors: return result else: return result, parser._errors def execTreeParser(self, grammar, grammarEntry, treeGrammar, treeEntry, input): lexerCls, parserCls = self.compileInlineGrammar(grammar) walkerCls = self.compileInlineGrammar(treeGrammar) cStream = antlr3.StringStream(input) lexer = lexerCls(cStream) tStream = antlr3.CommonTokenStream(lexer) parser = parserCls(tStream) r = getattr(parser, grammarEntry)() nodes = antlr3.tree.CommonTreeNodeStream(r.tree) nodes.setTokenStream(tStream) walker = walkerCls(nodes) r = getattr(walker, treeEntry)() if r: return r.tree.toStringTree() return "" def testTokenList(self): grammar = textwrap.dedent( r''' grammar foo; options {language=Python3;output=AST;} a : ID INT ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN}; ''') found = self.execParser(grammar, "a", "abc 34") self.assertEqual("abc 34", found); def testTokenListInSingleAltBlock(self): grammar = textwrap.dedent( r''' grammar foo; options {language=Python3;output=AST;} a : (ID INT) ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar,"a", "abc 34") self.assertEqual("abc 34", found) def testSimpleRootAtOuterLevel(self): grammar = textwrap.dedent( r''' grammar foo; options {language=Python3;output=AST;} a : ID^ INT ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "abc 34") self.assertEqual("(abc 34)", found) def testSimpleRootAtOuterLevelReverse(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : INT ID^ ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "34 abc") self.assertEqual("(abc 34)", found) def testBang(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : ID INT! ID! INT ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "abc 34 dag 4532") self.assertEqual("abc 4532", found) def testOptionalThenRoot(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : ( ID INT )? ID^ ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "a 1 b") self.assertEqual("(b a 1)", found) def testLabeledStringRoot(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : v='void'^ ID ';' ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "void foo;") self.assertEqual("(void foo ;)", found) def testWildcard(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : v='void'^ . ';' ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "void foo;") self.assertEqual("(void foo ;)", found) def testWildcardRoot(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : v='void' .^ ';' ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "void foo;") self.assertEqual("(foo void ;)", found) def testWildcardRootWithLabel(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : v='void' x=.^ ';' ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "void foo;") self.assertEqual("(foo void ;)", found) def testWildcardRootWithListLabel(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : v='void' x=.^ ';' ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "void foo;") self.assertEqual("(foo void ;)", found) def testWildcardBangWithListLabel(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : v='void' x=.! ';' ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "void foo;") self.assertEqual("void ;", found) def testRootRoot(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : ID^ INT^ ID ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "a 34 c") self.assertEqual("(34 a c)", found) def testRootRoot2(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : ID INT^ ID^ ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "a 34 c") self.assertEqual("(c (34 a))", found) def testRootThenRootInLoop(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : ID^ (INT '*'^ ID)+ ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "a 34 * b 9 * c") self.assertEqual("(* (* (a 34) b 9) c)", found) def testNestedSubrule(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : 'void' (({pass}ID|INT) ID | 'null' ) ';' ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "void a b;") self.assertEqual("void a b ;", found) def testInvokeRule(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : type ID ; type : {pass}'int' | 'float' ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "int a") self.assertEqual("int a", found) def testInvokeRuleAsRoot(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : type^ ID ; type : {pass}'int' | 'float' ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "int a") self.assertEqual("(int a)", found) def testInvokeRuleAsRootWithLabel(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : x=type^ ID ; type : {pass}'int' | 'float' ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "int a") self.assertEqual("(int a)", found) def testInvokeRuleAsRootWithListLabel(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : x+=type^ ID ; type : {pass}'int' | 'float' ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "int a") self.assertEqual("(int a)", found) def testRuleRootInLoop(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : ID ('+'^ ID)* ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "a+b+c+d") self.assertEqual("(+ (+ (+ a b) c) d)", found) def testRuleInvocationRuleRootInLoop(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : ID (op^ ID)* ; op : {pass}'+' | '-' ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "a+b+c-d") self.assertEqual("(- (+ (+ a b) c) d)", found) def testTailRecursion(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} s : a ; a : atom ('exp'^ a)? ; atom : INT ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "s", "3 exp 4 exp 5") self.assertEqual("(exp 3 (exp 4 5))", found) def testSet(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : ID|INT ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "abc") self.assertEqual("abc", found) def testSetRoot(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : ('+' | '-')^ ID ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "+abc") self.assertEqual("(+ abc)", found) @testbase.broken( "FAILS until antlr.g rebuilt in v3", testbase.GrammarCompileError) def testSetRootWithLabel(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : x=('+' | '-')^ ID ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "+abc") self.assertEqual("(+ abc)", found) def testSetAsRuleRootInLoop(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : ID (('+'|'-')^ ID)* ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "a+b-c") self.assertEqual("(- (+ a b) c)", found) def testNotSet(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : ~ID '+' INT ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "34+2") self.assertEqual("34 + 2", found) def testNotSetWithLabel(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : x=~ID '+' INT ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "34+2") self.assertEqual("34 + 2", found) def testNotSetWithListLabel(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : x=~ID '+' INT ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "34+2") self.assertEqual("34 + 2", found) def testNotSetRoot(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : ~'+'^ INT ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "34 55") self.assertEqual("(34 55)", found) def testNotSetRootWithLabel(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : ~'+'^ INT ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "34 55") self.assertEqual("(34 55)", found) def testNotSetRootWithListLabel(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : ~'+'^ INT ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "34 55") self.assertEqual("(34 55)", found) def testNotSetRuleRootInLoop(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : INT (~INT^ INT)* ; blort : '+' ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "3+4+5") self.assertEqual("(+ (+ 3 4) 5)", found) @testbase.broken("FIXME: What happened to the semicolon?", AssertionError) def testTokenLabelReuse(self): # check for compilation problem due to multiple defines grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a returns [result] : id=ID id=ID {$result = "2nd id="+$id.text+";"} ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "a b") self.assertEqual("2nd id=b;a b", found) def testTokenLabelReuse2(self): # check for compilation problem due to multiple defines grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a returns [result]: id=ID id=ID^ {$result = "2nd id="+$id.text+','} ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "a b") self.assertEqual("2nd id=b,(b a)", found) def testTokenListLabelReuse(self): # check for compilation problem due to multiple defines # make sure ids has both ID tokens grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a returns [result] : ids+=ID ids+=ID {$result = "id list=[{}],".format(",".join([t.text for t in $ids]))} ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "a b") expecting = "id list=[a,b],a b" self.assertEqual(expecting, found) def testTokenListLabelReuse2(self): # check for compilation problem due to multiple defines # make sure ids has both ID tokens grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a returns [result] : ids+=ID^ ids+=ID {$result = "id list=[{}],".format(",".join([t.text for t in $ids]))} ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "a b") expecting = "id list=[a,b],(a b)" self.assertEqual(expecting, found) def testTokenListLabelRuleRoot(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : id+=ID^ ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "a") self.assertEqual("a", found) def testTokenListLabelBang(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : id+=ID! ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "a") self.assertEqual("", found) def testRuleListLabel(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a returns [result]: x+=b x+=b { t=$x[1] $result = "2nd x="+t.toStringTree()+','; }; b : ID; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "a b") self.assertEqual("2nd x=b,a b", found) def testRuleListLabelRuleRoot(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a returns [result] : ( x+=b^ )+ { $result = "x="+$x[1].toStringTree()+','; } ; b : ID; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "a b") self.assertEqual("x=(b a),(b a)", found) def testRuleListLabelBang(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a returns [result] : x+=b! x+=b { $result = "1st x="+$x[0].toStringTree()+','; } ; b : ID; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "a b") self.assertEqual("1st x=a,b", found) def testComplicatedMelange(self): # check for compilation problem grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : A b=B b=B c+=C c+=C D {s = $D.text} ; A : 'a' ; B : 'b' ; C : 'c' ; D : 'd' ; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "a b b c c d") self.assertEqual("a b b c c d", found) def testReturnValueWithAST(self): grammar = textwrap.dedent( r''' grammar foo; options {language=Python3;output=AST;} a returns [result] : ID b { $result = str($b.i) + '\n';} ; b returns [i] : INT {$i=int($INT.text);} ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "abc 34") self.assertEqual("34\nabc 34", found) def testSetLoop(self): grammar = textwrap.dedent( r''' grammar T; options { language=Python3;output=AST; } r : (INT|ID)+ ; ID : 'a'..'z' + ; INT : '0'..'9' +; WS: (' ' | '\n' | '\\t')+ {$channel = HIDDEN}; ''') found = self.execParser(grammar, "r", "abc 34 d") self.assertEqual("abc 34 d", found) def testExtraTokenInSimpleDecl(self): grammar = textwrap.dedent( r''' grammar foo; options {language=Python3;output=AST;} decl : type^ ID '='! INT ';'! ; type : 'int' | 'float' ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found, errors = self.execParser(grammar, "decl", "int 34 x=1;", expectErrors=True) self.assertEqual(["line 1:4 extraneous input '34' expecting ID"], errors) self.assertEqual("(int x 1)", found) # tree gets correct x and 1 tokens def testMissingIDInSimpleDecl(self): grammar = textwrap.dedent( r''' grammar foo; options {language=Python3;output=AST;} tokens {EXPR;} decl : type^ ID '='! INT ';'! ; type : 'int' | 'float' ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found, errors = self.execParser(grammar, "decl", "int =1;", expectErrors=True) self.assertEqual(["line 1:4 missing ID at '='"], errors) self.assertEqual("(int 1)", found) # tree gets invented ID token def testMissingSetInSimpleDecl(self): grammar = textwrap.dedent( r''' grammar foo; options {language=Python3;output=AST;} tokens {EXPR;} decl : type^ ID '='! INT ';'! ; type : 'int' | 'float' ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found, errors = self.execParser(grammar, "decl", "x=1;", expectErrors=True) self.assertEqual(["line 1:0 mismatched input 'x' expecting set None"], errors) self.assertEqual("( x 1)", found) # tree gets invented ID token def testMissingTokenGivesErrorNode(self): grammar = textwrap.dedent( r''' grammar foo; options {language=Python3;output=AST;} a : ID INT ; // follow is EOF ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found, errors = self.execParser(grammar, "a", "abc", expectErrors=True) self.assertEqual(["line 1:3 missing INT at ''"], errors) self.assertEqual("abc ", found) def testMissingTokenGivesErrorNodeInInvokedRule(self): grammar = textwrap.dedent( r''' grammar foo; options {language=Python3;output=AST;} a : b ; b : ID INT ; // follow should see EOF ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found, errors = self.execParser(grammar, "a", "abc", expectErrors=True) self.assertEqual(["line 1:3 mismatched input '' expecting INT"], errors) self.assertEqual(", resync=abc>", found) def testExtraTokenGivesErrorNode(self): grammar = textwrap.dedent( r''' grammar foo; options {language=Python3;output=AST;} a : b c ; b : ID ; c : INT ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found, errors = self.execParser(grammar, "a", "abc ick 34", expectErrors=True) self.assertEqual(["line 1:4 extraneous input 'ick' expecting INT"], errors) self.assertEqual("abc 34", found) def testMissingFirstTokenGivesErrorNode(self): grammar = textwrap.dedent( r''' grammar foo; options {language=Python3;output=AST;} a : ID INT ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found, errors = self.execParser(grammar, "a", "34", expectErrors=True) self.assertEqual(["line 1:0 missing ID at '34'"], errors) self.assertEqual(" 34", found) def testMissingFirstTokenGivesErrorNode2(self): grammar = textwrap.dedent( r''' grammar foo; options {language=Python3;output=AST;} a : b c ; b : ID ; c : INT ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found, errors = self.execParser(grammar, "a", "34", expectErrors=True) # finds an error at the first token, 34, and re-syncs. # re-synchronizing does not consume a token because 34 follows # ref to rule b (start of c). It then matches 34 in c. self.assertEqual(["line 1:0 missing ID at '34'"], errors) self.assertEqual(" 34", found) def testNoViableAltGivesErrorNode(self): grammar = textwrap.dedent( r''' grammar foo; options {language=Python3;output=AST;} a : b | c ; b : ID ; c : INT ; ID : 'a'..'z'+ ; S : '*' ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found, errors = self.execParser(grammar, "a", "*", expectErrors=True) self.assertEqual(["line 1:0 no viable alternative at input '*'"], errors) self.assertEqual(",1:0], resync=*>", found) if __name__ == '__main__': unittest.main() python3-antlr3-3.5.2/tests/t058rewriteAST.py000066400000000000000000001271161324200532700205220ustar00rootroot00000000000000import unittest import textwrap import antlr3 import antlr3.tree import testbase import sys class TestRewriteAST(testbase.ANTLRTest): def parserClass(self, base): class TParser(base): def __init__(self, *args, **kwargs): super().__init__(*args, **kwargs) self._errors = [] self._output = "" def capture(self, t): self._output += t def traceIn(self, ruleName, ruleIndex): self.traces.append('>'+ruleName) def traceOut(self, ruleName, ruleIndex): self.traces.append('<'+ruleName) def emitErrorMessage(self, msg): self._errors.append(msg) return TParser def lexerClass(self, base): class TLexer(base): def __init__(self, *args, **kwargs): super().__init__(*args, **kwargs) self._output = "" def capture(self, t): self._output += t def traceIn(self, ruleName, ruleIndex): self.traces.append('>'+ruleName) def traceOut(self, ruleName, ruleIndex): self.traces.append('<'+ruleName) def recover(self, input, re): # no error recovery yet, just crash! raise return TLexer def execParser(self, grammar, grammarEntry, input, expectErrors=False): lexerCls, parserCls = self.compileInlineGrammar(grammar) cStream = antlr3.StringStream(input) lexer = lexerCls(cStream) tStream = antlr3.CommonTokenStream(lexer) parser = parserCls(tStream) r = getattr(parser, grammarEntry)() if not expectErrors: self.assertEqual(len(parser._errors), 0, parser._errors) result = "" if r: if hasattr(r, 'result'): result += r.result if r.tree: result += r.tree.toStringTree() if not expectErrors: return result else: return result, parser._errors def execTreeParser(self, grammar, grammarEntry, treeGrammar, treeEntry, input): lexerCls, parserCls = self.compileInlineGrammar(grammar) walkerCls = self.compileInlineGrammar(treeGrammar) cStream = antlr3.StringStream(input) lexer = lexerCls(cStream) tStream = antlr3.CommonTokenStream(lexer) parser = parserCls(tStream) r = getattr(parser, grammarEntry)() nodes = antlr3.tree.CommonTreeNodeStream(r.tree) nodes.setTokenStream(tStream) walker = walkerCls(nodes) r = getattr(walker, treeEntry)() if r: return r.tree.toStringTree() return "" def testDelete(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : ID INT -> ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "abc 34") self.assertEqual("", found) def testSingleToken(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : ID -> ID; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "abc") self.assertEqual("abc", found) def testSingleTokenToNewNode(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : ID -> ID["x"]; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "abc") self.assertEqual("x", found) def testSingleTokenToNewNodeRoot(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : ID -> ^(ID["x"] INT); ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "abc") self.assertEqual("(x INT)", found) def testSingleTokenToNewNode2(self): # Allow creation of new nodes w/o args. grammar = textwrap.dedent( r''' grammar TT; options {language=Python3;output=AST;} a : ID -> ID[ ]; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "abc") self.assertEqual("ID", found) def testSingleCharLiteral(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : 'c' -> 'c'; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "c") self.assertEqual("c", found) def testSingleStringLiteral(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : 'ick' -> 'ick'; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "ick") self.assertEqual("ick", found) def testSingleRule(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : b -> b; b : ID ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "abc") self.assertEqual("abc", found) def testReorderTokens(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : ID INT -> INT ID; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "abc 34") self.assertEqual("34 abc", found) def testReorderTokenAndRule(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : b INT -> INT b; b : ID ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "abc 34") self.assertEqual("34 abc", found) def testTokenTree(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : ID INT -> ^(INT ID); ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "abc 34") self.assertEqual("(34 abc)", found) def testTokenTreeAfterOtherStuff(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : 'void' ID INT -> 'void' ^(INT ID); ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "void abc 34") self.assertEqual("void (34 abc)", found) def testNestedTokenTreeWithOuterLoop(self): # verify that ID and INT both iterate over outer index variable grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} tokens {DUH;} a : ID INT ID INT -> ^( DUH ID ^( DUH INT) )+ ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "a 1 b 2") self.assertEqual("(DUH a (DUH 1)) (DUH b (DUH 2))", found) def testOptionalSingleToken(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : ID -> ID? ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "abc") self.assertEqual("abc", found) def testClosureSingleToken(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : ID ID -> ID* ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "a b") self.assertEqual("a b", found) def testPositiveClosureSingleToken(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : ID ID -> ID+ ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "a b") self.assertEqual("a b", found) def testOptionalSingleRule(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : b -> b?; b : ID ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "abc") self.assertEqual("abc", found) def testClosureSingleRule(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : b b -> b*; b : ID ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "a b") self.assertEqual("a b", found) def testClosureOfLabel(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : x+=b x+=b -> $x*; b : ID ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "a b") self.assertEqual("a b", found) def testOptionalLabelNoListLabel(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : (x=ID)? -> $x?; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "a") self.assertEqual("a", found) def testPositiveClosureSingleRule(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : b b -> b+; b : ID ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "a b") self.assertEqual("a b", found) def testSinglePredicateT(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : ID -> {True}? ID -> ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "abc") self.assertEqual("abc", found) def testSinglePredicateF(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : ID -> {False}? ID -> ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "abc") self.assertEqual("", found) def testMultiplePredicate(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : ID INT -> {False}? ID -> {True}? INT -> ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "a 2") self.assertEqual("2", found) def testMultiplePredicateTrees(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : ID INT -> {False}? ^(ID INT) -> {True}? ^(INT ID) -> ID ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "a 2") self.assertEqual("(2 a)", found) def testSimpleTree(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : op INT -> ^(op INT); op : '+'|'-' ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "-34") self.assertEqual("(- 34)", found) def testSimpleTree2(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : op INT -> ^(INT op); op : '+'|'-' ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "+ 34") self.assertEqual("(34 +)", found) def testNestedTrees(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : 'var' (ID ':' type ';')+ -> ^('var' ^(':' ID type)+) ; type : 'int' | 'float' ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "var a:int; b:float;") self.assertEqual("(var (: a int) (: b float))", found) def testImaginaryTokenCopy(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} tokens {VAR;} a : ID (',' ID)*-> ^(VAR ID)+ ; type : 'int' | 'float' ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "a,b,c") self.assertEqual("(VAR a) (VAR b) (VAR c)", found) def testTokenUnreferencedOnLeftButDefined(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} tokens {VAR;} a : b -> ID ; b : ID ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "a") self.assertEqual("ID", found) def testImaginaryTokenCopySetText(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} tokens {VAR;} a : ID (',' ID)*-> ^(VAR["var"] ID)+ ; type : 'int' | 'float' ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "a,b,c") self.assertEqual("(var a) (var b) (var c)", found) def testImaginaryTokenNoCopyFromToken(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} tokens {BLOCK;} a : lc='{' ID+ '}' -> ^(BLOCK[$lc] ID+) ; type : 'int' | 'float' ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "{a b c}") self.assertEqual("({ a b c)", found) def testImaginaryTokenNoCopyFromTokenSetText(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} tokens {BLOCK;} a : lc='{' ID+ '}' -> ^(BLOCK[$lc,"block"] ID+) ; type : 'int' | 'float' ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "{a b c}") self.assertEqual("(block a b c)", found) def testMixedRewriteAndAutoAST(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} tokens {BLOCK;} a : b b^ ; // 2nd b matches only an INT; can make it root b : ID INT -> INT ID | INT ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "a 1 2") self.assertEqual("(2 1 a)", found) def testSubruleWithRewrite(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} tokens {BLOCK;} a : b b ; b : (ID INT -> INT ID | INT INT -> INT+ ) ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "a 1 2 3") self.assertEqual("1 a 2 3", found) def testSubruleWithRewrite2(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} tokens {TYPE;} a : b b ; b : 'int' ( ID -> ^(TYPE 'int' ID) | ID '=' INT -> ^(TYPE 'int' ID INT) ) ';' ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "int a; int b=3;") self.assertEqual("(TYPE int a) (TYPE int b 3)", found) def testNestedRewriteShutsOffAutoAST(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} tokens {BLOCK;} a : b b ; b : ID ( ID (last=ID -> $last)+ ) ';' // get last ID | INT // should still get auto AST construction ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "a b c d; 42") self.assertEqual("d 42", found) def testRewriteActions(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : atom -> ^({self.adaptor.create(INT,"9")} atom) ; atom : INT ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "3") self.assertEqual("(9 3)", found) def testRewriteActions2(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : atom -> {self.adaptor.create(INT,"9")} atom ; atom : INT ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "3") self.assertEqual("9 3", found) def testRefToOldValue(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} tokens {BLOCK;} a : (atom -> atom) (op='+' r=atom -> ^($op $a $r) )* ; atom : INT ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "3+4+5") self.assertEqual("(+ (+ 3 4) 5)", found) def testCopySemanticsForRules(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} tokens {BLOCK;} a : atom -> ^(atom atom) ; // NOT CYCLE! (dup atom) atom : INT ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "3") self.assertEqual("(3 3)", found) def testCopySemanticsForRules2(self): # copy type as a root for each invocation of (...)+ in rewrite grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : type ID (',' ID)* ';' -> ^(type ID)+ ; type : 'int' ; ID : 'a'..'z'+ ; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "int a,b,c;") self.assertEqual("(int a) (int b) (int c)", found) def testCopySemanticsForRules3(self): # copy type *and* modifier even though it's optional # for each invocation of (...)+ in rewrite grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : modifier? type ID (',' ID)* ';' -> ^(type modifier? ID)+ ; type : 'int' ; modifier : 'public' ; ID : 'a'..'z'+ ; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "public int a,b,c;") self.assertEqual("(int public a) (int public b) (int public c)", found) def testCopySemanticsForRules3Double(self): # copy type *and* modifier even though it's optional # for each invocation of (...)+ in rewrite grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : modifier? type ID (',' ID)* ';' -> ^(type modifier? ID)+ ^(type modifier? ID)+ ; type : 'int' ; modifier : 'public' ; ID : 'a'..'z'+ ; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "public int a,b,c;") self.assertEqual("(int public a) (int public b) (int public c) (int public a) (int public b) (int public c)", found) def testCopySemanticsForRules4(self): # copy type *and* modifier even though it's optional # for each invocation of (...)+ in rewrite grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} tokens {MOD;} a : modifier? type ID (',' ID)* ';' -> ^(type ^(MOD modifier)? ID)+ ; type : 'int' ; modifier : 'public' ; ID : 'a'..'z'+ ; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "public int a,b,c;") self.assertEqual("(int (MOD public) a) (int (MOD public) b) (int (MOD public) c)", found) def testCopySemanticsLists(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} tokens {MOD;} a : ID (',' ID)* ';' -> ID+ ID+ ; ID : 'a'..'z'+ ; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "a,b,c;") self.assertEqual("a b c a b c", found) def testCopyRuleLabel(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} tokens {BLOCK;} a : x=b -> $x $x; b : ID ; ID : 'a'..'z'+ ; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "a") self.assertEqual("a a", found) def testCopyRuleLabel2(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} tokens {BLOCK;} a : x=b -> ^($x $x); b : ID ; ID : 'a'..'z'+ ; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "a") self.assertEqual("(a a)", found) def testQueueingOfTokens(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : 'int' ID (',' ID)* ';' -> ^('int' ID+) ; op : '+'|'-' ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "int a,b,c;") self.assertEqual("(int a b c)", found) def testCopyOfTokens(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : 'int' ID ';' -> 'int' ID 'int' ID ; op : '+'|'-' ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "int a;") self.assertEqual("int a int a", found) def testTokenCopyInLoop(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : 'int' ID (',' ID)* ';' -> ^('int' ID)+ ; op : '+'|'-' ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "int a,b,c;") self.assertEqual("(int a) (int b) (int c)", found) def testTokenCopyInLoopAgainstTwoOthers(self): # must smear 'int' copies across as root of multiple trees grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : 'int' ID ':' INT (',' ID ':' INT)* ';' -> ^('int' ID INT)+ ; op : '+'|'-' ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "int a:1,b:2,c:3;") self.assertEqual("(int a 1) (int b 2) (int c 3)", found) def testListRefdOneAtATime(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : ID+ -> ID ID ID ; // works if 3 input IDs op : '+'|'-' ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "a b c") self.assertEqual("a b c", found) def testSplitListWithLabels(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} tokens {VAR;} a : first=ID others+=ID* -> $first VAR $others+ ; op : '+'|'-' ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "a b c") self.assertEqual("a VAR b c", found) def testComplicatedMelange(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} tokens {BLOCK;} a : A A b=B B b=B c+=C C c+=C D {s=$D.text} -> A+ B+ C+ D ; type : 'int' | 'float' ; A : 'a' ; B : 'b' ; C : 'c' ; D : 'd' ; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "a a b b b c c c d") self.assertEqual("a a b b b c c c d", found) def testRuleLabel(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} tokens {BLOCK;} a : x=b -> $x; b : ID ; ID : 'a'..'z'+ ; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "a") self.assertEqual("a", found) def testAmbiguousRule(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : ID a -> a | INT ; ID : 'a'..'z'+ ; INT: '0'..'9'+ ; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "abc 34") self.assertEqual("34", found) def testRuleListLabel(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} tokens {BLOCK;} a : x+=b x+=b -> $x+; b : ID ; ID : 'a'..'z'+ ; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "a b") self.assertEqual("a b", found) def testRuleListLabel2(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} tokens {BLOCK;} a : x+=b x+=b -> $x $x*; b : ID ; ID : 'a'..'z'+ ; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "a b") self.assertEqual("a b", found) def testOptional(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} tokens {BLOCK;} a : x=b (y=b)? -> $x $y?; b : ID ; ID : 'a'..'z'+ ; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "a") self.assertEqual("a", found) def testOptional2(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} tokens {BLOCK;} a : x=ID (y=b)? -> $x $y?; b : ID ; ID : 'a'..'z'+ ; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "a b") self.assertEqual("a b", found) def testOptional3(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} tokens {BLOCK;} a : x=ID (y=b)? -> ($x $y)?; b : ID ; ID : 'a'..'z'+ ; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "a b") self.assertEqual("a b", found) def testOptional4(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} tokens {BLOCK;} a : x+=ID (y=b)? -> ($x $y)?; b : ID ; ID : 'a'..'z'+ ; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "a b") self.assertEqual("a b", found) def testOptional5(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} tokens {BLOCK;} a : ID -> ID? ; // match an ID to optional ID b : ID ; ID : 'a'..'z'+ ; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "a") self.assertEqual("a", found) def testArbitraryExprType(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} tokens {BLOCK;} a : x+=b x+=b -> {CommonTree(None)}; b : ID ; ID : 'a'..'z'+ ; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "a b") self.assertEqual("", found) def testSet(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a: (INT|ID)+ -> INT+ ID+ ; INT: '0'..'9'+; ID : 'a'..'z'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "2 a 34 de") self.assertEqual("2 34 a de", found) def testSet2(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a: (INT|ID) -> INT? ID? ; INT: '0'..'9'+; ID : 'a'..'z'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "2") self.assertEqual("2", found) @testbase.broken("http://www.antlr.org:8888/browse/ANTLR-162", antlr3.tree.RewriteEmptyStreamException) def testSetWithLabel(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : x=(INT|ID) -> $x ; INT: '0'..'9'+; ID : 'a'..'z'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "2") self.assertEqual("2", found) def testRewriteAction(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} tokens { FLOAT; } r : INT -> {CommonTree(CommonToken(type=FLOAT, text=$INT.text+".0"))} ; INT : '0'..'9'+; WS: (' ' | '\n' | '\t')+ {$channel = HIDDEN}; ''') found = self.execParser(grammar, "r", "25") self.assertEqual("25.0", found) def testOptionalSubruleWithoutRealElements(self): # copy type *and* modifier even though it's optional # for each invocation of (...)+ in rewrite grammar = textwrap.dedent( r""" grammar T; options {language=Python3;output=AST;} tokens {PARMS;} modulo : 'modulo' ID ('(' parms+ ')')? -> ^('modulo' ID ^(PARMS parms+)?) ; parms : '#'|ID; ID : ('a'..'z' | 'A'..'Z')+; WS : (' '|'\n') {$channel=HIDDEN} ; """) found = self.execParser(grammar, "modulo", "modulo abc (x y #)") self.assertEqual("(modulo abc (PARMS x y #))", found) ## C A R D I N A L I T Y I S S U E S def testCardinality(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} tokens {BLOCK;} a : ID ID INT INT INT -> (ID INT)+; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') self.assertRaises(antlr3.tree.RewriteCardinalityException, self.execParser, grammar, "a", "a b 3 4 5") def testCardinality2(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : ID+ -> ID ID ID ; // only 2 input IDs op : '+'|'-' ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') self.assertRaises(antlr3.tree.RewriteCardinalityException, self.execParser, grammar, "a", "a b") def testCardinality3(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : ID? INT -> ID INT ; op : '+'|'-' ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') self.assertRaises(antlr3.tree.RewriteEmptyStreamException, self.execParser, grammar, "a", "3") def testLoopCardinality(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : ID? INT -> ID+ INT ; op : '+'|'-' ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') self.assertRaises(antlr3.tree.RewriteEarlyExitException, self.execParser, grammar, "a", "3") def testWildcard(self): grammar = textwrap.dedent( r''' grammar T; options {language=Python3;output=AST;} a : ID c=. -> $c; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found = self.execParser(grammar, "a", "abc 34") self.assertEqual("34", found) # E R R O R S def testExtraTokenInSimpleDecl(self): grammar = textwrap.dedent( r''' grammar foo; options {language=Python3;output=AST;} tokens {EXPR;} decl : type ID '=' INT ';' -> ^(EXPR type ID INT) ; type : 'int' | 'float' ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found, errors = self.execParser(grammar, "decl", "int 34 x=1;", expectErrors=True) self.assertEqual(["line 1:4 extraneous input '34' expecting ID"], errors) self.assertEqual("(EXPR int x 1)", found) # tree gets correct x and 1 tokens #@testbase.broken("FIXME", AssertionError) def testMissingIDInSimpleDecl(self): grammar = textwrap.dedent( r''' grammar foo; options {language=Python3;output=AST;} tokens {EXPR;} decl : type ID '=' INT ';' -> ^(EXPR type ID INT) ; type : 'int' | 'float' ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found, errors = self.execParser(grammar, "decl", "int =1;", expectErrors=True) self.assertEqual(["line 1:4 missing ID at '='"], errors) self.assertEqual("(EXPR int 1)", found) # tree gets invented ID token def testMissingSetInSimpleDecl(self): grammar = textwrap.dedent( r''' grammar foo; options {language=Python3;output=AST;} tokens {EXPR;} decl : type ID '=' INT ';' -> ^(EXPR type ID INT) ; type : 'int' | 'float' ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found, errors = self.execParser(grammar, "decl", "x=1;", expectErrors=True) self.assertEqual(["line 1:0 mismatched input 'x' expecting set None"], errors); self.assertEqual("(EXPR x 1)", found) # tree gets invented ID token def testMissingTokenGivesErrorNode(self): grammar = textwrap.dedent( r''' grammar foo; options {language=Python3;output=AST;} a : ID INT -> ID INT ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found, errors = self.execParser(grammar, "a", "abc", expectErrors=True) self.assertEqual(["line 1:3 missing INT at ''"], errors) # doesn't do in-line recovery for sets (yet?) self.assertEqual("abc ", found) def testExtraTokenGivesErrorNode(self): grammar = textwrap.dedent( r''' grammar foo; options {language=Python3;output=AST;} a : b c -> b c; b : ID -> ID ; c : INT -> INT ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found, errors = self.execParser(grammar, "a", "abc ick 34", expectErrors=True) self.assertEqual(["line 1:4 extraneous input 'ick' expecting INT"], errors) self.assertEqual("abc 34", found) #@testbase.broken("FIXME", AssertionError) def testMissingFirstTokenGivesErrorNode(self): grammar = textwrap.dedent( r''' grammar foo; options {language=Python3;output=AST;} a : ID INT -> ID INT ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found, errors = self.execParser(grammar, "a", "34", expectErrors=True) self.assertEqual(["line 1:0 missing ID at '34'"], errors) self.assertEqual(" 34", found) #@testbase.broken("FIXME", AssertionError) def testMissingFirstTokenGivesErrorNode2(self): grammar = textwrap.dedent( r''' grammar foo; options {language=Python3;output=AST;} a : b c -> b c; b : ID -> ID ; c : INT -> INT ; ID : 'a'..'z'+ ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found, errors = self.execParser(grammar, "a", "34", expectErrors=True) # finds an error at the first token, 34, and re-syncs. # re-synchronizing does not consume a token because 34 follows # ref to rule b (start of c). It then matches 34 in c. self.assertEqual(["line 1:0 missing ID at '34'"], errors) self.assertEqual(" 34", found) def testNoViableAltGivesErrorNode(self): grammar = textwrap.dedent( r''' grammar foo; options {language=Python3;output=AST;} a : b -> b | c -> c; b : ID -> ID ; c : INT -> INT ; ID : 'a'..'z'+ ; S : '*' ; INT : '0'..'9'+; WS : (' '|'\n') {$channel=HIDDEN} ; ''') found, errors = self.execParser(grammar, "a", "*", expectErrors=True) # finds an error at the first token, 34, and re-syncs. # re-synchronizing does not consume a token because 34 follows # ref to rule b (start of c). It then matches 34 in c. self.assertEqual(["line 1:0 no viable alternative at input '*'"], errors); self.assertEqual(",1:0], resync=*>", found) if __name__ == '__main__': unittest.main() python3-antlr3-3.5.2/tests/t059debug.py000066400000000000000000000675511324200532700175660ustar00rootroot00000000000000import unittest import textwrap import antlr3 import antlr3.tree import antlr3.debug import testbase import sys import threading import socket import errno import time class Debugger(threading.Thread): def __init__(self, port): super().__init__() self.events = [] self.success = False self.port = port def run(self): # create listening socket s = None tstart = time.time() while time.time() - tstart < 10: try: s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) s.connect(('127.0.0.1', self.port)) break except socket.error as exc: if s: s.close() if exc.args[0] != errno.ECONNREFUSED: raise time.sleep(0.1) if s is None: self.events.append(['nosocket']) return s.setblocking(1) s.settimeout(10.0) output = s.makefile('w', 1) input = s.makefile('r', 1) try: # handshake l = input.readline().strip() assert l == 'ANTLR 2' l = input.readline().strip() assert l.startswith('grammar "'), l output.write('ACK\n') output.flush() while True: event = input.readline().strip() self.events.append(event.split('\t')) output.write('ACK\n') output.flush() if event == 'terminate': self.success = True break except socket.timeout: self.events.append(['timeout']) except socket.error as exc: self.events.append(['socketerror', exc.args]) finally: output.close() input.close() s.close() class T(testbase.ANTLRTest): def execParser(self, grammar, grammarEntry, input, listener, parser_args={}): if listener is None: port = 49100 debugger = Debugger(port) debugger.start() # TODO(pink): install alarm, so it doesn't hang forever in case of a bug else: port = None try: lexerCls, parserCls = self.compileInlineGrammar( grammar, options='-debug') cStream = antlr3.StringStream(input) lexer = lexerCls(cStream) tStream = antlr3.CommonTokenStream(lexer) parser = parserCls(tStream, dbg=listener, port=port, **parser_args) getattr(parser, grammarEntry)() finally: if listener is None: debugger.join() return debugger def testBasicParser(self): grammar = textwrap.dedent( r''' grammar T; options { language=Python3; } a : ID EOF; ID : 'a'..'z'+ ; WS : (' '|'\n') {$channel=HIDDEN} ; ''') listener = antlr3.debug.RecordDebugEventListener() self.execParser( grammar, 'a', input="a", listener=listener) # We only check that some LT events are present. How many is subject # to change (at the time of writing there are two, which is one too # many). lt_events = [event for event in listener.events if event.startswith("LT ")] self.assertNotEqual(lt_events, []) # For the rest, filter out LT events to get a reliable test. expected = ["enterRule a", "location 6:1", "location 6:5", "location 6:8", "location 6:11", "exitRule a"] found = [event for event in listener.events if not event.startswith("LT ")] self.assertListEqual(found, expected) def testSocketProxy(self): grammar = textwrap.dedent( r''' grammar T; options { language=Python3; } a : ID EOF; ID : 'a'..'z'+ ; WS : (' '|'\n') {$channel=HIDDEN} ; ''') debugger = self.execParser( grammar, 'a', input="a", listener=None) self.assertTrue(debugger.success) expected = [['enterRule', 'T.g', 'a'], ['location', '6', '1'], ['enterAlt', '1'], ['location', '6', '5'], ['LT', '1', '0', '4', '0', '1', '0', '"a'], ['LT', '1', '0', '4', '0', '1', '0', '"a'], ['consumeToken', '0', '4', '0', '1', '0', '"a'], ['location', '6', '8'], ['LT', '1', '-1', '-1', '0', '1', '1', '"'], ['LT', '1', '-1', '-1', '0', '1', '1', '"'], ['consumeToken', '-1', '-1', '0', '1', '1', '"'], ['location', '6', '11'], ['exitRule', 'T.g', 'a'], ['terminate']] self.assertListEqual(debugger.events, expected) def testRecognitionException(self): grammar = textwrap.dedent( r''' grammar T; options { language=Python3; } a : ID EOF; ID : 'a'..'z'+ ; WS : (' '|'\n') {$channel=HIDDEN} ; ''') debugger = self.execParser( grammar, 'a', input="a b", listener=None) self.assertTrue(debugger.success) expected = [['enterRule', 'T.g', 'a'], ['location', '6', '1'], ['enterAlt', '1'], ['location', '6', '5'], ['LT', '1', '0', '4', '0', '1', '0', '"a'], ['LT', '1', '0', '4', '0', '1', '0', '"a'], ['consumeToken', '0', '4', '0', '1', '0', '"a'], ['consumeHiddenToken', '1', '5', '99', '1', '1', '"'], ['location', '6', '8'], ['LT', '1', '2', '4', '0', '1', '2', '"b'], ['LT', '1', '2', '4', '0', '1', '2', '"b'], ['LT', '2', '-1', '-1', '0', '1', '3', '"'], ['LT', '1', '2', '4', '0', '1', '2', '"b'], ['LT', '1', '2', '4', '0', '1', '2', '"b'], ['beginResync'], ['consumeToken', '2', '4', '0', '1', '2', '"b'], ['endResync'], ['exception', 'UnwantedTokenException', '2', '1', '2'], ['LT', '1', '-1', '-1', '0', '1', '3', '"'], ['consumeToken', '-1', '-1', '0', '1', '3', '"'], ['location', '6', '11'], ['exitRule', 'T.g', 'a'], ['terminate']] self.assertListEqual(debugger.events, expected) def testSemPred(self): grammar = textwrap.dedent( r''' grammar T; options { language=Python3; } a : {True}? ID EOF; ID : 'a'..'z'+ ; WS : (' '|'\n') {$channel=HIDDEN} ; ''') debugger = self.execParser( grammar, 'a', input="a", listener=None) self.assertTrue(debugger.success) expected = [['enterRule', 'T.g', 'a'], ['location', '6', '1'], ['enterAlt', '1'], ['location', '6', '5'], ['semanticPredicate', '1', 'True'], ['location', '6', '13'], ['LT', '1', '0', '4', '0', '1', '0', '"a'], ['LT', '1', '0', '4', '0', '1', '0', '"a'], ['consumeToken', '0', '4', '0', '1', '0', '"a'], ['location', '6', '16'], ['LT', '1', '-1', '-1', '0', '1', '1', '"'], ['LT', '1', '-1', '-1', '0', '1', '1', '"'], ['consumeToken', '-1', '-1', '0', '1', '1', '"'], ['location', '6', '19'], ['exitRule', 'T.g', 'a'], ['terminate']] self.assertListEqual(debugger.events, expected) def testPositiveClosureBlock(self): grammar = textwrap.dedent( r''' grammar T; options { language=Python3; } a : ID ( ID | INT )+ EOF; ID : 'a'..'z'+ ; INT : '0'..'9'+ ; WS : (' '|'\n') {$channel=HIDDEN} ; ''') debugger = self.execParser( grammar, 'a', input="a 1 b c 3", listener=None) self.assertTrue(debugger.success) expected = [['enterRule', 'T.g', 'a'], ['location', '6', '1'], ['enterAlt', '1'], ['location', '6', '5'], ['LT', '1', '0', '4', '0', '1', '0', '"a'], ['LT', '1', '0', '4', '0', '1', '0', '"a'], ['consumeToken', '0', '4', '0', '1', '0', '"a'], ['consumeHiddenToken', '1', '6', '99', '1', '1', '"'], ['location', '6', '8'], ['enterSubRule', '1'], ['enterDecision', '1', '0'], ['LT', '1', '2', '5', '0', '1', '2', '"1'], ['exitDecision', '1'], ['enterAlt', '1'], ['location', '6', '8'], ['LT', '1', '2', '5', '0', '1', '2', '"1'], ['consumeToken', '2', '5', '0', '1', '2', '"1'], ['consumeHiddenToken', '3', '6', '99', '1', '3', '"'], ['enterDecision', '1', '0'], ['LT', '1', '4', '4', '0', '1', '4', '"b'], ['exitDecision', '1'], ['enterAlt', '1'], ['location', '6', '8'], ['LT', '1', '4', '4', '0', '1', '4', '"b'], ['consumeToken', '4', '4', '0', '1', '4', '"b'], ['consumeHiddenToken', '5', '6', '99', '1', '5', '"'], ['enterDecision', '1', '0'], ['LT', '1', '6', '4', '0', '1', '6', '"c'], ['exitDecision', '1'], ['enterAlt', '1'], ['location', '6', '8'], ['LT', '1', '6', '4', '0', '1', '6', '"c'], ['consumeToken', '6', '4', '0', '1', '6', '"c'], ['consumeHiddenToken', '7', '6', '99', '1', '7', '"'], ['enterDecision', '1', '0'], ['LT', '1', '8', '5', '0', '1', '8', '"3'], ['exitDecision', '1'], ['enterAlt', '1'], ['location', '6', '8'], ['LT', '1', '8', '5', '0', '1', '8', '"3'], ['consumeToken', '8', '5', '0', '1', '8', '"3'], ['enterDecision', '1', '0'], ['LT', '1', '-1', '-1', '0', '1', '9', '"'], ['exitDecision', '1'], ['exitSubRule', '1'], ['location', '6', '22'], ['LT', '1', '-1', '-1', '0', '1', '9', '"'], ['LT', '1', '-1', '-1', '0', '1', '9', '"'], ['consumeToken', '-1', '-1', '0', '1', '9', '"'], ['location', '6', '25'], ['exitRule', 'T.g', 'a'], ['terminate']] self.assertListEqual(debugger.events, expected) def testClosureBlock(self): grammar = textwrap.dedent( r''' grammar T; options { language=Python3; } a : ID ( ID | INT )* EOF; ID : 'a'..'z'+ ; INT : '0'..'9'+ ; WS : (' '|'\n') {$channel=HIDDEN} ; ''') debugger = self.execParser( grammar, 'a', input="a 1 b c 3", listener=None) self.assertTrue(debugger.success) expected = [['enterRule', 'T.g', 'a'], ['location', '6', '1'], ['enterAlt', '1'], ['location', '6', '5'], ['LT', '1', '0', '4', '0', '1', '0', '"a'], ['LT', '1', '0', '4', '0', '1', '0', '"a'], ['consumeToken', '0', '4', '0', '1', '0', '"a'], ['consumeHiddenToken', '1', '6', '99', '1', '1', '"'], ['location', '6', '8'], ['enterSubRule', '1'], ['enterDecision', '1', '0'], ['LT', '1', '2', '5', '0', '1', '2', '"1'], ['exitDecision', '1'], ['enterAlt', '1'], ['location', '6', '8'], ['LT', '1', '2', '5', '0', '1', '2', '"1'], ['consumeToken', '2', '5', '0', '1', '2', '"1'], ['consumeHiddenToken', '3', '6', '99', '1', '3', '"'], ['enterDecision', '1', '0'], ['LT', '1', '4', '4', '0', '1', '4', '"b'], ['exitDecision', '1'], ['enterAlt', '1'], ['location', '6', '8'], ['LT', '1', '4', '4', '0', '1', '4', '"b'], ['consumeToken', '4', '4', '0', '1', '4', '"b'], ['consumeHiddenToken', '5', '6', '99', '1', '5', '"'], ['enterDecision', '1', '0'], ['LT', '1', '6', '4', '0', '1', '6', '"c'], ['exitDecision', '1'], ['enterAlt', '1'], ['location', '6', '8'], ['LT', '1', '6', '4', '0', '1', '6', '"c'], ['consumeToken', '6', '4', '0', '1', '6', '"c'], ['consumeHiddenToken', '7', '6', '99', '1', '7', '"'], ['enterDecision', '1', '0'], ['LT', '1', '8', '5', '0', '1', '8', '"3'], ['exitDecision', '1'], ['enterAlt', '1'], ['location', '6', '8'], ['LT', '1', '8', '5', '0', '1', '8', '"3'], ['consumeToken', '8', '5', '0', '1', '8', '"3'], ['enterDecision', '1', '0'], ['LT', '1', '-1', '-1', '0', '1', '9', '"'], ['exitDecision', '1'], ['exitSubRule', '1'], ['location', '6', '22'], ['LT', '1', '-1', '-1', '0', '1', '9', '"'], ['LT', '1', '-1', '-1', '0', '1', '9', '"'], ['consumeToken', '-1', '-1', '0', '1', '9', '"'], ['location', '6', '25'], ['exitRule', 'T.g', 'a'], ['terminate']] self.assertListEqual(debugger.events, expected) def testMismatchedSetException(self): grammar = textwrap.dedent( r''' grammar T; options { language=Python3; } a : ID ( ID | INT ) EOF; ID : 'a'..'z'+ ; INT : '0'..'9'+ ; WS : (' '|'\n') {$channel=HIDDEN} ; ''') debugger = self.execParser( grammar, 'a', input="a", listener=None) self.assertTrue(debugger.success) expected = [['enterRule', 'T.g', 'a'], ['location', '6', '1'], ['enterAlt', '1'], ['location', '6', '5'], ['LT', '1', '0', '4', '0', '1', '0', '"a'], ['LT', '1', '0', '4', '0', '1', '0', '"a'], ['consumeToken', '0', '4', '0', '1', '0', '"a'], ['location', '6', '8'], ['LT', '1', '-1', '-1', '0', '1', '1', '"'], ['LT', '1', '-1', '-1', '0', '1', '1', '"'], ['LT', '1', '-1', '-1', '0', '1', '1', '"'], ['exception', 'MismatchedSetException', '1', '1', '1'], ['exception', 'MismatchedSetException', '1', '1', '1'], ['beginResync'], ['LT', '1', '-1', '-1', '0', '1', '1', '"'], ['endResync'], ['location', '6', '24'], ['exitRule', 'T.g', 'a'], ['terminate']] self.assertListEqual(debugger.events, expected) def testBlock(self): grammar = textwrap.dedent( r''' grammar T; options { language=Python3; } a : ID ( b | c ) EOF; b : ID; c : INT; ID : 'a'..'z'+ ; INT : '0'..'9'+ ; WS : (' '|'\n') {$channel=HIDDEN} ; ''') debugger = self.execParser( grammar, 'a', input="a 1", listener=None) self.assertTrue(debugger.success) expected = [['enterRule', 'T.g', 'a'], ['location', '6', '1'], ['enterAlt', '1'], ['location', '6', '5'], ['LT', '1', '0', '4', '0', '1', '0', '"a'], ['LT', '1', '0', '4', '0', '1', '0', '"a'], ['consumeToken', '0', '4', '0', '1', '0', '"a'], ['consumeHiddenToken', '1', '6', '99', '1', '1', '"'], ['location', '6', '8'], ['enterSubRule', '1'], ['enterDecision', '1', '0'], ['LT', '1', '2', '5', '0', '1', '2', '"1'], ['exitDecision', '1'], ['enterAlt', '2'], ['location', '6', '14'], ['enterRule', 'T.g', 'c'], ['location', '8', '1'], ['enterAlt', '1'], ['location', '8', '5'], ['LT', '1', '2', '5', '0', '1', '2', '"1'], ['LT', '1', '2', '5', '0', '1', '2', '"1'], ['consumeToken', '2', '5', '0', '1', '2', '"1'], ['location', '8', '8'], ['exitRule', 'T.g', 'c'], ['exitSubRule', '1'], ['location', '6', '18'], ['LT', '1', '-1', '-1', '0', '1', '3', '"'], ['LT', '1', '-1', '-1', '0', '1', '3', '"'], ['consumeToken', '-1', '-1', '0', '1', '3', '"'], ['location', '6', '21'], ['exitRule', 'T.g', 'a'], ['terminate']] self.assertListEqual(debugger.events, expected) def testNoViableAlt(self): grammar = textwrap.dedent( r''' grammar T; options { language=Python3; } a : ID ( b | c ) EOF; b : ID; c : INT; ID : 'a'..'z'+ ; INT : '0'..'9'+ ; BANG : '!' ; WS : (' '|'\n') {$channel=HIDDEN} ; ''') debugger = self.execParser( grammar, 'a', input="a !", listener=None) self.assertTrue(debugger.success) expected = [['enterRule', 'T.g', 'a'], ['location', '6', '1'], ['enterAlt', '1'], ['location', '6', '5'], ['LT', '1', '0', '5', '0', '1', '0', '"a'], ['LT', '1', '0', '5', '0', '1', '0', '"a'], ['consumeToken', '0', '5', '0', '1', '0', '"a'], ['consumeHiddenToken', '1', '7', '99', '1', '1', '"'], ['location', '6', '8'], ['enterSubRule', '1'], ['enterDecision', '1', '0'], ['LT', '1', '2', '4', '0', '1', '2', '"!'], ['LT', '1', '2', '4', '0', '1', '2', '"!'], ['LT', '1', '2', '4', '0', '1', '2', '"!'], ['exception', 'NoViableAltException', '2', '1', '2'], ['exitDecision', '1'], ['exitSubRule', '1'], ['exception', 'NoViableAltException', '2', '1', '2'], ['beginResync'], ['LT', '1', '2', '4', '0', '1', '2', '"!'], ['consumeToken', '2', '4', '0', '1', '2', '"!'], ['LT', '1', '-1', '-1', '0', '1', '3', '"'], ['endResync'], ['location', '6', '21'], ['exitRule', 'T.g', 'a'], ['terminate']] self.assertListEqual(debugger.events, expected) def testRuleBlock(self): grammar = textwrap.dedent( r''' grammar T; options { language=Python3; } a : b | c; b : ID; c : INT; ID : 'a'..'z'+ ; INT : '0'..'9'+ ; WS : (' '|'\n') {$channel=HIDDEN} ; ''') debugger = self.execParser( grammar, 'a', input="1", listener=None) self.assertTrue(debugger.success) expected = [['enterRule', 'T.g', 'a'], ['location', '6', '1'], ['enterDecision', '1', '0'], ['LT', '1', '0', '5', '0', '1', '0', '"1'], ['exitDecision', '1'], ['enterAlt', '2'], ['location', '6', '9'], ['enterRule', 'T.g', 'c'], ['location', '8', '1'], ['enterAlt', '1'], ['location', '8', '5'], ['LT', '1', '0', '5', '0', '1', '0', '"1'], ['LT', '1', '0', '5', '0', '1', '0', '"1'], ['consumeToken', '0', '5', '0', '1', '0', '"1'], ['location', '8', '8'], ['exitRule', 'T.g', 'c'], ['location', '6', '10'], ['exitRule', 'T.g', 'a'], ['terminate']] self.assertListEqual(debugger.events, expected) def testRuleBlockSingleAlt(self): grammar = textwrap.dedent( r''' grammar T; options { language=Python3; } a : b; b : ID; ID : 'a'..'z'+ ; INT : '0'..'9'+ ; WS : (' '|'\n') {$channel=HIDDEN} ; ''') debugger = self.execParser( grammar, 'a', input="a", listener=None) self.assertTrue(debugger.success) expected = [['enterRule', 'T.g', 'a'], ['location', '6', '1'], ['enterAlt', '1'], ['location', '6', '5'], ['enterRule', 'T.g', 'b'], ['location', '7', '1'], ['enterAlt', '1'], ['location', '7', '5'], ['LT', '1', '0', '4', '0', '1', '0', '"a'], ['LT', '1', '0', '4', '0', '1', '0', '"a'], ['consumeToken', '0', '4', '0', '1', '0', '"a'], ['location', '7', '7'], ['exitRule', 'T.g', 'b'], ['location', '6', '6'], ['exitRule', 'T.g', 'a'], ['terminate']] self.assertListEqual(debugger.events, expected) def testBlockSingleAlt(self): grammar = textwrap.dedent( r''' grammar T; options { language=Python3; } a : ( b ); b : ID; ID : 'a'..'z'+ ; INT : '0'..'9'+ ; WS : (' '|'\n') {$channel=HIDDEN} ; ''') debugger = self.execParser( grammar, 'a', input="a", listener=None) self.assertTrue(debugger.success) expected = [['enterRule', 'T.g', 'a'], ['location', '6', '1'], ['enterAlt', '1'], ['location', '6', '5'], ['enterAlt', '1'], ['location', '6', '7'], ['enterRule', 'T.g', 'b'], ['location', '7', '1'], ['enterAlt', '1'], ['location', '7', '5'], ['LT', '1', '0', '4', '0', '1', '0', '"a'], ['LT', '1', '0', '4', '0', '1', '0', '"a'], ['consumeToken', '0', '4', '0', '1', '0', '"a'], ['location', '7', '7'], ['exitRule', 'T.g', 'b'], ['location', '6', '10'], ['exitRule', 'T.g', 'a'], ['terminate']] self.assertListEqual(debugger.events, expected) def testDFA(self): grammar = textwrap.dedent( r''' grammar T; options { language=Python3; } a : ( b | c ) EOF; b : ID* INT; c : ID+ BANG; ID : 'a'..'z'+ ; INT : '0'..'9'+ ; BANG : '!'; WS : (' '|'\n') {$channel=HIDDEN} ; ''') debugger = self.execParser( grammar, 'a', input="a!", listener=None) self.assertTrue(debugger.success) expected = [['enterRule', 'T.g', 'a'], ['location', '6', '1'], ['enterAlt', '1'], ['location', '6', '5'], ['enterSubRule', '1'], ['enterDecision', '1', '0'], ['mark', '0'], ['LT', '1', '0', '5', '0', '1', '0', '"a'], ['consumeToken', '0', '5', '0', '1', '0', '"a'], ['LT', '1', '1', '4', '0', '1', '1', '"!'], ['consumeToken', '1', '4', '0', '1', '1', '"!'], ['rewind', '0'], ['exitDecision', '1'], ['enterAlt', '2'], ['location', '6', '11'], ['enterRule', 'T.g', 'c'], ['location', '8', '1'], ['enterAlt', '1'], ['location', '8', '5'], ['enterSubRule', '3'], ['enterDecision', '3', '0'], ['LT', '1', '0', '5', '0', '1', '0', '"a'], ['exitDecision', '3'], ['enterAlt', '1'], ['location', '8', '5'], ['LT', '1', '0', '5', '0', '1', '0', '"a'], ['LT', '1', '0', '5', '0', '1', '0', '"a'], ['consumeToken', '0', '5', '0', '1', '0', '"a'], ['enterDecision', '3', '0'], ['LT', '1', '1', '4', '0', '1', '1', '"!'], ['exitDecision', '3'], ['exitSubRule', '3'], ['location', '8', '9'], ['LT', '1', '1', '4', '0', '1', '1', '"!'], ['LT', '1', '1', '4', '0', '1', '1', '"!'], ['consumeToken', '1', '4', '0', '1', '1', '"!'], ['location', '8', '13'], ['exitRule', 'T.g', 'c'], ['exitSubRule', '1'], ['location', '6', '15'], ['LT', '1', '-1', '-1', '0', '1', '2', '"'], ['LT', '1', '-1', '-1', '0', '1', '2', '"'], ['consumeToken', '-1', '-1', '0', '1', '2', '"'], ['location', '6', '18'], ['exitRule', 'T.g', 'a'], ['terminate']] self.assertListEqual(debugger.events, expected) def testBasicAST(self): grammar = textwrap.dedent( r''' grammar T; options { language=Python3; output=AST; } a : ( b | c ) EOF!; b : ID* INT -> ^(INT ID*); c : ID+ BANG -> ^(BANG ID+); ID : 'a'..'z'+ ; INT : '0'..'9'+ ; BANG : '!'; WS : (' '|'\n') {$channel=HIDDEN} ; ''') listener = antlr3.debug.RecordDebugEventListener() self.execParser( grammar, 'a', input="a!", listener=listener) # don't check output for now (too dynamic), I'm satisfied if it # doesn't crash if __name__ == '__main__': unittest.main() python3-antlr3-3.5.2/tests/t060leftrecursion.py000066400000000000000000000347131324200532700213460ustar00rootroot00000000000000import unittest import re import textwrap import antlr3 import testbase # Left-recursion resolution is not yet enabled in the tool. # class TestLeftRecursion(testbase.ANTLRTest): # def parserClass(self, base): # class TParser(base): # def __init__(self, *args, **kwargs): # super().__init__(*args, **kwargs) # self._output = "" # def capture(self, t): # self._output += str(t) # def recover(self, input, re): # # no error recovery yet, just crash! # raise # return TParser # def execParser(self, grammar, grammarEntry, input): # lexerCls, parserCls = self.compileInlineGrammar(grammar) # cStream = antlr3.StringStream(input) # lexer = lexerCls(cStream) # tStream = antlr3.CommonTokenStream(lexer) # parser = parserCls(tStream) # getattr(parser, grammarEntry)() # return parser._output # def runTests(self, grammar, tests, grammarEntry): # lexerCls, parserCls = self.compileInlineGrammar(grammar) # build_ast = re.search(r'output\s*=\s*AST', grammar) # for input, expecting in tests: # cStream = antlr3.StringStream(input) # lexer = lexerCls(cStream) # tStream = antlr3.CommonTokenStream(lexer) # parser = parserCls(tStream) # r = getattr(parser, grammarEntry)() # found = parser._output # if build_ast: # found += r.tree.toStringTree() # self.assertEqual( # expecting, found, # "{!r} != {!r} (for input {!r})".format(expecting, found, input)) # def testSimple(self): # grammar = textwrap.dedent( # r""" # grammar T; # options { # language=Python3; # } # s : a { self.capture($a.text) } ; # a : a ID # | ID # ; # ID : 'a'..'z'+ ; # WS : (' '|'\n') {self.skip()} ; # """) # found = self.execParser(grammar, 's', 'a b c') # expecting = "abc" # self.assertEqual(expecting, found) # def testSemPred(self): # grammar = textwrap.dedent( # r""" # grammar T; # options { # language=Python3; # } # s : a { self.capture($a.text) } ; # a : a {True}? ID # | ID # ; # ID : 'a'..'z'+ ; # WS : (' '|'\n') {self.skip()} ; # """) # found = self.execParser(grammar, "s", "a b c") # expecting = "abc" # self.assertEqual(expecting, found) # def testTernaryExpr(self): # grammar = textwrap.dedent( # r""" # grammar T; # options { # language=Python3; # output=AST; # } # e : e '*'^ e # | e '+'^ e # | e '?'^ e ':'! e # | e '='^ e # | ID # ; # ID : 'a'..'z'+ ; # WS : (' '|'\n') {self.skip()} ; # """) # tests = [ # ("a", "a"), # ("a+b", "(+ a b)"), # ("a*b", "(* a b)"), # ("a?b:c", "(? a b c)"), # ("a=b=c", "(= a (= b c))"), # ("a?b+c:d", "(? a (+ b c) d)"), # ("a?b=c:d", "(? a (= b c) d)"), # ("a? b?c:d : e", "(? a (? b c d) e)"), # ("a?b: c?d:e", "(? a b (? c d e))"), # ] # self.runTests(grammar, tests, "e") # def testDeclarationsUsingASTOperators(self): # grammar = textwrap.dedent( # r""" # grammar T; # options { # language=Python3; # output=AST; # } # declarator # : declarator '['^ e ']'! # | declarator '['^ ']'! # | declarator '('^ ')'! # | '*'^ declarator // binds less tight than suffixes # | '('! declarator ')'! # | ID # ; # e : INT ; # ID : 'a'..'z'+ ; # INT : '0'..'9'+ ; # WS : (' '|'\n') {self.skip()} ; # """) # tests = [ # ("a", "a"), # ("*a", "(* a)"), # ("**a", "(* (* a))"), # ("a[3]", "([ a 3)"), # ("b[]", "([ b)"), # ("(a)", "a"), # ("a[]()", "(( ([ a))"), # ("a[][]", "([ ([ a))"), # ("*a[]", "(* ([ a))"), # ("(*a)[]", "([ (* a))"), # ] # self.runTests(grammar, tests, "declarator") # def testDeclarationsUsingRewriteOperators(self): # grammar = textwrap.dedent( # r""" # grammar T; # options { # language=Python3; # output=AST; # } # declarator # : declarator '[' e ']' -> ^('[' declarator e) # | declarator '[' ']' -> ^('[' declarator) # | declarator '(' ')' -> ^('(' declarator) # | '*' declarator -> ^('*' declarator) // binds less tight than suffixes # | '(' declarator ')' -> declarator # | ID -> ID # ; # e : INT ; # ID : 'a'..'z'+ ; # INT : '0'..'9'+ ; # WS : (' '|'\n') {self.skip()} ; # """) # tests = [ # ("a", "a"), # ("*a", "(* a)"), # ("**a", "(* (* a))"), # ("a[3]", "([ a 3)"), # ("b[]", "([ b)"), # ("(a)", "a"), # ("a[]()", "(( ([ a))"), # ("a[][]", "([ ([ a))"), # ("*a[]", "(* ([ a))"), # ("(*a)[]", "([ (* a))"), # ] # self.runTests(grammar, tests, "declarator") # def testExpressionsUsingASTOperators(self): # grammar = textwrap.dedent( # r""" # grammar T; # options { # language=Python3; # output=AST; # } # e : e '.'^ ID # | e '.'^ 'this' # | '-'^ e # | e '*'^ e # | e ('+'^|'-'^) e # | INT # | ID # ; # ID : 'a'..'z'+ ; # INT : '0'..'9'+ ; # WS : (' '|'\n') {self.skip()} ; # """) # tests = [ # ("a", "a"), # ("1", "1"), # ("a+1", "(+ a 1)"), # ("a*1", "(* a 1)"), # ("a.b", "(. a b)"), # ("a.this", "(. a this)"), # ("a-b+c", "(+ (- a b) c)"), # ("a+b*c", "(+ a (* b c))"), # ("a.b+1", "(+ (. a b) 1)"), # ("-a", "(- a)"), # ("-a+b", "(+ (- a) b)"), # ("-a.b", "(- (. a b))"), # ] # self.runTests(grammar, tests, "e") # @testbase.broken( # "Grammar compilation returns errors", testbase.GrammarCompileError) # def testExpressionsUsingRewriteOperators(self): # grammar = textwrap.dedent( # r""" # grammar T; # options { # language=Python3; # output=AST; # } # e : e '.' ID -> ^('.' e ID) # | e '.' 'this' -> ^('.' e 'this') # | '-' e -> ^('-' e) # | e '*' b=e -> ^('*' e $b) # | e (op='+'|op='-') b=e -> ^($op e $b) # | INT -> INT # | ID -> ID # ; # ID : 'a'..'z'+ ; # INT : '0'..'9'+ ; # WS : (' '|'\n') {self.skip()} ; # """) # tests = [ # ("a", "a"), # ("1", "1"), # ("a+1", "(+ a 1)"), # ("a*1", "(* a 1)"), # ("a.b", "(. a b)"), # ("a.this", "(. a this)"), # ("a+b*c", "(+ a (* b c))"), # ("a.b+1", "(+ (. a b) 1)"), # ("-a", "(- a)"), # ("-a+b", "(+ (- a) b)"), # ("-a.b", "(- (. a b))"), # ] # self.runTests(grammar, tests, "e") # def testExpressionAssociativity(self): # grammar = textwrap.dedent( # r""" # grammar T; # options { # language=Python3; # output=AST; # } # e # : e '.'^ ID # | '-'^ e # | e '^'^ e # | e '*'^ e # | e ('+'^|'-'^) e # | e ('='^ |'+='^) e # | INT # | ID # ; # ID : 'a'..'z'+ ; # INT : '0'..'9'+ ; # WS : (' '|'\n') {self.skip()} ; # """) # tests = [ # ("a", "a"), # ("1", "1"), # ("a+1", "(+ a 1)"), # ("a*1", "(* a 1)"), # ("a.b", "(. a b)"), # ("a-b+c", "(+ (- a b) c)"), # ("a+b*c", "(+ a (* b c))"), # ("a.b+1", "(+ (. a b) 1)"), # ("-a", "(- a)"), # ("-a+b", "(+ (- a) b)"), # ("-a.b", "(- (. a b))"), # ("a^b^c", "(^ a (^ b c))"), # ("a=b=c", "(= a (= b c))"), # ("a=b=c+d.e", "(= a (= b (+ c (. d e))))"), # ] # self.runTests(grammar, tests, "e") # def testJavaExpressions(self): # grammar = textwrap.dedent( # r""" # grammar T; # options { # language=Python3; # output=AST; # } # expressionList # : e (','! e)* # ; # e : '('! e ')'! # | 'this' # | 'super' # | INT # | ID # | type '.'^ 'class' # | e '.'^ ID # | e '.'^ 'this' # | e '.'^ 'super' '('^ expressionList? ')'! # | e '.'^ 'new'^ ID '('! expressionList? ')'! # | 'new'^ type ( '(' expressionList? ')'! | (options {k=1;}:'[' e ']'!)+) // ugly; simplified # | e '['^ e ']'! # | '('^ type ')'! e # | e ('++'^ | '--'^) # | e '('^ expressionList? ')'! # | ('+'^|'-'^|'++'^|'--'^) e # | ('~'^|'!'^) e # | e ('*'^|'/'^|'%'^) e # | e ('+'^|'-'^) e # | e ('<'^ '<' | '>'^ '>' '>' | '>'^ '>') e # | e ('<='^ | '>='^ | '>'^ | '<'^) e # | e 'instanceof'^ e # | e ('=='^ | '!='^) e # | e '&'^ e # | e '^'^ e # | e '|'^ e # | e '&&'^ e # | e '||'^ e # | e '?' e ':' e # | e ('='^ # |'+='^ # |'-='^ # |'*='^ # |'/='^ # |'&='^ # |'|='^ # |'^='^ # |'>>='^ # |'>>>='^ # |'<<='^ # |'%='^) e # ; # type: ID # | ID '['^ ']'! # | 'int' # | 'int' '['^ ']'! # ; # ID : ('a'..'z'|'A'..'Z'|'_'|'$')+; # INT : '0'..'9'+ ; # WS : (' '|'\n') {self.skip()} ; # """) # tests = [ # ("a", "a"), # ("1", "1"), # ("a+1", "(+ a 1)"), # ("a*1", "(* a 1)"), # ("a.b", "(. a b)"), # ("a-b+c", "(+ (- a b) c)"), # ("a+b*c", "(+ a (* b c))"), # ("a.b+1", "(+ (. a b) 1)"), # ("-a", "(- a)"), # ("-a+b", "(+ (- a) b)"), # ("-a.b", "(- (. a b))"), # ("a^b^c", "(^ a (^ b c))"), # ("a=b=c", "(= a (= b c))"), # ("a=b=c+d.e", "(= a (= b (+ c (. d e))))"), # ("a|b&c", "(| a (& b c))"), # ("(a|b)&c", "(& (| a b) c)"), # ("a > b", "(> a b)"), # ("a >> b", "(> a b)"), # text is from one token # ("a < b", "(< a b)"), # ("(T)x", "(( T x)"), # ("new A().b", "(. (new A () b)"), # ("(T)t.f()", "(( (( T (. t f)))"), # ("a.f(x)==T.c", "(== (( (. a f) x) (. T c))"), # ("a.f().g(x,1)", "(( (. (( (. a f)) g) x 1)"), # ("new T[((n-1) * x) + 1]", "(new T [ (+ (* (- n 1) x) 1))"), # ] # self.runTests(grammar, tests, "e") # def testReturnValueAndActions(self): # grammar = textwrap.dedent( # r""" # grammar T; # options { # language=Python3; # } # s : e { self.capture($e.v) } ; # e returns [v, ignored] # : e '*' b=e {$v *= $b.v;} # | e '+' b=e {$v += $b.v;} # | INT {$v = int($INT.text);} # ; # INT : '0'..'9'+ ; # WS : (' '|'\n') {self.skip()} ; # """) # tests = [ # ("4", "4"), # ("1+2", "3") # ] # self.runTests(grammar, tests, "s") # def testReturnValueAndActionsAndASTs(self): # grammar = textwrap.dedent( # r""" # grammar T; # options { # language=Python3; # output=AST; # } # s : e { self.capture("v={}, ".format($e.v)) } ; # e returns [v, ignored] # : e '*'^ b=e {$v *= $b.v;} # | e '+'^ b=e {$v += $b.v;} # | INT {$v = int($INT.text);} # ; # INT : '0'..'9'+ ; # WS : (' '|'\n') {self.skip()} ; # """) # tests = [ # ("4", "v=4, 4"), # ("1+2", "v=3, (+ 1 2)"), # ] # self.runTests(grammar, tests, "s") if __name__ == '__main__': unittest.main() python3-antlr3-3.5.2/tests/testbase.py000066400000000000000000000322731324200532700176610ustar00rootroot00000000000000from distutils.errors import * import errno import glob import hashlib import imp import inspect import os import re import shutil import sys import tempfile import unittest import antlr3 def unlink(path): try: os.unlink(path) except OSError as exc: if exc.errno != errno.ENOENT: raise class GrammarCompileError(Exception): """Grammar failed to compile.""" pass # At least on MacOSX tempdir (/tmp) is a symlink. It's sometimes dereferences, # sometimes not, breaking the inspect.getmodule() function. testbasedir = os.path.join( os.path.realpath(tempfile.gettempdir()), 'antlr3-test') class BrokenTest(unittest.TestCase.failureException): def __repr__(self): name, reason = self.args return '{}: {}: {} works now'.format( (self.__class__.__name__, name, reason)) def broken(reason, *exceptions): '''Indicates a failing (or erroneous) test case fails that should succeed. If the test fails with an exception, list the exception type in args''' def wrapper(test_method): def replacement(*args, **kwargs): try: test_method(*args, **kwargs) except exceptions or unittest.TestCase.failureException: pass else: raise BrokenTest(test_method.__name__, reason) replacement.__doc__ = test_method.__doc__ replacement.__name__ = 'XXX_' + test_method.__name__ replacement.todo = reason return replacement return wrapper dependencyCache = {} compileErrorCache = {} # setup java CLASSPATH if 'CLASSPATH' not in os.environ: cp = [] baseDir = os.path.abspath(os.path.join(os.path.dirname(__file__), '..', '..', '..')) libDir = os.path.join(baseDir, 'lib') jar = os.path.join(libDir, 'ST-4.0.5.jar') if not os.path.isfile(jar): raise DistutilsFileError( "Missing file '{}'. Grab it from a distribution package.".format(jar) ) cp.append(jar) jar = os.path.join(libDir, 'antlr-3.4.1-SNAPSHOT.jar') if not os.path.isfile(jar): raise DistutilsFileError( "Missing file '{}'. Grab it from a distribution package.".format(jar) ) cp.append(jar) jar = os.path.join(libDir, 'antlr-runtime-3.4.jar') if not os.path.isfile(jar): raise DistutilsFileError( "Missing file '{}'. Grab it from a distribution package.".format(jar) ) cp.append(jar) cp.append(os.path.join(baseDir, 'runtime', 'Python', 'build')) classpath = '-cp "' + ':'.join([os.path.abspath(p) for p in cp]) + '"' else: classpath = '' class ANTLRTest(unittest.TestCase): def __init__(self, *args, **kwargs): super().__init__(*args, **kwargs) self.moduleName = os.path.splitext(os.path.basename(sys.modules[self.__module__].__file__))[0] self.className = self.__class__.__name__ self._baseDir = None self.lexerModule = None self.parserModule = None self.grammarName = None self.grammarType = None @property def baseDir(self): if self._baseDir is None: testName = 'unknownTest' for frame in inspect.stack(): code = frame[0].f_code codeMod = inspect.getmodule(code) if codeMod is None: continue # skip frames not in requested module if codeMod is not sys.modules[self.__module__]: continue # skip some unwanted names if code.co_name in ('nextToken', ''): continue if code.co_name.startswith('test'): testName = code.co_name break self._baseDir = os.path.join( testbasedir, self.moduleName, self.className, testName) if not os.path.isdir(self._baseDir): os.makedirs(self._baseDir) return self._baseDir def _invokeantlr(self, dir, file, options, javaOptions=''): cmd = 'cd {}; java {} {} org.antlr.Tool -o . {} {} 2>&1'.format( dir, javaOptions, classpath, options, file ) fp = os.popen(cmd) output = '' failed = False for line in fp: output += line if line.startswith('error('): failed = True rc = fp.close() if rc: failed = True if failed: raise GrammarCompileError( "Failed to compile grammar '{}':\n{}\n\n{}".format(file, cmd, output) ) def compileGrammar(self, grammarName=None, options='', javaOptions=''): if grammarName is None: grammarName = self.moduleName + '.g' self._baseDir = os.path.join( testbasedir, self.moduleName) if not os.path.isdir(self._baseDir): os.makedirs(self._baseDir) if self.grammarName is None: self.grammarName = os.path.splitext(grammarName)[0] grammarPath = os.path.join(os.path.dirname(os.path.abspath(__file__)), grammarName) # get type and name from first grammar line with open(grammarPath, 'r') as fp: grammar = fp.read() m = re.match(r'\s*((lexer|parser|tree)\s+|)grammar\s+(\S+);', grammar, re.MULTILINE) self.assertIsNotNone(m, grammar) self.grammarType = m.group(2) or 'combined' self.assertIn(self.grammarType, ('lexer', 'parser', 'tree', 'combined')) # don't try to rebuild grammar, if it already failed if grammarName in compileErrorCache: return try: # # get dependencies from antlr # if grammarName in dependencyCache: # dependencies = dependencyCache[grammarName] # else: # dependencies = [] # cmd = ('cd %s; java %s %s org.antlr.Tool -o . -depend %s 2>&1' # % (self.baseDir, javaOptions, classpath, grammarPath)) # output = "" # failed = False # fp = os.popen(cmd) # for line in fp: # output += line # if line.startswith('error('): # failed = True # elif ':' in line: # a, b = line.strip().split(':', 1) # dependencies.append( # (os.path.join(self.baseDir, a.strip()), # [os.path.join(self.baseDir, b.strip())]) # ) # rc = fp.close() # if rc is not None: # failed = True # if failed: # raise GrammarCompileError( # "antlr -depend failed with code {} on grammar '{}':\n\n{}\n{}".format( # rc, grammarName, cmd, output) # ) # # add dependencies to my .stg files # templateDir = os.path.abspath(os.path.join(os.path.dirname(__file__), '..', '..', '..', 'tool', 'src', 'main', 'resources', 'org', 'antlr', 'codegen', 'templates', 'Python')) # templates = glob.glob(os.path.join(templateDir, '*.stg')) # for dst, src in dependencies: # src.extend(templates) # dependencyCache[grammarName] = dependencies # rebuild = False # for dest, sources in dependencies: # if not os.path.isfile(dest): # rebuild = True # break # for source in sources: # if os.path.getmtime(source) > os.path.getmtime(dest): # rebuild = True # break # if rebuild: # self._invokeantlr(self.baseDir, grammarPath, options, javaOptions) self._invokeantlr(self.baseDir, grammarPath, options, javaOptions) except: # mark grammar as broken compileErrorCache[grammarName] = True raise def lexerClass(self, base): """Optionally build a subclass of generated lexer class""" return base def parserClass(self, base): """Optionally build a subclass of generated parser class""" return base def walkerClass(self, base): """Optionally build a subclass of generated walker class""" return base def __load_module(self, name): modFile, modPathname, modDescription = imp.find_module(name, [self.baseDir]) with modFile: return imp.load_module(name, modFile, modPathname, modDescription) def getLexer(self, *args, **kwargs): """Build lexer instance. Arguments are passed to lexer.__init__().""" if self.grammarType == 'lexer': self.lexerModule = self.__load_module(self.grammarName) cls = getattr(self.lexerModule, self.grammarName) else: self.lexerModule = self.__load_module(self.grammarName + 'Lexer') cls = getattr(self.lexerModule, self.grammarName + 'Lexer') cls = self.lexerClass(cls) lexer = cls(*args, **kwargs) return lexer def getParser(self, *args, **kwargs): """Build parser instance. Arguments are passed to parser.__init__().""" if self.grammarType == 'parser': self.lexerModule = self.__load_module(self.grammarName) cls = getattr(self.lexerModule, self.grammarName) else: self.parserModule = self.__load_module(self.grammarName + 'Parser') cls = getattr(self.parserModule, self.grammarName + 'Parser') cls = self.parserClass(cls) parser = cls(*args, **kwargs) return parser def getWalker(self, *args, **kwargs): """Build walker instance. Arguments are passed to walker.__init__().""" self.walkerModule = self.__load_module(self.grammarName + 'Walker') cls = getattr(self.walkerModule, self.grammarName + 'Walker') cls = self.walkerClass(cls) walker = cls(*args, **kwargs) return walker def writeInlineGrammar(self, grammar): # Create a unique ID for this test and use it as the grammar name, # to avoid class name reuse. This kinda sucks. Need to find a way so # tests can use the same grammar name without messing up the namespace. # Well, first I should figure out what the exact problem is... id = hashlib.md5(self.baseDir.encode('utf-8')).hexdigest()[-8:] grammar = grammar.replace('$TP', 'TP' + id) grammar = grammar.replace('$T', 'T' + id) # get type and name from first grammar line m = re.match(r'\s*((lexer|parser|tree)\s+|)grammar\s+(\S+);', grammar, re.MULTILINE) self.assertIsNotNone(m, grammar) grammarType = m.group(2) or 'combined' grammarName = m.group(3) self.assertIn(grammarType, ('lexer', 'parser', 'tree', 'combined')) grammarPath = os.path.join(self.baseDir, grammarName + '.g') # dump temp grammar file with open(grammarPath, 'w') as fp: fp.write(grammar) return grammarName, grammarPath, grammarType def writeFile(self, name, contents): testDir = os.path.dirname(os.path.abspath(__file__)) path = os.path.join(self.baseDir, name) with open(path, 'w') as fp: fp.write(contents) return path def compileInlineGrammar(self, grammar, options='', javaOptions='', returnModule=False): # write grammar file grammarName, grammarPath, grammarType = self.writeInlineGrammar(grammar) # compile it self._invokeantlr( os.path.dirname(grammarPath), os.path.basename(grammarPath), options, javaOptions ) if grammarType == 'combined': lexerMod = self.__load_module(grammarName + 'Lexer') parserMod = self.__load_module(grammarName + 'Parser') if returnModule: return lexerMod, parserMod lexerCls = getattr(lexerMod, grammarName + 'Lexer') lexerCls = self.lexerClass(lexerCls) parserCls = getattr(parserMod, grammarName + 'Parser') parserCls = self.parserClass(parserCls) return lexerCls, parserCls if grammarType == 'lexer': lexerMod = self.__load_module(grammarName) if returnModule: return lexerMod lexerCls = getattr(lexerMod, grammarName) lexerCls = self.lexerClass(lexerCls) return lexerCls if grammarType == 'parser': parserMod = self.__load_module(grammarName) if returnModule: return parserMod parserCls = getattr(parserMod, grammarName) parserCls = self.parserClass(parserCls) return parserCls if grammarType == 'tree': walkerMod = self.__load_module(grammarName) if returnModule: return walkerMod walkerCls = getattr(walkerMod, grammarName) walkerCls = self.walkerClass(walkerCls) return walkerCls python3-antlr3-3.5.2/unittests/000077500000000000000000000000001324200532700163665ustar00rootroot00000000000000python3-antlr3-3.5.2/unittests/testantlr3.py000066400000000000000000000001771324200532700210500ustar00rootroot00000000000000 import unittest import antlr3 if __name__ == "__main__": unittest.main(testRunner=unittest.TextTestRunner(verbosity=2)) python3-antlr3-3.5.2/unittests/testbase.py000066400000000000000000000016211324200532700205520ustar00rootroot00000000000000import unittest class BrokenTest(unittest.TestCase.failureException): def __repr__(self): name, reason = self.args return '{}: {}: {} works now'.format( self.__class__.__name__, name, reason) def broken(reason, *exceptions): '''Indicates a failing (or erroneous) test case fails that should succeed. If the test fails with an exception, list the exception type in args''' def wrapper(test_method): def replacement(*args, **kwargs): try: test_method(*args, **kwargs) except exceptions or unittest.TestCase.failureException: pass else: raise BrokenTest(test_method.__name__, reason) replacement.__doc__ = test_method.__doc__ replacement.__name__ = 'XXX_' + test_method.__name__ replacement.todo = reason return replacement return wrapper python3-antlr3-3.5.2/unittests/testdfa.py000066400000000000000000000027731324200532700204030ustar00rootroot00000000000000 import unittest import antlr3 class TestDFA(unittest.TestCase): """Test case for the DFA class.""" def setUp(self): """Setup test fixure. We need a Recognizer in order to instanciate a DFA. """ class TRecognizer(antlr3.BaseRecognizer): api_version = 'HEAD' self.recog = TRecognizer() def testInit(self): """DFA.__init__() Just a smoke test. """ dfa = antlr3.DFA( self.recog, 1, eot=[], eof=[], min=[], max=[], accept=[], special=[], transition=[] ) def testUnpack(self): """DFA.unpack()""" self.assertEqual( antlr3.DFA.unpack( "\1\3\1\4\2\uffff\1\5\22\uffff\1\2\31\uffff\1\6\6\uffff" "\32\6\4\uffff\1\6\1\uffff\32\6" ), [ 3, 4, -1, -1, 5, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 2, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 6, -1, -1, -1, -1, -1, -1, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, -1, -1, -1, -1, 6, -1, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6 ] ) if __name__ == "__main__": unittest.main(testRunner=unittest.TextTestRunner(verbosity=2)) python3-antlr3-3.5.2/unittests/testexceptions.py000066400000000000000000000052711324200532700220260ustar00rootroot00000000000000import unittest import antlr3 import testbase class TestRecognitionException(unittest.TestCase): """Tests for the antlr3.RecognitionException class""" def testInitNone(self): """RecognitionException.__init__()""" exc = antlr3.RecognitionException() class TestEarlyExitException(unittest.TestCase): """Tests for the antlr3.EarlyExitException class""" @testbase.broken("FIXME", Exception) def testInitNone(self): """EarlyExitException.__init__()""" exc = antlr3.EarlyExitException() class TestFailedPredicateException(unittest.TestCase): """Tests for the antlr3.FailedPredicateException class""" @testbase.broken("FIXME", Exception) def testInitNone(self): """FailedPredicateException.__init__()""" exc = antlr3.FailedPredicateException() class TestMismatchedNotSetException(unittest.TestCase): """Tests for the antlr3.MismatchedNotSetException class""" @testbase.broken("FIXME", Exception) def testInitNone(self): """MismatchedNotSetException.__init__()""" exc = antlr3.MismatchedNotSetException() class TestMismatchedRangeException(unittest.TestCase): """Tests for the antlr3.MismatchedRangeException class""" @testbase.broken("FIXME", Exception) def testInitNone(self): """MismatchedRangeException.__init__()""" exc = antlr3.MismatchedRangeException() class TestMismatchedSetException(unittest.TestCase): """Tests for the antlr3.MismatchedSetException class""" @testbase.broken("FIXME", Exception) def testInitNone(self): """MismatchedSetException.__init__()""" exc = antlr3.MismatchedSetException() class TestMismatchedTokenException(unittest.TestCase): """Tests for the antlr3.MismatchedTokenException class""" @testbase.broken("FIXME", Exception) def testInitNone(self): """MismatchedTokenException.__init__()""" exc = antlr3.MismatchedTokenException() class TestMismatchedTreeNodeException(unittest.TestCase): """Tests for the antlr3.MismatchedTreeNodeException class""" @testbase.broken("FIXME", Exception) def testInitNone(self): """MismatchedTreeNodeException.__init__()""" exc = antlr3.MismatchedTreeNodeException() class TestNoViableAltException(unittest.TestCase): """Tests for the antlr3.NoViableAltException class""" @testbase.broken("FIXME", Exception) def testInitNone(self): """NoViableAltException.__init__()""" exc = antlr3.NoViableAltException() if __name__ == "__main__": unittest.main(testRunner=unittest.TextTestRunner(verbosity=2)) python3-antlr3-3.5.2/unittests/testrecognizers.py000066400000000000000000000030431324200532700221720ustar00rootroot00000000000000import sys import unittest import antlr3 class TestBaseRecognizer(unittest.TestCase): """Tests for BaseRecognizer class""" def testGetRuleInvocationStack(self): """BaseRecognizer._getRuleInvocationStack()""" rules = antlr3.BaseRecognizer._getRuleInvocationStack(__name__) self.assertEqual( rules, ['testGetRuleInvocationStack'] ) class TestTokenSource(unittest.TestCase): """Testcase to the antlr3.TokenSource class""" def testIteratorInterface(self): """TokenSource.next()""" class TrivialToken(object): def __init__(self, type): self.type = type class TestSource(antlr3.TokenSource): def __init__(self): self.tokens = [ TrivialToken(1), TrivialToken(2), TrivialToken(3), TrivialToken(4), TrivialToken(antlr3.EOF), ] def nextToken(self): return self.tokens.pop(0) src = TestSource() tokens = [] for token in src: tokens.append(token.type) self.assertEqual(tokens, [1, 2, 3, 4]) class TestLexer(unittest.TestCase): def testInit(self): """Lexer.__init__()""" class TLexer(antlr3.Lexer): api_version = 'HEAD' stream = antlr3.StringStream('foo') TLexer(stream) if __name__ == "__main__": unittest.main(testRunner=unittest.TextTestRunner(verbosity=2)) python3-antlr3-3.5.2/unittests/teststreams.input1000066400000000000000000000000071324200532700221030ustar00rootroot00000000000000foo barpython3-antlr3-3.5.2/unittests/teststreams.input2000066400000000000000000000000101324200532700220760ustar00rootroot00000000000000foo bärpython3-antlr3-3.5.2/unittests/teststreams.py000066400000000000000000000405651324200532700213300ustar00rootroot00000000000000 from io import StringIO import os import unittest import antlr3 class TestStringStream(unittest.TestCase): """Test case for the StringStream class.""" def testSize(self): """StringStream.size()""" stream = antlr3.StringStream('foo') self.assertEqual(stream.size(), 3) def testIndex(self): """StringStream.index()""" stream = antlr3.StringStream('foo') self.assertEqual(stream.index(), 0) def testConsume(self): """StringStream.consume()""" stream = antlr3.StringStream('foo\nbar') stream.consume() # f self.assertEqual(stream.index(), 1) self.assertEqual(stream.charPositionInLine, 1) self.assertEqual(stream.line, 1) stream.consume() # o self.assertEqual(stream.index(), 2) self.assertEqual(stream.charPositionInLine, 2) self.assertEqual(stream.line, 1) stream.consume() # o self.assertEqual(stream.index(), 3) self.assertEqual(stream.charPositionInLine, 3) self.assertEqual(stream.line, 1) stream.consume() # \n self.assertEqual(stream.index(), 4) self.assertEqual(stream.charPositionInLine, 0) self.assertEqual(stream.line, 2) stream.consume() # b self.assertEqual(stream.index(), 5) self.assertEqual(stream.charPositionInLine, 1) self.assertEqual(stream.line, 2) stream.consume() # a self.assertEqual(stream.index(), 6) self.assertEqual(stream.charPositionInLine, 2) self.assertEqual(stream.line, 2) stream.consume() # r self.assertEqual(stream.index(), 7) self.assertEqual(stream.charPositionInLine, 3) self.assertEqual(stream.line, 2) stream.consume() # EOF self.assertEqual(stream.index(), 7) self.assertEqual(stream.charPositionInLine, 3) self.assertEqual(stream.line, 2) stream.consume() # EOF self.assertEqual(stream.index(), 7) self.assertEqual(stream.charPositionInLine, 3) self.assertEqual(stream.line, 2) def testReset(self): """StringStream.reset()""" stream = antlr3.StringStream('foo') stream.consume() stream.consume() stream.reset() self.assertEqual(stream.index(), 0) self.assertEqual(stream.line, 1) self.assertEqual(stream.charPositionInLine, 0) self.assertEqual(stream.LT(1), 'f') def testLA(self): """StringStream.LA()""" stream = antlr3.StringStream('foo') self.assertEqual(stream.LT(1), 'f') self.assertEqual(stream.LT(2), 'o') self.assertEqual(stream.LT(3), 'o') stream.consume() stream.consume() self.assertEqual(stream.LT(1), 'o') self.assertEqual(stream.LT(2), antlr3.EOF) self.assertEqual(stream.LT(3), antlr3.EOF) def testSubstring(self): """StringStream.substring()""" stream = antlr3.StringStream('foobar') self.assertEqual(stream.substring(0, 0), 'f') self.assertEqual(stream.substring(0, 1), 'fo') self.assertEqual(stream.substring(0, 5), 'foobar') self.assertEqual(stream.substring(3, 5), 'bar') def testSeekForward(self): """StringStream.seek(): forward""" stream = antlr3.StringStream('foo\nbar') stream.seek(4) self.assertEqual(stream.index(), 4) self.assertEqual(stream.line, 2) self.assertEqual(stream.charPositionInLine, 0) self.assertEqual(stream.LT(1), 'b') ## # not yet implemented ## def testSeekBackward(self): ## """StringStream.seek(): backward""" ## stream = antlr3.StringStream('foo\nbar') ## stream.seek(4) ## stream.seek(1) ## self.assertEqual(stream.index(), 1) ## self.assertEqual(stream.line, 1) ## self.assertEqual(stream.charPositionInLine, 1) ## self.assertEqual(stream.LA(1), 'o') def testMark(self): """StringStream.mark()""" stream = antlr3.StringStream('foo\nbar') stream.seek(4) marker = stream.mark() self.assertEqual(marker, 1) self.assertEqual(stream.markDepth, 1) stream.consume() marker = stream.mark() self.assertEqual(marker, 2) self.assertEqual(stream.markDepth, 2) def testReleaseLast(self): """StringStream.release(): last marker""" stream = antlr3.StringStream('foo\nbar') stream.seek(4) marker1 = stream.mark() stream.consume() marker2 = stream.mark() stream.release() self.assertEqual(stream.markDepth, 1) # release same marker again, nothing has changed stream.release() self.assertEqual(stream.markDepth, 1) def testReleaseNested(self): """StringStream.release(): nested""" stream = antlr3.StringStream('foo\nbar') stream.seek(4) marker1 = stream.mark() stream.consume() marker2 = stream.mark() stream.consume() marker3 = stream.mark() stream.release(marker2) self.assertEqual(stream.markDepth, 1) def testRewindLast(self): """StringStream.rewind(): last marker""" stream = antlr3.StringStream('foo\nbar') stream.seek(4) marker = stream.mark() stream.consume() stream.consume() stream.rewind() self.assertEqual(stream.markDepth, 0) self.assertEqual(stream.index(), 4) self.assertEqual(stream.line, 2) self.assertEqual(stream.charPositionInLine, 0) self.assertEqual(stream.LT(1), 'b') def testRewindNested(self): """StringStream.rewind(): nested""" stream = antlr3.StringStream('foo\nbar') stream.seek(4) marker1 = stream.mark() stream.consume() marker2 = stream.mark() stream.consume() marker3 = stream.mark() stream.rewind(marker2) self.assertEqual(stream.markDepth, 1) self.assertEqual(stream.index(), 5) self.assertEqual(stream.line, 2) self.assertEqual(stream.charPositionInLine, 1) self.assertEqual(stream.LT(1), 'a') class TestFileStream(unittest.TestCase): """Test case for the FileStream class.""" def testNoEncoding(self): path = os.path.join(os.path.dirname(__file__), 'teststreams.input1') stream = antlr3.FileStream(path) stream.seek(4) marker1 = stream.mark() stream.consume() marker2 = stream.mark() stream.consume() marker3 = stream.mark() stream.rewind(marker2) self.assertEqual(stream.markDepth, 1) self.assertEqual(stream.index(), 5) self.assertEqual(stream.line, 2) self.assertEqual(stream.charPositionInLine, 1) self.assertEqual(stream.LT(1), 'a') self.assertEqual(stream.LA(1), ord('a')) def testEncoded(self): path = os.path.join(os.path.dirname(__file__), 'teststreams.input2') stream = antlr3.FileStream(path) stream.seek(4) marker1 = stream.mark() stream.consume() marker2 = stream.mark() stream.consume() marker3 = stream.mark() stream.rewind(marker2) self.assertEqual(stream.markDepth, 1) self.assertEqual(stream.index(), 5) self.assertEqual(stream.line, 2) self.assertEqual(stream.charPositionInLine, 1) self.assertEqual(stream.LT(1), 'ä') self.assertEqual(stream.LA(1), ord('ä')) class TestInputStream(unittest.TestCase): """Test case for the InputStream class.""" def testNoEncoding(self): file = StringIO('foo\nbar') stream = antlr3.InputStream(file) stream.seek(4) marker1 = stream.mark() stream.consume() marker2 = stream.mark() stream.consume() marker3 = stream.mark() stream.rewind(marker2) self.assertEqual(stream.markDepth, 1) self.assertEqual(stream.index(), 5) self.assertEqual(stream.line, 2) self.assertEqual(stream.charPositionInLine, 1) self.assertEqual(stream.LT(1), 'a') self.assertEqual(stream.LA(1), ord('a')) def testEncoded(self): file = StringIO('foo\nbär') stream = antlr3.InputStream(file) stream.seek(4) marker1 = stream.mark() stream.consume() marker2 = stream.mark() stream.consume() marker3 = stream.mark() stream.rewind(marker2) self.assertEqual(stream.markDepth, 1) self.assertEqual(stream.index(), 5) self.assertEqual(stream.line, 2) self.assertEqual(stream.charPositionInLine, 1) self.assertEqual(stream.LT(1), 'ä') self.assertEqual(stream.LA(1), ord('ä')) class TestCommonTokenStream(unittest.TestCase): """Test case for the StringStream class.""" def setUp(self): """Setup test fixure The constructor of CommonTokenStream needs a token source. This is a simple mock class providing just the nextToken() method. """ class MockSource(object): def __init__(self): self.tokens = [] def makeEOFToken(self): return antlr3.CommonToken(type=antlr3.EOF) def nextToken(self): if self.tokens: return self.tokens.pop(0) return None self.source = MockSource() def testInit(self): """CommonTokenStream.__init__()""" stream = antlr3.CommonTokenStream(self.source) self.assertEqual(stream.index(), -1) def testSetTokenSource(self): """CommonTokenStream.setTokenSource()""" stream = antlr3.CommonTokenStream(None) stream.setTokenSource(self.source) self.assertEqual(stream.index(), -1) self.assertEqual(stream.channel, antlr3.DEFAULT_CHANNEL) def testLTEmptySource(self): """CommonTokenStream.LT(): EOF (empty source)""" stream = antlr3.CommonTokenStream(self.source) lt1 = stream.LT(1) self.assertEqual(lt1.type, antlr3.EOF) def testLT1(self): """CommonTokenStream.LT(1)""" self.source.tokens.append( antlr3.CommonToken(type=12) ) stream = antlr3.CommonTokenStream(self.source) lt1 = stream.LT(1) self.assertEqual(lt1.type, 12) def testLT1WithHidden(self): """CommonTokenStream.LT(1): with hidden tokens""" self.source.tokens.append( antlr3.CommonToken(type=12, channel=antlr3.HIDDEN_CHANNEL) ) self.source.tokens.append( antlr3.CommonToken(type=13) ) stream = antlr3.CommonTokenStream(self.source) lt1 = stream.LT(1) self.assertEqual(lt1.type, 13) def testLT2BeyondEnd(self): """CommonTokenStream.LT(2): beyond end""" self.source.tokens.append( antlr3.CommonToken(type=12) ) self.source.tokens.append( antlr3.CommonToken(type=13, channel=antlr3.HIDDEN_CHANNEL) ) stream = antlr3.CommonTokenStream(self.source) lt1 = stream.LT(2) self.assertEqual(lt1.type, antlr3.EOF) # not yet implemented def testLTNegative(self): """CommonTokenStream.LT(-1): look back""" self.source.tokens.append( antlr3.CommonToken(type=12) ) self.source.tokens.append( antlr3.CommonToken(type=13) ) stream = antlr3.CommonTokenStream(self.source) stream.fillBuffer() stream.consume() lt1 = stream.LT(-1) self.assertEqual(lt1.type, 12) def testLB1(self): """CommonTokenStream.LB(1)""" self.source.tokens.append( antlr3.CommonToken(type=12) ) self.source.tokens.append( antlr3.CommonToken(type=13) ) stream = antlr3.CommonTokenStream(self.source) stream.fillBuffer() stream.consume() self.assertEqual(stream.LB(1).type, 12) def testLTZero(self): """CommonTokenStream.LT(0)""" self.source.tokens.append( antlr3.CommonToken(type=12) ) self.source.tokens.append( antlr3.CommonToken(type=13) ) stream = antlr3.CommonTokenStream(self.source) lt1 = stream.LT(0) self.assertIsNone(lt1) def testLBBeyondBegin(self): """CommonTokenStream.LB(-1): beyond begin""" self.source.tokens.append( antlr3.CommonToken(type=12) ) self.source.tokens.append( antlr3.CommonToken(type=12, channel=antlr3.HIDDEN_CHANNEL) ) self.source.tokens.append( antlr3.CommonToken(type=12, channel=antlr3.HIDDEN_CHANNEL) ) self.source.tokens.append( antlr3.CommonToken(type=13) ) stream = antlr3.CommonTokenStream(self.source) self.assertIsNone(stream.LB(1)) stream.consume() stream.consume() self.assertIsNone(stream.LB(3)) def testFillBuffer(self): """CommonTokenStream.fillBuffer()""" self.source.tokens.append( antlr3.CommonToken(type=12) ) self.source.tokens.append( antlr3.CommonToken(type=13) ) self.source.tokens.append( antlr3.CommonToken(type=14) ) self.source.tokens.append( antlr3.CommonToken(type=antlr3.EOF) ) stream = antlr3.CommonTokenStream(self.source) stream.fillBuffer() self.assertEqual(len(stream.tokens), 3) self.assertEqual(stream.tokens[0].type, 12) self.assertEqual(stream.tokens[1].type, 13) self.assertEqual(stream.tokens[2].type, 14) def testConsume(self): """CommonTokenStream.consume()""" self.source.tokens.append( antlr3.CommonToken(type=12) ) self.source.tokens.append( antlr3.CommonToken(type=13) ) self.source.tokens.append( antlr3.CommonToken(type=antlr3.EOF) ) stream = antlr3.CommonTokenStream(self.source) self.assertEqual(stream.LA(1), 12) stream.consume() self.assertEqual(stream.LA(1), 13) stream.consume() self.assertEqual(stream.LA(1), antlr3.EOF) stream.consume() self.assertEqual(stream.LA(1), antlr3.EOF) def testSeek(self): """CommonTokenStream.seek()""" self.source.tokens.append( antlr3.CommonToken(type=12) ) self.source.tokens.append( antlr3.CommonToken(type=13) ) self.source.tokens.append( antlr3.CommonToken(type=antlr3.EOF) ) stream = antlr3.CommonTokenStream(self.source) self.assertEqual(stream.LA(1), 12) stream.seek(2) self.assertEqual(stream.LA(1), antlr3.EOF) stream.seek(0) self.assertEqual(stream.LA(1), 12) def testMarkRewind(self): """CommonTokenStream.mark()/rewind()""" self.source.tokens.append( antlr3.CommonToken(type=12) ) self.source.tokens.append( antlr3.CommonToken(type=13) ) self.source.tokens.append( antlr3.CommonToken(type=antlr3.EOF) ) stream = antlr3.CommonTokenStream(self.source) stream.fillBuffer() stream.consume() marker = stream.mark() stream.consume() stream.rewind(marker) self.assertEqual(stream.LA(1), 13) def testToString(self): """CommonTokenStream.toString()""" self.source.tokens.append( antlr3.CommonToken(type=12, text="foo") ) self.source.tokens.append( antlr3.CommonToken(type=13, text="bar") ) self.source.tokens.append( antlr3.CommonToken(type=14, text="gnurz") ) self.source.tokens.append( antlr3.CommonToken(type=15, text="blarz") ) stream = antlr3.CommonTokenStream(self.source) self.assertEqual(stream.toString(), "foobargnurzblarz") self.assertEqual(stream.toString(1, 2), "bargnurz") self.assertEqual(stream.toString(stream.tokens[1], stream.tokens[-2]), "bargnurz") if __name__ == "__main__": unittest.main(testRunner=unittest.TextTestRunner(verbosity=2)) python3-antlr3-3.5.2/unittests/testtree.py000066400000000000000000001266261324200532700206140ustar00rootroot00000000000000 from io import StringIO import os import unittest from antlr3.tree import (CommonTreeNodeStream, CommonTree, CommonTreeAdaptor, TreeParser, TreeVisitor, TreeIterator) from antlr3 import CommonToken, UP, DOWN, EOF from antlr3.treewizard import TreeWizard class TestTreeNodeStream(unittest.TestCase): """Test case for the TreeNodeStream class.""" def setUp(self): self.adaptor = CommonTreeAdaptor() def newStream(self, t): """Build new stream; let's us override to test other streams.""" return CommonTreeNodeStream(t) def testSingleNode(self): t = CommonTree(CommonToken(101)) stream = self.newStream(t) expecting = "101" found = self.toNodesOnlyString(stream) self.assertEqual(expecting, found) expecting = "101" found = str(stream) self.assertEqual(expecting, found) def testTwoChildrenOfNilRoot(self): class V(CommonTree): def __init__(self, token=None, ttype=None): if token: self.token = token elif ttype: self.token = CommonToken(type=ttype) def __str__(self): if self.token: txt = self.token.text else: txt = "" txt += "" return txt root_0 = self.adaptor.nil() t = V(ttype=101) u = V(token=CommonToken(type=102, text="102")) self.adaptor.addChild(root_0, t) self.adaptor.addChild(root_0, u) self.assertIsNone(root_0.parent) self.assertEqual(-1, root_0.childIndex) self.assertEqual(0, t.childIndex) self.assertEqual(1, u.childIndex) def test4Nodes(self): # ^(101 ^(102 103) 104) t = CommonTree(CommonToken(101)) t.addChild(CommonTree(CommonToken(102))) t.getChild(0).addChild(CommonTree(CommonToken(103))) t.addChild(CommonTree(CommonToken(104))) stream = self.newStream(t) expecting = "101 102 103 104" found = self.toNodesOnlyString(stream) self.assertEqual(expecting, found) expecting = "101 2 102 2 103 3 104 3" found = str(stream) self.assertEqual(expecting, found) def testList(self): root = CommonTree(None) t = CommonTree(CommonToken(101)) t.addChild(CommonTree(CommonToken(102))) t.getChild(0).addChild(CommonTree(CommonToken(103))) t.addChild(CommonTree(CommonToken(104))) u = CommonTree(CommonToken(105)) root.addChild(t) root.addChild(u) stream = CommonTreeNodeStream(root) expecting = "101 102 103 104 105" found = self.toNodesOnlyString(stream) self.assertEqual(expecting, found) expecting = "101 2 102 2 103 3 104 3 105" found = str(stream) self.assertEqual(expecting, found) def testFlatList(self): root = CommonTree(None) root.addChild(CommonTree(CommonToken(101))) root.addChild(CommonTree(CommonToken(102))) root.addChild(CommonTree(CommonToken(103))) stream = CommonTreeNodeStream(root) expecting = "101 102 103" found = self.toNodesOnlyString(stream) self.assertEqual(expecting, found) expecting = "101 102 103" found = str(stream) self.assertEqual(expecting, found) def testListWithOneNode(self): root = CommonTree(None) root.addChild(CommonTree(CommonToken(101))) stream = CommonTreeNodeStream(root) expecting = "101" found = self.toNodesOnlyString(stream) self.assertEqual(expecting, found) expecting = "101" found = str(stream) self.assertEqual(expecting, found) def testAoverB(self): t = CommonTree(CommonToken(101)) t.addChild(CommonTree(CommonToken(102))) stream = self.newStream(t) expecting = "101 102" found = self.toNodesOnlyString(stream) self.assertEqual(expecting, found) expecting = "101 2 102 3" found = str(stream) self.assertEqual(expecting, found) def testLT(self): # ^(101 ^(102 103) 104) t = CommonTree(CommonToken(101)) t.addChild(CommonTree(CommonToken(102))) t.getChild(0).addChild(CommonTree(CommonToken(103))) t.addChild(CommonTree(CommonToken(104))) stream = self.newStream(t) self.assertEqual(101, stream.LT(1).getType()) self.assertEqual(DOWN, stream.LT(2).getType()) self.assertEqual(102, stream.LT(3).getType()) self.assertEqual(DOWN, stream.LT(4).getType()) self.assertEqual(103, stream.LT(5).getType()) self.assertEqual(UP, stream.LT(6).getType()) self.assertEqual(104, stream.LT(7).getType()) self.assertEqual(UP, stream.LT(8).getType()) self.assertEqual(EOF, stream.LT(9).getType()) # check way ahead self.assertEqual(EOF, stream.LT(100).getType()) def testMarkRewindEntire(self): # ^(101 ^(102 103 ^(106 107) ) 104 105) # stream has 7 real + 6 nav nodes # Sequence of types: 101 DN 102 DN 103 106 DN 107 UP UP 104 105 UP EOF r0 = CommonTree(CommonToken(101)) r1 = CommonTree(CommonToken(102)) r0.addChild(r1) r1.addChild(CommonTree(CommonToken(103))) r2 = CommonTree(CommonToken(106)) r2.addChild(CommonTree(CommonToken(107))) r1.addChild(r2) r0.addChild(CommonTree(CommonToken(104))) r0.addChild(CommonTree(CommonToken(105))) stream = CommonTreeNodeStream(r0) m = stream.mark() # MARK for _ in range(13): # consume til end stream.LT(1) stream.consume() self.assertEqual(EOF, stream.LT(1).getType()) self.assertEqual(UP, stream.LT(-1).getType()) #TODO: remove? stream.rewind(m) # REWIND # consume til end again :) for _ in range(13): # consume til end stream.LT(1) stream.consume() self.assertEqual(EOF, stream.LT(1).getType()) self.assertEqual(UP, stream.LT(-1).getType()) #TODO: remove? def testMarkRewindInMiddle(self): # ^(101 ^(102 103 ^(106 107) ) 104 105) # stream has 7 real + 6 nav nodes # Sequence of types: 101 DN 102 DN 103 106 DN 107 UP UP 104 105 UP EOF r0 = CommonTree(CommonToken(101)) r1 = CommonTree(CommonToken(102)) r0.addChild(r1) r1.addChild(CommonTree(CommonToken(103))) r2 = CommonTree(CommonToken(106)) r2.addChild(CommonTree(CommonToken(107))) r1.addChild(r2) r0.addChild(CommonTree(CommonToken(104))) r0.addChild(CommonTree(CommonToken(105))) stream = CommonTreeNodeStream(r0) for _ in range(7): # consume til middle #System.out.println(tream.LT(1).getType()) stream.consume() self.assertEqual(107, stream.LT(1).getType()) m = stream.mark() # MARK stream.consume() # consume 107 stream.consume() # consume UP stream.consume() # consume UP stream.consume() # consume 104 stream.rewind(m) # REWIND self.assertEqual(107, stream.LT(1).getType()) stream.consume() self.assertEqual(UP, stream.LT(1).getType()) stream.consume() self.assertEqual(UP, stream.LT(1).getType()) stream.consume() self.assertEqual(104, stream.LT(1).getType()) stream.consume() # now we're past rewind position self.assertEqual(105, stream.LT(1).getType()) stream.consume() self.assertEqual(UP, stream.LT(1).getType()) stream.consume() self.assertEqual(EOF, stream.LT(1).getType()) self.assertEqual(UP, stream.LT(-1).getType()) def testMarkRewindNested(self): # ^(101 ^(102 103 ^(106 107) ) 104 105) # stream has 7 real + 6 nav nodes # Sequence of types: 101 DN 102 DN 103 106 DN 107 UP UP 104 105 UP EOF r0 = CommonTree(CommonToken(101)) r1 = CommonTree(CommonToken(102)) r0.addChild(r1) r1.addChild(CommonTree(CommonToken(103))) r2 = CommonTree(CommonToken(106)) r2.addChild(CommonTree(CommonToken(107))) r1.addChild(r2) r0.addChild(CommonTree(CommonToken(104))) r0.addChild(CommonTree(CommonToken(105))) stream = CommonTreeNodeStream(r0) m = stream.mark() # MARK at start stream.consume() # consume 101 stream.consume() # consume DN m2 = stream.mark() # MARK on 102 stream.consume() # consume 102 stream.consume() # consume DN stream.consume() # consume 103 stream.consume() # consume 106 stream.rewind(m2) # REWIND to 102 self.assertEqual(102, stream.LT(1).getType()) stream.consume() self.assertEqual(DOWN, stream.LT(1).getType()) stream.consume() # stop at 103 and rewind to start stream.rewind(m) # REWIND to 101 self.assertEqual(101, stream.LT(1).getType()) stream.consume() self.assertEqual(DOWN, stream.LT(1).getType()) stream.consume() self.assertEqual(102, stream.LT(1).getType()) stream.consume() self.assertEqual(DOWN, stream.LT(1).getType()) def testSeek(self): # ^(101 ^(102 103 ^(106 107) ) 104 105) # stream has 7 real + 6 nav nodes # Sequence of types: 101 DN 102 DN 103 106 DN 107 UP UP 104 105 UP EOF r0 = CommonTree(CommonToken(101)) r1 = CommonTree(CommonToken(102)) r0.addChild(r1) r1.addChild(CommonTree(CommonToken(103))) r2 = CommonTree(CommonToken(106)) r2.addChild(CommonTree(CommonToken(107))) r1.addChild(r2) r0.addChild(CommonTree(CommonToken(104))) r0.addChild(CommonTree(CommonToken(105))) stream = CommonTreeNodeStream(r0) stream.consume() # consume 101 stream.consume() # consume DN stream.consume() # consume 102 stream.seek(7) # seek to 107 self.assertEqual(107, stream.LT(1).getType()) stream.consume() # consume 107 stream.consume() # consume UP stream.consume() # consume UP self.assertEqual(104, stream.LT(1).getType()) def testSeekFromStart(self): # ^(101 ^(102 103 ^(106 107) ) 104 105) # stream has 7 real + 6 nav nodes # Sequence of types: 101 DN 102 DN 103 106 DN 107 UP UP 104 105 UP EOF r0 = CommonTree(CommonToken(101)) r1 = CommonTree(CommonToken(102)) r0.addChild(r1) r1.addChild(CommonTree(CommonToken(103))) r2 = CommonTree(CommonToken(106)) r2.addChild(CommonTree(CommonToken(107))) r1.addChild(r2) r0.addChild(CommonTree(CommonToken(104))) r0.addChild(CommonTree(CommonToken(105))) stream = CommonTreeNodeStream(r0) stream.seek(7) # seek to 107 self.assertEqual(107, stream.LT(1).getType()) stream.consume() # consume 107 stream.consume() # consume UP stream.consume() # consume UP self.assertEqual(104, stream.LT(1).getType()) def testReset(self): # ^(101 ^(102 103 ^(106 107) ) 104 105) # stream has 7 real + 6 nav nodes # Sequence of types: 101 DN 102 DN 103 106 DN 107 UP UP 104 105 UP EOF r0 = CommonTree(CommonToken(101)) r1 = CommonTree(CommonToken(102)) r0.addChild(r1) r1.addChild(CommonTree(CommonToken(103))) r2 = CommonTree(CommonToken(106)) r2.addChild(CommonTree(CommonToken(107))) r1.addChild(r2) r0.addChild(CommonTree(CommonToken(104))) r0.addChild(CommonTree(CommonToken(105))) stream = CommonTreeNodeStream(r0) v1 = self.toNodesOnlyString(stream) # scan all stream.reset() v2 = self.toNodesOnlyString(stream) # scan all self.assertEqual(v1, v2) def testIterator(self): r0 = CommonTree(CommonToken(101)) r1 = CommonTree(CommonToken(102)) r0.addChild(r1) r1.addChild(CommonTree(CommonToken(103))) r2 = CommonTree(CommonToken(106)) r2.addChild(CommonTree(CommonToken(107))) r1.addChild(r2) r0.addChild(CommonTree(CommonToken(104))) r0.addChild(CommonTree(CommonToken(105))) stream = CommonTreeNodeStream(r0) expecting = [ 101, DOWN, 102, DOWN, 103, 106, DOWN, 107, UP, UP, 104, 105, UP] found = [t.type for t in stream] self.assertEqual(expecting, found) def toNodesOnlyString(self, nodes): buf = [] for i in range(nodes.size()): t = nodes.LT(i + 1) type = nodes.getTreeAdaptor().getType(t) if type not in {DOWN, UP}: buf.append(str(type)) return ' '.join(buf) class TestCommonTreeNodeStream(unittest.TestCase): """Test case for the CommonTreeNodeStream class.""" def testPushPop(self): # ^(101 ^(102 103) ^(104 105) ^(106 107) 108 109) # stream has 9 real + 8 nav nodes # Sequence of types: 101 DN 102 DN 103 UP 104 DN 105 UP 106 DN 107 UP 108 109 UP r0 = CommonTree(CommonToken(101)) r1 = CommonTree(CommonToken(102)) r1.addChild(CommonTree(CommonToken(103))) r0.addChild(r1) r2 = CommonTree(CommonToken(104)) r2.addChild(CommonTree(CommonToken(105))) r0.addChild(r2) r3 = CommonTree(CommonToken(106)) r3.addChild(CommonTree(CommonToken(107))) r0.addChild(r3) r0.addChild(CommonTree(CommonToken(108))) r0.addChild(CommonTree(CommonToken(109))) stream = CommonTreeNodeStream(r0) expecting = "101 2 102 2 103 3 104 2 105 3 106 2 107 3 108 109 3" found = str(stream) self.assertEqual(expecting, found) # Assume we want to hit node 107 and then "call 102" then return indexOf102 = 2 indexOf107 = 12 for _ in range(indexOf107):# consume til 107 node stream.consume() # CALL 102 self.assertEqual(107, stream.LT(1).getType()) stream.push(indexOf102) self.assertEqual(102, stream.LT(1).getType()) stream.consume() # consume 102 self.assertEqual(DOWN, stream.LT(1).getType()) stream.consume() # consume DN self.assertEqual(103, stream.LT(1).getType()) stream.consume() # consume 103 self.assertEqual(UP, stream.LT(1).getType()) # RETURN stream.pop() self.assertEqual(107, stream.LT(1).getType()) def testNestedPushPop(self): # ^(101 ^(102 103) ^(104 105) ^(106 107) 108 109) # stream has 9 real + 8 nav nodes # Sequence of types: 101 DN 102 DN 103 UP 104 DN 105 UP 106 DN 107 UP 108 109 UP r0 = CommonTree(CommonToken(101)) r1 = CommonTree(CommonToken(102)) r1.addChild(CommonTree(CommonToken(103))) r0.addChild(r1) r2 = CommonTree(CommonToken(104)) r2.addChild(CommonTree(CommonToken(105))) r0.addChild(r2) r3 = CommonTree(CommonToken(106)) r3.addChild(CommonTree(CommonToken(107))) r0.addChild(r3) r0.addChild(CommonTree(CommonToken(108))) r0.addChild(CommonTree(CommonToken(109))) stream = CommonTreeNodeStream(r0) # Assume we want to hit node 107 and then "call 102", which # calls 104, then return indexOf102 = 2 indexOf107 = 12 for _ in range(indexOf107): # consume til 107 node stream.consume() self.assertEqual(107, stream.LT(1).getType()) # CALL 102 stream.push(indexOf102) self.assertEqual(102, stream.LT(1).getType()) stream.consume() # consume 102 self.assertEqual(DOWN, stream.LT(1).getType()) stream.consume() # consume DN self.assertEqual(103, stream.LT(1).getType()) stream.consume() # consume 103 # CALL 104 indexOf104 = 6 stream.push(indexOf104) self.assertEqual(104, stream.LT(1).getType()) stream.consume() # consume 102 self.assertEqual(DOWN, stream.LT(1).getType()) stream.consume() # consume DN self.assertEqual(105, stream.LT(1).getType()) stream.consume() # consume 103 self.assertEqual(UP, stream.LT(1).getType()) # RETURN (to UP node in 102 subtree) stream.pop() self.assertEqual(UP, stream.LT(1).getType()) # RETURN (to empty stack) stream.pop() self.assertEqual(107, stream.LT(1).getType()) def testPushPopFromEOF(self): # ^(101 ^(102 103) ^(104 105) ^(106 107) 108 109) # stream has 9 real + 8 nav nodes # Sequence of types: 101 DN 102 DN 103 UP 104 DN 105 UP 106 DN 107 UP 108 109 UP r0 = CommonTree(CommonToken(101)) r1 = CommonTree(CommonToken(102)) r1.addChild(CommonTree(CommonToken(103))) r0.addChild(r1) r2 = CommonTree(CommonToken(104)) r2.addChild(CommonTree(CommonToken(105))) r0.addChild(r2) r3 = CommonTree(CommonToken(106)) r3.addChild(CommonTree(CommonToken(107))) r0.addChild(r3) r0.addChild(CommonTree(CommonToken(108))) r0.addChild(CommonTree(CommonToken(109))) stream = CommonTreeNodeStream(r0) while stream.LA(1) != EOF: stream.consume() indexOf102 = 2 indexOf104 = 6 self.assertEqual(EOF, stream.LT(1).getType()) # CALL 102 stream.push(indexOf102) self.assertEqual(102, stream.LT(1).getType()) stream.consume() # consume 102 self.assertEqual(DOWN, stream.LT(1).getType()) stream.consume() # consume DN self.assertEqual(103, stream.LT(1).getType()) stream.consume() # consume 103 self.assertEqual(UP, stream.LT(1).getType()) # RETURN (to empty stack) stream.pop() self.assertEqual(EOF, stream.LT(1).getType()) # CALL 104 stream.push(indexOf104) self.assertEqual(104, stream.LT(1).getType()) stream.consume() # consume 102 self.assertEqual(DOWN, stream.LT(1).getType()) stream.consume() # consume DN self.assertEqual(105, stream.LT(1).getType()) stream.consume() # consume 103 self.assertEqual(UP, stream.LT(1).getType()) # RETURN (to empty stack) stream.pop() self.assertEqual(EOF, stream.LT(1).getType()) class TestCommonTree(unittest.TestCase): """Test case for the CommonTree class.""" def setUp(self): """Setup test fixure""" self.adaptor = CommonTreeAdaptor() def testSingleNode(self): t = CommonTree(CommonToken(101)) self.assertIsNone(t.parent) self.assertEqual(-1, t.childIndex) def test4Nodes(self): # ^(101 ^(102 103) 104) r0 = CommonTree(CommonToken(101)) r0.addChild(CommonTree(CommonToken(102))) r0.getChild(0).addChild(CommonTree(CommonToken(103))) r0.addChild(CommonTree(CommonToken(104))) self.assertIsNone(r0.parent) self.assertEqual(-1, r0.childIndex) def testList(self): # ^(nil 101 102 103) r0 = CommonTree(None) c0=CommonTree(CommonToken(101)) r0.addChild(c0) c1=CommonTree(CommonToken(102)) r0.addChild(c1) c2=CommonTree(CommonToken(103)) r0.addChild(c2) self.assertIsNone(r0.parent) self.assertEqual(-1, r0.childIndex) self.assertEqual(r0, c0.parent) self.assertEqual(0, c0.childIndex) self.assertEqual(r0, c1.parent) self.assertEqual(1, c1.childIndex) self.assertEqual(r0, c2.parent) self.assertEqual(2, c2.childIndex) def testList2(self): # Add child ^(nil 101 102 103) to root 5 # should pull 101 102 103 directly to become 5's child list root = CommonTree(CommonToken(5)) # child tree r0 = CommonTree(None) c0=CommonTree(CommonToken(101)) r0.addChild(c0) c1=CommonTree(CommonToken(102)) r0.addChild(c1) c2=CommonTree(CommonToken(103)) r0.addChild(c2) root.addChild(r0) self.assertIsNone(root.parent) self.assertEqual(-1, root.childIndex) # check children of root all point at root self.assertEqual(root, c0.parent) self.assertEqual(0, c0.childIndex) self.assertEqual(root, c0.parent) self.assertEqual(1, c1.childIndex) self.assertEqual(root, c0.parent) self.assertEqual(2, c2.childIndex) def testAddListToExistChildren(self): # Add child ^(nil 101 102 103) to root ^(5 6) # should add 101 102 103 to end of 5's child list root = CommonTree(CommonToken(5)) root.addChild(CommonTree(CommonToken(6))) # child tree r0 = CommonTree(None) c0=CommonTree(CommonToken(101)) r0.addChild(c0) c1=CommonTree(CommonToken(102)) r0.addChild(c1) c2=CommonTree(CommonToken(103)) r0.addChild(c2) root.addChild(r0) self.assertIsNone(root.parent) self.assertEqual(-1, root.childIndex) # check children of root all point at root self.assertEqual(root, c0.parent) self.assertEqual(1, c0.childIndex) self.assertEqual(root, c0.parent) self.assertEqual(2, c1.childIndex) self.assertEqual(root, c0.parent) self.assertEqual(3, c2.childIndex) def testDupTree(self): # ^(101 ^(102 103 ^(106 107) ) 104 105) r0 = CommonTree(CommonToken(101)) r1 = CommonTree(CommonToken(102)) r0.addChild(r1) r1.addChild(CommonTree(CommonToken(103))) r2 = CommonTree(CommonToken(106)) r2.addChild(CommonTree(CommonToken(107))) r1.addChild(r2) r0.addChild(CommonTree(CommonToken(104))) r0.addChild(CommonTree(CommonToken(105))) dup = self.adaptor.dupTree(r0) self.assertIsNone(dup.parent) self.assertEqual(-1, dup.childIndex) dup.sanityCheckParentAndChildIndexes() def testBecomeRoot(self): # 5 becomes root of ^(nil 101 102 103) newRoot = CommonTree(CommonToken(5)) oldRoot = CommonTree(None) oldRoot.addChild(CommonTree(CommonToken(101))) oldRoot.addChild(CommonTree(CommonToken(102))) oldRoot.addChild(CommonTree(CommonToken(103))) self.adaptor.becomeRoot(newRoot, oldRoot) newRoot.sanityCheckParentAndChildIndexes() def testBecomeRoot2(self): # 5 becomes root of ^(101 102 103) newRoot = CommonTree(CommonToken(5)) oldRoot = CommonTree(CommonToken(101)) oldRoot.addChild(CommonTree(CommonToken(102))) oldRoot.addChild(CommonTree(CommonToken(103))) self.adaptor.becomeRoot(newRoot, oldRoot) newRoot.sanityCheckParentAndChildIndexes() def testBecomeRoot3(self): # ^(nil 5) becomes root of ^(nil 101 102 103) newRoot = CommonTree(None) newRoot.addChild(CommonTree(CommonToken(5))) oldRoot = CommonTree(None) oldRoot.addChild(CommonTree(CommonToken(101))) oldRoot.addChild(CommonTree(CommonToken(102))) oldRoot.addChild(CommonTree(CommonToken(103))) self.adaptor.becomeRoot(newRoot, oldRoot) newRoot.sanityCheckParentAndChildIndexes() def testBecomeRoot5(self): # ^(nil 5) becomes root of ^(101 102 103) newRoot = CommonTree(None) newRoot.addChild(CommonTree(CommonToken(5))) oldRoot = CommonTree(CommonToken(101)) oldRoot.addChild(CommonTree(CommonToken(102))) oldRoot.addChild(CommonTree(CommonToken(103))) self.adaptor.becomeRoot(newRoot, oldRoot) newRoot.sanityCheckParentAndChildIndexes() def testBecomeRoot6(self): # emulates construction of ^(5 6) root_0 = self.adaptor.nil() root_1 = self.adaptor.nil() root_1 = self.adaptor.becomeRoot(CommonTree(CommonToken(5)), root_1) self.adaptor.addChild(root_1, CommonTree(CommonToken(6))) self.adaptor.addChild(root_0, root_1) root_0.sanityCheckParentAndChildIndexes() # Test replaceChildren def testReplaceWithNoChildren(self): t = CommonTree(CommonToken(101)) newChild = CommonTree(CommonToken(5)) error = False self.assertRaises(IndexError, t.replaceChildren, 0, 0, newChild) def testReplaceWithOneChildren(self): # assume token type 99 and use text t = CommonTree(CommonToken(99, text="a")) c0 = CommonTree(CommonToken(99, text="b")) t.addChild(c0) newChild = CommonTree(CommonToken(99, text="c")) t.replaceChildren(0, 0, newChild) expecting = "(a c)" self.assertEqual(expecting, t.toStringTree()) t.sanityCheckParentAndChildIndexes() def testReplaceInMiddle(self): t = CommonTree(CommonToken(99, text="a")) t.addChild(CommonTree(CommonToken(99, text="b"))) t.addChild(CommonTree(CommonToken(99, text="c"))) # index 1 t.addChild(CommonTree(CommonToken(99, text="d"))) newChild = CommonTree(CommonToken(99, text="x")) t.replaceChildren(1, 1, newChild) expecting = "(a b x d)" self.assertEqual(expecting, t.toStringTree()) t.sanityCheckParentAndChildIndexes() def testReplaceAtLeft(self): t = CommonTree(CommonToken(99, text="a")) t.addChild(CommonTree(CommonToken(99, text="b"))) # index 0 t.addChild(CommonTree(CommonToken(99, text="c"))) t.addChild(CommonTree(CommonToken(99, text="d"))) newChild = CommonTree(CommonToken(99, text="x")) t.replaceChildren(0, 0, newChild) expecting = "(a x c d)" self.assertEqual(expecting, t.toStringTree()) t.sanityCheckParentAndChildIndexes() def testReplaceAtRight(self): t = CommonTree(CommonToken(99, text="a")) t.addChild(CommonTree(CommonToken(99, text="b"))) t.addChild(CommonTree(CommonToken(99, text="c"))) t.addChild(CommonTree(CommonToken(99, text="d"))) # index 2 newChild = CommonTree(CommonToken(99, text="x")) t.replaceChildren(2, 2, newChild) expecting = "(a b c x)" self.assertEqual(expecting, t.toStringTree()) t.sanityCheckParentAndChildIndexes() def testReplaceOneWithTwoAtLeft(self): t = CommonTree(CommonToken(99, text="a")) t.addChild(CommonTree(CommonToken(99, text="b"))) t.addChild(CommonTree(CommonToken(99, text="c"))) t.addChild(CommonTree(CommonToken(99, text="d"))) newChildren = self.adaptor.nil() newChildren.addChild(CommonTree(CommonToken(99, text="x"))) newChildren.addChild(CommonTree(CommonToken(99, text="y"))) t.replaceChildren(0, 0, newChildren) expecting = "(a x y c d)" self.assertEqual(expecting, t.toStringTree()) t.sanityCheckParentAndChildIndexes() def testReplaceOneWithTwoAtRight(self): t = CommonTree(CommonToken(99, text="a")) t.addChild(CommonTree(CommonToken(99, text="b"))) t.addChild(CommonTree(CommonToken(99, text="c"))) t.addChild(CommonTree(CommonToken(99, text="d"))) newChildren = self.adaptor.nil() newChildren.addChild(CommonTree(CommonToken(99, text="x"))) newChildren.addChild(CommonTree(CommonToken(99, text="y"))) t.replaceChildren(2, 2, newChildren) expecting = "(a b c x y)" self.assertEqual(expecting, t.toStringTree()) t.sanityCheckParentAndChildIndexes() def testReplaceOneWithTwoInMiddle(self): t = CommonTree(CommonToken(99, text="a")) t.addChild(CommonTree(CommonToken(99, text="b"))) t.addChild(CommonTree(CommonToken(99, text="c"))) t.addChild(CommonTree(CommonToken(99, text="d"))) newChildren = self.adaptor.nil() newChildren.addChild(CommonTree(CommonToken(99, text="x"))) newChildren.addChild(CommonTree(CommonToken(99, text="y"))) t.replaceChildren(1, 1, newChildren) expecting = "(a b x y d)" self.assertEqual(expecting, t.toStringTree()) t.sanityCheckParentAndChildIndexes() def testReplaceTwoWithOneAtLeft(self): t = CommonTree(CommonToken(99, text="a")) t.addChild(CommonTree(CommonToken(99, text="b"))) t.addChild(CommonTree(CommonToken(99, text="c"))) t.addChild(CommonTree(CommonToken(99, text="d"))) newChild = CommonTree(CommonToken(99, text="x")) t.replaceChildren(0, 1, newChild) expecting = "(a x d)" self.assertEqual(expecting, t.toStringTree()) t.sanityCheckParentAndChildIndexes() def testReplaceTwoWithOneAtRight(self): t = CommonTree(CommonToken(99, text="a")) t.addChild(CommonTree(CommonToken(99, text="b"))) t.addChild(CommonTree(CommonToken(99, text="c"))) t.addChild(CommonTree(CommonToken(99, text="d"))) newChild = CommonTree(CommonToken(99, text="x")) t.replaceChildren(1, 2, newChild) expecting = "(a b x)" self.assertEqual(expecting, t.toStringTree()) t.sanityCheckParentAndChildIndexes() def testReplaceAllWithOne(self): t = CommonTree(CommonToken(99, text="a")) t.addChild(CommonTree(CommonToken(99, text="b"))) t.addChild(CommonTree(CommonToken(99, text="c"))) t.addChild(CommonTree(CommonToken(99, text="d"))) newChild = CommonTree(CommonToken(99, text="x")) t.replaceChildren(0, 2, newChild) expecting = "(a x)" self.assertEqual(expecting, t.toStringTree()) t.sanityCheckParentAndChildIndexes() def testReplaceAllWithTwo(self): t = CommonTree(CommonToken(99, text="a")) t.addChild(CommonTree(CommonToken(99, text="b"))) t.addChild(CommonTree(CommonToken(99, text="c"))) t.addChild(CommonTree(CommonToken(99, text="d"))) newChildren = self.adaptor.nil() newChildren.addChild(CommonTree(CommonToken(99, text="x"))) newChildren.addChild(CommonTree(CommonToken(99, text="y"))) t.replaceChildren(0, 2, newChildren) expecting = "(a x y)" self.assertEqual(expecting, t.toStringTree()) t.sanityCheckParentAndChildIndexes() class TestTreeContext(unittest.TestCase): """Test the TreeParser.inContext() method""" tokenNames = [ "", "", "", "", "VEC", "ASSIGN", "PRINT", "PLUS", "MULT", "DOT", "ID", "INT", "WS", "'['", "','", "']'" ] def testSimpleParent(self): tree = "(nil (ASSIGN ID[x] INT[3]) (PRINT (MULT ID[x] (VEC INT[1] INT[2] INT[3]))))" adaptor = CommonTreeAdaptor() wiz = TreeWizard(adaptor, self.tokenNames) t = wiz.create(tree) labels = {} valid = wiz.parse( t, "(nil (ASSIGN ID[x] INT[3]) (PRINT (MULT ID (VEC INT %x:INT INT))))", labels) self.assertTrue(valid) node = labels.get("x") expecting = True found = TreeParser._inContext(adaptor, self.tokenNames, node, "VEC") self.assertEqual(expecting, found) def testNoParent(self): tree = "(PRINT (MULT ID[x] (VEC INT[1] INT[2] INT[3])))" adaptor = CommonTreeAdaptor() wiz = TreeWizard(adaptor, self.tokenNames) t = wiz.create(tree) labels = {} valid = wiz.parse( t, "(%x:PRINT (MULT ID (VEC INT INT INT)))", labels) self.assertTrue(valid) node = labels.get("x") expecting = False found = TreeParser._inContext(adaptor, self.tokenNames, node, "VEC") self.assertEqual(expecting, found) def testParentWithWildcard(self): tree = "(nil (ASSIGN ID[x] INT[3]) (PRINT (MULT ID[x] (VEC INT[1] INT[2] INT[3]))))" adaptor = CommonTreeAdaptor() wiz = TreeWizard(adaptor, self.tokenNames) t = wiz.create(tree) labels = {} valid = wiz.parse( t, "(nil (ASSIGN ID[x] INT[3]) (PRINT (MULT ID (VEC INT %x:INT INT))))", labels) self.assertTrue(valid) node = labels.get("x") expecting = True found = TreeParser._inContext(adaptor, self.tokenNames, node, "VEC ...") self.assertEqual(expecting, found) def testWildcardAtStartIgnored(self): tree = "(nil (ASSIGN ID[x] INT[3]) (PRINT (MULT ID[x] (VEC INT[1] INT[2] INT[3]))))" adaptor = CommonTreeAdaptor() wiz = TreeWizard(adaptor, self.tokenNames) t = wiz.create(tree) labels = {} valid = wiz.parse( t, "(nil (ASSIGN ID[x] INT[3]) (PRINT (MULT ID (VEC INT %x:INT INT))))", labels) self.assertTrue(valid) node = labels.get("x") expecting = True found = TreeParser._inContext(adaptor, self.tokenNames, node, "...VEC") self.assertEqual(expecting, found) def testWildcardInBetween(self): tree = "(nil (ASSIGN ID[x] INT[3]) (PRINT (MULT ID[x] (VEC INT[1] INT[2] INT[3]))))" adaptor = CommonTreeAdaptor() wiz = TreeWizard(adaptor, self.tokenNames) t = wiz.create(tree) labels = {} valid = wiz.parse( t, "(nil (ASSIGN ID[x] INT[3]) (PRINT (MULT ID (VEC INT %x:INT INT))))", labels) self.assertTrue(valid) node = labels.get("x") expecting = True found = TreeParser._inContext(adaptor, self.tokenNames, node, "PRINT...VEC") self.assertEqual(expecting, found) def testLotsOfWildcards(self): tree = "(nil (ASSIGN ID[x] INT[3]) (PRINT (MULT ID[x] (VEC INT[1] INT[2] INT[3]))))" adaptor = CommonTreeAdaptor() wiz = TreeWizard(adaptor, self.tokenNames) t = wiz.create(tree) labels = {} valid = wiz.parse( t, "(nil (ASSIGN ID[x] INT[3]) (PRINT (MULT ID (VEC INT %x:INT INT))))", labels) self.assertTrue(valid) node = labels.get("x") expecting = True found = TreeParser._inContext(adaptor, self.tokenNames, node, "... PRINT ... VEC ...") self.assertEqual(expecting, found) def testDeep(self): tree = "(PRINT (MULT ID[x] (VEC (MULT INT[9] INT[1]) INT[2] INT[3])))" adaptor = CommonTreeAdaptor() wiz = TreeWizard(adaptor, self.tokenNames) t = wiz.create(tree) labels = {} valid = wiz.parse( t, "(PRINT (MULT ID (VEC (MULT INT %x:INT) INT INT)))", labels) self.assertTrue(valid) node = labels.get("x") expecting = True found = TreeParser._inContext(adaptor, self.tokenNames, node, "VEC ...") self.assertEqual(expecting, found) def testDeepAndFindRoot(self): tree = "(PRINT (MULT ID[x] (VEC (MULT INT[9] INT[1]) INT[2] INT[3])))" adaptor = CommonTreeAdaptor() wiz = TreeWizard(adaptor, self.tokenNames) t = wiz.create(tree) labels = {} valid = wiz.parse( t, "(PRINT (MULT ID (VEC (MULT INT %x:INT) INT INT)))", labels) self.assertTrue(valid) node = labels.get("x") expecting = True found = TreeParser._inContext(adaptor, self.tokenNames, node, "PRINT ...") self.assertEqual(expecting, found) def testDeepAndFindRoot2(self): tree = "(PRINT (MULT ID[x] (VEC (MULT INT[9] INT[1]) INT[2] INT[3])))" adaptor = CommonTreeAdaptor() wiz = TreeWizard(adaptor, self.tokenNames) t = wiz.create(tree) labels = {} valid = wiz.parse( t, "(PRINT (MULT ID (VEC (MULT INT %x:INT) INT INT)))", labels) self.assertTrue(valid) node = labels.get("x") expecting = True found = TreeParser._inContext(adaptor, self.tokenNames, node, "PRINT ... VEC ...") self.assertEqual(expecting, found) def testChain(self): tree = "(PRINT (MULT ID[x] (VEC (MULT INT[9] INT[1]) INT[2] INT[3])))" adaptor = CommonTreeAdaptor() wiz = TreeWizard(adaptor, self.tokenNames) t = wiz.create(tree) labels = {} valid = wiz.parse( t, "(PRINT (MULT ID (VEC (MULT INT %x:INT) INT INT)))", labels) self.assertTrue(valid) node = labels.get("x") expecting = True found = TreeParser._inContext(adaptor, self.tokenNames, node, "PRINT MULT VEC MULT") self.assertEqual(expecting, found) ## TEST INVALID CONTEXTS def testNotParent(self): tree = "(PRINT (MULT ID[x] (VEC (MULT INT[9] INT[1]) INT[2] INT[3])))" adaptor = CommonTreeAdaptor() wiz = TreeWizard(adaptor, self.tokenNames) t = wiz.create(tree) labels = {} valid = wiz.parse( t, "(PRINT (MULT ID (VEC (MULT INT %x:INT) INT INT)))", labels) self.assertTrue(valid) node = labels.get("x") expecting = False found = TreeParser._inContext(adaptor, self.tokenNames, node, "VEC") self.assertEqual(expecting, found) def testMismatch(self): tree = "(PRINT (MULT ID[x] (VEC (MULT INT[9] INT[1]) INT[2] INT[3])))" adaptor = CommonTreeAdaptor() wiz = TreeWizard(adaptor, self.tokenNames) t = wiz.create(tree) labels = {} valid = wiz.parse( t, "(PRINT (MULT ID (VEC (MULT INT %x:INT) INT INT)))", labels) self.assertTrue(valid) node = labels.get("x") expecting = False ## missing MULT found = TreeParser._inContext(adaptor, self.tokenNames, node, "PRINT VEC MULT") self.assertEqual(expecting, found) def testMismatch2(self): tree = "(PRINT (MULT ID[x] (VEC (MULT INT[9] INT[1]) INT[2] INT[3])))" adaptor = CommonTreeAdaptor() wiz = TreeWizard(adaptor, self.tokenNames) t = wiz.create(tree) labels = {} valid = wiz.parse( t, "(PRINT (MULT ID (VEC (MULT INT %x:INT) INT INT)))", labels) self.assertTrue(valid) node = labels.get("x") expecting = False found = TreeParser._inContext(adaptor, self.tokenNames, node, "PRINT VEC ...") self.assertEqual(expecting, found) def testMismatch3(self): tree = "(PRINT (MULT ID[x] (VEC (MULT INT[9] INT[1]) INT[2] INT[3])))" adaptor = CommonTreeAdaptor() wiz = TreeWizard(adaptor, self.tokenNames) t = wiz.create(tree) labels = {} valid = wiz.parse( t, "(PRINT (MULT ID (VEC (MULT INT %x:INT) INT INT)))", labels) self.assertTrue(valid) node = labels.get("x") expecting = False found = TreeParser._inContext(adaptor, self.tokenNames, node, "VEC ... VEC MULT") self.assertEqual(expecting, found) def testDoubleEtc(self): tree = "(PRINT (MULT ID[x] (VEC (MULT INT[9] INT[1]) INT[2] INT[3])))" adaptor = CommonTreeAdaptor() wiz = TreeWizard(adaptor, self.tokenNames) t = wiz.create(tree) labels = {} valid = wiz.parse( t, "(PRINT (MULT ID (VEC (MULT INT %x:INT) INT INT)))", labels) self.assertTrue(valid) node = labels.get("x") self.assertRaisesRegex( ValueError, r'invalid syntax: \.\.\. \.\.\.', TreeParser._inContext, adaptor, self.tokenNames, node, "PRINT ... ... VEC") def testDotDot(self): tree = "(PRINT (MULT ID[x] (VEC (MULT INT[9] INT[1]) INT[2] INT[3])))" adaptor = CommonTreeAdaptor() wiz = TreeWizard(adaptor, self.tokenNames) t = wiz.create(tree) labels = {} valid = wiz.parse( t, "(PRINT (MULT ID (VEC (MULT INT %x:INT) INT INT)))", labels) self.assertTrue(valid) node = labels.get("x") self.assertRaisesRegex( ValueError, r'invalid syntax: \.\.', TreeParser._inContext, adaptor, self.tokenNames, node, "PRINT .. VEC") class TestTreeVisitor(unittest.TestCase): """Test of the TreeVisitor class.""" tokenNames = [ "", "", "", "", "VEC", "ASSIGN", "PRINT", "PLUS", "MULT", "DOT", "ID", "INT", "WS", "'['", "','", "']'" ] def testTreeVisitor(self): tree = "(PRINT (MULT ID[x] (VEC (MULT INT[9] INT[1]) INT[2] INT[3])))" adaptor = CommonTreeAdaptor() wiz = TreeWizard(adaptor, self.tokenNames) t = wiz.create(tree) found = [] def pre(t): found.append("pre({})".format(t)) return t def post(t): found.append("post({})".format(t)) return t visitor = TreeVisitor(adaptor) visitor.visit(t, pre, post) expecting = [ "pre(PRINT)", "pre(MULT)", "pre(x)", "post(x)", "pre(VEC)", "pre(MULT)", "pre(9)", "post(9)", "pre(1)", "post(1)", "post(MULT)", "pre(2)", "post(2)", "pre(3)", "post(3)", "post(VEC)", "post(MULT)", "post(PRINT)" ] self.assertEqual(expecting, found) class TestTreeIterator(unittest.TestCase): tokens = [ "", "", "", "", "A", "B", "C", "D", "E", "F", "G" ] def testNode(self): adaptor = CommonTreeAdaptor() wiz = TreeWizard(adaptor, self.tokens) t = wiz.create("A") it = TreeIterator(t) expecting = "A EOF" found = self.toString(it) self.assertEqual(expecting, found) def testFlatAB(self): adaptor = CommonTreeAdaptor() wiz = TreeWizard(adaptor, self.tokens) t = wiz.create("(nil A B)") it = TreeIterator(t) expecting = "nil DOWN A B UP EOF" found = self.toString(it) self.assertEqual(expecting, found) def testAB(self): adaptor = CommonTreeAdaptor() wiz = TreeWizard(adaptor, self.tokens) t = wiz.create("(A B)") it = TreeIterator(t) expecting = "A DOWN B UP EOF" found = self.toString(it) self.assertEqual(expecting, found) def testABC(self): adaptor = CommonTreeAdaptor() wiz = TreeWizard(adaptor, self.tokens) t = wiz.create("(A B C)") it = TreeIterator(t) expecting = "A DOWN B C UP EOF" found = self.toString(it) self.assertEqual(expecting, found) def testVerticalList(self): adaptor = CommonTreeAdaptor() wiz = TreeWizard(adaptor, self.tokens) t = wiz.create("(A (B C))") it = TreeIterator(t) expecting = "A DOWN B DOWN C UP UP EOF" found = self.toString(it) self.assertEqual(expecting, found) def testComplex(self): adaptor = CommonTreeAdaptor() wiz = TreeWizard(adaptor, self.tokens) t = wiz.create("(A (B (C D E) F) G)") it = TreeIterator(t) expecting = "A DOWN B DOWN C DOWN D E UP F UP G UP EOF" found = self.toString(it) self.assertEqual(expecting, found) def testReset(self): adaptor = CommonTreeAdaptor() wiz = TreeWizard(adaptor, self.tokens) t = wiz.create("(A (B (C D E) F) G)") it = TreeIterator(t) expecting = "A DOWN B DOWN C DOWN D E UP F UP G UP EOF" found = self.toString(it) self.assertEqual(expecting, found) it.reset() expecting = "A DOWN B DOWN C DOWN D E UP F UP G UP EOF" found = self.toString(it) self.assertEqual(expecting, found) def toString(self, it): buf = [] for n in it: buf.append(str(n)) return ' '.join(buf) if __name__ == "__main__": unittest.main(testRunner=unittest.TextTestRunner(verbosity=2)) python3-antlr3-3.5.2/unittests/testtreewizard.py000066400000000000000000000477011324200532700220310ustar00rootroot00000000000000 from io import StringIO import os import unittest from antlr3.tree import CommonTreeAdaptor, CommonTree, INVALID_TOKEN_TYPE from antlr3.treewizard import TreeWizard, computeTokenTypes, \ TreePatternLexer, EOF, ID, BEGIN, END, PERCENT, COLON, DOT, ARG, \ TreePatternParser, \ TreePattern, WildcardTreePattern, TreePatternTreeAdaptor class TestComputeTokenTypes(unittest.TestCase): """Test case for the computeTokenTypes function.""" def testNone(self): """computeTokenTypes(None) -> {}""" typeMap = computeTokenTypes(None) self.assertIsInstance(typeMap, dict) self.assertEqual(typeMap, {}) def testList(self): """computeTokenTypes(['a', 'b']) -> { 'a': 0, 'b': 1 }""" typeMap = computeTokenTypes(['a', 'b']) self.assertIsInstance(typeMap, dict) self.assertEqual(typeMap, { 'a': 0, 'b': 1 }) class TestTreePatternLexer(unittest.TestCase): """Test case for the TreePatternLexer class.""" def testBegin(self): """TreePatternLexer(): '('""" lexer = TreePatternLexer('(') type = lexer.nextToken() self.assertEqual(type, BEGIN) self.assertEqual(lexer.sval, '') self.assertFalse(lexer.error) def testEnd(self): """TreePatternLexer(): ')'""" lexer = TreePatternLexer(')') type = lexer.nextToken() self.assertEqual(type, END) self.assertEqual(lexer.sval, '') self.assertFalse(lexer.error) def testPercent(self): """TreePatternLexer(): '%'""" lexer = TreePatternLexer('%') type = lexer.nextToken() self.assertEqual(type, PERCENT) self.assertEqual(lexer.sval, '') self.assertFalse(lexer.error) def testDot(self): """TreePatternLexer(): '.'""" lexer = TreePatternLexer('.') type = lexer.nextToken() self.assertEqual(type, DOT) self.assertEqual(lexer.sval, '') self.assertFalse(lexer.error) def testColon(self): """TreePatternLexer(): ':'""" lexer = TreePatternLexer(':') type = lexer.nextToken() self.assertEqual(type, COLON) self.assertEqual(lexer.sval, '') self.assertFalse(lexer.error) def testEOF(self): """TreePatternLexer(): EOF""" lexer = TreePatternLexer(' \n \r \t ') type = lexer.nextToken() self.assertEqual(type, EOF) self.assertEqual(lexer.sval, '') self.assertFalse(lexer.error) def testID(self): """TreePatternLexer(): ID""" lexer = TreePatternLexer('_foo12_bar') type = lexer.nextToken() self.assertEqual(type, ID) self.assertEqual(lexer.sval, '_foo12_bar') self.assertFalse(lexer.error) def testARG(self): """TreePatternLexer(): ARG""" lexer = TreePatternLexer(r'[ \]bla\n]') type = lexer.nextToken() self.assertEqual(type, ARG) self.assertEqual(lexer.sval, r' ]bla\n') self.assertFalse(lexer.error) def testError(self): """TreePatternLexer(): error""" lexer = TreePatternLexer('1') type = lexer.nextToken() self.assertEqual(type, EOF) self.assertEqual(lexer.sval, '') self.assertTrue(lexer.error) class TestTreePatternParser(unittest.TestCase): """Test case for the TreePatternParser class.""" def setUp(self): """Setup text fixure We need a tree adaptor, use CommonTreeAdaptor. And a constant list of token names. """ self.adaptor = CommonTreeAdaptor() self.tokens = [ "", "", "", "", "", "A", "B", "C", "D", "E", "ID", "VAR" ] self.wizard = TreeWizard(self.adaptor, tokenNames=self.tokens) def testSingleNode(self): """TreePatternParser: 'ID'""" lexer = TreePatternLexer('ID') parser = TreePatternParser(lexer, self.wizard, self.adaptor) tree = parser.pattern() self.assertIsInstance(tree, CommonTree) self.assertEqual(tree.getType(), 10) self.assertEqual(tree.getText(), 'ID') def testSingleNodeWithArg(self): """TreePatternParser: 'ID[foo]'""" lexer = TreePatternLexer('ID[foo]') parser = TreePatternParser(lexer, self.wizard, self.adaptor) tree = parser.pattern() self.assertIsInstance(tree, CommonTree) self.assertEqual(tree.getType(), 10) self.assertEqual(tree.getText(), 'foo') def testSingleLevelTree(self): """TreePatternParser: '(A B)'""" lexer = TreePatternLexer('(A B)') parser = TreePatternParser(lexer, self.wizard, self.adaptor) tree = parser.pattern() self.assertIsInstance(tree, CommonTree) self.assertEqual(tree.getType(), 5) self.assertEqual(tree.getText(), 'A') self.assertEqual(tree.getChildCount(), 1) self.assertEqual(tree.getChild(0).getType(), 6) self.assertEqual(tree.getChild(0).getText(), 'B') def testNil(self): """TreePatternParser: 'nil'""" lexer = TreePatternLexer('nil') parser = TreePatternParser(lexer, self.wizard, self.adaptor) tree = parser.pattern() self.assertIsInstance(tree, CommonTree) self.assertEqual(tree.getType(), 0) self.assertIsNone(tree.getText()) def testWildcard(self): """TreePatternParser: '(.)'""" lexer = TreePatternLexer('(.)') parser = TreePatternParser(lexer, self.wizard, self.adaptor) tree = parser.pattern() self.assertIsInstance(tree, WildcardTreePattern) def testLabel(self): """TreePatternParser: '(%a:A)'""" lexer = TreePatternLexer('(%a:A)') parser = TreePatternParser(lexer, self.wizard, TreePatternTreeAdaptor()) tree = parser.pattern() self.assertIsInstance(tree, TreePattern) self.assertEqual(tree.label, 'a') def testError1(self): """TreePatternParser: ')'""" lexer = TreePatternLexer(')') parser = TreePatternParser(lexer, self.wizard, self.adaptor) tree = parser.pattern() self.assertIsNone(tree) def testError2(self): """TreePatternParser: '()'""" lexer = TreePatternLexer('()') parser = TreePatternParser(lexer, self.wizard, self.adaptor) tree = parser.pattern() self.assertIsNone(tree) def testError3(self): """TreePatternParser: '(A ])'""" lexer = TreePatternLexer('(A ])') parser = TreePatternParser(lexer, self.wizard, self.adaptor) tree = parser.pattern() self.assertIsNone(tree) class TestTreeWizard(unittest.TestCase): """Test case for the TreeWizard class.""" def setUp(self): """Setup text fixure We need a tree adaptor, use CommonTreeAdaptor. And a constant list of token names. """ self.adaptor = CommonTreeAdaptor() self.tokens = [ "", "", "", "", "", "A", "B", "C", "D", "E", "ID", "VAR" ] def testInit(self): """TreeWizard.__init__()""" wiz = TreeWizard( self.adaptor, tokenNames=['a', 'b'] ) self.assertIs(wiz.adaptor, self.adaptor) self.assertEqual( wiz.tokenNameToTypeMap, { 'a': 0, 'b': 1 } ) def testGetTokenType(self): """TreeWizard.getTokenType()""" wiz = TreeWizard( self.adaptor, tokenNames=self.tokens ) self.assertEqual( wiz.getTokenType('A'), 5 ) self.assertEqual( wiz.getTokenType('VAR'), 11 ) self.assertEqual( wiz.getTokenType('invalid'), INVALID_TOKEN_TYPE ) def testSingleNode(self): wiz = TreeWizard(self.adaptor, self.tokens) t = wiz.create("ID") found = t.toStringTree() expecting = "ID" self.assertEqual(expecting, found) def testSingleNodeWithArg(self): wiz = TreeWizard(self.adaptor, self.tokens) t = wiz.create("ID[foo]") found = t.toStringTree() expecting = "foo" self.assertEqual(expecting, found) def testSingleNodeTree(self): wiz = TreeWizard(self.adaptor, self.tokens) t = wiz.create("(A)") found = t.toStringTree() expecting = "A" self.assertEqual(expecting, found) def testSingleLevelTree(self): wiz = TreeWizard(self.adaptor, self.tokens) t = wiz.create("(A B C D)") found = t.toStringTree() expecting = "(A B C D)" self.assertEqual(expecting, found) def testListTree(self): wiz = TreeWizard(self.adaptor, self.tokens) t = wiz.create("(nil A B C)") found = t.toStringTree() expecting = "A B C" self.assertEqual(expecting, found) def testInvalidListTree(self): wiz = TreeWizard(self.adaptor, self.tokens) t = wiz.create("A B C") self.assertIsNone(t) def testDoubleLevelTree(self): wiz = TreeWizard(self.adaptor, self.tokens) t = wiz.create("(A (B C) (B D) E)") found = t.toStringTree() expecting = "(A (B C) (B D) E)" self.assertEqual(expecting, found) def __simplifyIndexMap(self, indexMap): return dict( # stringify nodes for easy comparing (ttype, [str(node) for node in nodes]) for ttype, nodes in indexMap.items() ) def testSingleNodeIndex(self): wiz = TreeWizard(self.adaptor, self.tokens) tree = wiz.create("ID") indexMap = wiz.index(tree) found = self.__simplifyIndexMap(indexMap) expecting = { 10: ["ID"] } self.assertEqual(expecting, found) def testNoRepeatsIndex(self): wiz = TreeWizard(self.adaptor, self.tokens) tree = wiz.create("(A B C D)") indexMap = wiz.index(tree) found = self.__simplifyIndexMap(indexMap) expecting = { 8:['D'], 6:['B'], 7:['C'], 5:['A'] } self.assertEqual(expecting, found) def testRepeatsIndex(self): wiz = TreeWizard(self.adaptor, self.tokens) tree = wiz.create("(A B (A C B) B D D)") indexMap = wiz.index(tree) found = self.__simplifyIndexMap(indexMap) expecting = { 8: ['D', 'D'], 6: ['B', 'B', 'B'], 7: ['C'], 5: ['A', 'A'] } self.assertEqual(expecting, found) def testNoRepeatsVisit(self): wiz = TreeWizard(self.adaptor, self.tokens) tree = wiz.create("(A B C D)") elements = [] def visitor(node, parent, childIndex, labels): elements.append(str(node)) wiz.visit(tree, wiz.getTokenType("B"), visitor) expecting = ['B'] self.assertEqual(expecting, elements) def testNoRepeatsVisit2(self): wiz = TreeWizard(self.adaptor, self.tokens) tree = wiz.create("(A B (A C B) B D D)") elements = [] def visitor(node, parent, childIndex, labels): elements.append(str(node)) wiz.visit(tree, wiz.getTokenType("C"), visitor) expecting = ['C'] self.assertEqual(expecting, elements) def testRepeatsVisit(self): wiz = TreeWizard(self.adaptor, self.tokens) tree = wiz.create("(A B (A C B) B D D)") elements = [] def visitor(node, parent, childIndex, labels): elements.append(str(node)) wiz.visit(tree, wiz.getTokenType("B"), visitor) expecting = ['B', 'B', 'B'] self.assertEqual(expecting, elements) def testRepeatsVisit2(self): wiz = TreeWizard(self.adaptor, self.tokens) tree = wiz.create("(A B (A C B) B D D)") elements = [] def visitor(node, parent, childIndex, labels): elements.append(str(node)) wiz.visit(tree, wiz.getTokenType("A"), visitor) expecting = ['A', 'A'] self.assertEqual(expecting, elements) def testRepeatsVisitWithContext(self): wiz = TreeWizard(self.adaptor, self.tokens) tree = wiz.create("(A B (A C B) B D D)") elements = [] def visitor(node, parent, childIndex, labels): elements.append('{}@{}[{}]'.format(node, parent, childIndex)) wiz.visit(tree, wiz.getTokenType("B"), visitor) expecting = ['B@A[0]', 'B@A[1]', 'B@A[2]'] self.assertEqual(expecting, elements) def testRepeatsVisitWithNullParentAndContext(self): wiz = TreeWizard(self.adaptor, self.tokens) tree = wiz.create("(A B (A C B) B D D)") elements = [] def visitor(node, parent, childIndex, labels): elements.append( '{}@{}[{}]'.format( node, parent or 'nil', childIndex) ) wiz.visit(tree, wiz.getTokenType("A"), visitor) expecting = ['A@nil[0]', 'A@A[1]'] self.assertEqual(expecting, elements) def testVisitPattern(self): wiz = TreeWizard(self.adaptor, self.tokens) tree = wiz.create("(A B C (A B) D)") elements = [] def visitor(node, parent, childIndex, labels): elements.append( str(node) ) wiz.visit(tree, '(A B)', visitor) expecting = ['A'] # shouldn't match overall root, just (A B) self.assertEqual(expecting, elements) def testVisitPatternMultiple(self): wiz = TreeWizard(self.adaptor, self.tokens) tree = wiz.create("(A B C (A B) (D (A B)))") elements = [] def visitor(node, parent, childIndex, labels): elements.append( '{}@{}[{}]'.format(node, parent or 'nil', childIndex) ) wiz.visit(tree, '(A B)', visitor) expecting = ['A@A[2]', 'A@D[0]'] self.assertEqual(expecting, elements) def testVisitPatternMultipleWithLabels(self): wiz = TreeWizard(self.adaptor, self.tokens) tree = wiz.create("(A B C (A[foo] B[bar]) (D (A[big] B[dog])))") elements = [] def visitor(node, parent, childIndex, labels): elements.append( '{}@{}[{}]{}&{}'.format( node, parent or 'nil', childIndex, labels['a'], labels['b'], ) ) wiz.visit(tree, '(%a:A %b:B)', visitor) expecting = ['foo@A[2]foo&bar', 'big@D[0]big&dog'] self.assertEqual(expecting, elements) def testParse(self): wiz = TreeWizard(self.adaptor, self.tokens) t = wiz.create("(A B C)") valid = wiz.parse(t, "(A B C)") self.assertTrue(valid) def testParseSingleNode(self): wiz = TreeWizard(self.adaptor, self.tokens) t = wiz.create("A") valid = wiz.parse(t, "A") self.assertTrue(valid) def testParseSingleNodeFails(self): wiz = TreeWizard(self.adaptor, self.tokens) t = wiz.create("A") valid = wiz.parse(t, "B") self.assertFalse(valid) def testParseFlatTree(self): wiz = TreeWizard(self.adaptor, self.tokens) t = wiz.create("(nil A B C)") valid = wiz.parse(t, "(nil A B C)") self.assertTrue(valid) def testParseFlatTreeFails(self): wiz = TreeWizard(self.adaptor, self.tokens) t = wiz.create("(nil A B C)") valid = wiz.parse(t, "(nil A B)") self.assertFalse(valid) def testParseFlatTreeFails2(self): wiz = TreeWizard(self.adaptor, self.tokens) t = wiz.create("(nil A B C)") valid = wiz.parse(t, "(nil A B A)") self.assertFalse(valid) def testWildcard(self): wiz = TreeWizard(self.adaptor, self.tokens) t = wiz.create("(A B C)") valid = wiz.parse(t, "(A . .)") self.assertTrue(valid) def testParseWithText(self): wiz = TreeWizard(self.adaptor, self.tokens) t = wiz.create("(A B[foo] C[bar])") # C pattern has no text arg so despite [bar] in t, no need # to match text--check structure only. valid = wiz.parse(t, "(A B[foo] C)") self.assertTrue(valid) def testParseWithText2(self): wiz = TreeWizard(self.adaptor, self.tokens) t = wiz.create("(A B[T__32] (C (D E[a])))") # C pattern has no text arg so despite [bar] in t, no need # to match text--check structure only. valid = wiz.parse(t, "(A B[foo] C)") self.assertEqual("(A T__32 (C (D a)))", t.toStringTree()) def testParseWithTextFails(self): wiz = TreeWizard(self.adaptor, self.tokens) t = wiz.create("(A B C)") valid = wiz.parse(t, "(A[foo] B C)") self.assertFalse(valid) # fails def testParseLabels(self): wiz = TreeWizard(self.adaptor, self.tokens) t = wiz.create("(A B C)") labels = {} valid = wiz.parse(t, "(%a:A %b:B %c:C)", labels) self.assertTrue(valid) self.assertEqual("A", str(labels["a"])) self.assertEqual("B", str(labels["b"])) self.assertEqual("C", str(labels["c"])) def testParseWithWildcardLabels(self): wiz = TreeWizard(self.adaptor, self.tokens) t = wiz.create("(A B C)") labels = {} valid = wiz.parse(t, "(A %b:. %c:.)", labels) self.assertTrue(valid) self.assertEqual("B", str(labels["b"])) self.assertEqual("C", str(labels["c"])) def testParseLabelsAndTestText(self): wiz = TreeWizard(self.adaptor, self.tokens) t = wiz.create("(A B[foo] C)") labels = {} valid = wiz.parse(t, "(%a:A %b:B[foo] %c:C)", labels) self.assertTrue(valid) self.assertEqual("A", str(labels["a"])) self.assertEqual("foo", str(labels["b"])) self.assertEqual("C", str(labels["c"])) def testParseLabelsInNestedTree(self): wiz = TreeWizard(self.adaptor, self.tokens) t = wiz.create("(A (B C) (D E))") labels = {} valid = wiz.parse(t, "(%a:A (%b:B %c:C) (%d:D %e:E) )", labels) self.assertTrue(valid) self.assertEqual("A", str(labels["a"])) self.assertEqual("B", str(labels["b"])) self.assertEqual("C", str(labels["c"])) self.assertEqual("D", str(labels["d"])) self.assertEqual("E", str(labels["e"])) def testEquals(self): wiz = TreeWizard(self.adaptor, self.tokens) t1 = wiz.create("(A B C)") t2 = wiz.create("(A B C)") same = wiz.equals(t1, t2) self.assertTrue(same) def testEqualsWithText(self): wiz = TreeWizard(self.adaptor, self.tokens) t1 = wiz.create("(A B[foo] C)") t2 = wiz.create("(A B[foo] C)") same = wiz.equals(t1, t2) self.assertTrue(same) def testEqualsWithMismatchedText(self): wiz = TreeWizard(self.adaptor, self.tokens) t1 = wiz.create("(A B[foo] C)") t2 = wiz.create("(A B C)") same = wiz.equals(t1, t2) self.assertFalse(same) def testEqualsWithMismatchedList(self): wiz = TreeWizard(self.adaptor, self.tokens) t1 = wiz.create("(A B C)") t2 = wiz.create("(A B A)") same = wiz.equals(t1, t2) self.assertFalse(same) def testEqualsWithMismatchedListLength(self): wiz = TreeWizard(self.adaptor, self.tokens) t1 = wiz.create("(A B C)") t2 = wiz.create("(A B)") same = wiz.equals(t1, t2) self.assertFalse(same) def testFindPattern(self): wiz = TreeWizard(self.adaptor, self.tokens) t = wiz.create("(A B C (A[foo] B[bar]) (D (A[big] B[dog])))") subtrees = wiz.find(t, "(A B)") found = [str(node) for node in subtrees] expecting = ['foo', 'big'] self.assertEqual(expecting, found) def testFindTokenType(self): wiz = TreeWizard(self.adaptor, self.tokens) t = wiz.create("(A B C (A[foo] B[bar]) (D (A[big] B[dog])))") subtrees = wiz.find(t, wiz.getTokenType('A')) found = [str(node) for node in subtrees] expecting = ['A', 'foo', 'big'] self.assertEqual(expecting, found) if __name__ == "__main__": unittest.main(testRunner=unittest.TextTestRunner(verbosity=2))