Home | History | Annotate | Download | only in Pccts
      1 ======================================================================
      2 
      3                        CHANGES_SUMMARY.TXT
      4 
      5         A QUICK overview of changes from 1.33 in reverse order
      6 
      7   A summary of additions rather than bug fixes and minor code changes.
      8 
      9           Numbers refer to items in CHANGES_FROM_133*.TXT
     10              which may contain additional information.
     11 
     12                           DISCLAIMER
     13 
     14  The software and these notes are provided "as is".  They may include
     15  typographical or technical errors and their authors disclaims all
     16  liability of any kind or nature for damages due to error, fault,
     17  defect, or deficiency regardless of cause.  All warranties of any
     18  kind, either express or implied, including, but not limited to, the
     19  implied  warranties of merchantability and fitness for a particular
     20  purpose are disclaimed.
     21 
     22 ======================================================================
     23 
     24 #258. You can specify a user-defined base class for your parser
     25 
     26     The base class must constructor must have a signature similar to
     27     that of ANTLRParser.
     28 
     29 #253. Generation of block preamble (-preamble and -preamble_first)
     30 
     31     The antlr option -preamble causes antlr to insert the code
     32     BLOCK_PREAMBLE at the start of each rule and block.
     33 
     34     The antlr option -preamble_first is similar, but inserts the
     35     code BLOCK_PREAMBLE_FIRST(PreambleFirst_123) where the symbol
     36     PreambleFirst_123 is equivalent to the first set defined by
     37     the #FirstSetSymbol described in Item #248.
     38 
     39 #248. Generate symbol for first set of an alternative
     40 
     41         rr : #FirstSetSymbol(rr_FirstSet)  ( Foo | Bar ) ;
     42 
     43 #216. Defer token fetch for C++ mode
     44 
     45     When the ANTLRParser class is built with the pre-processor option 
     46     ZZDEFER_FETCH defined, the fetch of new tokens by consume() is deferred
     47     until LA(i) or LT(i) is called. 
     48 
     49 #215. Use reset() to reset DLGLexerBase
     50 #188. Added pccts/h/DLG_stream_input.h
     51 #180. Added ANTLRParser::getEofToken()
     52 #173. -glms for Microsoft style filenames with -gl
     53 #170. Suppression for predicates with lookahead depth >1
     54 
     55       Consider the following grammar with -ck 2 and the predicate in rule
     56       "a" with depth 2:
     57 
     58             r1  : (ab)* "@"
     59                 ;
     60 
     61             ab  : a
     62                 | b
     63                 ;
     64 
     65             a   : (A B)? => <<p(LATEXT(2))>>? A B C
     66                 ;
     67 
     68             b   : A B C
     69                 ;
     70 
     71       Normally, the predicate would be hoisted into rule r1 in order to
     72       determine whether to call rule "ab".  However it should *not* be
     73       hoisted because, even if p is false, there is a valid alternative
     74       in rule b.  With "-mrhoistk on" the predicate will be suppressed.
     75 
     76       If "-info p" command line option is present the following information
     77       will appear in the generated code:
     78 
     79                 while ( (LA(1)==A)
     80         #if 0
     81 
     82         Part (or all) of predicate with depth > 1 suppressed by alternative
     83             without predicate
     84 
     85         pred  <<  p(LATEXT(2))>>?
     86                   depth=k=2  ("=>" guard)  rule a  line 8  t1.g
     87           tree context:
     88             (root = A
     89                B
     90             )
     91 
     92         The token sequence which is suppressed: ( A B )
     93         The sequence of references which generate that sequence of tokens:
     94 
     95            1 to ab          r1/1       line 1     t1.g
     96            2 ab             ab/1       line 4     t1.g
     97            3 to b           ab/2       line 5     t1.g
     98            4 b              b/1        line 11    t1.g
     99            5 #token A       b/1        line 11    t1.g
    100            6 #token B       b/1        line 11    t1.g
    101 
    102         #endif
    103 
    104       A slightly more complicated example:
    105 
    106             r1  : (ab)* "@"
    107                 ;
    108 
    109             ab  : a
    110                 | b
    111                 ;
    112 
    113             a   : (A B)? => <<p(LATEXT(2))>>? (A  B | D E)
    114                 ;
    115 
    116             b   : <<q(LATEXT(2))>>? D E
    117                 ;
    118 
    119 
    120       In this case, the sequence (D E) in rule "a" which lies behind
    121       the guard is used to suppress the predicate with context (D E)
    122       in rule b.
    123 
    124                 while ( (LA(1)==A || LA(1)==D)
    125             #if 0
    126 
    127             Part (or all) of predicate with depth > 1 suppressed by alternative
    128                 without predicate
    129 
    130             pred  <<  q(LATEXT(2))>>?
    131                               depth=k=2  rule b  line 11  t2.g
    132               tree context:
    133                 (root = D
    134                    E
    135                 )
    136 
    137             The token sequence which is suppressed: ( D E )
    138             The sequence of references which generate that sequence of tokens:
    139 
    140                1 to ab          r1/1       line 1     t2.g
    141                2 ab             ab/1       line 4     t2.g
    142                3 to a           ab/1       line 4     t2.g
    143                4 a              a/1        line 8     t2.g
    144                5 #token D       a/1        line 8     t2.g
    145                6 #token E       a/1        line 8     t2.g
    146 
    147             #endif
    148             &&
    149             #if 0
    150 
    151             pred  <<  p(LATEXT(2))>>?
    152                               depth=k=2  ("=>" guard)  rule a  line 8  t2.g
    153               tree context:
    154                 (root = A
    155                    B
    156                 )
    157 
    158             #endif
    159 
    160             (! ( LA(1)==A && LA(2)==B ) || p(LATEXT(2)) )  {
    161                 ab();
    162                 ...
    163 
    164 #165. (Changed in MR13) option -newAST
    165 
    166       To create ASTs from an ANTLRTokenPtr antlr usually calls
    167       "new AST(ANTLRTokenPtr)".  This option generates a call
    168       to "newAST(ANTLRTokenPtr)" instead.  This allows a user
    169       to define a parser member function to create an AST object.
    170 
    171 #161. (Changed in MR13) Switch -gxt inhibits generation of tokens.h
    172 
    173 #158. (Changed in MR13) #header causes problem for pre-processors
    174 
    175       A user who runs the C pre-processor on antlr source suggested
    176       that another syntax be allowed.  With MR13 such directives
    177       such as #header, #pragma, etc. may be written as "\#header",
    178       "\#pragma", etc.  For escaping pre-processor directives inside
    179       a #header use something like the following:
    180 
    181             \#header
    182             <<
    183                 \#include <stdio.h>
    184             >>
    185 
    186 #155. (Changed in MR13) Context behind predicates can suppress
    187 
    188       With -mrhoist enabled the context behind a guarded predicate can
    189       be used to suppress other predicates.  Consider the following grammar:
    190 
    191         r0 : (r1)+;
    192 
    193         r1  : rp
    194             | rq
    195             ;
    196         rp  : <<p LATEXT(1)>>? B ;
    197         rq : (A)? => <<q LATEXT(1)>>? (A|B);
    198 
    199       In earlier versions both predicates "p" and "q" would be hoisted into
    200       rule r0. With MR12c predicate p is suppressed because the context which
    201       follows predicate q includes "B" which can "cover" predicate "p".  In
    202       other words, in trying to decide in r0 whether to call r1, it doesn't
    203       really matter whether p is false or true because, either way, there is
    204       a valid choice within r1.
    205 
    206 #154. (Changed in MR13) Making hoist suppression explicit using <<nohoist>>
    207 
    208       A common error, even among experienced pccts users, is to code
    209       an init-action to inhibit hoisting rather than a leading action.
    210       An init-action does not inhibit hoisting.
    211 
    212       This was coded:
    213 
    214         rule1 : <<;>> rule2
    215 
    216       This is what was meant:
    217 
    218         rule1 : <<;>> <<;>> rule2
    219 
    220       With MR13, the user can code:
    221 
    222         rule1 : <<;>> <<nohoist>> rule2
    223 
    224       The following will give an error message:
    225 
    226         rule1 : <<nohoist>> rule2
    227 
    228       If the <<nohoist>> appears as an init-action rather than a leading
    229       action an error message is issued.  The meaning of an init-action
    230       containing "nohoist" is unclear: does it apply to just one
    231       alternative or to all alternatives ?
    232 
    233 #151a. Addition of ANTLRParser::getLexer(), ANTLRTokenStream::getLexer()
    234 
    235       You must manually cast the ANTLRTokenStream to your program's
    236       lexer class. Because the name of the lexer's class is not fixed.
    237       Thus it is impossible to incorporate it into the DLGLexerBase
    238       class.
    239 
    240 #151b.(Changed in MR12) ParserBlackBox member getLexer()
    241 
    242 #150. (Changed in MR12) syntaxErrCount and lexErrCount now public
    243 
    244 #149. (Changed in MR12) antlr option -info o (letter o for orphan)
    245 
    246       If there is more than one rule which is not referenced by any
    247       other rule then all such rules are listed.  This is useful for
    248       alerting one to rules which are not used, but which can still
    249       contribute to ambiguity.
    250 
    251 #148. (Changed in MR11) #token names appearing in zztokens,token_tbl
    252 
    253       One can write:
    254 
    255             #token Plus ("+")             "\+"
    256             #token RP   ("(")             "\("
    257             #token COM  ("comment begin") "/\*"
    258 
    259       The string in parenthesis will be used in syntax error messages.
    260 
    261 #146. (Changed in MR11) Option -treport for locating "difficult" alts
    262 
    263       It can be difficult to determine which alternatives are causing
    264       pccts to work hard to resolve an ambiguity.  In some cases the
    265       ambiguity is successfully resolved after much CPU time so there
    266       is no message at all.
    267 
    268       A rough measure of the amount of work being peformed which is
    269       independent of the CPU speed and system load is the number of
    270       tnodes created.  Using "-info t" gives information about the
    271       total number of tnodes created and the peak number of tnodes.
    272 
    273         Tree Nodes:  peak 1300k  created 1416k  lost 0
    274 
    275       It also puts in the generated C or C++ file the number of tnodes
    276       created for a rule (at the end of the rule).  However this
    277       information is not sufficient to locate the alternatives within
    278       a rule which are causing the creation of tnodes.
    279 
    280       Using:
    281 
    282              antlr -treport 100000 ....
    283 
    284       causes antlr to list on stdout any alternatives which require the
    285       creation of more than 100,000 tnodes, along with the lookahead sets
    286       for those alternatives.
    287 
    288       The following is a trivial case from the ansi.g grammar which shows
    289       the format of the report.  This report might be of more interest
    290       in cases where 1,000,000 tuples were created to resolve the ambiguity.
    291 
    292       -------------------------------------------------------------------------
    293         There were 0 tuples whose ambiguity could not be resolved
    294              by full lookahead
    295         There were 157 tnodes created to resolve ambiguity between:
    296 
    297           Choice 1: statement/2  line 475  file ansi.g
    298           Choice 2: statement/3  line 476  file ansi.g
    299 
    300             Intersection of lookahead[1] sets:
    301 
    302                IDENTIFIER
    303 
    304             Intersection of lookahead[2] sets:
    305 
    306                LPARENTHESIS     COLON            AMPERSAND        MINUS
    307                STAR             PLUSPLUS         MINUSMINUS       ONESCOMPLEMENT
    308                NOT              SIZEOF           OCTALINT         DECIMALINT
    309                HEXADECIMALINT   FLOATONE         FLOATTWO         IDENTIFIER
    310                STRING           CHARACTER
    311       -------------------------------------------------------------------------
    312 
    313 #143. (Changed in MR11) Optional ";" at end of #token statement
    314 
    315       Fixes problem of:
    316 
    317             #token X "x"
    318 
    319             <<
    320                 parser action
    321             >>
    322 
    323       Being confused with:
    324 
    325             #token X "x" <<lexical action>>
    326 
    327 #142. (Changed in MR11) class BufFileInput subclass of DLGInputStream
    328 
    329       Alexey Demakov (demakov (a] kazbek.ispras.ru) has supplied class
    330       BufFileInput derived from DLGInputStream which provides a
    331       function lookahead(char *string) to test characters in the
    332       input stream more than one character ahead.
    333       The class is located in pccts/h/BufFileInput.* of the kit.
    334 
    335 #140. #pred to define predicates
    336 
    337       +---------------------------------------------------+
    338       | Note: Assume "-prc on" for this entire discussion |
    339       +---------------------------------------------------+
    340 
    341       A problem with predicates is that each one is regarded as
    342       unique and capable of disambiguating cases where two
    343       alternatives have identical lookahead.  For example:
    344 
    345         rule : <<pred(LATEXT(1))>>? A
    346              | <<pred(LATEXT(1))>>? A
    347              ;
    348 
    349       will not cause any error messages or warnings to be issued
    350       by earlier versions of pccts.  To compare the text of the
    351       predicates is an incomplete solution.
    352 
    353       In 1.33MR11 I am introducing the #pred statement in order to
    354       solve some problems with predicates.  The #pred statement allows
    355       one to give a symbolic name to a "predicate literal" or a
    356       "predicate expression" in order to refer to it in other predicate
    357       expressions or in the rules of the grammar.
    358 
    359       The predicate literal associated with a predicate symbol is C
    360       or C++ code which can be used to test the condition.  A
    361       predicate expression defines a predicate symbol in terms of other
    362       predicate symbols using "!", "&&", and "||".  A predicate symbol
    363       can be defined in terms of a predicate literal, a predicate
    364       expression, or *both*.
    365 
    366       When a predicate symbol is defined with both a predicate literal
    367       and a predicate expression, the predicate literal is used to generate
    368       code, but the predicate expression is used to check for two
    369       alternatives with identical predicates in both alternatives.
    370 
    371       Here are some examples of #pred statements:
    372 
    373         #pred  IsLabel       <<isLabel(LATEXT(1))>>?
    374         #pred  IsLocalVar    <<isLocalVar(LATEXT(1))>>?
    375         #pred  IsGlobalVar   <<isGlobalVar(LATEXT(1)>>?
    376         #pred  IsVar         <<isVar(LATEXT(1))>>?       IsLocalVar || IsGlobalVar
    377         #pred  IsScoped      <<isScoped(LATEXT(1))>>?    IsLabel || IsLocalVar
    378 
    379       I hope that the use of EBNF notation to describe the syntax of the
    380       #pred statement will not cause problems for my readers (joke).
    381 
    382         predStatement : "#pred"
    383                             CapitalizedName
    384                               (
    385                                   "<<predicate_literal>>?"
    386                                 | "<<predicate_literal>>?"  predOrExpr
    387                                 | predOrExpr
    388                               )
    389                       ;
    390 
    391         predOrExpr    : predAndExpr ( "||" predAndExpr ) * ;
    392 
    393         predAndExpr   : predPrimary ( "&&" predPrimary ) * ;
    394 
    395         predPrimary   : CapitalizedName
    396                       | "!" predPrimary
    397                       | "(" predOrExpr ")"
    398                       ;
    399 
    400       What is the purpose of this nonsense ?
    401 
    402       To understand how predicate symbols help, you need to realize that
    403       predicate symbols are used in two different ways with two different
    404       goals.
    405 
    406         a. Allow simplification of predicates which have been combined
    407            during predicate hoisting.
    408 
    409         b. Allow recognition of identical predicates which can't disambiguate
    410            alternatives with common lookahead.
    411 
    412       First we will discuss goal (a).  Consider the following rule:
    413 
    414             rule0: rule1
    415                  | ID
    416                  | ...
    417                  ;
    418 
    419             rule1: rule2
    420                  | rule3
    421                  ;
    422 
    423             rule2: <<isX(LATEXT(1))>>? ID ;
    424             rule3: <<!isX(LATEXT(1)>>? ID ;
    425 
    426       When the predicates in rule2 and rule3 are combined by hoisting
    427       to create a prediction expression for rule1 the result is:
    428 
    429             if ( LA(1)==ID
    430                 && ( isX(LATEXT(1) || !isX(LATEXT(1) ) ) { rule1(); ...
    431 
    432       This is inefficient, but more importantly, can lead to false
    433       assumptions that the predicate expression distinguishes the rule1
    434       alternative with some other alternative with lookahead ID.  In
    435       MR11 one can write:
    436 
    437             #pred IsX     <<isX(LATEXT(1))>>?
    438 
    439             ...
    440 
    441             rule2: <<IsX>>? ID  ;
    442             rule3: <<!IsX>>? ID ;
    443 
    444       During hoisting MR11 recognizes this as a special case and
    445       eliminates the predicates.  The result is a prediction
    446       expression like the following:
    447 
    448             if ( LA(1)==ID ) { rule1(); ...
    449 
    450       Please note that the following cases which appear to be equivalent
    451       *cannot* be simplified by MR11 during hoisting because the hoisting
    452       logic only checks for a "!" in the predicate action, not in the
    453       predicate expression for a predicate symbol.
    454 
    455         *Not* equivalent and is not simplified during hoisting:
    456 
    457             #pred IsX      <<isX(LATEXT(1))>>?
    458             #pred NotX     <<!isX(LATEXT(1))>>?
    459             ...
    460             rule2: <<IsX>>? ID  ;
    461             rule3: <<NotX>>? ID ;
    462 
    463         *Not* equivalent and is not simplified during hoisting:
    464 
    465             #pred IsX      <<isX(LATEXT(1))>>?
    466             #pred NotX     !IsX
    467             ...
    468             rule2: <<IsX>>? ID  ;
    469             rule3: <<NotX>>? ID ;
    470 
    471       Now we will discuss goal (b).
    472 
    473       When antlr discovers that there is a lookahead ambiguity between
    474       two alternatives it attempts to resolve the ambiguity by searching
    475       for predicates in both alternatives.  In the past any predicate
    476       would do, even if the same one appeared in both alternatives:
    477 
    478             rule: <<p(LATEXT(1))>>? X
    479                 | <<p(LATEXT(1))>>? X
    480                 ;
    481 
    482       The #pred statement is a start towards solving this problem.
    483       During ambiguity resolution (*not* predicate hoisting) the
    484       predicates for the two alternatives are expanded and compared.
    485       Consider the following example:
    486 
    487             #pred Upper     <<isUpper(LATEXT(1))>>?
    488             #pred Lower     <<isLower(LATEXT(1))>>?
    489             #pred Alpha     <<isAlpha(LATEXT(1))>>?  Upper || Lower
    490 
    491             rule0: rule1
    492                  | <<Alpha>>? ID
    493                  ;
    494 
    495             rule1:
    496                  | rule2
    497                  | rule3
    498                  ...
    499                  ;
    500 
    501             rule2: <<Upper>>? ID;
    502             rule3: <<Lower>>? ID;
    503 
    504       The definition of #pred Alpha expresses:
    505 
    506             a. to test the predicate use the C code "isAlpha(LATEXT(1))"
    507 
    508             b. to analyze the predicate use the information that
    509                Alpha is equivalent to the union of Upper and Lower,
    510 
    511       During ambiguity resolution the definition of Alpha is expanded
    512       into "Upper || Lower" and compared with the predicate in the other
    513       alternative, which is also "Upper || Lower".  Because they are
    514       identical MR11 will report a problem.
    515 
    516     -------------------------------------------------------------------------
    517       t10.g, line 5: warning: the predicates used to disambiguate rule rule0
    518              (file t10.g alt 1 line 5 and alt 2 line 6)
    519              are identical when compared without context and may have no
    520              resolving power for some lookahead sequences.
    521     -------------------------------------------------------------------------
    522 
    523       If you use the "-info p" option the output file will contain:
    524 
    525     +----------------------------------------------------------------------+
    526     |#if 0                                                                 |
    527     |                                                                      |
    528     |The following predicates are identical when compared without          |
    529     |  lookahead context information.  For some ambiguous lookahead        |
    530     |  sequences they may not have any power to resolve the ambiguity.     |
    531     |                                                                      |
    532     |Choice 1: rule0/1  alt 1  line 5  file t10.g                          |
    533     |                                                                      |
    534     |  The original predicate for choice 1 with available context          |
    535     |    information:                                                      |
    536     |                                                                      |
    537     |    OR expr                                                           |
    538     |                                                                      |
    539     |      pred  <<  Upper>>?                                              |
    540     |                        depth=k=1  rule rule2  line 14  t10.g         |
    541     |        set context:                                                  |
    542     |           ID                                                         |
    543     |                                                                      |
    544     |      pred  <<  Lower>>?                                              |
    545     |                        depth=k=1  rule rule3  line 15  t10.g         |
    546     |        set context:                                                  |
    547     |           ID                                                         |
    548     |                                                                      |
    549     |  The predicate for choice 1 after expansion (but without context     |
    550     |    information):                                                     |
    551     |                                                                      |
    552     |    OR expr                                                           |
    553     |                                                                      |
    554     |      pred  <<  isUpper(LATEXT(1))>>?                                 |
    555     |                        depth=k=1  rule   line 1  t10.g               |
    556     |                                                                      |
    557     |      pred  <<  isLower(LATEXT(1))>>?                                 |
    558     |                        depth=k=1  rule   line 2  t10.g               |
    559     |                                                                      |
    560     |                                                                      |
    561     |Choice 2: rule0/2  alt 2  line 6  file t10.g                          |
    562     |                                                                      |
    563     |  The original predicate for choice 2 with available context          |
    564     |    information:                                                      |
    565     |                                                                      |
    566     |  pred  <<  Alpha>>?                                                  |
    567     |                    depth=k=1  rule rule0  line 6  t10.g              |
    568     |    set context:                                                      |
    569     |       ID                                                             |
    570     |                                                                      |
    571     |  The predicate for choice 2 after expansion (but without context     |
    572     |    information):                                                     |
    573     |                                                                      |
    574     |  OR expr                                                             |
    575     |                                                                      |
    576     |    pred  <<  isUpper(LATEXT(1))>>?                                   |
    577     |                      depth=k=1  rule   line 1  t10.g                 |
    578     |                                                                      |
    579     |    pred  <<  isLower(LATEXT(1))>>?                                   |
    580     |                      depth=k=1  rule   line 2  t10.g                 |
    581     |                                                                      |
    582     |                                                                      |
    583     |#endif                                                                |
    584     +----------------------------------------------------------------------+
    585 
    586       The comparison of the predicates for the two alternatives takes
    587       place without context information, which means that in some cases
    588       the predicates will be considered identical even though they operate
    589       on disjoint lookahead sets.  Consider:
    590 
    591             #pred Alpha
    592 
    593             rule1: <<Alpha>>? ID
    594                  | <<Alpha>>? Label
    595                  ;
    596 
    597       Because the comparison of predicates takes place without context
    598       these will be considered identical.  The reason for comparing
    599       without context is that otherwise it would be necessary to re-evaluate
    600       the entire predicate expression for each possible lookahead sequence.
    601       This would require more code to be written and more CPU time during
    602       grammar analysis, and it is not yet clear whether anyone will even make
    603       use of the new #pred facility.
    604 
    605       A temporary workaround might be to use different #pred statements
    606       for predicates you know have different context.  This would avoid
    607       extraneous warnings.
    608 
    609       The above example might be termed a "false positive".  Comparison
    610       without context will also lead to "false negatives".  Consider the
    611       following example:
    612 
    613             #pred Alpha
    614             #pred Beta
    615 
    616             rule1: <<Alpha>>? A
    617                  | rule2
    618                  ;
    619 
    620             rule2: <<Alpha>>? A
    621                  | <<Beta>>?  B
    622                  ;
    623 
    624       The predicate used for alt 2 of rule1 is (Alpha || Beta).  This
    625       appears to be different than the predicate Alpha used for alt1.
    626       However, the context of Beta is B.  Thus when the lookahead is A
    627       Beta will have no resolving power and Alpha will be used for both
    628       alternatives.  Using the same predicate for both alternatives isn't
    629       very helpful, but this will not be detected with 1.33MR11.
    630 
    631       To properly handle this the predicate expression would have to be
    632       evaluated for each distinct lookahead context.
    633 
    634       To determine whether two predicate expressions are identical is
    635       difficult.  The routine may fail to identify identical predicates.
    636 
    637       The #pred feature also compares predicates to see if a choice between
    638       alternatives which is resolved by a predicate which makes the second
    639       choice unreachable.  Consider the following example:
    640 
    641             #pred A         <<A(LATEXT(1)>>?
    642             #pred B         <<B(LATEXT(1)>>?
    643             #pred A_or_B    A || B
    644 
    645             r   : s
    646                 | t
    647                 ;
    648             s   : <<A_or_B>>? ID
    649                 ;
    650             t   : <<A>>? ID
    651                 ;
    652 
    653         ----------------------------------------------------------------------------
    654         t11.g, line 5: warning: the predicate used to disambiguate the
    655                first choice of  rule r
    656              (file t11.g alt 1 line 5 and alt 2 line 6)
    657              appears to "cover" the second predicate when compared without context.
    658              The second predicate may have no resolving power for some lookahead
    659                sequences.
    660         ----------------------------------------------------------------------------
    661 
    662 #132. (Changed in 1.33MR11) Recognition of identical predicates in alts
    663 
    664       Prior to 1.33MR11, there would be no ambiguity warning when the
    665       very same predicate was used to disambiguate both alternatives:
    666 
    667         test: ref B
    668             | ref C
    669             ;
    670 
    671         ref : <<pred(LATEXT(1)>>? A
    672 
    673       In 1.33MR11 this will cause the warning:
    674 
    675         warning: the predicates used to disambiguate rule test
    676             (file v98.g alt 1 line 1 and alt 2 line 2)
    677              are identical and have no resolving power
    678 
    679         -----------------  Note  -----------------
    680 
    681           This is different than the following case
    682 
    683                 test: <<pred(LATEXT(1))>>? A B
    684                     | <<pred(LATEXT(1)>>?  A C
    685                     ;
    686 
    687           In this case there are two distinct predicates
    688           which have exactly the same text.  In the first
    689           example there are two references to the same
    690           predicate.  The problem represented by this
    691           grammar will be addressed later.
    692 
    693 
    694 #127. (Changed in 1.33MR11)
    695 
    696                     Count Syntax Errors     Count DLG Errors
    697                     -------------------     ----------------
    698 
    699        C++ mode     ANTLRParser::           DLGLexerBase::
    700                       syntaxErrCount          lexErrCount
    701        C mode       zzSyntaxErrCount        zzLexErrCount
    702 
    703        The C mode variables are global and initialized to 0.
    704        They are *not* reset to 0 automatically when antlr is
    705        restarted.
    706 
    707        The C++ mode variables are public.  They are initialized
    708        to 0 by the constructors.  They are *not* reset to 0 by the
    709        ANTLRParser::init() method.
    710 
    711        Suggested by Reinier van den Born (reinier (a] vnet.ibm.com).
    712 
    713 #126. (Changed in 1.33MR11) Addition of #first <<...>>
    714 
    715        The #first <<...>> inserts the specified text in the output
    716        files before any other #include statements required by pccts.
    717        The only things before the #first text are comments and
    718        a #define ANTLR_VERSION.
    719 
    720        Requested by  and Esa Pulkkinen (esap (a] cs.tut.fi) and Alexin
    721        Zoltan (alexin (a] inf.u-szeged.hu).
    722 
    723 #124. A Note on the New "&&" Style Guarded Predicates
    724 
    725         I've been asked several times, "What is the difference between
    726         the old "=>" style guard predicates and the new style "&&" guard
    727         predicates, and how do you choose one over the other" ?
    728 
    729         The main difference is that the "=>" does not apply the
    730         predicate if the context guard doesn't match, whereas
    731         the && form always does.  What is the significance ?
    732 
    733         If you have a predicate which is not on the "leading edge"
    734         it is cannot be hoisted.  Suppose you need a predicate that
    735         looks at LA(2).  You must introduce it manually.  The
    736         classic example is:
    737 
    738             castExpr :
    739                      LP typeName RP
    740                      | ....
    741                      ;
    742 
    743             typeName : <<isTypeName(LATEXT(1))>>?  ID
    744                      | STRUCT ID
    745                      ;
    746 
    747         The problem  is that isTypeName() isn't on the leading edge
    748         of typeName, so it won't be hoisted into castExpr to help
    749         make a decision on which production to choose.
    750 
    751         The *first* attempt to fix it is this:
    752 
    753             castExpr :
    754                      <<isTypeName(LATEXT(2))>>?
    755                                         LP typeName RP
    756                      | ....
    757                      ;
    758 
    759         Unfortunately, this won't work because it ignores
    760         the problem of STRUCT.  The solution is to apply
    761         isTypeName() in castExpr if LA(2) is an ID and
    762         don't apply it when LA(2) is STRUCT:
    763 
    764             castExpr :
    765                      (LP ID)? => <<isTypeName(LATEXT(2))>>?
    766                                         LP typeName RP
    767                      | ....
    768                      ;
    769 
    770         In conclusion, the "=>" style guarded predicate is
    771         useful when:
    772 
    773             a. the tokens required for the predicate
    774                are not on the leading edge
    775             b. there are alternatives in the expression
    776                selected by the predicate for which the
    777                predicate is inappropriate
    778 
    779         If (b) were false, then one could use a simple
    780         predicate (assuming "-prc on"):
    781 
    782             castExpr :
    783                      <<isTypeName(LATEXT(2))>>?
    784                                         LP typeName RP
    785                      | ....
    786                      ;
    787 
    788             typeName : <<isTypeName(LATEXT(1))>>?  ID
    789                      ;
    790 
    791         So, when do you use the "&&" style guarded predicate ?
    792 
    793         The new-style "&&" predicate should always be used with
    794         predicate context.  The context guard is in ADDITION to
    795         the automatically computed context.  Thus it useful for
    796         predicates which depend on the token type for reasons
    797         other than context.
    798 
    799         The following example is contributed by Reinier van den Born
    800         (reinier (a] vnet.ibm.com).
    801 
    802  +-------------------------------------------------------------------------+
    803  | This grammar has two ways to call functions:                            |
    804  |                                                                         |
    805  |  - a "standard" call syntax with parens and comma separated args        |
    806  |  - a shell command like syntax (no parens and spacing separated args)   |
    807  |                                                                         |
    808  | The former also allows a variable to hold the name of the function,     |
    809  | the latter can also be used to call external commands.                  |
    810  |                                                                         |
    811  | The grammar (simplified) looks like this:                               |
    812  |                                                                         |
    813  |   fun_call   :     ID "(" { expr ("," expr)* } ")"                      |
    814  |                                  /* ID is function name */              |
    815  |              | "@" ID "(" { expr ("," expr)* } ")"                      |
    816  |                                  /* ID is var containing fun name */    |
    817  |              ;                                                          |
    818  |                                                                         |
    819  |   command    : ID expr*          /* ID is function name */              |
    820  |              | path expr*        /* path is external command name */    |
    821  |              ;                                                          |
    822  |                                                                         |
    823  |   path       : ID                /* left out slashes and such */        |
    824  |              | "@" ID            /* ID is environment var */            |
    825  |              ;                                                          |
    826  |                                                                         |
    827  |   expr       : ....                                                     |
    828  |              | "(" expr ")";                                            |
    829  |                                                                         |
    830  |   call       : fun_call                                                 |
    831  |              | command                                                  |
    832  |              ;                                                          |
    833  |                                                                         |
    834  | Obviously the call is wildly ambiguous. This is more or less how this   |
    835  | is to be resolved:                                                      |
    836  |                                                                         |
    837  |    A call begins with an ID or an @ followed by an ID.                  |
    838  |                                                                         |
    839  |    If it is an ID and if it is an ext. command name  -> command         |
    840  |                       if followed by a paren         -> fun_call        |
    841  |                       otherwise                      -> command         |
    842  |                                                                         |
    843  |    If it is an @  and if the ID is a var name        -> fun_call        |
    844  |                       otherwise                      -> command         |
    845  |                                                                         |
    846  | One can implement these rules quite neatly using && predicates:         |
    847  |                                                                         |
    848  |   call       : ("@" ID)? && <<isVarName(LT(2))>>? fun_call              |
    849  |              | (ID)?     && <<isExtCmdName>>?     command               |
    850  |              | (ID "(")?                          fun_call              |
    851  |              |                                    command               |
    852  |              ;                                                          |
    853  |                                                                         |
    854  | This can be done better, so it is not an ideal example, but it          |
    855  | conveys the principle.                                                  |
    856  +-------------------------------------------------------------------------+
    857 
    858 #122. (Changed in 1.33MR11)  Member functions to reset DLG in C++ mode
    859 
    860          void DLGFileReset(FILE *f) { input = f; found_eof = 0; }
    861          void DLGStringReset(DLGChar *s) { input = s; p = &input[0]; }
    862 
    863         Supplied by R.A. Nelson (cowboy (a] VNET.IBM.COM)
    864 
    865 #119. (Changed in 1.33MR11) Ambiguity aid for grammars
    866 
    867       The user can ask for additional information on ambiguities reported
    868       by antlr to stdout.  At the moment, only one ambiguity report can
    869       be created in an antlr run.
    870 
    871       This feature is enabled using the "-aa" (Ambiguity Aid)  option.
    872 
    873       The following options control the reporting of ambiguities:
    874 
    875           -aa ruleName       Selects reporting by name of rule
    876           -aa lineNumber     Selects reporting by line number
    877                                (file name not compared)
    878 
    879           -aam               Selects "multiple" reporting for a token
    880                              in the intersection set of the
    881                              alternatives.
    882 
    883                              For instance, the token ID may appear dozens
    884                              of times in various paths as the program
    885                              explores the rules which are reachable from
    886                              the point of an ambiguity. With option -aam
    887                              every possible path the search program
    888                              encounters is reported.
    889 
    890                              Without -aam only the first encounter is
    891                              reported.  This may result in incomplete
    892                              information, but the information may be
    893                              sufficient and much shorter.
    894 
    895           -aad depth         Selects the depth of the search.
    896                              The default value is 1.
    897 
    898                              The number of paths to be searched, and the
    899                              size of the report can grow geometrically
    900                              with the -ck value if a full search for all
    901                              contributions to the source of the ambiguity
    902                              is explored.
    903 
    904                              The depth represents the number of tokens
    905                              in the lookahead set which are matched against
    906                              the set of ambiguous tokens.  A depth of 1
    907                              means that the search stops when a lookahead
    908                              sequence of just one token is matched.
    909 
    910                              A k=1 ck=6 grammar might generate 5,000 items
    911                              in a report if a full depth 6 search is made
    912                              with the Ambiguity Aid.  The source of the
    913                              problem may be in the first token and obscured
    914                              by the volume of data - I hesitate to call
    915                              it information.
    916 
    917                              When the user selects a depth > 1, the search
    918                              is first performed at depth=1 for both
    919                              alternatives, then depth=2 for both alternatives,
    920                              etc.
    921 
    922       Sample output for rule grammar in antlr.g itself:
    923 
    924   +---------------------------------------------------------------------+
    925   | Ambiguity Aid                                                       |
    926   |                                                                     |
    927   |   Choice 1: grammar/70                 line 632  file a.g           |
    928   |   Choice 2: grammar/82                 line 644  file a.g           |
    929   |                                                                     |
    930   |   Intersection of lookahead[1] sets:                                |
    931   |                                                                     |
    932   |      "\}"             "class"          "#errclass"      "#tokclass" |
    933   |                                                                     |
    934   |    Choice:1  Depth:1  Group:1  ("#errclass")                        |
    935   |  1 in (...)* block                grammar/70       line 632   a.g   |
    936   |  2 to error                       grammar/73       line 635   a.g   |
    937   |  3 error                          error/1          line 894   a.g   |
    938   |  4 #token "#errclass"             error/2          line 895   a.g   |
    939   |                                                                     |
    940   |    Choice:1  Depth:1  Group:2  ("#tokclass")                        |
    941   |  2 to tclass                      grammar/74       line 636   a.g   |
    942   |  3 tclass                         tclass/1         line 937   a.g   |
    943   |  4 #token "#tokclass"             tclass/2         line 938   a.g   |
    944   |                                                                     |
    945   |    Choice:1  Depth:1  Group:3  ("class")                            |
    946   |  2 to class_def                   grammar/75       line 637   a.g   |
    947   |  3 class_def                      class_def/1      line 669   a.g   |
    948   |  4 #token "class"                 class_def/3      line 671   a.g   |
    949   |                                                                     |
    950   |    Choice:1  Depth:1  Group:4  ("\}")                               |
    951   |  2 #token "\}"                    grammar/76       line 638   a.g   |
    952   |                                                                     |
    953   |    Choice:2  Depth:1  Group:5  ("#errclass")                        |
    954   |  1 in (...)* block                grammar/83       line 645   a.g   |
    955   |  2 to error                       grammar/93       line 655   a.g   |
    956   |  3 error                          error/1          line 894   a.g   |
    957   |  4 #token "#errclass"             error/2          line 895   a.g   |
    958   |                                                                     |
    959   |    Choice:2  Depth:1  Group:6  ("#tokclass")                        |
    960   |  2 to tclass                      grammar/94       line 656   a.g   |
    961   |  3 tclass                         tclass/1         line 937   a.g   |
    962   |  4 #token "#tokclass"             tclass/2         line 938   a.g   |
    963   |                                                                     |
    964   |    Choice:2  Depth:1  Group:7  ("class")                            |
    965   |  2 to class_def                   grammar/95       line 657   a.g   |
    966   |  3 class_def                      class_def/1      line 669   a.g   |
    967   |  4 #token "class"                 class_def/3      line 671   a.g   |
    968   |                                                                     |
    969   |    Choice:2  Depth:1  Group:8  ("\}")                               |
    970   |  2 #token "\}"                    grammar/96       line 658   a.g   |
    971   +---------------------------------------------------------------------+
    972 
    973       For a linear lookahead set ambiguity (where k=1 or for k>1 but
    974       when all lookahead sets [i] with i<k all have degree one) the
    975       reports appear in the following order:
    976 
    977         for (depth=1 ; depth <= "-aad depth" ; depth++) {
    978           for (alternative=1; alternative <=2 ; alternative++) {
    979             while (matches-are-found) {
    980               group++;
    981               print-report
    982             };
    983           };
    984        };
    985 
    986       For reporting a k-tuple ambiguity, the reports appear in the
    987       following order:
    988 
    989         for (depth=1 ; depth <= "-aad depth" ; depth++) {
    990           while (matches-are-found) {
    991             for (alternative=1; alternative <=2 ; alternative++) {
    992               group++;
    993               print-report
    994             };
    995           };
    996        };
    997 
    998       This is because matches are generated in different ways for
    999       linear lookahead and k-tuples.
   1000 
   1001 #117. (Changed in 1.33MR10) new EXPERIMENTAL predicate hoisting code
   1002 
   1003       The hoisting of predicates into rules to create prediction
   1004       expressions is a problem in antlr.  Consider the following
   1005       example (k=1 with -prc on):
   1006 
   1007         start   : (a)* "@" ;
   1008         a       : b | c ;
   1009         b       : <<isUpper(LATEXT(1))>>? A ;
   1010         c       : A ;
   1011 
   1012       Prior to 1.33MR10 the code generated for "start" would resemble:
   1013 
   1014         while {
   1015             if (LA(1)==A &&
   1016                     (!LA(1)==A || isUpper())) {
   1017               a();
   1018             }
   1019         };
   1020 
   1021       This code is wrong because it makes rule "c" unreachable from
   1022       "start".  The essence of the problem is that antlr fails to
   1023       recognize that there can be a valid alternative within "a" even
   1024       when the predicate <<isUpper(LATEXT(1))>>? is false.
   1025 
   1026       In 1.33MR10 with -mrhoist the hoisting of the predicate into
   1027       "start" is suppressed because it recognizes that "c" can
   1028       cover all the cases where the predicate is false:
   1029 
   1030         while {
   1031             if (LA(1)==A) {
   1032               a();
   1033             }
   1034         };
   1035 
   1036       With the antlr "-info p" switch the user will receive information
   1037       about the predicate suppression in the generated file:
   1038 
   1039       --------------------------------------------------------------
   1040         #if 0
   1041 
   1042         Hoisting of predicate suppressed by alternative without predicate.
   1043         The alt without the predicate includes all cases where
   1044             the predicate is false.
   1045 
   1046            WITH predicate: line 7  v1.g
   1047            WITHOUT predicate: line 7  v1.g
   1048 
   1049         The context set for the predicate:
   1050 
   1051              A
   1052 
   1053         The lookahead set for the alt WITHOUT the semantic predicate:
   1054 
   1055              A
   1056 
   1057         The predicate:
   1058 
   1059           pred <<  isUpper(LATEXT(1))>>?
   1060                           depth=k=1  rule b  line 9  v1.g
   1061             set context:
   1062                A
   1063             tree context: null
   1064 
   1065         Chain of referenced rules:
   1066 
   1067             #0  in rule start (line 5 v1.g) to rule a
   1068             #1  in rule a (line 7 v1.g)
   1069 
   1070         #endif
   1071       --------------------------------------------------------------
   1072 
   1073       A predicate can be suppressed by a combination of alternatives
   1074       which, taken together, cover a predicate:
   1075 
   1076         start   : (a)* "@" ;
   1077 
   1078         a       : b | ca | cb | cc ;
   1079 
   1080         b       : <<isUpper(LATEXT(1))>>? ( A | B | C ) ;
   1081 
   1082         ca      : A ;
   1083         cb      : B ;
   1084         cc      : C ;
   1085 
   1086       Consider a more complex example in which "c" covers only part of
   1087       a predicate:
   1088 
   1089         start   : (a)* "@" ;
   1090 
   1091         a       : b
   1092                 | c
   1093                 ;
   1094 
   1095         b       : <<isUpper(LATEXT(1))>>?
   1096                     ( A
   1097                     | X
   1098                     );
   1099 
   1100         c       : A
   1101                 ;
   1102 
   1103       Prior to 1.33MR10 the code generated for "start" would resemble:
   1104 
   1105         while {
   1106             if ( (LA(1)==A || LA(1)==X) &&
   1107                     (! (LA(1)==A || LA(1)==X) || isUpper()) {
   1108               a();
   1109             }
   1110         };
   1111 
   1112       With 1.33MR10 and -mrhoist the predicate context is restricted to
   1113       the non-covered lookahead.  The code resembles:
   1114 
   1115         while {
   1116             if ( (LA(1)==A || LA(1)==X) &&
   1117                   (! (LA(1)==X) || isUpper()) {
   1118               a();
   1119             }
   1120         };
   1121 
   1122       With the antlr "-info p" switch the user will receive information
   1123       about the predicate restriction in the generated file:
   1124 
   1125       --------------------------------------------------------------
   1126         #if 0
   1127 
   1128         Restricting the context of a predicate because of overlap
   1129           in the lookahead set between the alternative with the
   1130           semantic predicate and one without
   1131         Without this restriction the alternative without the predicate
   1132           could not be reached when input matched the context of the
   1133           predicate and the predicate was false.
   1134 
   1135            WITH predicate: line 11  v4.g
   1136            WITHOUT predicate: line 12  v4.g
   1137 
   1138         The original context set for the predicate:
   1139 
   1140              A                X
   1141 
   1142         The lookahead set for the alt WITHOUT the semantic predicate:
   1143 
   1144              A
   1145 
   1146         The intersection of the two sets
   1147 
   1148              A
   1149 
   1150         The original predicate:
   1151 
   1152           pred <<  isUpper(LATEXT(1))>>?
   1153                           depth=k=1  rule b  line 15  v4.g
   1154             set context:
   1155                A                X
   1156             tree context: null
   1157 
   1158         The new (modified) form of the predicate:
   1159 
   1160           pred <<  isUpper(LATEXT(1))>>?
   1161                           depth=k=1  rule b  line 15  v4.g
   1162             set context:
   1163                X
   1164             tree context: null
   1165 
   1166         #endif
   1167       --------------------------------------------------------------
   1168 
   1169       The bad news about -mrhoist:
   1170 
   1171         (a) -mrhoist does not analyze predicates with lookahead
   1172             depth > 1.
   1173 
   1174         (b) -mrhoist does not look past a guarded predicate to
   1175             find context which might cover other predicates.
   1176 
   1177       For these cases you might want to use syntactic predicates.
   1178       When a semantic predicate fails during guess mode the guess
   1179       fails and the next alternative is tried.
   1180 
   1181       Limitation (a) is illustrated by the following example:
   1182 
   1183         start    : (stmt)* EOF ;
   1184 
   1185         stmt     : cast
   1186                  | expr
   1187                  ;
   1188         cast     : <<isTypename(LATEXT(2))>>? LP ID RP ;
   1189 
   1190         expr     : LP ID RP ;
   1191 
   1192       This is not much different from the first example, except that
   1193       it requires two tokens of lookahead context to determine what
   1194       to do.  This predicate is NOT suppressed because the current version
   1195       is unable to handle predicates with depth > 1.
   1196 
   1197       A predicate can be combined with other predicates during hoisting.
   1198       In those cases the depth=1 predicates are still handled.  Thus,
   1199       in the following example the isUpper() predicate will be suppressed
   1200       by line #4 when hoisted from "bizarre" into "start", but will still
   1201       be present in "bizarre" in order to predict "stmt".
   1202 
   1203         start    : (bizarre)* EOF ;     // #1
   1204                                         // #2
   1205         bizarre  : stmt                 // #3
   1206                  | A                    // #4
   1207                  ;
   1208 
   1209         stmt     : cast
   1210                  | expr
   1211                  ;
   1212 
   1213         cast     : <<isTypename(LATEXT(2))>>? LP ID RP ;
   1214 
   1215         expr     : LP ID RP ;
   1216                  | <<isUpper(LATEXT(1))>>? A
   1217 
   1218       Limitation (b) is illustrated by the following example of a
   1219       context guarded predicate:
   1220 
   1221         rule : (A)? <<p>>?          // #1
   1222                      (A             // #2
   1223                      |B             // #3
   1224                      )              // #4
   1225              | <<q>> B              // #5
   1226              ;
   1227 
   1228       Recall that this means that when the lookahead is NOT A then
   1229       the predicate "p" is ignored and it attempts to match "A|B".
   1230       Ideally, the "B" at line #3 should suppress predicate "q".
   1231       However, the current version does not attempt to look past
   1232       the guard predicate to find context which might suppress other
   1233       predicates.
   1234 
   1235       In some cases -mrhoist will lead to the reporting of ambiguities
   1236       which were not visible before:
   1237 
   1238         start   : (a)* "@";
   1239         a       : bc | d;
   1240         bc      : b  | c ;
   1241 
   1242         b       : <<isUpper(LATEXT(1))>>? A;
   1243         c       : A ;
   1244 
   1245         d       : A ;
   1246 
   1247       In this case there is a true ambiguity in "a" between "bc" and "d"
   1248       which can both match "A".  Without -mrhoist the predicate in "b"
   1249       is hoisted into "a" and there is no ambiguity reported.  However,
   1250       with -mrhoist, the predicate in "b" is suppressed by "c" (as it
   1251       should be) making the ambiguity in "a" apparent.
   1252 
   1253       The motivations for these changes were hoisting problems reported
   1254       by Reinier van den Born (reinier (a] vnet.ibm.com) and several others.
   1255 
   1256 #113. (Changed in 1.33MR10) new context guarded pred: (g)? && <<p>>? expr
   1257 
   1258       The existing context guarded predicate:
   1259 
   1260             rule : (guard)? => <<p>>? expr
   1261                  | next_alternative
   1262                  ;
   1263 
   1264       generates code which resembles:
   1265 
   1266             if (lookahead(expr) && (!guard || pred)) {
   1267               expr()
   1268             } else ....
   1269 
   1270       This is not suitable for some applications because it allows
   1271       expr() to be invoked when the predicate is false.  This is
   1272       intentional because it is meant to mimic automatically computed
   1273       predicate context.
   1274 
   1275       The new context guarded predicate uses the guard information
   1276       differently because it has a different goal.  Consider:
   1277 
   1278             rule : (guard)? && <<p>>? expr
   1279                  | next_alternative
   1280                  ;
   1281 
   1282       The new style of context guarded predicate is equivalent to:
   1283 
   1284             rule : <<guard==true && pred>>? expr
   1285                  | next_alternative
   1286                  ;
   1287 
   1288       It generates code which resembles:
   1289 
   1290             if (lookahead(expr) && guard && pred) {
   1291                 expr();
   1292             } else ...
   1293 
   1294       Both forms of guarded predicates severely restrict the form of
   1295       the context guard: it can contain no rule references, no
   1296       (...)*, no (...)+, and no {...}.  It may contain token and
   1297       token class references, and alternation ("|").
   1298 
   1299       Addition for 1.33MR11: in the token expression all tokens must
   1300       be at the same height of the token tree:
   1301 
   1302             (A ( B | C))? && ...            is ok (all height 2)
   1303             (A ( B |  ))? && ...            is not ok (some 1, some 2)
   1304             (A B C D | E F G H)? && ...     is ok (all height 4)
   1305             (A B C D | E )? && ...          is not ok (some 4, some 1)
   1306 
   1307       This restriction is required in order to properly compute the lookahead
   1308       set for expressions like:
   1309 
   1310             rule1 : (A B C)? && <<pred>>? rule2 ;
   1311             rule2 : (A|X) (B|Y) (C|Z);
   1312 
   1313       This addition was suggested by Rienier van den Born (reinier (a] vnet.ibm.com)
   1314 
   1315 #109. (Changed in 1.33MR10) improved trace information
   1316 
   1317       The quality of the trace information provided by the "-gd"
   1318       switch has been improved significantly.  Here is an example
   1319       of the output from a test program.  It shows the rule name,
   1320       the first token of lookahead, the call depth, and the guess
   1321       status:
   1322 
   1323         exit rule gusxx {"?"} depth 2
   1324         enter rule gusxx {"?"} depth 2
   1325         enter rule gus1 {"o"} depth 3 guessing
   1326         guess done - returning to rule gus1 {"o"} at depth 3
   1327                     (guess mode continues - an enclosing guess is still active)
   1328         guess done - returning to rule gus1 {"Z"} at depth 3
   1329                     (guess mode continues - an enclosing guess is still active)
   1330         exit rule gus1 {"Z"} depth 3 guessing
   1331         guess done - returning to rule gusxx {"o"} at depth 2 (guess mode ends)
   1332         enter rule gus1 {"o"} depth 3
   1333         guess done - returning to rule gus1 {"o"} at depth 3 (guess mode ends)
   1334         guess done - returning to rule gus1 {"Z"} at depth 3 (guess mode ends)
   1335         exit rule gus1 {"Z"} depth 3
   1336         line 1: syntax error at "Z" missing SC
   1337             ...
   1338 
   1339       Rule trace reporting is controlled by the value of the integer
   1340       [zz]traceOptionValue:  when it is positive tracing is enabled,
   1341       otherwise it is disabled.  Tracing during guess mode is controlled
   1342       by the value of the integer [zz]traceGuessOptionValue.  When
   1343       it is positive AND [zz]traceOptionValue is positive rule trace
   1344       is reported in guess mode.
   1345 
   1346       The values of [zz]traceOptionValue and [zz]traceGuessOptionValue
   1347       can be adjusted by subroutine calls listed below.
   1348 
   1349       Depending on the presence or absence of the antlr -gd switch
   1350       the variable [zz]traceOptionValueDefault is set to 0 or 1.  When
   1351       the parser is initialized or [zz]traceReset() is called the
   1352       value of [zz]traceOptionValueDefault is copied to [zz]traceOptionValue.
   1353       The value of [zz]traceGuessOptionValue is always initialzed to 1,
   1354       but, as noted earlier, nothing will be reported unless
   1355       [zz]traceOptionValue is also positive.
   1356 
   1357       When the parser state is saved/restored the value of the trace
   1358       variables are also saved/restored.  If a restore causes a change in
   1359       reporting behavior from on to off or vice versa this will be reported.
   1360 
   1361       When the -gd option is selected, the macro "#define zzTRACE_RULES"
   1362       is added to appropriate output files.
   1363 
   1364         C++ mode
   1365         --------
   1366         int     traceOption(int delta)
   1367         int     traceGuessOption(int delta)
   1368         void    traceReset()
   1369         int     traceOptionValueDefault
   1370 
   1371         C mode
   1372         --------
   1373         int     zzTraceOption(int delta)
   1374         int     zzTraceGuessOption(int delta)
   1375         void    zzTraceReset()
   1376         int     zzTraceOptionValueDefault
   1377 
   1378       The argument "delta" is added to the traceOptionValue.  To
   1379       turn on trace when inside a particular rule one:
   1380 
   1381         rule : <<traceOption(+1);>>
   1382                (
   1383                 rest-of-rule
   1384                )
   1385                <<traceOption(-1);>>
   1386        ;  /* fail clause */ <<traceOption(-1);>>
   1387 
   1388       One can use the same idea to turn *off* tracing within a
   1389       rule by using a delta of (-1).
   1390 
   1391       An improvement in the rule trace was suggested by Sramji
   1392       Ramanathan (ps (a] kumaran.com).
   1393 
   1394 #108. A Note on Deallocation of Variables Allocated in Guess Mode
   1395 
   1396                             NOTE
   1397         ------------------------------------------------------
   1398         This mechanism only works for heap allocated variables
   1399         ------------------------------------------------------
   1400 
   1401       The rewrite of the trace provides the machinery necessary
   1402       to properly free variables or undo actions following a
   1403       failed guess.
   1404 
   1405       The macro zzUSER_GUESS_HOOK(guessSeq,zzrv) is expanded
   1406       as part of the zzGUESS macro.  When a guess is opened
   1407       the value of zzrv is 0.  When a longjmp() is executed to
   1408       undo the guess, the value of zzrv will be 1.
   1409 
   1410       The macro zzUSER_GUESS_DONE_HOOK(guessSeq) is expanded
   1411       as part of the zzGUESS_DONE macro.  This is executed
   1412       whether the guess succeeds or fails as part of closing
   1413       the guess.
   1414 
   1415       The guessSeq is a sequence number which is assigned to each
   1416       guess and is incremented by 1 for each guess which becomes
   1417       active.  It is needed by the user to associate the start of
   1418       a guess with the failure and/or completion (closing) of a
   1419       guess.
   1420 
   1421       Guesses are nested.  They must be closed in the reverse
   1422       of the order that they are opened.
   1423 
   1424       In order to free memory used by a variable during a guess
   1425       a user must write a routine which can be called to
   1426       register the variable along with the current guess sequence
   1427       number provided by the zzUSER_GUESS_HOOK macro. If the guess
   1428       fails, all variables tagged with the corresponding guess
   1429       sequence number should be released.  This is ugly, but
   1430       it would require a major rewrite of antlr 1.33 to use
   1431       some mechanism other than setjmp()/longjmp().
   1432 
   1433       The order of calls for a *successful* guess would be:
   1434 
   1435         zzUSER_GUESS_HOOK(guessSeq,0);
   1436         zzUSER_GUESS_DONE_HOOK(guessSeq);
   1437 
   1438       The order of calls for a *failed* guess would be:
   1439 
   1440         zzUSER_GUESS_HOOK(guessSeq,0);
   1441         zzUSER_GUESS_HOOK(guessSeq,1);
   1442         zzUSER_GUESS_DONE_HOOK(guessSeq);
   1443 
   1444       The default definitions of these macros are empty strings.
   1445 
   1446       Here is an example in C++ mode.  The zzUSER_GUESS_HOOK and
   1447       zzUSER_GUESS_DONE_HOOK macros and myGuessHook() routine
   1448       can be used without change in both C and C++ versions.
   1449 
   1450       ----------------------------------------------------------------------
   1451         <<
   1452 
   1453         #include "AToken.h"
   1454 
   1455         typedef ANTLRCommonToken ANTLRToken;
   1456 
   1457         #include "DLGLexer.h"
   1458 
   1459         int main() {
   1460 
   1461           {
   1462             DLGFileInput     in(stdin);
   1463             DLGLexer         lexer(&in,2000);
   1464             ANTLRTokenBuffer pipe(&lexer,1);
   1465             ANTLRCommonToken aToken;
   1466             P                parser(&pipe);
   1467 
   1468             lexer.setToken(&aToken);
   1469             parser.init();
   1470             parser.start();
   1471           };
   1472 
   1473           fclose(stdin);
   1474           fclose(stdout);
   1475           return 0;
   1476         }
   1477 
   1478         >>
   1479 
   1480         <<
   1481         char *s=NULL;
   1482 
   1483         #undef zzUSER_GUESS_HOOK
   1484         #define zzUSER_GUESS_HOOK(guessSeq,zzrv) myGuessHook(guessSeq,zzrv);
   1485         #undef zzUSER_GUESS_DONE_HOOK
   1486         #define zzUSER_GUESS_DONE_HOOK(guessSeq)   myGuessHook(guessSeq,2);
   1487 
   1488         void myGuessHook(int guessSeq,int zzrv) {
   1489           if (zzrv == 0) {
   1490             fprintf(stderr,"User hook: starting guess #%d\n",guessSeq);
   1491           } else if (zzrv == 1) {
   1492             free (s);
   1493             s=NULL;
   1494             fprintf(stderr,"User hook: failed guess #%d\n",guessSeq);
   1495           } else if (zzrv == 2) {
   1496             free (s);
   1497             s=NULL;
   1498             fprintf(stderr,"User hook: ending guess #%d\n",guessSeq);
   1499           };
   1500         }
   1501 
   1502         >>
   1503 
   1504         #token A    "a"
   1505         #token      "[\t \ \n]"     <<skip();>>
   1506 
   1507         class P {
   1508 
   1509         start : (top)+
   1510               ;
   1511 
   1512         top   : (which) ?   <<fprintf(stderr,"%s is a which\n",s); free(s); s=NULL; >>
   1513               | other       <<fprintf(stderr,"%s is an other\n",s); free(s); s=NULL; >>
   1514               ; <<if (s != NULL) free(s); s=NULL; >>
   1515 
   1516         which : which2
   1517               ;
   1518 
   1519         which2 : which3
   1520               ;
   1521         which3
   1522               : (label)?         <<fprintf(stderr,"%s is a label\n",s);>>
   1523               | (global)?        <<fprintf(stderr,"%s is a global\n",s);>>
   1524               | (exclamation)?   <<fprintf(stderr,"%s is an exclamation\n",s);>>
   1525               ;
   1526 
   1527         label :       <<s=strdup(LT(1)->getText());>> A ":" ;
   1528 
   1529         global :      <<s=strdup(LT(1)->getText());>> A "::" ;
   1530 
   1531         exclamation : <<s=strdup(LT(1)->getText());>> A "!" ;
   1532 
   1533         other :       <<s=strdup(LT(1)->getText());>> "other" ;
   1534 
   1535         }
   1536       ----------------------------------------------------------------------
   1537 
   1538       This is a silly example, but illustrates the idea.  For the input
   1539       "a ::" with tracing enabled the output begins:
   1540 
   1541       ----------------------------------------------------------------------
   1542         enter rule "start" depth 1
   1543         enter rule "top" depth 2
   1544         User hook: starting guess #1
   1545         enter rule "which" depth 3 guessing
   1546         enter rule "which2" depth 4 guessing
   1547         enter rule "which3" depth 5 guessing
   1548         User hook: starting guess #2
   1549         enter rule "label" depth 6 guessing
   1550         guess failed
   1551         User hook: failed guess #2
   1552         guess done - returning to rule "which3" at depth 5 (guess mode continues
   1553                                                  - an enclosing guess is still active)
   1554         User hook: ending guess #2
   1555         User hook: starting guess #3
   1556         enter rule "global" depth 6 guessing
   1557         exit rule "global" depth 6 guessing
   1558         guess done - returning to rule "which3" at depth 5 (guess mode continues
   1559                                                  - an enclosing guess is still active)
   1560         User hook: ending guess #3
   1561         enter rule "global" depth 6 guessing
   1562         exit rule "global" depth 6 guessing
   1563         exit rule "which3" depth 5 guessing
   1564         exit rule "which2" depth 4 guessing
   1565         exit rule "which" depth 3 guessing
   1566         guess done - returning to rule "top" at depth 2 (guess mode ends)
   1567         User hook: ending guess #1
   1568         enter rule "which" depth 3
   1569         .....
   1570       ----------------------------------------------------------------------
   1571 
   1572       Remember:
   1573 
   1574         (a) Only init-actions are executed during guess mode.
   1575         (b) A rule can be invoked multiple times during guess mode.
   1576         (c) If the guess succeeds the rule will be called once more
   1577               without guess mode so that normal actions will be executed.
   1578             This means that the init-action might need to distinguish
   1579               between guess mode and non-guess mode using the variable
   1580               [zz]guessing.
   1581 
   1582 #101. (Changed in 1.33MR10) antlr -info command line switch
   1583 
   1584         -info
   1585 
   1586             p   - extra predicate information in generated file
   1587 
   1588             t   - information about tnode use:
   1589                     at the end of each rule in generated file
   1590                     summary on stderr at end of program
   1591 
   1592             m   - monitor progress
   1593                     prints name of each rule as it is started
   1594                     flushes output at start of each rule
   1595 
   1596             f   - first/follow set information to stdout
   1597 
   1598             0   - no operation (added in 1.33MR11)
   1599 
   1600       The options may be combined and may appear in any order.
   1601       For example:
   1602 
   1603         antlr -info ptm -CC -gt -mrhoist on mygrammar.g
   1604 
   1605 #100a. (Changed in 1.33MR10) Predicate tree simplification
   1606 
   1607       When the same predicates can be referenced in more than one
   1608       alternative of a block large predicate trees can be formed.
   1609 
   1610       The difference that these optimizations make is so dramatic
   1611       that I have decided to use it even when -mrhoist is not selected.
   1612 
   1613       Consider the following grammar:
   1614 
   1615         start : ( all )* ;
   1616 
   1617         all   : a
   1618               | d
   1619               | e
   1620               | f
   1621               ;
   1622 
   1623         a     : c A B
   1624               | c A C
   1625               ;
   1626 
   1627         c     : <<AAA(LATEXT(2))>>?
   1628               ;
   1629 
   1630         d     : <<BBB(LATEXT(2))>>? B C
   1631               ;
   1632 
   1633         e     : <<CCC(LATEXT(2))>>? B C
   1634               ;
   1635 
   1636         f     : e X Y
   1637               ;
   1638 
   1639       In rule "a" there is a reference to rule "c" in both alternatives.
   1640       The length of the predicate AAA is k=2 and it can be followed in
   1641       alternative 1 only by (A B) while in alternative 2 it can be
   1642       followed only by (A C).  Thus they do not have identical context.
   1643 
   1644       In rule "all" the alternatives which refer to rules "e" and "f" allow
   1645       elimination of the duplicate reference to predicate CCC.
   1646 
   1647       The table below summarized the kind of simplification performed by
   1648       1.33MR10.  In the table, X and Y stand for single predicates
   1649       (not trees).
   1650 
   1651         (OR X (OR Y (OR Z)))  => (OR X Y Z)
   1652         (AND X (AND Y (AND Z)))  => (AND X Y Z)
   1653 
   1654         (OR X  (... (OR  X Y) ... ))     => (OR X (... Y ... ))
   1655         (AND X (... (AND X Y) ... ))     => (AND X (... Y ... ))
   1656         (OR X  (... (AND X Y) ... ))     => (OR X (...  ... ))
   1657         (AND X (... (OR  X Y) ... ))     => (AND X (...  ... ))
   1658 
   1659         (AND X)               => X
   1660         (OR X)                => X
   1661 
   1662       In a test with a complex grammar for a real application, a predicate
   1663       tree with six OR nodes and 12 leaves was reduced to "(OR X Y Z)".
   1664 
   1665       In 1.33MR10 there is a greater effort to release memory used
   1666       by predicates once they are no longer in use.
   1667 
   1668 #100b. (Changed in 1.33MR10) Suppression of extra predicate tests
   1669 
   1670       The following optimizations require that -mrhoist be selected.
   1671 
   1672       It is relatively easy to optimize the code generated for predicate
   1673       gates when they are of the form:
   1674 
   1675             (AND X Y Z ...)
   1676         or  (OR  X Y Z ...)
   1677 
   1678       where X, Y, Z, and "..." represent individual predicates (leaves) not
   1679       predicate trees.
   1680 
   1681       If the predicate is an AND the contexts of the X, Y, Z, etc. are
   1682       ANDed together to create a single Tree context for the group and
   1683       context tests for the individual predicates are suppressed:
   1684 
   1685             --------------------------------------------------
   1686             Note: This was incorrect.  The contexts should be
   1687             ORed together.  This has been fixed.  A more 
   1688             complete description is available in item #152.
   1689             ---------------------------------------------------
   1690 
   1691       Optimization 1:  (AND X Y Z ...)
   1692 
   1693         Suppose the context for Xtest is LA(1)==LP and the context for
   1694         Ytest is LA(1)==LP && LA(2)==ID.
   1695 
   1696             Without the optimization the code would resemble:
   1697 
   1698                 if (lookaheadContext &&
   1699                     !(LA(1)==LP && LA(1)==LP && LA(2)==ID) ||
   1700                         ( (! LA(1)==LP || Xtest) &&
   1701                           (! (LA(1)==LP || LA(2)==ID) || Xtest)
   1702                         )) {...
   1703 
   1704             With the -mrhoist optimization the code would resemble:
   1705 
   1706                 if (lookaheadContext &&
   1707                     ! (LA(1)==LP && LA(2)==ID) || (Xtest && Ytest) {...
   1708 
   1709       Optimization 2: (OR X Y Z ...) with identical contexts
   1710 
   1711         Suppose the context for Xtest is LA(1)==ID and for Ytest
   1712         the context is also LA(1)==ID.
   1713 
   1714             Without the optimization the code would resemble:
   1715 
   1716                 if (lookaheadContext &&
   1717                     ! (LA(1)==ID || LA(1)==ID) ||
   1718                         (LA(1)==ID && Xtest) ||
   1719                         (LA(1)==ID && Ytest) {...
   1720 
   1721             With the -mrhoist optimization the code would resemble:
   1722 
   1723                 if (lookaheadContext &&
   1724                     (! LA(1)==ID) || (Xtest || Ytest) {...
   1725 
   1726       Optimization 3: (OR X Y Z ...) with distinct contexts
   1727 
   1728         Suppose the context for Xtest is LA(1)==ID and for Ytest
   1729         the context is LA(1)==LP.
   1730 
   1731             Without the optimization the code would resemble:
   1732 
   1733                 if (lookaheadContext &&
   1734                     ! (LA(1)==ID || LA(1)==LP) ||
   1735                         (LA(1)==ID && Xtest) ||
   1736                         (LA(1)==LP && Ytest) {...
   1737 
   1738             With the -mrhoist optimization the code would resemble:
   1739 
   1740                 if (lookaheadContext &&
   1741                         (zzpf=0,
   1742                             (LA(1)==ID && (zzpf=1) && Xtest) ||
   1743                             (LA(1)==LP && (zzpf=1) && Ytest) ||
   1744                             !zzpf) {
   1745 
   1746             These may appear to be of similar complexity at first,
   1747             but the non-optimized version contains two tests of each
   1748             context while the optimized version contains only one
   1749             such test, as well as eliminating some of the inverted
   1750             logic (" !(...) || ").
   1751 
   1752       Optimization 4: Computation of predicate gate trees
   1753 
   1754         When generating code for the gates of predicate expressions
   1755         antlr 1.33 vanilla uses a recursive procedure to generate
   1756         "&&" and "||" expressions for testing the lookahead. As each
   1757         layer of the predicate tree is exposed a new set of "&&" and
   1758         "||" expressions on the lookahead are generated.  In many
   1759         cases the lookahead being tested has already been tested.
   1760 
   1761         With -mrhoist a lookahead tree is computed for the entire
   1762         lookahead expression.  This means that predicates with identical
   1763         context or context which is a subset of another predicate's
   1764         context disappear.
   1765 
   1766         This is especially important for predicates formed by rules
   1767         like the following:
   1768 
   1769             uppperCaseVowel : <<isUpperCase(LATEXT(1))>>?  vowel;
   1770             vowel:          : <<isVowel(LATEXT(1))>>? LETTERS;
   1771 
   1772         These predicates are combined using AND since both must be
   1773         satisfied for rule upperCaseVowel.  They have identical
   1774         context which makes this optimization very effective.
   1775 
   1776       The affect of Items #100a and #100b together can be dramatic.  In
   1777       a very large (but real world) grammar one particular predicate
   1778       expression was reduced from an (unreadable) 50 predicate leaves,
   1779       195 LA(1) terms, and 5500 characters to an (easily comprehensible)
   1780       3 predicate leaves (all different) and a *single* LA(1) term.
   1781 
   1782 #98.  (Changed in 1.33MR10) Option "-info p"
   1783 
   1784       When the user selects option "-info p" the program will generate
   1785       detailed information about predicates.  If the user selects
   1786       "-mrhoist on" additional detail will be provided explaining
   1787       the promotion and suppression of predicates.  The output is part
   1788       of the generated file and sandwiched between #if 0/#endif statements.
   1789 
   1790       Consider the following k=1 grammar:
   1791 
   1792         start : ( all ) * ;
   1793 
   1794         all   : ( a
   1795                 | b
   1796                 )
   1797                 ;
   1798 
   1799         a     : c B
   1800               ;
   1801 
   1802         c     : <<LATEXT(1)>>?
   1803               | B
   1804               ;
   1805 
   1806         b     : <<LATEXT(1)>>? X
   1807               ;
   1808 
   1809       Below is an excerpt of the output for rule "start" for the three
   1810       predicate options (off, on, and maintenance release style hoisting).
   1811 
   1812       For those who do not wish to use the "-mrhoist on" option for code
   1813       generation the option can be used in a "diagnostic" mode to provide
   1814       valuable information:
   1815 
   1816             a. where one should insert null actions to inhibit hoisting
   1817             b. a chain of rule references which shows where predicates are
   1818                being hoisted
   1819 
   1820       ======================================================================
   1821       Example of "-info p" with "-mrhoist on"
   1822       ======================================================================
   1823         #if 0
   1824 
   1825         Hoisting of predicate suppressed by alternative without predicate.
   1826         The alt without the predicate includes all cases where the
   1827            predicate is false.
   1828 
   1829            WITH predicate: line 11  v36.g
   1830            WITHOUT predicate: line 12  v36.g
   1831 
   1832         The context set for the predicate:
   1833 
   1834              B
   1835 
   1836         The lookahead set for alt WITHOUT the semantic predicate:
   1837 
   1838              B
   1839 
   1840         The predicate:
   1841 
   1842           pred <<  LATEXT(1)>>?  depth=k=1  rule c  line 11  v36.g
   1843 
   1844             set context:
   1845                B
   1846             tree context: null
   1847 
   1848         Chain of referenced rules:
   1849 
   1850             #0  in rule start (line 1 v36.g) to rule all
   1851             #1  in rule all (line 3 v36.g) to rule a
   1852             #2  in rule a (line 8 v36.g) to rule c
   1853             #3  in rule c (line 11 v36.g)
   1854 
   1855         #endif
   1856         &&
   1857         #if 0
   1858 
   1859         pred <<  LATEXT(1)>>?  depth=k=1  rule b  line 15  v36.g
   1860 
   1861           set context:
   1862              X
   1863           tree context: null
   1864 
   1865         #endif
   1866       ======================================================================
   1867       Example of "-info p"  with the default -prc setting ( "-prc off")
   1868       ======================================================================
   1869         #if 0
   1870 
   1871         OR
   1872           pred <<  LATEXT(1)>>?  depth=k=1  rule c  line 11  v36.g
   1873 
   1874             set context:
   1875               nil
   1876             tree context: null
   1877 
   1878           pred <<  LATEXT(1)>>?  depth=k=1  rule b  line 15  v36.g
   1879 
   1880             set context:
   1881               nil
   1882             tree context: null
   1883 
   1884         #endif
   1885       ======================================================================
   1886       Example of "-info p" with "-prc on" and "-mrhoist off"
   1887       ======================================================================
   1888         #if 0
   1889 
   1890         OR
   1891           pred <<  LATEXT(1)>>?  depth=k=1  rule c  line 11  v36.g
   1892 
   1893             set context:
   1894                B
   1895             tree context: null
   1896 
   1897           pred <<  LATEXT(1)>>?  depth=k=1  rule b  line 15  v36.g
   1898 
   1899             set context:
   1900                X
   1901             tree context: null
   1902 
   1903         #endif
   1904       ======================================================================
   1905 
   1906 #60.  (Changed in 1.33MR7) Major changes to exception handling
   1907 
   1908         There were significant problems in the handling of exceptions
   1909         in 1.33 vanilla.  The general problem is that it can only
   1910         process one level of exception handler.  For example, a named
   1911         exception handler, an exception handler for an alternative, or
   1912         an exception for a subrule  always went to the rule's exception
   1913         handler if there was no "catch" which matched the exception.
   1914 
   1915         In 1.33MR7 the exception handlers properly "nest".  If an
   1916         exception handler does not have a matching "catch" then the
   1917         nextmost outer exception handler is checked for an appropriate
   1918         "catch" clause, and so on until an exception handler with an
   1919         appropriate "catch" is found.
   1920 
   1921         There are still undesirable features in the way exception
   1922         handlers are implemented, but I do not have time to fix them
   1923         at the moment:
   1924 
   1925             The exception handlers for alternatives are outside the
   1926             block containing the alternative.  This makes it impossible
   1927             to access variables declared in a block or to resume the
   1928             parse by "falling through".  The parse can still be easily
   1929             resumed in other ways, but not in the most natural fashion.
   1930 
   1931             This results in an inconsistentcy between named exception
   1932             handlers and exception handlers for alternatives.  When
   1933             an exception handler for an alternative "falls through"
   1934             it goes to the nextmost outer handler - not the "normal
   1935             action".
   1936 
   1937         A major difference between 1.33MR7 and 1.33 vanilla is
   1938         the default action after an exception is caught:
   1939 
   1940             1.33 Vanilla
   1941             ------------
   1942             In 1.33 vanilla the signal value is set to zero ("NoSignal")
   1943             and the code drops through to the code following the exception.
   1944             For named exception handlers this is the "normal action".
   1945             For alternative exception handlers this is the rule's handler.
   1946 
   1947             1.33MR7
   1948             -------
   1949             In 1.33MR7 the signal value is NOT automatically set to zero.
   1950 
   1951             There are two cases:
   1952 
   1953                 For named exception handlers: if the signal value has been
   1954                 set to zero the code drops through to the "normal action".
   1955 
   1956                 For all other cases the code branches to the nextmost outer
   1957                 exception handler until it reaches the handler for the rule.
   1958 
   1959         The following macros have been defined for convenience:
   1960 
   1961             C/C++ Mode Name
   1962             --------------------
   1963             (zz)suppressSignal
   1964                   set signal & return signal arg to 0 ("NoSignal")
   1965             (zz)setSignal(intValue)
   1966                   set signal & return signal arg to some value
   1967             (zz)exportSignal
   1968                   copy the signal value to the return signal arg
   1969 
   1970         I'm not sure why PCCTS make a distinction between the local
   1971         signal value and the return signal argument, but I'm loathe
   1972         to change the code. The burden of copying the local signal
   1973         value to the return signal argument can be given to the
   1974         default signal handler, I suppose.
   1975 
   1976 #53.  (Explanation for 1.33MR6) What happens after an exception is caught ?
   1977 
   1978         The Book is silent about what happens after an exception
   1979         is caught.
   1980 
   1981         The following code fragment prints "Error Action" followed
   1982         by "Normal Action".
   1983 
   1984         test : Word ex:Number <<printf("Normal Action\n");>>
   1985                 exception[ex]
   1986                    catch NoViableAlt:
   1987                         <<printf("Error Action\n");>>
   1988         ;
   1989 
   1990         The reason for "Normal Action" is that the normal flow of the
   1991         program after a user-written exception handler is to "drop through".
   1992         In the case of an exception handler for a rule this results in
   1993         the exection of a "return" statement.  In the case of an
   1994         exception handler attached to an alternative, rule, or token
   1995         this is the code that would have executed had there been no
   1996         exception.
   1997 
   1998         The user can achieve the desired result by using a "return"
   1999         statement.
   2000 
   2001         test : Word ex:Number <<printf("Normal Action\n");>>
   2002                 exception[ex]
   2003                    catch NoViableAlt:
   2004                         <<printf("Error Action\n"); return;>>
   2005         ;
   2006 
   2007         The most powerful mechanism for recovery from parse errors
   2008         in pccts is syntactic predicates because they provide
   2009         backtracking.  Exceptions allow "return", "break",
   2010         "consumeUntil(...)", "goto _handler", "goto _fail", and
   2011         changing the _signal value.
   2012 
   2013 #41.  (Added in 1.33MR6) antlr -stdout
   2014 
   2015         Using "antlr -stdout ..." forces the text that would
   2016         normally go to the grammar.c or grammar.cpp file to
   2017         stdout.
   2018 
   2019 #40.  (Added in 1.33MR6) antlr -tab to change tab stops
   2020 
   2021         Using "antlr -tab number ..." changes the tab stops
   2022         for the grammar.c or grammar.cpp file.  The number
   2023         must be between 0 and 8.  Using 0 gives tab characters,
   2024         values between 1 and 8 give the appropriate number of
   2025         space characters.
   2026 
   2027 #34.  (Added to 1.33MR1) Add public DLGLexerBase::set_line(int newValue)
   2028 
   2029         Previously there was no public function for changing the line
   2030         number maintained by the lexer.
   2031 
   2032 #28.   (Added to 1.33MR1) More control over DLG header
   2033 
   2034         Version 1.33MR1 adds the following directives to PCCTS
   2035         for C++ mode:
   2036 
   2037           #lexprefix  <<source code>>
   2038 
   2039                 Adds source code to the DLGLexer.h file
   2040                 after the #include "DLexerBase.h" but
   2041                 before the start of the class definition.
   2042 
   2043           #lexmember  <<source code>>
   2044 
   2045                 Adds source code to the DLGLexer.h file
   2046                 as part of the DLGLexer class body.  It
   2047                 appears immediately after the start of
   2048                 the class and a "public: statement.
   2049 
   2050