Home | History | Annotate | Download | only in Pccts
      1 =======================================================================
      2 List of Implemented Fixes and Changes for Maintenance Releases of PCCTS
      3 =======================================================================
      4 
      5                                DISCLAIMER
      6 
      7  The software and these notes are provided "as is".  They may include
      8  typographical or technical errors and their authors disclaims all
      9  liability of any kind or nature for damages due to error, fault,
     10  defect, or deficiency regardless of cause.  All warranties of any
     11  kind, either express or implied, including, but not limited to, the
     12  implied  warranties of merchantability and fitness for a particular
     13  purpose are disclaimed.
     14 
     15 
     16         -------------------------------------------------------
     17         Note:  Items #153 to #1 are now in a separate file named
     18                 CHANGES_FROM_133_BEFORE_MR13.txt
     19         -------------------------------------------------------
     20         
     21 #312. (Changed in MR33) Bug caused by change #299.
     22 
     23 	In change #299 a warning message was suppressed when there was
     24 	no LT(1) in a semantic predicate and max(k,ck) was 1.  The 
     25 	changed caused the code which set a default predicate depth for
     26 	the semantic predicate to be left as 0 rather than set to 1.
     27 	
     28 	This manifested as an error at line #1559 of mrhost.c
     29 	
     30 	Reported by Peter Dulimov.
     31 	    
     32 #311. (Changed in MR33) Added sorcer/lib to Makefile.
     33 
     34     Reported by Dale Martin.
     35             
     36 #310. (Changed in MR32) In C mode zzastPush was spelled zzastpush in one case.
     37 
     38     Reported by Jean-Claude Durand
     39     
     40 #309. (Changed in MR32) Renamed baseName because of VMS name conflict
     41 
     42     Renamed baseName to pcctsBaseName to avoid library name conflict with
     43     VMS library routine.  Reported by Jean-Franois PIRONNE.
     44     
     45 #308. (Changed in MR32) Used "template" as name of formal in C routine
     46 
     47 	In astlib.h routine ast_scan a formal was named "template".  This caused
     48 	problems when the C code was compiled with a C++ compiler.  Reported by
     49 	Sabyasachi Dey.
     50             
     51 #307. (Changed in MR31) Compiler dependent bug in function prototype generation
     52     
     53     The code which generated function prototypes contained a bug which
     54     was compiler/optimization dependent.  Under some circumstance an
     55     extra character would be included in portions of a function prototype.
     56     
     57     Reported by David Cook.
     58     
     59 #306. (Changed in MR30) Validating predicate following a token
     60 
     61     A validating predicate which immediately followed a token match 
     62     consumed the token after the predicate rather than before.  Prior
     63     to this fix (in the following example) isValidTimeScaleValue() in
     64     the predicate would test the text for TIMESCALE rather than for
     65     NUMBER:
     66      
     67 		time_scale :
     68     		TIMESCALE
     69     		<<isValidTimeScaleValue(LT(1)->getText())>>?
     70     		ts:NUMBER
     71     		( us:MICROSECOND << tVal = ...>>
     72     		| ns:NANOSECOND << tVal = ...  >>
     73     		)
     74 	
     75 	Reported by Adalbert Perbandt.
     76 	
     77 #305. (Changed in MR30) Alternatives with guess blocks inside (...)* blocks.
     78 
     79 	In MR14 change #175 fixed a bug in the prediction expressions for guess
     80 	blocks which were of the form (alpha)? beta.  Unfortunately, this
     81 	resulted in a new bug as exemplified by the example below, which computed
     82 	the first set for r as {B} rather than {B C}:
     83 	
     84 					r : ( (A)? B
     85 					    | C
     86 						)*
     87   
     88     This example doesn't make any sense as A is not a prefix of B, but it
     89     illustrates the problem.  This bug did not appear for:
     90     
     91     				r : ( (A)?
     92     				    | C
     93     				    )*
     94 
     95 	because it does not use the (alpha)? beta form.
     96 
     97 	Item #175 fixed an asymmetry in ambiguity messages for the following
     98 	constructs which appear to have identical ambiguities (between repeating
     99 	the loop vs. exiting the loop).  MR30 retains this fix, but the implementation
    100 	is slightly different.
    101 	
    102 	          r_star : ( (A B)? )* A ;
    103 	          r_plus : ( (A B)? )+ A ;
    104 
    105     Reported by Arpad Beszedes (beszedes inf.u-szeged.hu).
    106     
    107 #304. (Changed in MR30) Crash when mismatch between output value counts.
    108 
    109 	For a rule such as:
    110 	
    111 		r1 : r2>[i,j];
    112 		r2 >[int i, int j] : A;
    113 		
    114 	If there were extra actuals for the reference to rule r2 from rule r1
    115 	there antlr would crash.  This bug was introduced by change #276.
    116 
    117 	Reported by Sinan Karasu.
    118 	
    119 #303. (Changed in MR30) DLGLexerBase::replchar
    120 
    121 	DLGLexerBase::replchar and the C mode routine zzreplchar did not work 
    122 	properly when the new character was 0.
    123       
    124     Reported with fix by Philippe Laporte
    125 
    126 #302. (Changed in MR28) Fix significant problems in initial release of MR27.
    127 
    128 #301. (Changed in MR27) Default tab stops set to 2 spaces.
    129 
    130     To have antlr generate true tabs rather than spaces, use "antlr -tab 0".
    131     To generate 4 spaces per tab stop use "antlr -tab 4"
    132     
    133 #300. (Changed in MR27)
    134 
    135 	Consider the following methods of constructing an AST from ID:
    136 	
    137         rule1!
    138                 : id:ID << #0 = #[id]; >> ;
    139         
    140         rule2!
    141                 : id:ID << #0 = #id; >> ;
    142         
    143         rule3
    144                 : ID ;
    145         
    146         rule4
    147                 : id:ID << #0 = #id; >> ;
    148         
    149     For rule_2, the AST corresponding to id would always be NULL.  This
    150     is because the user explicitly suppressed AST construction using the
    151     "!" operator on the rule.  In MR27 the use of an AST expression
    152     such as #id overrides the "!" operator and forces construction of
    153     the AST.
    154     
    155     This fix does not apply to C mode ASTs when the ASTs are referenced
    156     using numbers rather than symbols.
    157 
    158 	For C mode, this requires that the (optional) function/macro zzmk_ast
    159 	be defined.  This functions copies information from an attribute into
    160 	a previously allocated AST.
    161 
    162     Reported by Jan Langer (jan langernetz.de)
    163 
    164 #299. (Changed in MR27) Don't warn if k=1 and semantic predicate missing LT(i)
    165 
    166     If a semantic does not have a reference to LT(i) or (C mode LATEXT(i))
    167     then pccts doesn't know how many lookahead tokens to use for context.
    168     However, if max(k,ck) is 1 then there is really only one choice and
    169     the warning is unnecessary.
    170     
    171 #298. (Changed in MR27) Removed "register" for lastpos in dlgauto.c zzgettok
    172 
    173 #297. (Changed in MR27) Incorrect prototypes when used with classic C
    174 
    175     There were a number of errors in function headers when antlr was
    176     built with compilers that do not have __STDC__ or __cplusplus set.
    177     
    178     The functions which have variable length argument lists now use
    179     PCCTS_USE_STDARG rather than __USE_PROTOTYPES__ to determine
    180     whether to use stdargs or varargs.
    181 
    182 #296. (Changed in MR27) Complex return types in rules.
    183 
    184     The following return type was not properly handled when 
    185     unpacking a struct with containing multiple return values:
    186     
    187       rule > [int i, IIR_Bool (IIR_Decl::*constraint)()] : ...    
    188 
    189     Instead of using "constraint", the program got lost and used
    190     an empty string.
    191     
    192     Reported by P.A. Wilsey.
    193 
    194 #295. (Changed in MR27) Extra ";" following zzGUESS_DONE sometimes.
    195 
    196     Certain constructs with guess blocks in MR23 led to extra ";"
    197     preceding the "else" clause of an "if".
    198 
    199     Reported by P.A. Wilsey.
    200     
    201 #294. (Changed in MR27) Infinite loop in antlr for nested blocks
    202 
    203     An oversight in detecting an empty alternative sometimes led
    204     to an infinite loop in antlr when it encountered a rule with
    205     nested blocks and guess blocks.
    206     
    207     Reported by P.A. Wilsey.
    208     
    209 #293. (Changed in MR27) Sorcerer optimization of _t->type()
    210 
    211     Sorcerer generated code may contain many calls to _t->type() in a
    212     single statement.  This change introduces a temporary variable
    213     to eliminate unnnecesary function calls.
    214 
    215     Change implemented by Tom Molteno (tim videoscript.com).
    216 
    217 #292. (Changed in MR27)
    218 
    219     WARNING:  Item #267 changes the signature of methods in the AST class.
    220 
    221     **** Be sure to revise your AST functions of the same name  ***
    222 
    223 #291. (Changed in MR24)
    224 
    225     Fix to serious code generation error in MR23 for (...)+ block.
    226 
    227 #290. (Changed in MR23) 
    228 
    229     Item #247 describes a change in the way {...} blocks handled
    230     an error.  Consider:
    231 
    232             r1 : {A} b ;
    233             b  : B;
    234                 
    235                 with input "C".
    236 
    237     Prior to change #247, the error would resemble "expected B -
    238     found C".  This is correct but incomplete, and therefore
    239     misleading.  In #247 it was changed to "expected A, B - found
    240     C".  This was fine, except for users of parser exception
    241     handling because the exception was generated in the epilogue 
    242     for {...} block rather than in rule b.  This made it difficult
    243     for users of parser exception handling because B was not
    244     expected in that context. Those not using parser exception
    245     handling didn't notice the difference.
    246 
    247     The current change restores the behavior prior to #247 when
    248     parser exceptions are present, but retains the revised behavior
    249     otherwise.  This change should be visible only when exceptions
    250     are in use and only for {...} blocks and sub-blocks of the form
    251     (something|something | something | epsilon) where epsilon represents
    252     an empty production and it is the last alternative of a sub-block.
    253     In contrast, (something | epsilon | something) should generate the
    254     same code as before, even when exceptions are used.
    255     
    256     Reported by Philippe Laporte (philippe at transvirtual.com).
    257 
    258 #289. (Changed in MR23) Bug in matching complement of a #tokclass
    259 
    260     Prior to MR23 when a #tokclass was matched in both its complemented form
    261     and uncomplemented form, the bit set generated for its first use was used
    262     for both cases.  However, the prediction expression was correctly computed
    263     in both cases.  This meant that the second case would never be matched
    264     because, for the second appearance, the prediction expression and the 
    265     set to be matched would be complements of each other.
    266         
    267     Consider:
    268         
    269                 #token A "a"
    270                 #token B "b"
    271                 #token C "c"
    272                 #tokclass AB {A B}
    273                 
    274                 r1 : AB    /* alt 1x */
    275                    | ~AB   /* alt 1y */
    276                    ;
    277         
    278     Prior to MR23, this resulted in alternative 1y being unreachable.  Had it
    279     been written:
    280         
    281                 r2 : ~AB  /* alt 2x */
    282                    : AB   /* alt 2y */
    283                    
    284     then alternative 2y would have become unreachable.        
    285         
    286     This bug was only for the case of complemented #tokclass.  For complemented
    287     #token the proper code was generated.           
    288         
    289 #288. (Changed in MR23) #errclass not restricted to choice points
    290 
    291     The #errclass directive is supposed to allow a programmer to define
    292     print strings which should appear in syntax error messages as a replacement
    293     for some combinations of tokens. For instance:
    294     
    295             #errclass Operator {PLUS MINUS TIMES DIVIDE}
    296             
    297     If a syntax message includes all four of these tokens, and there is no
    298     "better" choice of error class, the word "Operator" will be used rather
    299     than a list of the four token names.
    300         
    301     Prior to MR23 the #errclass definitions were used only at choice points
    302     (which call the FAIL macro). In other cases where there was no choice
    303     (e.g. where a single token or token class were matched) the #errclass
    304     information was not used.
    305 
    306     With MR23 the #errclass declarations are used for syntax error messages
    307     when matching a #tokclass, a wildcard (i.e. "*"), or the complement of a
    308     #token or #tokclass (e.g. ~Operator).
    309 
    310     Please note that #errclass may now be defined using #tokclass names 
    311     (see Item #284).
    312 
    313     Reported by Philip A. Wilsey.
    314 
    315 #287. (Changed in MR23) Print name for #tokclass
    316 
    317     Item #148 describes how to give a print name to a #token so that,for
    318     example, #token ID could have the expression "identifier" in syntax
    319     error messages.  This has been extended to #tokclass:
    320     
    321             #token ID("identifier")  "[a-zA-Z]+"
    322             #tokclass Primitive("primitive type") 
    323                                     {INT, FLOAT, CHAR, FLOAT, DOUBLE, BOOL} 
    324 
    325     This is really a cosmetic change, since #tokclass names do not appear
    326     in any error messages.
    327         
    328 #286. (Changed in MR23) Makefile change to use of cd
    329 
    330     In cases where a pccts subdirectory name matched a directory identified
    331     in a $CDPATH environment variable the build would fail.  All makefile
    332     cd commands have been changed from "cd xyz" to "cd ./xyz" in order
    333     to avoid this problem.
    334         
    335 #285. (Changed in MR23) Check for null pointers in some dlg structures
    336 
    337     An invalid regular expression can cause dlg to build an invalid
    338     structure to represent the regular expression even while it issues 
    339     error messages.  Additional pointer checks were added.
    340 
    341     Reported by Robert Sherry.
    342 
    343 #284. (Changed in MR23) Allow #tokclass in #errclass definitions
    344 
    345     Previously, a #tokclass reference in the definition of an
    346     #errclass was not handled properly. Instead of being expanded
    347     into the set of tokens represented by the #tokclass it was
    348     treated somewhat like an #errclass.  However, in a later phase
    349     when all #errclass were expanded into the corresponding tokens
    350     the #tokclass reference was not expanded (because it wasn't an
    351     #errclass).  In effect the reference was ignored.
    352 
    353     This has been fixed.
    354 
    355     Problem reported by Mike Dimmick (mike dimmick.demon.co.uk).
    356 
    357 #283. (Changed in MR23) Option -tmake invoke's parser's tmake 
    358 
    359     When the string #(...) appears in an action antlr replaces it with
    360     a call to ASTBase::tmake(...) to construct an AST.  It is sometimes
    361     useful to change the tmake routine so that it has access to information
    362     in the parser - something which is not possible with a static method
    363     in an application where they may be multiple parsers active.
    364 
    365     The antlr option -tmake replaces the call to ASTBase::tmake with a call
    366     to a user supplied tmake routine.
    367    
    368 #282. (Changed in MR23) Initialization error for DBG_REFCOUNTTOKEN
    369 
    370     When the pre-processor symbol DBG_REFCOUNTTOKEN is defined 
    371     incorrect code is generated to initialize ANTLRRefCountToken::ctor and
    372     dtor.
    373 
    374     Fix reported by Sven Kuehn (sven sevenkuehn.de).
    375    
    376 #281. (Changed in MR23) Addition of -noctor option for Sorcerer
    377 
    378     Added a -noctor option to suppress generation of the blank ctor
    379     for users who wish to define their own ctor.
    380 
    381     Contributed by Jan Langer (jan langernetz.de).
    382 
    383 #280. (Changed in MR23) Syntax error message for EOF token
    384 
    385     The EOF token now receives special treatment in syntax error messages
    386     because there is no text matched by the eof token.  The token name
    387     of the eof token is used unless it is "@" - in which case the string
    388     "<eof>" is used.
    389 
    390     Problem reported by Erwin Achermann (erwin.achermann switzerland.org).
    391 
    392 #279. (Changed in MR23) Exception groups
    393 
    394     There was a bug in the way that exception groups were attached to
    395     alternatives which caused problems when there was a block contained
    396     in an alternative.  For instance, in the following rule;
    397 
    398         statement : IF S { ELSE S } 
    399                         exception ....
    400         ;
    401 
    402     the exception would be attached to the {...} block instead of the 
    403     entire alternative because it was attached, in error, to the last
    404     alternative instead of the last OPEN alternative.
    405 
    406     Reported by Ty Mordane (tymordane hotmail.com).
    407     
    408 #278. (Changed in MR23) makefile changes
    409 
    410     Contributed by Tomasz Babczynski (faster lab05-7.ict.pwr.wroc.pl).
    411 
    412     The -cfile option is not absolutely needed: when extension of
    413     source file is one of the well-known C/C++ extensions it is 
    414     treated as C/C++ source
    415 
    416     The gnu make defines the CXX variable as the default C++ compiler
    417     name, so I added a line to copy this (if defined) to the CCC var.
    418 
    419     Added a -sor option: after it any -class command defines the class
    420     name for sorcerer, not for ANTLR.  A file extended with .sor is 
    421     treated as sorcerer input.  Because sorcerer can be called multiple
    422     times, -sor option can be repeated.  Any files and classes (one class
    423     per group) after each -sor makes one tree parser.
    424 
    425     Not implemented:
    426 
    427         1. Generate dependences for user c/c++ files.
    428         2. Support for -sor in c mode not.
    429 
    430     I have left the old genmk program in the directory as genmk_old.c.
    431 
    432 #277. (Changed in MR23) Change in macro for failed semantic predicates
    433 
    434     In the past, a semantic predicate that failed generated a call to
    435     the macro zzfailed_pred:
    436 
    437         #ifndef zzfailed_pred
    438         #define zzfailed_pred(_p) \
    439           if (guessing) { \
    440             zzGUESS_FAIL; \
    441           } else { \
    442             something(_p)
    443           }
    444         #endif
    445 
    446     If a user wished to use the failed action option for semantic predicates:
    447 
    448         rule : <<my_predicate>>? [my_fail_action] A
    449              | ...
    450 
    451            
    452     the code for my_fail_action would have to contain logic for handling
    453     the guess part of the zzfailed_pred macro.  The user should not have
    454     to be aware of the guess logic in writing the fail action.
    455 
    456     The zzfailed_pred has been rewritten to have three arguments:
    457 
    458             arg 1: the stringized predicate of the semantic predicate
    459             arg 2: 0 => there is no user-defined fail action
    460                    1 => there is a user-defined fail action
    461             arg 3: the user-defined fail action (if defined)
    462                    otherwise a no-operation
    463 
    464     The zzfailed_pred macro is now defined as:
    465 
    466         #ifndef zzfailed_pred
    467         #define zzfailed_pred(_p,_hasuseraction,_useraction) \
    468           if (guessing) { \
    469             zzGUESS_FAIL; \
    470           } else { \
    471             zzfailed_pred_action(_p,_hasuseraction,_useraction) \
    472           }
    473         #endif
    474 
    475 
    476     With zzfailed_pred_action defined as:
    477 
    478         #ifndef zzfailed_pred_action
    479         #define zzfailed_pred_action(_p,_hasuseraction,_useraction) \
    480             if (_hasUserAction) { _useraction } else { failedSemanticPredicate(_p); }
    481         #endif
    482 
    483     In C++ mode failedSemanticPredicate() is a virtual function.
    484     In C mode the default action is a fprintf statement.
    485 
    486     Suggested by Erwin Achermann (erwin.achermann switzerland.org).
    487 
    488 #276. (Changed in MR23) Addition of return value initialization syntax
    489 
    490     In an attempt to reduce the problems caused by the PURIFY macro I have
    491     added new syntax for initializing the return value of rules and the
    492     antlr option "-nopurify".
    493 
    494     A rule with a single return argument:
    495 
    496         r1 > [Foo f = expr] :
    497 
    498     now generates code that resembles:
    499 
    500         Foo r1(void) {
    501           Foo _retv = expr;
    502           ...
    503         }
    504   
    505     A rule with more than one return argument:
    506 
    507         r2 > [Foo f = expr1, Bar b = expr2 ] :
    508 
    509     generates code that resembles:
    510 
    511         struct _rv1 {
    512             Foo f;
    513             Bar b;
    514         }
    515 
    516         _rv1 r2(void) {
    517           struct _rv1 _retv;
    518           _retv.f = expr1;
    519           _retv.b = expr2;
    520           ...
    521         }
    522 
    523     C++ style comments appearing in the initialization list may cause problems.
    524 
    525 #275. (Changed in MR23) Addition of -nopurify option to antlr
    526 
    527     A long time ago the PURIFY macro was introduced to initialize
    528     return value arguments and get rid of annying messages from program
    529     that checked for unitialized variables.
    530 
    531     This has caused significant annoyance for C++ users that had
    532     classes with virtual functions or non-trivial contructors because
    533     it would zero the object, including the pointer to the virtual
    534     function table.  This could be defeated by redefining
    535     the PURIFY macro to be empty, but it was a constant surprise to
    536     new C++ users of pccts.
    537 
    538     I would like to remove it, but I fear that some existing programs
    539     depend on it and would break.  My temporary solution is to add
    540     an antlr option -nopurify which disables generation of the PURIFY
    541     macro call.
    542 
    543     The PURIFY macro should be avoided in favor of the new syntax
    544     for initializing return arguments described in item #275.
    545 
    546     To avoid name clash, the PURIFY macro has been renamed PCCTS_PURIFY.
    547 
    548 #274. (Changed in MR23) DLexer.cpp renamed to DLexer.h
    549       (Changed in MR23) ATokPtr.cpp renamed to ATokPtrImpl.h
    550 
    551     These two files had .cpp extensions but acted like .h files because
    552     there were included in other files. This caused problems for many IDE.
    553     I have renamed them.  The ATokPtrImpl.h was necessary because there was
    554     already an ATokPtr.h.
    555 
    556 #273. (Changed in MR23) Default win32 library changed to multi-threaded DLL
    557 
    558     The model used for building the Win32 debug and release libraries has changed
    559     to multi-threaded DLL.
    560 
    561     To make this change in your MSVC 6 project:
    562 
    563         Project -> Settings
    564         Select the C++ tab in the right pane of the dialog box
    565         Select "Category: Code Generation"
    566         Under "Use run-time library" select one of the following:
    567 
    568             Multi-threaded DLL
    569             Debug Multi-threaded DLL
    570            
    571     Suggested by Bill Menees (bill.menees gogallagher.com) 
    572     
    573 #272. (Changed in MR23) Failed semantic predicate reported via virtual function
    574 
    575     In the past, a failed semantic predicated reported the problem via a
    576     macro which used fprintf().  The macro now expands into a call on 
    577     the virtual function ANTLRParser::failedSemanticPredicate().
    578 
    579 #271. (Changed in MR23) Warning for LT(i), LATEXT(i) in token match actions
    580 
    581     An bug (or at least an oddity) is that a reference to LT(1), LA(1),
    582     or LATEXT(1) in an action which immediately follows a token match
    583     in a rule refers to the token matched, not the token which is in
    584     the lookahead buffer.  Consider:
    585 
    586         r : abc <<action alpha>> D <<action beta>> E;
    587 
    588     In this case LT(1) in action alpha will refer to the next token in
    589     the lookahead buffer ("D"), but LT(1) in action beta will refer to
    590     the token matched by D - the preceding token.
    591 
    592     A warning has been added for users about this when an action
    593     following a token match contains a reference to LT(1), LA(1), or LATEXT(1).
    594 
    595     This behavior should be changed, but it appears in too many programs
    596     now.  Another problem, perhaps more significant, is that the obvious
    597     fix (moving the consume() call to before the action) could change the 
    598     order in which input is requested and output appears in existing programs.
    599 
    600     This problem was reported, along with a fix by Benjamin Mandel
    601     (beny sd.co.il).  However, I felt that changing the behavior was too
    602     dangerous for existing code.
    603 
    604 #270. (Changed in MR23) Removed static objects from PCCTSAST.cpp
    605 
    606     There were some statically allocated objects in PCCTSAST.cpp
    607     These were changed to non-static.
    608 
    609 #269. (Changed in MR23) dlg output for initializing static array
    610 
    611     The output from dlg contains a construct similar to the
    612     following:
    613    
    614         struct XXX {
    615           static const int size;
    616           static int array1[5];
    617         };
    618 
    619         const int XXX::size = 4;
    620         int XXX::array1[size+1];
    621 
    622     
    623     The problem is that although the expression "size+1" used in
    624     the definition of array1 is equal to 5 (the expression used to
    625     declare array), it is not considered equivalent by some compilers.
    626 
    627     Reported with fix by Volker H. Simonis (simonis informatik.uni-tuebingen.de)
    628 
    629 #268. (Changed in MR23) syn() routine output when k > 1
    630 
    631     The syn() routine is supposed to print out the text of the
    632     token causing the syntax error.  It appears that it always
    633     used the text from the first lookahead token rather than the
    634     appropriate one.  The appropriate one is computed by comparing
    635     the token codes of lookahead token i (for i = 1 to k) with
    636     the FIRST(i) set.
    637     
    638     This has been corrected in ANTLRParser::syn().
    639 
    640     Reported by Bill Menees (bill.menees gogallagher.com) 
    641 
    642 #267. (Changed in MR23) AST traversal functions client data argument
    643 
    644     The AST traversal functions now take an extra (optional) parameter
    645     which can point to client data:
    646 
    647         preorder_action(void* pData = NULL)
    648         preorder_before_action(void* pData = NULL)
    649         preorder_after_action(void* pData = NULL)
    650 
    651     ****       Warning: this changes the AST signature.         ***
    652     **** Be sure to revise your AST functions of the same name  ***
    653 
    654     Bill Menees (bill.menees gogallagher.com) 
    655     
    656 #266. (Changed in MR23) virtual function printMessage()
    657 
    658     Bill Menees (bill.menees gogallagher.com) has completed the
    659     tedious taks of replacing all calls to fprintf() with calls
    660     to the virtual function printMessage().  For classes which
    661     have a pointer to the parser it forwards the printMessage()
    662     call to the parser's printMessage() routine.
    663 
    664     This should make it significanly easier to redirect pccts
    665     error and warning messages.
    666 
    667 #265. (Changed in MR23) Remove "labase++" in C++ mode
    668 
    669     In C++ mode labase++ is called when a token is matched.
    670     It appears that labase is not used in C++ mode at all, so
    671     this code has been commented out.
    672     
    673 #264. (Changed in MR23) Complete rewrite of ParserBlackBox.h
    674 
    675     The parser black box (PBlackBox.h) was completely rewritten
    676     by Chris Uzdavinis (chris atdesk.com) to improve its robustness.
    677 
    678 #263. (Changed in MR23) -preamble and -preamble_first rescinded
    679 
    680     Changes for item #253 have been rescinded.
    681 
    682 #262. (Changed in MR23) Crash with -alpha option during traceback
    683 
    684     Under some circumstances a -alpha traceback was started at the
    685     "wrong" time.  As a result, internal data structures were not
    686     initialized.
    687 
    688     Reported by Arpad Beszedes (beszedes inf.u-szeged.hu).
    689 
    690 #261. (Changed in MR23) Defer token fetch for C++ mode
    691 
    692     Item #216 has been revised to indicate that use of the defer fetch
    693     option (ZZDEFER_FETCH) requires dlg option -i.
    694 
    695 #260. (MR22) Raise default lex buffer size from 8,000 to 32,000 bytes.
    696 
    697     ZZLEXBUFSIZE is the size (in bytes) of the buffer used by dlg 
    698     generated lexers.  The default value has been raised to 32,000 and
    699     the value used by antlr, dlg, and sorcerer has also been raised to
    700     32,000.
    701 
    702 #259. (MR22) Default function arguments in C++ mode.
    703 
    704     If a rule is declared:
    705 
    706             rr [int i = 0] : ....
    707 
    708     then the declaration generated by pccts resembles:
    709 
    710             void rr(int i = 0);
    711 
    712     however, the definition must omit the default argument:
    713 
    714             void rr(int i) {...}
    715 
    716     In the past the default value was not omitted.  In MR22
    717     the generated code resembles:
    718 
    719             void rr(int i /* = 0 */ ) {...}
    720 
    721     Implemented by Volker H. Simonis (simonis informatik.uni-tuebingen.de)
    722 
    723 
    724     Note: In MR23 this was changed so that nested C style comments
    725     ("/* ... */") would not cause problems.
    726 
    727 #258. (MR22)  Using a base class for your parser
    728 
    729     In item #102 (MR10) the class statement was extended to allow one
    730     to specify a base class other than ANTLRParser for the generated
    731     parser.  It turned out that this was less than useful because
    732     the constructor still specified ANTLRParser as the base class.
    733 
    734     The class statement now uses the first identifier appearing after
    735     the ":" as the name of the base class.  For example:
    736 
    737         class MyParser : public FooParser {
    738 
    739     Generates in MyParser.h:
    740 
    741             class MyParser : public FooParser {
    742 
    743     Generates in MyParser.cpp something that resembles:
    744 
    745             MyParser::MyParser(ANTLRTokenBuffer *input) :
    746                                          FooParser(input,1,0,0,4)
    747             {
    748                 token_tbl = _token_tbl;
    749                 traceOptionValueDefault=1;    // MR10 turn trace ON
    750             }
    751 
    752     The base class constructor must have a signature similar to
    753     that of ANTLRParser.
    754 
    755 #257. (MR21a) Removed dlg statement that -i has no effect in C++ mode.
    756 
    757     This was incorrect.
    758 
    759 #256. (MR21a) Malformed syntax graph causes crash after error message.
    760 
    761     In the past, certain kinds of errors in the very first grammar
    762     element could cause the construction of a malformed graph 
    763     representing the grammar.  This would eventually result in a
    764     fatal internal error.  The code has been changed to be more
    765     resistant to this particular error.
    766 
    767 #255. (MR21a) ParserBlackBox(FILE* f) 
    768 
    769     This constructor set openByBlackBox to the wrong value.
    770 
    771     Reported by Kees Bakker (kees_bakker tasking.nl).
    772 
    773 #254. (MR21a) Reporting syntax error at end-of-file
    774 
    775     When there was a syntax error at the end-of-file the syntax
    776     error routine would substitute "<eof>" for the programmer's
    777     end-of-file symbol.  This substitution is now done only when
    778     the programmer does not define his own end-of-file symbol
    779     or the symbol begins with the character "@".
    780 
    781     Reported by Kees Bakker (kees_bakker tasking.nl).
    782 
    783 #253. (MR21) Generation of block preamble (-preamble and -preamble_first)
    784 
    785         *** This change was rescinded by item #263 ***
    786 
    787     The antlr option -preamble causes antlr to insert the code
    788     BLOCK_PREAMBLE at the start of each rule and block.  It does
    789     not insert code before rules references, token references, or
    790     actions.  By properly defining the macro BLOCK_PREAMBLE the
    791     user can generate code which is specific to the start of blocks.
    792 
    793     The antlr option -preamble_first is similar, but inserts the
    794     code BLOCK_PREAMBLE_FIRST(PreambleFirst_123) where the symbol
    795     PreambleFirst_123 is equivalent to the first set defined by
    796     the #FirstSetSymbol described in Item #248.
    797 
    798     I have not investigated how these options interact with guess
    799     mode (syntactic predicates).
    800 
    801 #252. (MR21) Check for null pointer in trace routine
    802 
    803     When some trace options are used when the parser is generated
    804     without the trace enabled, the current rule name may be a
    805     NULL pointer.  A guard was added to check for this in
    806     restoreState.
    807 
    808     Reported by Douglas E. Forester (dougf projtech.com).
    809 
    810 #251. (MR21) Changes to #define zzTRACE_RULES
    811 
    812     The macro zzTRACE_RULES was being use to pass information to
    813     AParser.h.  If this preprocessor symbol was not properly
    814     set the first time AParser.h was #included, the declaration
    815     of zzTRACEdata would be omitted (it is used by the -gd option).
    816     Subsequent #includes of AParser.h would be skipped because of 
    817     the #ifdef guard, so the declaration of zzTracePrevRuleName would
    818     never be made.  The result was that proper compilation was very 
    819     order dependent.
    820 
    821     The declaration of zzTRACEdata was made unconditional and the
    822     problem of removing unused declarations will be left to optimizers.
    823     
    824     Diagnosed by Douglas E. Forester (dougf projtech.com).
    825 
    826 #250. (MR21) Option for EXPERIMENTAL change to error sets for blocks
    827 
    828     The antlr option -mrblkerr turns on an experimental feature
    829     which is supposed to provide more accurate syntax error messages
    830     for k=1, ck=1 grammars.  When used with k>1 or ck>1 grammars the
    831     behavior should be no worse than the current behavior.
    832 
    833     There is no problem with the matching of elements or the computation
    834     of prediction expressions in pccts.  The task is only one of listing
    835     the most appropriate tokens in the error message.  The error sets used
    836     in pccts error messages are approximations of the exact error set when
    837     optional elements in (...)* or (...)+ are involved.  While entirely
    838     correct, the error messages are sometimes not 100% accurate.  
    839 
    840     There is also a minor philosophical issue.  For example, suppose the
    841     grammar expects the token to be an optional A followed by Z, and it 
    842     is X.  X, of course, is neither A nor Z, so an error message is appropriate.
    843     Is it appropriate to say "Expected Z" ?  It is correct, it is accurate,
    844     but it is not complete.  
    845 
    846     When k>1 or ck>1 the problem of providing the exactly correct
    847     list of tokens for the syntax error messages ends up becoming
    848     equivalent to evaluating the prediction expression for the
    849     alternatives twice. However, for k=1 ck=1 grammars the prediction
    850     expression can be computed easily and evaluated cheaply, so I
    851     decided to try implementing it to satisfy a particular application.
    852     This application uses the error set in an interactive command language
    853     to provide prompts which list the alternatives available at that
    854     point in the parser.  The user can then enter additional tokens to
    855     complete the command line.  To do this required more accurate error 
    856     sets then previously provided by pccts.
    857 
    858     In some cases the default pccts behavior may lead to more robust error
    859     recovery or clearer error messages then having the exact set of tokens.
    860     This is because (a) features like -ge allow the use of symbolic names for
    861     certain sets of tokens, so having extra tokens may simply obscure things
    862     and (b) the error set is use to resynchronize the parser, so a good
    863     choice is sometimes more important than having the exact set.
    864 
    865     Consider the following example:
    866 
    867             Note:  All examples code has been abbreviated
    868             to the absolute minimum in order to make the
    869             examples concise.
    870 
    871         star1 : (A)* Z;
    872 
    873     The generated code resembles:
    874 
    875            old                new (with -mrblkerr)
    876         --//-----------         --------------------
    877         for (;;) {            for (;;) {
    878             match(A);           match(A);
    879         }                     }
    880         match(Z);             if (! A and ! Z) then
    881                                 FAIL(...{A,Z}...);
    882                               }
    883                               match(Z);
    884 
    885 
    886         With input X
    887             old message: Found X, expected Z
    888             new message: Found X, expected A, Z
    889 
    890     For the example:
    891 
    892         star2 : (A|B)* Z;
    893 
    894            old                      new (with -mrblkerr)
    895         -------------               --------------------
    896         for (;;) {                  for (;;) {
    897           if (!A and !B) break;       if (!A and !B) break;
    898           if (...) {                  if (...) {
    899             <same ...>                  <same ...>
    900           }                           }
    901           else {                      else {
    902             FAIL(...{A,B,Z}...)         FAIL(...{A,B}...);
    903           }                           }
    904         }                           }
    905         match(B);                   if (! A and ! B and !Z) then
    906                                         FAIL(...{A,B,Z}...);
    907                                     }
    908                                     match(B);
    909 
    910         With input X
    911             old message: Found X, expected Z
    912             new message: Found X, expected A, B, Z
    913         With input A X
    914             old message: Found X, expected Z
    915             new message: Found X, expected A, B, Z
    916 
    917             This includes the choice of looping back to the
    918             star block.
    919 
    920     The code for plus blocks:
    921 
    922         plus1 : (A)+ Z;
    923 
    924     The generated code resembles:
    925 
    926            old                  new (with -mrblkerr)
    927         -------------           --------------------
    928         do {                    do {
    929           match(A);               match(A);
    930         } while (A)             } while (A)
    931         match(Z);               if (! A and ! Z) then
    932                                   FAIL(...{A,Z}...);
    933                                 }
    934                                 match(Z);
    935 
    936         With input A X
    937             old message: Found X, expected Z
    938             new message: Found X, expected A, Z
    939 
    940             This includes the choice of looping back to the
    941             plus block.
    942 
    943     For the example:
    944 
    945         plus2 : (A|B)+ Z;
    946 
    947            old                    new (with -mrblkerr)
    948         -------------             --------------------
    949         do {                        do {
    950           if (A) {                    <same>
    951             match(A);                 <same>
    952           } else if (B) {             <same>
    953             match(B);                 <same>
    954           } else {                    <same>
    955             if (cnt > 1) break;       <same>
    956             FAIL(...{A,B,Z}...)         FAIL(...{A,B}...);
    957           }                           }
    958           cnt++;                      <same>
    959         }                           }
    960 
    961         match(Z);                   if (! A and ! B and !Z) then
    962                                         FAIL(...{A,B,Z}...);
    963                                     }
    964                                     match(B);
    965 
    966         With input X
    967             old message: Found X, expected A, B, Z
    968             new message: Found X, expected A, B
    969         With input A X
    970             old message: Found X, expected Z
    971             new message: Found X, expected A, B, Z
    972 
    973             This includes the choice of looping back to the
    974             star block.
    975     
    976 #249. (MR21) Changes for DEC/VMS systems
    977 
    978     Jean-Franois Pironne (jfp altavista.net) has updated some
    979     VMS related command files and fixed some minor problems related
    980     to building pccts under the DEC/VMS operating system.  For DEC/VMS
    981     users the most important differences are:
    982 
    983         a.  Revised makefile.vms
    984         b.  Revised genMMS for genrating VMS style makefiles.
    985 
    986 #248. (MR21) Generate symbol for first set of an alternative
    987 
    988     pccts can generate a symbol which represents the tokens which may
    989     appear at the start of a block:
    990 
    991         rr : #FirstSetSymbol(rr_FirstSet)  ( Foo | Bar ) ;
    992 
    993     This will generate the symbol rr_FirstSet of type SetWordType with
    994     elements Foo and Bar set. The bits can be tested using code similar 
    995     to the following:
    996 
    997         if (set_el(Foo, &rr_FirstSet)) { ...
    998 
    999     This can be combined with the C array zztokens[] or the C++ routine
   1000     tokenName() to get the print name of the token in the first set.
   1001 
   1002     The size of the set is given by the newly added enum SET_SIZE, a 
   1003     protected member of the generated parser's class.  The number of
   1004     elements in the generated set will not be exactly equal to the 
   1005     value of SET_SIZE because of synthetic tokens created by #tokclass,
   1006     #errclass, the -ge option, and meta-tokens such as epsilon, and
   1007     end-of-file.
   1008 
   1009     The #FirstSetSymbol must appear immediately before a block
   1010     such as (...)+, (...)*, and {...}, and (...).  It may not appear
   1011     immediately before a token, a rule reference, or action.  However
   1012     a token or rule reference can be enclosed in a (...) in order to
   1013     make the use of #pragma FirstSetSymbol legal.
   1014 
   1015             rr_bad : #FirstSetSymbol(rr_bad_FirstSet) Foo;   //  Illegal
   1016 
   1017             rr_ok :  #FirstSetSymbol(rr_ok_FirstSet) (Foo);  //  Legal
   1018     
   1019     Do not confuse FirstSetSymbol sets with the sets used for testing
   1020     lookahead. The sets used for FirstSetSymbol have one element per bit,
   1021     so the number of bytes  is approximately the largest token number
   1022     divided by 8.  The sets used for testing lookahead store 8 lookahead 
   1023     sets per byte, so the length of the array is approximately the largest
   1024     token number.
   1025 
   1026     If there is demand, a similar routine for follow sets can be added.
   1027 
   1028 #247. (MR21) Misleading error message on syntax error for optional elements.
   1029 
   1030         ===================================================
   1031         The behavior has been revised when parser exception
   1032         handling is used.  See Item #290
   1033         ===================================================
   1034 
   1035     Prior to MR21, tokens which were optional did not appear in syntax
   1036     error messages if the block which immediately followed detected a 
   1037     syntax error.
   1038 
   1039     Consider the following grammar which accepts Number, Word, and Other:
   1040 
   1041             rr : {Number} Word;
   1042 
   1043     For this rule the code resembles:
   1044 
   1045             if (LA(1) == Number) {
   1046                 match(Number);
   1047                 consume();
   1048             }
   1049             match(Word);
   1050 
   1051     Prior to MR21, the error message for input "$ a" would be:
   1052 
   1053             line 1: syntax error at "$" missing Word
   1054 
   1055     With MR21 the message will be:
   1056 
   1057             line 1: syntax error at "$" expecting Word, Number.
   1058 
   1059     The generate code resembles:
   1060 
   1061             if ( (LA(1)==Number) ) {
   1062                 zzmatch(Number);
   1063                 consume();
   1064             }
   1065             else {
   1066                 if ( (LA(1)==Word) ) {
   1067                     /* nothing */
   1068                 }
   1069                 else {
   1070                     FAIL(... message for both Number and Word ...);
   1071                 }
   1072             }
   1073             match(Word);
   1074         
   1075     The code generated for optional blocks in MR21 is slightly longer
   1076     than the previous versions, but it should give better error messages.
   1077 
   1078     The code generated for:
   1079 
   1080             { a | b | c }
   1081 
   1082     should now be *identical* to:
   1083 
   1084             ( a | b | c | )
   1085 
   1086     which was not the case prior to MR21.
   1087 
   1088     Reported by Sue Marvin (sue siara.com).
   1089 
   1090 #246. (Changed in MR21) Use of $(MAKE) for calls to make
   1091 
   1092     Calls to make from the makefiles were replaced with $(MAKE)
   1093     because of problems when using gmake.
   1094 
   1095     Reported with fix by Sunil K.Vallamkonda (sunil siara.com).
   1096 
   1097 #245. (Changed in MR21) Changes to genmk
   1098 
   1099     The following command line options have been added to genmk:
   1100 
   1101         -cfiles ... 
   1102             
   1103             To add a user's C or C++ files into makefile automatically.
   1104             The list of files must be enclosed in apostrophes.  This
   1105             option may be specified multiple times.
   1106 
   1107         -compiler ...
   1108     
   1109             The name of the compiler to use for $(CCC) or $(CC).  The
   1110             default in C++ mode is "CC".  The default in C mode is "cc".
   1111 
   1112         -pccts_path ...
   1113 
   1114             The value for $(PCCTS), the pccts directory.  The default
   1115             is /usr/local/pccts.
   1116 
   1117     Contributed by Tomasz Babczynski (t.babczynski ict.pwr.wroc.pl).
   1118 
   1119 #244. (Changed in MR21) Rename variable "not" in antlr.g
   1120 
   1121     When antlr.g is compiled with a C++ compiler, a variable named
   1122     "not" causes problems.  Reported by Sinan Karasu
   1123     (sinan.karasu boeing.com).
   1124 
   1125 #243  (Changed in MR21) Replace recursion with iteration in zzfree_ast
   1126 
   1127     Another refinement to zzfree_ast in ast.c to limit recursion.
   1128 
   1129     NAKAJIMA Mutsuki (muc isr.co.jp).
   1130 
   1131 
   1132 #242.  (Changed in MR21) LineInfoFormatStr
   1133 
   1134     Added an #ifndef/#endif around LineInfoFormatStr in pcctscfg.h.
   1135 
   1136 #241. (Changed in MR21) Changed macro PURIFY to a no-op
   1137 
   1138                 ***********************
   1139                 *** NOT IMPLEMENTED ***
   1140                 ***********************
   1141 
   1142         The PURIFY macro was changed to a no-op because it was causing 
   1143         problems when passing C++ objects.
   1144     
   1145         The old definition:
   1146     
   1147             #define PURIFY(r,s)     memset((char *) &(r),'\\0',(s));
   1148     
   1149         The new definition:
   1150     
   1151             #define PURIFY(r,s)     /* nothing */
   1152 #endif
   1153 
   1154 #240. (Changed in MR21) sorcerer/h/sorcerer.h _MATCH and _MATCHRANGE
   1155 
   1156     Added test for NULL token pointer.
   1157 
   1158     Suggested by Peter Keller (keller ebi.ac.uk)
   1159 
   1160 #239. (Changed in MR21) C++ mode AParser::traceGuessFail
   1161 
   1162     If tracing is turned on when the code has been generated
   1163     without trace code, a failed guess generates a trace report
   1164     even though there are no other trace reports.  This
   1165     make the behavior consistent with other parts of the
   1166     trace system.
   1167 
   1168     Reported by David Wigg (wiggjd sbu.ac.uk).
   1169 
   1170 #238. (Changed in MR21) Namespace version #include files
   1171 
   1172     Changed reference from CStdio to cstdio (and other
   1173     #include file names) in the namespace version of pccts.
   1174     Should have known better.
   1175 
   1176 #237. (Changed in MR21) ParserBlackBox(FILE*)
   1177     
   1178     In the past, ParserBlackBox would close the FILE in the dtor
   1179     even though it was not opened by ParserBlackBox.  The problem
   1180     is that there were two constructors, one which accepted a file   
   1181     name and did an fopen, the other which accepted a FILE and did
   1182     not do an fopen.  There is now an extra member variable which
   1183     remembers whether ParserBlackBox did the open or not.
   1184 
   1185     Suggested by Mike Percy (mpercy scires.com).
   1186 
   1187 #236. (Changed in MR21) tmake now reports down pointer problem
   1188 
   1189     When ASTBase::tmake attempts to update the down pointer of 
   1190     an AST it checks to see if the down pointer is NULL.  If it
   1191     is not NULL it does not do the update and returns NULL.
   1192     An attempt to update the down pointer is almost always a
   1193     result of a user error.  This can lead to difficult to find
   1194     problems during tree construction.
   1195 
   1196     With this change, the routine calls a virtual function
   1197     reportOverwriteOfDownPointer() which calls panic to
   1198     report the problem.  Users who want the old behavior can
   1199     redefined the virtual function in their AST class.
   1200 
   1201     Suggested by Sinan Karasu (sinan.karasu boeing.com)
   1202 
   1203 #235. (Changed in MR21) Made ANTLRParser::resynch() virtual
   1204 
   1205     Suggested by Jerry Evans (jerry swsl.co.uk).
   1206 
   1207 #234. (Changed in MR21) Implicit int for function return value
   1208 
   1209     ATokenBuffer:bufferSize() did not specify a type for the
   1210     return value.
   1211 
   1212     Reported by Hai Vo-Ba (hai fc.hp.com).
   1213 
   1214 #233. (Changed in MR20) Converted to MSVC 6.0
   1215 
   1216     Due to external circumstances I have had to convert to MSVC 6.0
   1217     The MSVC 5.0 project files (.dsw and .dsp) have been retained as
   1218     xxx50.dsp and xxx50.dsw.  The MSVC 6.0 files are named xxx60.dsp
   1219     and xxx60.dsw (where xxx is the related to the directory/project).
   1220 
   1221 #232. (Changed in MR20) Make setwd bit vectors protected in parser.h
   1222 
   1223     The access for the setwd array in the parser header was not
   1224     specified.  As a result, it would depend on the code which 
   1225     preceded it.  In MR20 it will always have access "protected".
   1226 
   1227     Reported by Piotr Eljasiak (eljasiak zt.gdansk.tpsa.pl).
   1228 
   1229 #231. (Changed in MR20) Error in token buffer debug code.
   1230 
   1231     When token buffer debugging is selected via the pre-processor
   1232     symbol DEBUG_TOKENBUFFER there is an erroneous check in
   1233     AParser.cpp:
   1234 
   1235         #ifdef DEBUG_TOKENBUFFER
   1236             if (i >= inputTokens->bufferSize() ||
   1237                 inputTokens->minTokens() < LLk )     /* MR20 Was "<=" */
   1238         ...
   1239         #endif
   1240 
   1241     Reported by David Wigg (wiggjd sbu.ac.uk).
   1242 
   1243 #230. (Changed in MR20) Fixed problem with #define for -gd option
   1244 
   1245     There was an error in setting zzTRACE_RULES for the -gd (trace) option.
   1246 
   1247     Reported by Gary Funck (gary intrepid.com).
   1248 
   1249 #229. (Changed in MR20) Additional "const" for literals
   1250 
   1251     "const" was added to the token name literal table.
   1252     "const" was added to some panic() and similar routine
   1253 
   1254 #228. (Changed in MR20) dlg crashes on "()"
   1255 
   1256     The following token defintion will cause DLG to crash.
   1257 
   1258         #token "()"
   1259 
   1260     When there is a syntax error in a regular expression
   1261     many of the dlg routines return a structure which has
   1262     null pointers.  When this is accessed by callers it
   1263     generates the crash.
   1264 
   1265     I have attempted to fix the more common cases.
   1266 
   1267     Reported by  Mengue Olivier (dolmen bigfoot.com).
   1268 
   1269 #227. (Changed in MR20) Array overwrite
   1270 
   1271     Steveh Hand (sassth unx.sas.com) reported a problem which
   1272     was traced to a temporary array which was not properly
   1273     resized for deeply nested blocks.  This has been fixed.
   1274 
   1275 #226. (Changed in MR20) -pedantic conformance
   1276    
   1277     G. Hobbelt (i_a mbh.org) and THM made many, many minor 
   1278     changes to create prototypes for all the functions and
   1279     bring antlr, dlg, and sorcerer into conformance with
   1280     the gcc -pedantic option.
   1281 
   1282     This may require uses to add pccts/h/pcctscfg.h to some
   1283     files or makefiles in order to have __USE_PROTOS defined.
   1284 
   1285 #225  (Changed in MR20) AST stack adjustment in C mode
   1286 
   1287     The fix in #214 for AST stack adjustment in C mode missed 
   1288     some cases.
   1289 
   1290     Reported with fix by Ger Hobbelt (i_a mbh.org).
   1291 
   1292 #224  (Changed in MR20) LL(1) and LL(2) with #pragma approx
   1293 
   1294     This may take a record for the oldest, most trival, lexical
   1295     error in pccts.  The regular expressions for LL(1) and LL(2)
   1296     lacked an escape for the left and right parenthesis.
   1297 
   1298     Reported by Ger Hobbelt (i_a mbh.org).
   1299 
   1300 #223  (Changed in MR20) Addition of IBM_VISUAL_AGE directory
   1301 
   1302     Build files for antlr, dlg, and sorcerer under IBM Visual Age 
   1303     have been contributed by Anton Sergeev (ags mlc.ru).  They have
   1304     been placed in the pccts/IBM_VISUAL_AGE directory.
   1305 
   1306 #222  (Changed in MR20) Replace __STDC__ with __USE_PROTOS
   1307 
   1308     Most occurrences of __STDC__ replaced with __USE_PROTOS due to
   1309     complaints from several users.
   1310 
   1311 #221  (Changed in MR20) Added #include for DLexerBase.h to PBlackBox.
   1312 
   1313     Added #include for DLexerBase.h to PBlackBox.
   1314 
   1315 #220  (Changed in MR19) strcat arguments reversed in #pred parse
   1316 
   1317     The arguments to strcat are reversed when creating a print
   1318     name for a hash table entry for use with #pred feature.
   1319 
   1320     Problem diagnosed and fix reported by Scott Harrington 
   1321     (seh4 ix.netcom.com).
   1322 
   1323 #219. (Changed in MR19) C Mode routine zzfree_ast
   1324 
   1325     Changes to reduce use of recursion for AST trees with only right
   1326     links or only left links in the C mode routine zzfree_ast.
   1327 
   1328     Implemented by SAKAI Kiyotaka (ksakai isr.co.jp).
   1329 
   1330 #218. (Changed in MR19) Changes to support unsigned char in C mode
   1331 
   1332     Changes to antlr.h and err.h to fix omissions in use of zzchar_t
   1333 
   1334     Implemented by SAKAI Kiyotaka (ksakai isr.co.jp).
   1335 
   1336 #217. (Changed in MR19) Error message when dlg -i and -CC options selected
   1337     
   1338     *** This change was rescinded by item #257 ***
   1339 
   1340     The parsers generated by pccts in C++ mode are not able to support the
   1341     interactive lexer option (except, perhaps, when using the deferred fetch
   1342     parser option.(Item #216).
   1343 
   1344     DLG now warns when both -i and -CC are selected.
   1345 
   1346     This warning was suggested by David Venditti (07751870267-0001 t-online.de).
   1347 
   1348 #216. (Changed in MR19) Defer token fetch for C++ mode
   1349 
   1350     Implemented by Volker H. Simonis (simonis informatik.uni-tuebingen.de)
   1351 
   1352     Normally, pccts keeps the lookahead token buffer completely filled.
   1353     This requires max(k,ck) tokens of lookahead.  For some applications
   1354     this can cause deadlock problems.  For example, there may be cases
   1355     when the parser can't tell when the input has been completely consumed
   1356     until the parse is complete, but the parse can't be completed because 
   1357     the input routines are waiting for additional tokens to fill the
   1358     lookahead buffer.
   1359     
   1360     When the ANTLRParser class is built with the pre-processor option 
   1361     ZZDEFER_FETCH defined, the fetch of new tokens by consume() is deferred
   1362     until LA(i) or LT(i) is called. 
   1363 
   1364     To test whether this option has been built into the ANTLRParser class
   1365     use "isDeferFetchEnabled()".
   1366 
   1367     Using the -gd trace option with the default tracein() and traceout()
   1368     routines will defeat the effort to defer the fetch because the
   1369     trace routines print out information about the lookahead token at
   1370     the start of the rule.
   1371     
   1372     Because the tracein and traceout routines are virtual it is 
   1373     easy to redefine them in your parser:
   1374 
   1375         class MyParser {
   1376         <<
   1377             virtual void tracein(ANTLRChar * ruleName)
   1378                 { fprintf(stderr,"Entering: %s\n", ruleName); }
   1379             virtual void traceout(ANTLRChar * ruleName)
   1380                 { fprintf(stderr,"Leaving: %s\n", ruleName); }
   1381         >>
   1382  
   1383     The originals for those routines are pccts/h/AParser.cpp
   1384  
   1385     This requires use of the dlg option -i (interactive lexer).
   1386 
   1387     This is implemented only for C++ mode.
   1388 
   1389     This is experimental.  The interaction with guess mode (syntactic
   1390     predicates)is not known.
   1391 
   1392 #215. (Changed in MR19) Addition of reset() to DLGLexerBase
   1393 
   1394     There was no obvious way to reset the lexer for reuse.  The
   1395     reset() method now does this.
   1396 
   1397     Suggested by David Venditti (07751870267-0001 t-online.de).
   1398 
   1399 #214. (Changed in MR19)  C mode: Adjust AST stack pointer at exit
   1400 
   1401     In C mode the AST stack pointer needs to be reset if there will
   1402     be multiple calls to the ANTLRx macros.
   1403 
   1404     Reported with fix by Paul D. Smith (psmith baynetworks.com).
   1405 
   1406 #213. (Changed in MR18)  Fatal error with -mrhoistk (k>1 hoisting)
   1407 
   1408     When rearranging code I forgot to un-comment a critical line of
   1409     code that handles hoisting of predicates with k>1 lookahead.  This
   1410     is now fixed.
   1411 
   1412     Reported by Reinier van den Born (reinier vnet.ibm.com).
   1413 
   1414 #212. (Changed in MR17)  Mac related changes by Kenji Tanaka
   1415 
   1416     Kenji Tanaka (kentar osa.att.ne.jp) has made a number of changes for
   1417     Macintosh users.
   1418 
   1419     a.  The following Macintosh MPW files aid in installing pccts on Mac:
   1420 
   1421             pccts/MPW_Read_Me
   1422 
   1423             pccts/install68K.mpw
   1424             pccts/installPPC.mpw
   1425 
   1426             pccts/antlr/antlr.r
   1427             pccts/antlr/antlr68K.make
   1428             pccts/antlr/antlrPPC.make
   1429 
   1430             pccts/dlg/dlg.r
   1431             pccts/dlg/dlg68K.make
   1432             pccts/dlg/dlgPPC.make
   1433 
   1434             pccts/sorcerer/sor.r
   1435             pccts/sorcerer/sor68K.make
   1436             pccts/sorcerer/sorPPC.make
   1437     
   1438        They completely replace the previous Mac installation files.
   1439             
   1440     b. The most significant is a change in the MAC_FILE_CREATOR symbol
   1441        in pcctscfg.h:
   1442 
   1443         old: #define MAC_FILE_CREATOR 'MMCC'   /* Metrowerks C/C++ Text files */
   1444         new: #define MAC_FILE_CREATOR 'CWIE'   /* Metrowerks C/C++ Text files */
   1445 
   1446     c.  Added calls to special_fopen_actions() where necessary.
   1447 
   1448 #211. (Changed in MR16a)  C++ style comment in dlg
   1449 
   1450     This has been fixed.
   1451 
   1452 #210. (Changed in MR16a)  Sor accepts \r\n, \r, or \n for end-of-line
   1453 
   1454     A user requested that Sorcerer be changed to accept other forms
   1455     of end-of-line.
   1456 
   1457 #209. (Changed in MR16) Name of files changed.
   1458 
   1459         Old:  CHANGES_FROM_1.33
   1460         New:  CHANGES_FROM_133.txt
   1461 
   1462         Old:  KNOWN_PROBLEMS
   1463         New:  KNOWN_PROBLEMS.txt
   1464 
   1465 #208. (Changed in MR16) Change in use of pccts #include files
   1466 
   1467     There were problems with MS DevStudio when mixing Sorcerer and
   1468     PCCTS in the same source file.  The problem is caused by the
   1469     redefinition of setjmp in the MS header file setjmp.h.  In
   1470     setjmp.h the pre-processor symbol setjmp was redefined to be
   1471     _setjmp.  A later effort to execute #include <setjmp.h> resulted 
   1472     in an effort to #include <_setjmp.h>.  I'm not sure whether this
   1473     is a bug or a feature.  In any case, I decided to fix it by
   1474     avoiding the use of pre-processor symbols in #include statements
   1475     altogether.  This has the added benefit of making pre-compiled
   1476     headers work again.
   1477 
   1478     I've replaced statements:
   1479 
   1480         old: #include PCCTS_SETJMP_H
   1481         new: #include "pccts_setjmp.h"
   1482 
   1483     Where pccts_setjmp.h contains:
   1484 
   1485             #ifndef __PCCTS_SETJMP_H__
   1486             #define __PCCTS_SETJMP_H__
   1487     
   1488             #ifdef PCCTS_USE_NAMESPACE_STD
   1489             #include <Csetjmp>
   1490             #else
   1491             #include <setjmp.h>
   1492             #endif
   1493 
   1494             #endif
   1495         
   1496     A similar change has been made for other standard header files
   1497     required by pccts and sorcerer: stdlib.h, stdarg.h, stdio.h, etc.
   1498 
   1499     Reported by Jeff Vincent (JVincent novell.com) and Dale Davis
   1500     (DalDavis spectrace.com).
   1501 
   1502 #207. (Changed in MR16) dlg reports an invalid range for: [\0x00-\0xff]
   1503 
   1504      -----------------------------------------------------------------
   1505      Note from MR23:  This fix does not work.  I am investigating why.
   1506      -----------------------------------------------------------------
   1507 
   1508     dlg will report that this is an invalid range.
   1509 
   1510     Diagnosed by Piotr Eljasiak (eljasiak no-spam.zt.gdansk.tpsa.pl):
   1511 
   1512         I think this problem is not specific to unsigned chars
   1513         because dlg reports no error for the range [\0x00-\0xfe].
   1514 
   1515         I've found that information on range is kept in field
   1516         letter (unsigned char) of Attrib struct. Unfortunately
   1517         the letter value internally is for some reasons increased
   1518         by 1, so \0xff is represented here as 0.
   1519 
   1520         That's why dlg complains about the range [\0x00-\0xff] in
   1521         dlg_p.g:
   1522 
   1523         if ($$.letter > $2.letter) {
   1524           error("invalid range  ", zzline);
   1525         } 
   1526 
   1527     The fix is:
   1528 
   1529         if ($$.letter > $2.letter && 255 != $$2.letter) {
   1530           error("invalid range  ", zzline);
   1531         } 
   1532 
   1533 #206. (Changed in MR16) Free zzFAILtext in ANTLRParser destructor
   1534 
   1535     The ANTLRParser destructor now frees zzFAILtext.
   1536 
   1537     Problem and fix reported by Manfred Kogler (km cast.uni-linz.ac.at).
   1538 
   1539 #205. (Changed in MR16) DLGStringReset argument now const
   1540 
   1541     Changed: void DLGStringReset(DLGChar *s) {...}
   1542     To:      void DLGStringReset(const DLGChar *s) {...}
   1543 
   1544     Suggested by Dale Davis (daldavis spectrace.com)
   1545 
   1546 #204. (Changed in MR15a) Change __WATCOM__ to __WATCOMC__ in pcctscfg.h
   1547     
   1548     Reported by Oleg Dashevskii (olegdash my-dejanews.com).
   1549 
   1550 #203. (Changed in MR15) Addition of sorcerer to distribution kit
   1551 
   1552     I have finally caved in to popular demand.  The pccts 1.33mr15
   1553     kit will include sorcerer.  The separate sorcerer kit will be
   1554     discontinued.
   1555 
   1556 #202. (Changed) in MR15) Organization of MS Dev Studio Projects in Kit
   1557 
   1558     Previously there was one workspace that contained projects for
   1559     all three parts of pccts: antlr, dlg, and sorcerer.  Now each
   1560     part (and directory) has its own workspace/project and there
   1561     is an additional workspace/project to build a library from the
   1562     .cpp files in the pccts/h directory.
   1563 
   1564     The library build will create pccts_debug.lib or pccts_release.lib
   1565     according to the configuration selected.  
   1566 
   1567     If you don't want to build pccts 1.33MR15 you can download a
   1568     ready-to-run kit for win32 from http://www.polhode.com/win32.zip.
   1569     The ready-to-run for win32 includes executables, a pre-built static
   1570     library for the .cpp files in the pccts/h directory, and a  sample
   1571     application
   1572 
   1573     You will need to define the environment variable PCCTS to point to
   1574     the root of the pccts directory hierarchy.
   1575 
   1576 #201. (Changed in MR15) Several fixes by K.J. Cummings (cummings peritus.com)
   1577 
   1578       Generation of SETJMP rather than SETJMP_H in gen.c.
   1579 
   1580       (Sor B19) Declaration of ref_vars_inits for ref_var_inits in
   1581       pccts/sorcerer/sorcerer.h.
   1582 
   1583 #200. (Changed in MR15) Remove operator=() in AToken.h
   1584 
   1585       User reported that WatCom couldn't handle use of
   1586       explicit operator =().  Replace with equivalent
   1587       using cast operator.
   1588 
   1589 #199. (Changed in MR15) Don't allow use of empty #tokclass
   1590 
   1591       Change antlr.g to disallow empty #tokclass sets.
   1592 
   1593       Reported by Manfred Kogler (km cast.uni-linz.ac.at).
   1594 
   1595 #198. Revised ANSI C grammar due to efforts by Manuel Kessler
   1596 
   1597       Manuel Kessler (mlkessler cip.physik.uni-wuerzburg.de)
   1598 
   1599           Allow trailing ... in function parameter lists.
   1600           Add bit fields.
   1601           Allow old-style function declarations.
   1602           Support cv-qualified pointers.
   1603           Better checking of combinations of type specifiers.
   1604           Release of memory for local symbols on scope exit.
   1605           Allow input file name on command line as well as by redirection.
   1606 
   1607               and other miscellaneous tweaks.
   1608 
   1609       This is not part of the pccts distribution kit. It must be
   1610       downloaded separately from:
   1611 
   1612             http://www.polhode.com/ansi_mr15.zip
   1613 
   1614 #197. (Changed in MR14) Resetting the lookahead buffer of the parser
   1615 
   1616       Explanation and fix by Sinan Karasu (sinan.karasu boeing.com)
   1617 
   1618       Consider the code used to prime the lookahead buffer LA(i)
   1619       of the parser when init() is called:
   1620 
   1621         void
   1622         ANTLRParser::
   1623         prime_lookahead()
   1624         {
   1625             int i;
   1626             for(i=1;i<=LLk; i++) consume();
   1627             dirty=0;
   1628             //lap = 0;      // MR14 - Sinan Karasu (sinan.karusu boeing.com)
   1629             //labase = 0;   // MR14
   1630             labase=lap;     // MR14
   1631         }
   1632 
   1633       When the parser is instantiated, lap=0,labase=0 is set.
   1634 
   1635       The "for" loop runs LLk times. In consume(), lap = lap +1 (mod LLk) is
   1636       computed.  Therefore, lap(before the loop) == lap (after the loop).
   1637 
   1638       Now the only problem comes in when one does an init() of the parser
   1639       after an Eof has been seen. At that time, lap could be non zero.
   1640       Assume it was lap==1. Now we do a prime_lookahead(). If LLk is 2,
   1641       then
   1642 
   1643         consume()
   1644         {
   1645             NLA = inputTokens->getToken()->getType();
   1646             dirty--;
   1647             lap = (lap+1)&(LLk-1);
   1648         }
   1649 
   1650       or expanding NLA,
   1651 
   1652         token_type[lap&(LLk-1)]) = inputTokens->getToken()->getType();
   1653         dirty--;
   1654         lap = (lap+1)&(LLk-1);
   1655 
   1656       so now we prime locations 1 and 2.  In prime_lookahead it used to set
   1657       lap=0 and labase=0.  Now, the next token will be read from location 0,
   1658       NOT 1 as it should have been.
   1659 
   1660       This was never caught before, because if a parser is just instantiated,
   1661       then lap and labase are 0, the offending assignment lines are
   1662       basically no-ops, since the for loop wraps around back to 0.
   1663 
   1664 #196. (Changed in MR14) Problems with "(alpha)? beta" guess
   1665 
   1666     Consider the following syntactic predicate in a grammar
   1667     with 2 tokens of lookahead (k=2 or ck=2):
   1668 
   1669         rule  : ( alpha )? beta ;
   1670         alpha : S t ;
   1671         t     : T U
   1672               | T
   1673               ;
   1674         beta  : S t Z ;
   1675 
   1676     When antlr computes the prediction expression with one token
   1677     of lookahead for alts 1 and 2 of rule t it finds an ambiguity.
   1678 
   1679     Because the grammar has a lookahead of 2 it tries to compute
   1680     two tokens of lookahead for alts 1 and 2 of t.  Alt 1 clearly
   1681     has a lookahead of (T U).  Alt 2 is one token long so antlr
   1682     tries to compute the follow set of alt 2, which means finding
   1683     the things which can follow rule t in the context of (alpha)?.
   1684     This cannot be computed, because alpha is only part of a rule,
   1685     and antlr can't tell what part of beta is matched by alpha and
   1686     what part remains to be matched.  Thus it impossible for antlr
   1687     to  properly determine the follow set of rule t.
   1688 
   1689     Prior to 1.33MR14, the follow of (alpha)? was computed as
   1690     FIRST(beta) as a result of the internal representation of
   1691     guess blocks.
   1692 
   1693     With MR14 the follow set will be the empty set for that context.
   1694 
   1695     Normally, one expects a rule appearing in a guess block to also
   1696     appear elsewhere.  When the follow context for this other use
   1697     is "ored" with the empty set, the context from the other use
   1698     results, and a reasonable follow context results.  However if
   1699     there is *no* other use of the rule, or it is used in a different
   1700     manner then the follow context will be inaccurate - it was
   1701     inaccurate even before MR14, but it will be inaccurate in a
   1702     different way.
   1703 
   1704     For the example given earlier, a reasonable way to rewrite the
   1705     grammar:
   1706 
   1707         rule  : ( alpha )? beta
   1708         alpha : S t ;
   1709         t     : T U
   1710               | T
   1711               ;
   1712         beta  : alpha Z ;
   1713 
   1714     If there are no other uses of the rule appearing in the guess
   1715     block it will generate a test for EOF - a workaround for
   1716     representing a null set in the lookahead tests.
   1717 
   1718     If you encounter such a problem you can use the -alpha option
   1719     to get additional information:
   1720 
   1721     line 2: error: not possible to compute follow set for alpha
   1722               in an "(alpha)? beta" block.
   1723 
   1724     With the antlr -alpha command line option the following information
   1725     is inserted into the generated file:
   1726 
   1727     #if 0
   1728 
   1729       Trace of references leading to attempt to compute the follow set of
   1730       alpha in an "(alpha)? beta" block. It is not possible for antlr to
   1731       compute this follow set because it is not known what part of beta has
   1732       already been matched by alpha and what part remains to be matched.
   1733 
   1734       Rules which make use of the incorrect follow set will also be incorrect
   1735 
   1736          1 #token T              alpha/2   line 7     brief.g
   1737          2 end alpha             alpha/3   line 8     brief.g
   1738          2 end (...)? block at   start/1   line 2     brief.g
   1739 
   1740     #endif
   1741 
   1742     At the moment, with the -alpha option selected the program marks
   1743     any rules which appear in the trace back chain (above) as rules with
   1744     possible problems computing follow set.
   1745 
   1746     Reported by Greg Knapen (gregory.knapen bell.ca).
   1747 
   1748 #195. (Changed in MR14) #line directive not at column 1
   1749 
   1750       Under certain circunstances a predicate test could generate
   1751       a #line directive which was not at column 1.
   1752 
   1753       Reported with fix by David Kgedal  (davidk lysator.liu.se)
   1754       (http://www.lysator.liu.se/~davidk/).
   1755 
   1756 #194. (Changed in MR14) (C Mode only) Demand lookahead with #tokclass
   1757 
   1758       In C mode with the demand lookahead option there is a bug in the
   1759       code which handles matches for #tokclass (zzsetmatch and
   1760       zzsetmatch_wsig).
   1761 
   1762       The bug causes the lookahead pointer to get out of synchronization
   1763       with the current token pointer.
   1764 
   1765       The problem was reported with a fix by Ger Hobbelt (hobbelt axa.nl).
   1766 
   1767 #193. (Changed in MR14) Use of PCCTS_USE_NAMESPACE_STD
   1768 
   1769       The pcctscfg.h now contains the following definitions:
   1770 
   1771         #ifdef PCCTS_USE_NAMESPACE_STD
   1772         #define PCCTS_STDIO_H     <Cstdio>
   1773         #define PCCTS_STDLIB_H    <Cstdlib>
   1774         #define PCCTS_STDARG_H    <Cstdarg>
   1775         #define PCCTS_SETJMP_H    <Csetjmp>
   1776         #define PCCTS_STRING_H    <Cstring>
   1777         #define PCCTS_ASSERT_H    <Cassert>
   1778         #define PCCTS_ISTREAM_H   <istream>
   1779         #define PCCTS_IOSTREAM_H  <iostream>
   1780         #define PCCTS_NAMESPACE_STD     namespace std {}; using namespace std;
   1781         #else
   1782         #define PCCTS_STDIO_H     <stdio.h>
   1783         #define PCCTS_STDLIB_H    <stdlib.h>
   1784         #define PCCTS_STDARG_H    <stdarg.h>
   1785         #define PCCTS_SETJMP_H    <setjmp.h>
   1786         #define PCCTS_STRING_H    <string.h>
   1787         #define PCCTS_ASSERT_H    <assert.h>
   1788         #define PCCTS_ISTREAM_H   <istream.h>
   1789         #define PCCTS_IOSTREAM_H  <iostream.h>
   1790         #define PCCTS_NAMESPACE_STD
   1791         #endif
   1792 
   1793       The runtime support in pccts/h uses these pre-processor symbols
   1794       consistently.
   1795 
   1796       Also, antlr and dlg have been changed to generate code which uses
   1797       these pre-processor symbols rather than having the names of the
   1798       #include files hard-coded in the generated code.
   1799 
   1800       This required the addition of "#include pcctscfg.h" to a number of
   1801       files in pccts/h.
   1802 
   1803       It appears that this sometimes causes problems for MSVC 5 in
   1804       combination with the "automatic" option for pre-compiled headers.
   1805       In such cases disable the "automatic" pre-compiled headers option.
   1806 
   1807       Suggested by Hubert Holin (Hubert.Holin Bigfoot.com).
   1808 
   1809 #192. (Changed in MR14) Change setText() to accept "const ANTLRChar *"
   1810 
   1811       Changed ANTLRToken::setText(ANTLRChar *) to setText(const ANTLRChar *).
   1812       This allows literal strings to be used to initialize tokens.  Since
   1813       the usual token implementation (ANTLRCommonToken)  makes a copy of the
   1814       input string, this was an unnecessary limitation.
   1815 
   1816       Suggested by Bob McWhirter (bob netwrench.com).
   1817 
   1818 #191. (Changed in MR14) HP/UX aCC compiler compatibility problem
   1819 
   1820       Needed to explicitly declare zzINF_DEF_TOKEN_BUFFER_SIZE and
   1821       zzINF_BUFFER_TOKEN_CHUNK_SIZE as ints in pccts/h/AParser.cpp.
   1822 
   1823       Reported by David Cook (dcook bmc.com).
   1824 
   1825 #190. (Changed in MR14) IBM OS/2 CSet compiler compatibility problem
   1826 
   1827       Name conflict with "_cs" in pccts/h/ATokenBuffer.cpp
   1828 
   1829       Reported by David Cook (dcook bmc.com).
   1830 
   1831 #189. (Changed in MR14) -gxt switch in C mode
   1832 
   1833       The -gxt switch in C mode didn't work because of incorrect
   1834       initialization.
   1835 
   1836       Reported by Sinan Karasu (sinan boeing.com).
   1837 
   1838 #188. (Changed in MR14) Added pccts/h/DLG_stream_input.h
   1839 
   1840       This is a DLG stream class based on C++ istreams.
   1841 
   1842       Contributed by Hubert Holin (Hubert.Holin Bigfoot.com).
   1843 
   1844 #187. (Changed in MR14) Rename config.h to pcctscfg.h
   1845 
   1846       The PCCTS configuration file has been renamed from config.h to
   1847       pcctscfg.h.  The problem with the original name is that it led
   1848       to name collisions when pccts parsers were combined with other
   1849       software.
   1850 
   1851       All of the runtime support routines in pccts/h/* have been
   1852       changed to use the new name.  Existing software can continue
   1853       to use pccts/h/config.h. The contents of pccts/h/config.h is
   1854       now just "#include "pcctscfg.h".
   1855 
   1856       I don't have a record of the user who suggested this.
   1857 
   1858 #186. (Changed in MR14) Pre-processor symbol DllExportPCCTS class modifier
   1859 
   1860       Classes in the C++ runtime support routines are now declared:
   1861 
   1862         class DllExportPCCTS className ....
   1863 
   1864       By default, the pre-processor symbol is defined as the empty
   1865       string.  This if for use by MSVC++ users to create DLL classes.
   1866 
   1867       Suggested by Manfred Kogler (km cast.uni-linz.ac.at).
   1868 
   1869 #185. (Changed in MR14) Option to not use PCCTS_AST base class for ASTBase
   1870 
   1871       Normally, the ASTBase class is derived from PCCTS_AST which contains
   1872       functions useful to Sorcerer.  If these are not necessary then the
   1873       user can define the pre-processor symbol "PCCTS_NOT_USING_SOR" which
   1874       will cause the ASTBase class to replace references to PCCTS_AST with
   1875       references to ASTBase where necessary.
   1876 
   1877       The class ASTDoublyLinkedBase will contain a pure virtual function
   1878       shallowCopy() that was formerly defined in class PCCTS_AST.
   1879 
   1880       Suggested by Bob McWhirter (bob netwrench.com).
   1881 
   1882 #184. (Changed in MR14) Grammars with no tokens generate invalid tokens.h
   1883 
   1884       Reported by Hubert Holin (Hubert.Holin bigfoot.com).
   1885 
   1886 #183. (Changed in MR14) -f to specify file with names of grammar files
   1887 
   1888       In DEC/VMS it is difficult to specify very long command lines.
   1889       The -f option allows one to place the names of the grammar files
   1890       in a data file in order to bypass limitations of the DEC/VMS
   1891       command language interpreter.
   1892 
   1893       Addition supplied by Bernard Giroud (b_giroud decus.ch).
   1894 
   1895 #182. (Changed in MR14) Output directory option for DEC/VMS
   1896 
   1897       Fix some problems with the -o option under DEC/VMS.
   1898 
   1899       Fix supplied by Bernard Giroud (b_giroud decus.ch).
   1900 
   1901 #181. (Changed in MR14) Allow chars > 127 in DLGStringInput::nextChar()
   1902 
   1903       Changed DLGStringInput to cast the character using (unsigned char)
   1904       so that languages with character codes greater than 127 work
   1905       without changes.
   1906 
   1907       Suggested by Manfred Kogler (km cast.uni-linz.ac.at).
   1908 
   1909 #180. (Added in MR14) ANTLRParser::getEofToken()
   1910 
   1911       Added "ANTLRToken ANTLRParser::getEofToken() const" to match the
   1912       setEofToken routine.
   1913 
   1914       Requested by Manfred Kogler (km cast.uni-linz.ac.at).
   1915 
   1916 #179. (Fixed in MR14) Memory leak for BufFileInput subclass of DLGInputStream
   1917 
   1918       The BufFileInput class described in Item #142 neglected to release
   1919       the allocated buffer when an instance was destroyed.
   1920 
   1921       Reported by Manfred Kogler (km cast.uni-linz.ac.at).
   1922 
   1923 #178. (Fixed in MR14) Bug in "(alpha)? beta" guess blocks first sets
   1924 
   1925       In 1.33 vanilla, and all maintenance releases prior to MR14
   1926       there is a bug in the handling of guess blocks which use the
   1927       "long" form:
   1928 
   1929                   (alpha)? beta
   1930 
   1931       inside a (...)*, (...)+, or {...} block.
   1932 
   1933       This problem does *not* apply to the case where beta is omitted
   1934       or when the syntactic predicate is on the leading edge of an
   1935       alternative.
   1936 
   1937       The problem is that both alpha and beta are stored in the
   1938       syntax diagram, and that some analysis routines would fail
   1939       to skip the alpha portion when it was not on the leading edge.
   1940       Consider the following grammar with -ck 2:
   1941 
   1942                 r : ( (A)? B )* C D
   1943 
   1944                   | A B      /* forces -ck 2 computation for old antlr    */
   1945                              /*              reports ambig for alts 1 & 2 */
   1946 
   1947                   | B C      /* forces -ck 2 computation for new antlr    */
   1948                              /*              reports ambig for alts 1 & 3 */
   1949                   ;
   1950 
   1951       The prediction expression for the first alternative should be
   1952       LA(1)={B C} LA(2)={B C D}, but previous versions of antlr
   1953       would compute the prediction expression as LA(1)={A C} LA(2)={B D}
   1954 
   1955       Reported by Arpad Beszedes (beszedes inf.u-szeged.hu) who provided
   1956       a very clear example of the problem and identified the probable cause.
   1957 
   1958 #177. (Changed in MR14) #tokdefs and #token with regular expression
   1959 
   1960       In MR13 the change described by Item #162 caused an existing
   1961       feature of antlr to fail.  Prior to the change it was possible
   1962       to give regular expression definitions and actions to tokens
   1963       which were defined via the #tokdefs directive.
   1964 
   1965       This now works again.
   1966 
   1967       Reported by Manfred Kogler (km cast.uni-linz.ac.at).
   1968 
   1969 #176. (Changed in MR14) Support for #line in antlr source code
   1970 
   1971       Note: this was implemented by Arpad Beszedes (beszedes inf.u-szeged.hu).
   1972 
   1973       In 1.33MR14 it is possible for a pre-processor to generate #line
   1974       directives in the antlr source and have those line numbers and file
   1975       names used in antlr error messages and in the #line directives
   1976       generated by antlr.
   1977 
   1978       The #line directive may appear in the following forms:
   1979 
   1980             #line ll "sss" xx xx ...
   1981 
   1982       where ll represents a line number, "sss" represents the name of a file
   1983       enclosed in quotation marks, and xxx are arbitrary integers.
   1984 
   1985       The following form (without "line") is not supported at the moment:
   1986 
   1987             # ll "sss" xx xx ...
   1988 
   1989       The result:
   1990 
   1991         zzline
   1992 
   1993             is replaced with ll from the # or #line directive
   1994 
   1995         FileStr[CurFile]
   1996 
   1997             is updated with the contents of the string (if any)
   1998             following the line number
   1999 
   2000       Note
   2001       ----
   2002       The file-name string following the line number can be a complete
   2003       name with a directory-path. Antlr generates the output files from
   2004       the input file name (by replacing the extension from the file-name
   2005       with .c or .cpp).
   2006 
   2007       If the input file (or the file-name from the line-info) contains
   2008       a path:
   2009 
   2010         "../grammar.g"
   2011 
   2012       the generated source code will be placed in "../grammar.cpp" (i.e.
   2013       in the parent directory).  This is inconvenient in some cases
   2014       (even the -o switch can not be used) so the path information is
   2015       removed from the #line directive.  Thus, if the line-info was
   2016 
   2017         #line 2 "../grammar.g"
   2018 
   2019       then the current file-name will become "grammar.g"
   2020 
   2021       In this way, the generated source code according to the grammar file
   2022       will always be in the current directory, except when the -o switch
   2023       is used.
   2024 
   2025 #175. (Changed in MR14) Bug when guess block appears at start of (...)*
   2026 
   2027       In 1.33 vanilla and all maintenance releases prior to 1.33MR14
   2028       there is a bug when a guess block appears at the start of a (...)+.
   2029       Consider the following k=1 (ck=1) grammar:
   2030 
   2031             rule :
   2032                   ( (STAR)? ZIP )* ID ;
   2033 
   2034       Prior to 1.33MR14, the generated code resembled:
   2035 
   2036         ...
   2037         zzGUESS_BLOCK
   2038         while ( 1 ) {
   2039             if ( ! LA(1)==STAR) break;
   2040             zzGUESS
   2041             if ( !zzrv ) {
   2042                 zzmatch(STAR);
   2043                 zzCONSUME;
   2044                 zzGUESS_DONE
   2045                 zzmatch(ZIP);
   2046                 zzCONSUME;
   2047             ...
   2048 
   2049       Note that the routine uses STAR for the prediction expression
   2050       rather than ZIP.  With 1.33MR14 the generated code resembles:
   2051 
   2052         ...
   2053         while ( 1 ) {
   2054             if ( ! LA(1)==ZIP) break;
   2055         ...
   2056 
   2057       This problem existed only with (...)* blocks and was caused
   2058       by the slightly more complicated graph which represents (...)*
   2059       blocks.  This caused the analysis routine to compute the first
   2060       set for the alpha part of the "(alpha)? beta" rather than the
   2061       beta part.
   2062 
   2063       Both (...)+ and {...} blocks handled the guess block correctly.
   2064 
   2065       Reported by Arpad Beszedes (beszedes inf.u-szeged.hu) who provided
   2066       a very clear example of the problem and identified the probable cause.
   2067 
   2068 #174. (Changed in MR14) Bug when action precedes syntactic predicate
   2069 
   2070       In 1.33 vanilla, and all maintenance releases prior to MR14,
   2071       there was a bug when a syntactic predicate was immediately
   2072       preceded by an action.  Consider the following -ck 2 grammar:
   2073 
   2074             rule :
   2075                    <<int i;>>
   2076                    (alpha)? beta C
   2077                  | A B
   2078                  ;
   2079 
   2080             alpha : A ;
   2081             beta  : A B;
   2082 
   2083       Prior to MR14, the code generated for the first alternative
   2084       resembled:
   2085 
   2086         ...
   2087         zzGUESS
   2088         if ( !zzrv && LA(1)==A && LA(2)==A) {
   2089             alpha();
   2090             zzGUESS_DONE
   2091             beta();
   2092             zzmatch(C);
   2093             zzCONSUME;
   2094         } else {
   2095         ...
   2096 
   2097       The prediction expression (i.e. LA(1)==A && LA(2)==A) is clearly
   2098       wrong because LA(2) should be matched to B (first[2] of beta is {B}).
   2099 
   2100       With 1.33MR14 the prediction expression is:
   2101 
   2102         ...
   2103         if ( !zzrv && LA(1)==A && LA(2)==B) {
   2104             alpha();
   2105             zzGUESS_DONE
   2106             beta();
   2107             zzmatch(C);
   2108             zzCONSUME;
   2109         } else {
   2110         ...
   2111 
   2112       This will only affect users in which alpha is shorter than
   2113       than max(k,ck) and there is an action immediately preceding
   2114       the syntactic predicate.
   2115 
   2116       This problem was reported by reported by Arpad Beszedes
   2117       (beszedes inf.u-szeged.hu) who provided a very clear example
   2118       of the problem and identified the presence of the init-action
   2119       as the likely culprit.
   2120 
   2121 #173. (Changed in MR13a) -glms for Microsoft style filenames with -gl
   2122 
   2123       With the -gl option antlr generates #line directives using the
   2124       exact name of the input files specified on the command line.
   2125       An oddity of the Microsoft C and C++ compilers is that they
   2126       don't accept file names in #line directives containing "\"
   2127       even though these are names from the native file system.
   2128 
   2129       With -glms option, the "\" in file names appearing in #line
   2130       directives is replaced with a "/" in order to conform to
   2131       Microsoft compiler requirements.
   2132 
   2133       Reported by Erwin Achermann (erwin.achermann switzerland.org).
   2134 
   2135 #172. (Changed in MR13) \r\n in antlr source counted as one line
   2136 
   2137       Some MS software uses \r\n to indicate a new line.  Antlr
   2138       now recognizes this in counting lines.
   2139 
   2140       Reported by Edward L. Hepler (elh ece.vill.edu).
   2141 
   2142 #171. (Changed in MR13) #tokclass L..U now allowed
   2143 
   2144       The following is now allowed:
   2145 
   2146             #tokclass ABC { A..B C }
   2147 
   2148       Reported by Dave Watola (dwatola amtsun.jpl.nasa.gov)
   2149 
   2150 #170. (Changed in MR13) Suppression for predicates with lookahead depth >1
   2151 
   2152       In MR12 the capability for suppression of predicates with lookahead
   2153       depth=1 was introduced.  With MR13 this had been extended to
   2154       predicates with lookahead depth > 1 and released for use by users
   2155       on an experimental basis.
   2156 
   2157       Consider the following grammar with -ck 2 and the predicate in rule
   2158       "a" with depth 2:
   2159 
   2160             r1  : (ab)* "@"
   2161                 ;
   2162 
   2163             ab  : a
   2164                 | b
   2165                 ;
   2166 
   2167             a   : (A B)? => <<p(LATEXT(2))>>? A B C
   2168                 ;
   2169 
   2170             b   : A B C
   2171                 ;
   2172 
   2173       Normally, the predicate would be hoisted into rule r1 in order to
   2174       determine whether to call rule "ab".  However it should *not* be
   2175       hoisted because, even if p is false, there is a valid alternative
   2176       in rule b.  With "-mrhoistk on" the predicate will be suppressed.
   2177 
   2178       If "-info p" command line option is present the following information
   2179       will appear in the generated code:
   2180 
   2181                 while ( (LA(1)==A)
   2182         #if 0
   2183 
   2184         Part (or all) of predicate with depth > 1 suppressed by alternative
   2185             without predicate
   2186 
   2187         pred  <<  p(LATEXT(2))>>?
   2188                   depth=k=2  ("=>" guard)  rule a  line 8  t1.g
   2189           tree context:
   2190             (root = A
   2191                B
   2192             )
   2193 
   2194         The token sequence which is suppressed: ( A B )
   2195         The sequence of references which generate that sequence of tokens:
   2196 
   2197            1 to ab          r1/1       line 1     t1.g
   2198            2 ab             ab/1       line 4     t1.g
   2199            3 to b           ab/2       line 5     t1.g
   2200            4 b              b/1        line 11    t1.g
   2201            5 #token A       b/1        line 11    t1.g
   2202            6 #token B       b/1        line 11    t1.g
   2203 
   2204         #endif
   2205 
   2206       A slightly more complicated example:
   2207 
   2208             r1  : (ab)* "@"
   2209                 ;
   2210 
   2211             ab  : a
   2212                 | b
   2213                 ;
   2214 
   2215             a   : (A B)? => <<p(LATEXT(2))>>? (A  B | D E)
   2216                 ;
   2217 
   2218             b   : <<q(LATEXT(2))>>? D E
   2219                 ;
   2220 
   2221 
   2222       In this case, the sequence (D E) in rule "a" which lies behind
   2223       the guard is used to suppress the predicate with context (D E)
   2224       in rule b.
   2225 
   2226                 while ( (LA(1)==A || LA(1)==D)
   2227             #if 0
   2228 
   2229             Part (or all) of predicate with depth > 1 suppressed by alternative
   2230                 without predicate
   2231 
   2232             pred  <<  q(LATEXT(2))>>?
   2233                               depth=k=2  rule b  line 11  t2.g
   2234               tree context:
   2235                 (root = D
   2236                    E
   2237                 )
   2238 
   2239             The token sequence which is suppressed: ( D E )
   2240             The sequence of references which generate that sequence of tokens:
   2241 
   2242                1 to ab          r1/1       line 1     t2.g
   2243                2 ab             ab/1       line 4     t2.g
   2244                3 to a           ab/1       line 4     t2.g
   2245                4 a              a/1        line 8     t2.g
   2246                5 #token D       a/1        line 8     t2.g
   2247                6 #token E       a/1        line 8     t2.g
   2248 
   2249             #endif
   2250             &&
   2251             #if 0
   2252 
   2253             pred  <<  p(LATEXT(2))>>?
   2254                               depth=k=2  ("=>" guard)  rule a  line 8  t2.g
   2255               tree context:
   2256                 (root = A
   2257                    B
   2258                 )
   2259 
   2260             #endif
   2261 
   2262             (! ( LA(1)==A && LA(2)==B ) || p(LATEXT(2)) )  {
   2263                 ab();
   2264                 ...
   2265 
   2266 #169. (Changed in MR13) Predicate test optimization for depth=1 predicates
   2267 
   2268       When the MR12 generated a test of a predicate which had depth 1
   2269       it would use the depth >1 routines, resulting in correct but
   2270       inefficient behavior.  In MR13, a bit test is used.
   2271 
   2272 #168. (Changed in MR13) Token expressions in context guards
   2273 
   2274       The token expressions appearing in context guards such as:
   2275 
   2276             (A B)? => <<test(LT(1))>>?  someRule
   2277 
   2278       are computed during an early phase of antlr processing.  As
   2279       a result, prior to MR13, complex expressions such as:
   2280 
   2281             ~B
   2282             L..U
   2283             ~L..U
   2284             TokClassName
   2285             ~TokClassName
   2286 
   2287       were not computed properly.  This resulted in incorrect
   2288       context being computed for such expressions.
   2289 
   2290       In MR13 these context guards are verified for proper semantics
   2291       in the initial phase and then re-evaluated after complex token
   2292       expressions have been computed in order to produce the correct
   2293       behavior.
   2294 
   2295       Reported by Arpad Beszedes (beszedes inf.u-szeged.hu).
   2296 
   2297 #167. (Changed in MR13) ~L..U
   2298 
   2299       Prior to MR13, the complement of a token range was
   2300       not properly computed.
   2301 
   2302 #166. (Changed in MR13) token expression L..U
   2303 
   2304       The token U was represented as an unsigned char, restricting
   2305       the use of L..U to cases where U was assigned a token number
   2306       less than 256.  This is corrected in MR13.
   2307 
   2308 #165. (Changed in MR13) option -newAST
   2309 
   2310       To create ASTs from an ANTLRTokenPtr antlr usually calls
   2311       "new AST(ANTLRTokenPtr)".  This option generates a call
   2312       to "newAST(ANTLRTokenPtr)" instead.  This allows a user
   2313       to define a parser member function to create an AST object.
   2314 
   2315       Similar changes for ASTBase::tmake and ASTBase::link were not
   2316       thought necessary since they do not create AST objects, only
   2317       use existing ones.
   2318 
   2319 #164. (Changed in MR13) Unused variable _astp
   2320 
   2321       For many compilations, we have lived with warnings about
   2322       the unused variable _astp.  It turns out that this varible
   2323       can *never* be used because the code which references it was
   2324       commented out.
   2325 
   2326       This investigation was sparked by a note from Erwin Achermann
   2327       (erwin.achermann switzerland.org).
   2328 
   2329 #163. (Changed in MR13) Incorrect makefiles for testcpp examples
   2330 
   2331       All the examples in pccts/testcpp/* had incorrect definitions
   2332       in the makefiles for the symbol "CCC".  Instead of CCC=CC they
   2333       had CC=$(CCC).
   2334 
   2335       There was an additional problem in testcpp/1/test.g due to the
   2336       change in ANTLRToken::getText() to a const member function
   2337       (Item #137).
   2338 
   2339       Reported by Maurice Mass (maas cuci.nl).
   2340 
   2341 #162. (Changed in MR13) Combining #token with #tokdefs
   2342 
   2343       When it became possible to change the print-name of a
   2344       #token (Item #148) it became useful to give a #token
   2345       statement whose only purpose was to giving a print name
   2346       to the #token.  Prior to this change this could not be
   2347       combined with the #tokdefs feature.
   2348 
   2349 #161. (Changed in MR13) Switch -gxt inhibits generation of tokens.h
   2350 
   2351 #160. (Changed in MR13) Omissions in list of names for remap.h
   2352 
   2353       When a user selects the -gp option antlr creates a list
   2354       of macros in remap.h to rename some of the standard
   2355       antlr routines from zzXXX to userprefixXXX.
   2356 
   2357       There were number of omissions from the remap.h name
   2358       list related to the new trace facility.  This was reported,
   2359       along with a fix, by Bernie Solomon (bernard ug.eds.com).
   2360 
   2361 #159. (Changed in MR13) Violations of classic C rules
   2362 
   2363       There were a number of violations of classic C style in
   2364       the distribution kit.  This was reported, along with fixes,
   2365       by Bernie Solomon (bernard ug.eds.com).
   2366 
   2367 #158. (Changed in MR13) #header causes problem for pre-processors
   2368 
   2369       A user who runs the C pre-processor on antlr source suggested
   2370       that another syntax be allowed.  With MR13 such directives
   2371       such as #header, #pragma, etc. may be written as "\#header",
   2372       "\#pragma", etc.  For escaping pre-processor directives inside
   2373       a #header use something like the following:
   2374 
   2375             \#header
   2376             <<
   2377                 \#include <stdio.h>
   2378             >>
   2379 
   2380 #157. (Fixed in MR13) empty error sets for rules with infinite recursion
   2381 
   2382       When the first set for a rule cannot be computed due to infinite
   2383       left recursion and it is the only alternative for a block then
   2384       the error set for the block would be empty.  This would result
   2385       in a fatal error.
   2386 
   2387       Reported by Darin Creason (creason genedax.com)
   2388 
   2389 #156. (Changed in MR13) DLGLexerBase::getToken() now public
   2390 
   2391 #155. (Changed in MR13) Context behind predicates can suppress
   2392 
   2393       With -mrhoist enabled the context behind a guarded predicate can
   2394       be used to suppress other predicates.  Consider the following grammar:
   2395 
   2396         r0 : (r1)+;
   2397 
   2398         r1  : rp
   2399             | rq
   2400             ;
   2401         rp  : <<p LATEXT(1)>>? B ;
   2402         rq : (A)? => <<q LATEXT(1)>>? (A|B);
   2403 
   2404       In earlier versions both predicates "p" and "q" would be hoisted into
   2405       rule r0. With MR12c predicate p is suppressed because the context which
   2406       follows predicate q includes "B" which can "cover" predicate "p".  In
   2407       other words, in trying to decide in r0 whether to call r1, it doesn't
   2408       really matter whether p is false or true because, either way, there is
   2409       a valid choice within r1.
   2410 
   2411 #154. (Changed in MR13) Making hoist suppression explicit using <<nohoist>>
   2412 
   2413       A common error, even among experienced pccts users, is to code
   2414       an init-action to inhibit hoisting rather than a leading action.
   2415       An init-action does not inhibit hoisting.
   2416 
   2417       This was coded:
   2418 
   2419         rule1 : <<;>> rule2
   2420 
   2421       This is what was meant:
   2422 
   2423         rule1 : <<;>> <<;>> rule2
   2424 
   2425       With MR13, the user can code:
   2426 
   2427         rule1 : <<;>> <<nohoist>> rule2
   2428 
   2429       The following will give an error message:
   2430 
   2431         rule1 : <<nohoist>> rule2
   2432 
   2433       If the <<nohoist>> appears as an init-action rather than a leading
   2434       action an error message is issued.  The meaning of an init-action
   2435       containing "nohoist" is unclear: does it apply to just one
   2436       alternative or to all alternatives ?
   2437 
   2438 
   2439 
   2440 
   2441 
   2442 
   2443 
   2444 
   2445         -------------------------------------------------------
   2446         Note:  Items #153 to #1 are now in a separate file named
   2447                 CHANGES_FROM_133_BEFORE_MR13.txt
   2448         -------------------------------------------------------
   2449