Home | History | Annotate | Download | only in antlr
      1 
      2 
      3 
      4 ANTLR(1)               PCCTS Manual Pages                ANTLR(1)
      5 
      6 
      7 
      8 NAME
      9      antlr - ANother Tool for Language Recognition
     10 
     11 SYNTAX
     12      antlr [_o_p_t_i_o_n_s] _g_r_a_m_m_a_r__f_i_l_e_s
     13 
     14 DESCRIPTION
     15      _A_n_t_l_r converts an extended form of context-free grammar into
     16      a set of C functions which directly implement an efficient
     17      form of deterministic recursive-descent LL(k) parser.
     18      Context-free grammars may be augmented with predicates to
     19      allow semantics to influence parsing; this allows a form of
     20      context-sensitive parsing.  Selective backtracking is also
     21      available to handle non-LL(k) and even non-LALR(k) con-
     22      structs.  _A_n_t_l_r also produces a definition of a lexer which
     23      can be automatically converted into C code for a DFA-based
     24      lexer by _d_l_g.  Hence, _a_n_t_l_r serves a function much like that
     25      of _y_a_c_c, however, it is notably more flexible and is more
     26      integrated with a lexer generator (_a_n_t_l_r directly generates
     27      _d_l_g code, whereas _y_a_c_c and _l_e_x are given independent
     28      descriptions).  Unlike _y_a_c_c which accepts LALR(1) grammars,
     29      _a_n_t_l_r accepts LL(k) grammars in an extended BNF notation -
     30      which eliminates the need for precedence rules.
     31 
     32      Like _y_a_c_c grammars, _a_n_t_l_r grammars can use automatically-
     33      maintained symbol attribute values referenced as dollar
     34      variables.  Further, because _a_n_t_l_r generates top-down
     35      parsers, arbitrary values may be inherited from parent rules
     36      (passed like function parameters).  _A_n_t_l_r also has a mechan-
     37      ism for creating and manipulating abstract-syntax-trees.
     38 
     39      There are various other niceties in _a_n_t_l_r, including the
     40      ability to spread one grammar over multiple files or even
     41      multiple grammars in a single file, the ability to generate
     42      a version of the grammar with actions stripped out (for
     43      documentation purposes), and lots more.
     44 
     45 OPTIONS
     46      -ck _n
     47           Use up to _n symbols of lookahead when using compressed
     48           (linear approximation) lookahead.  This type of looka-
     49           head is very cheap to compute and is attempted before
     50           full LL(k) lookahead, which is of exponential complex-
     51           ity in the worst case.  In general, the compressed loo-
     52           kahead can be much deeper (e.g, -ck 10) _t_h_a_n _t_h_e _f_u_l_l
     53           _l_o_o_k_a_h_e_a_d (_w_h_i_c_h _u_s_u_a_l_l_y _m_u_s_t _b_e _l_e_s_s _t_h_a_n _4).
     54 
     55      -CC  Generate C++ output from both ANTLR and DLG.
     56 
     57      -cr  Generate a cross-reference for all rules.  For each
     58           rule, print a list of all other rules that reference
     59           it.
     60 
     61      -e1  Ambiguities/errors shown in low detail (default).
     62 
     63      -e2  Ambiguities/errors shown in more detail.
     64 
     65      -e3  Ambiguities/errors shown in excruciating detail.
     66 
     67      -fe file
     68           Rename err.c to file.
     69 
     70      -fh file
     71           Rename stdpccts.h header (turns on -gh) to file.
     72 
     73      -fl file
     74           Rename lexical output, parser.dlg, to file.
     75 
     76      -fm file
     77           Rename file with lexical mode definitions, mode.h, to
     78           file.
     79 
     80      -fr file
     81           Rename file which remaps globally visible symbols,
     82           remap.h, to file.
     83 
     84      -ft file
     85           Rename tokens.h to file.
     86 
     87      -ga  Generate ANSI-compatible code (default case).  This has
     88           not been rigorously tested to be ANSI XJ11 C compliant,
     89           but it is close.  The normal output of _a_n_t_l_r is
     90           currently compilable under both K&R, ANSI C, and C++-
     91           this option does nothing because _a_n_t_l_r generates a
     92           bunch of #ifdef's to do the right thing depending on
     93           the language.
     94 
     95      -gc  Indicates that _a_n_t_l_r should generate no C code, i.e.,
     96           only perform analysis on the grammar.
     97 
     98      -gd  C code is inserted in each of the _a_n_t_l_r generated pars-
     99           ing functions to provide for user-defined handling of a
    100           detailed parse trace.  The inserted code consists of
    101           calls to the user-supplied macros or functions called
    102           zzTRACEIN and zzTRACEOUT.  The only argument is a _c_h_a_r
    103           * pointing to a C-style string which is the grammar
    104           rule recognized by the current parsing function.  If no
    105           definition is given for the trace functions, upon rule
    106           entry and exit, a message will be printed indicating
    107           that a particular rule as been entered or exited.
    108 
    109      -ge  Generate an error class for each non-terminal.
    110 
    111      -gh  Generate stdpccts.h for non-ANTLR-generated files to
    112           include.  This file contains all defines needed to
    113           describe the type of parser generated by _a_n_t_l_r (e.g.
    114           how much lookahead is used and whether or not trees are
    115           constructed) and contains the header action specified
    116           by the user.
    117 
    118      -gk  Generate parsers that delay lookahead fetches until
    119           needed.  Without this option, _a_n_t_l_r generates parsers
    120           which always have _k tokens of lookahead available.
    121 
    122      -gl  Generate line info about grammar actions in C parser of
    123           the form # _l_i_n_e "_f_i_l_e" which makes error messages from
    124           the C/C++ compiler make more sense as they will point
    125           into the grammar file not the resulting C file.
    126           Debugging is easier as well, because you will step
    127           through the grammar not C file.
    128 
    129      -gs  Do not generate sets for token expression lists;
    130           instead generate a ||-separated sequence of
    131           LA(1)==_t_o_k_e_n__n_u_m_b_e_r.  The default is to generate sets.
    132 
    133      -gt  Generate code for Abstract-Syntax Trees.
    134 
    135      -gx  Do not create the lexical analyzer files (dlg-related).
    136           This option should be given when the user wishes to
    137           provide a customized lexical analyzer.  It may also be
    138           used in _m_a_k_e scripts to cause only the parser to be
    139           rebuilt when a change not affecting the lexical struc-
    140           ture is made to the input grammars.
    141 
    142      -k _n Set k of LL(k) to _n; i.e. set tokens of look-ahead
    143           (default==1).
    144 
    145      -o dir
    146           Directory where output files should go (default=".").
    147           This is very nice for keeping the source directory
    148           clear of ANTLR and DLG spawn.
    149 
    150      -p   The complete grammar, collected from all input grammar
    151           files and stripped of all comments and embedded
    152           actions, is listed to stdout.  This is intended to aid
    153           in viewing the entire grammar as a whole and to elim-
    154           inate the need to keep actions concisely stated so that
    155           the grammar is easier to read.  Hence, it is preferable
    156           to embed even complex actions directly in the grammar,
    157           rather than to call them as subroutines, since the sub-
    158           routine call overhead will be saved.
    159 
    160      -pa  This option is the same as -p except that the output is
    161           annotated with the first sets determined from grammar
    162           analysis.
    163 
    164      -prc on
    165           Turn on the computation and hoisting of predicate con-
    166           text.
    167 
    168      -prc off
    169           Turn off the computation and hoisting of predicate con-
    170           text.  This option makes 1.10 behave like the 1.06
    171           release with option -pr on.  Context computation is off
    172           by default.
    173 
    174      -rl _n
    175           Limit the maximum number of tree nodes used by grammar
    176           analysis to _n.  Occasionally, _a_n_t_l_r is unable to
    177           analyze a grammar submitted by the user.  This rare
    178           situation can only occur when the grammar is large and
    179           the amount of lookahead is greater than one.  A non-
    180           linear analysis algorithm is used by PCCTS to handle
    181           the general case of LL(k) parsing.  The average com-
    182           plexity of analysis, however, is near linear due to
    183           some fancy footwork in the implementation which reduces
    184           the number of calls to the full LL(k) algorithm.  An
    185           error message will be displayed, if this limit is
    186           reached, which indicates the grammar construct being
    187           analyzed when _a_n_t_l_r hit a non-linearity.  Use this
    188           option if _a_n_t_l_r seems to go out to lunch and your disk
    189           start thrashing; try _n=10000 to start.  Once the
    190           offending construct has been identified, try to remove
    191           the ambiguity that _a_n_t_l_r was trying to overcome with
    192           large lookahead analysis.  The introduction of (...)?
    193           backtracking blocks eliminates some of these problems -
    194           _a_n_t_l_r does not analyze alternatives that begin with
    195           (...)? (it simply backtracks, if necessary, at run
    196           time).
    197 
    198      -w1  Set low warning level.  Do not warn if semantic
    199           predicates and/or (...)? blocks are assumed to cover
    200           ambiguous alternatives.
    201 
    202      -w2  Ambiguous parsing decisions yield warnings even if
    203           semantic predicates or (...)? blocks are used.  Warn if
    204           predicate context computed and semantic predicates
    205           incompletely disambiguate alternative productions.
    206 
    207      -    Read grammar from standard input and generate stdin.c
    208           as the parser file.
    209 
    210 SPECIAL CONSIDERATIONS
    211      _A_n_t_l_r works...  we think.  There is no implicit guarantee of
    212      anything.  We reserve no legal rights to the software known
    213      as the Purdue Compiler Construction Tool Set (PCCTS) - PCCTS
    214      is in the public domain.  An individual or company may do
    215      whatever they wish with source code distributed with PCCTS
    216      or the code generated by PCCTS, including the incorporation
    217      of PCCTS, or its output, into commercial software.  We
    218      encourage users to develop software with PCCTS.  However, we
    219      do ask that credit is given to us for developing PCCTS.  By
    220      "credit", we mean that if you incorporate our source code
    221      into one of your programs (commercial product, research pro-
    222      ject, or otherwise) that you acknowledge this fact somewhere
    223      in the documentation, research report, etc...  If you like
    224      PCCTS and have developed a nice tool with the output, please
    225      mention that you developed it using PCCTS.  As long as these
    226      guidelines are followed, we expect to continue enhancing
    227      this system and expect to make other tools available as they
    228      are completed.
    229 
    230 FILES
    231      *.c  output C parser.
    232 
    233      *.cpp
    234           output C++ parser when C++ mode is used.
    235 
    236      parser.dlg
    237           output _d_l_g lexical analyzer.
    238 
    239      err.c
    240           token string array, error sets and error support rou-
    241           tines.  Not used in C++ mode.
    242 
    243      remap.h
    244           file that redefines all globally visible parser sym-
    245           bols.  The use of the #parser directive creates this
    246           file.  Not used in C++ mode.
    247 
    248      stdpccts.h
    249           list of definitions needed by C files, not generated by
    250           PCCTS, that reference PCCTS objects.  This is not gen-
    251           erated by default.  Not used in C++ mode.
    252 
    253      tokens.h
    254           output #_d_e_f_i_n_e_s for tokens used and function prototypes
    255           for functions generated for rules.
    256 
    257 
    258 SEE ALSO
    259      dlg(1), pccts(1)
    260 
    261 
    262 
    263 
    264 
    265