Cross Reference: /external/bison/doc/bison.texinfo

Lines Matching full:token
122 * Semantic Values::   Each token or syntactic grouping can have
217 * Token Decl::        Declaring terminal symbols.
241 * Token Values::      How @code{yylex} must return the semantic value
242                         of the token it has read.
243 * Token Locations::   How @code{yylex} must return the text location
244                         (line number, etc.) of the token, if the
251 * Look-Ahead::        Parser looks one token ahead when deciding what to do.
270 * Semantic Tokens::   Token parsing can depend on the semantic context.
271 * Lexical Tie-ins::   Token parsing can depend on the syntactic context.
406 * Semantic Values::   Each token or syntactic grouping can have
448 token of look-ahead.  Strictly speaking, that is a description of an
476 @cindex token
483 @dfn{terminal symbols} or @dfn{token types}.  We call a piece of input
484 corresponding to a single terminal symbol a @dfn{token}, and a piece
557 that the entire token sequence reduces to a single grouping whose symbol is
576 The Bison representation for a terminal symbol is also called a @dfn{token
577 type}.  Token types as well can be represented as C-like identifiers.  By
586 a C character constant.  You should do this whenever a token is just a
588 a literal as the terminal symbol for that token.
595 quotes is a literal character token, representing part of the C syntax for
621 3989 as constants in the program!  Therefore, each token in a Bison grammar
622 has both a token type and a @dfn{semantic value}.  @xref{Semantics,
626 The token type is a terminal symbol defined in the grammar, such as
628 you need to know to decide where the token may validly appear and how to
633 meaning of the token, such as the value of an integer, or the name of an
634 identifier.  (A token such as @code{','} which is just punctuation doesn't
637 For example, an input token might be classified as token type
638 @code{INTEGER} and have the semantic value 4.  Another input token might
639 have the same token type @code{INTEGER} but value 3989.  When a grammar
642 token, it keeps track of the token's semantic value.
788 These two declarations look identical until the @samp{..} token.
789 With normal @acronym{LALR}(1) one-token look-ahead it is not
816 error.  If there is a @samp{..} token before the next
819 fails since it requires a @samp{..} token.  So one of the branches
846 %token TYPE DOTDOT ID
950 %token TYPENAME ID
1109 the look-ahead token present at the time of the associated reduction.
1112 look-ahead token's semantic value and location, if any.
1182 Each token has a semantic value.  In a similar fashion, each token has an
1221 parser calls the lexical analyzer each time it wants a new token.  It
1236 Aside from the token type names and the symbols in the actions you
1414 %token NUM
1427 @code{double}, each token and each expression has an associated value,
1440 about the token types (@pxref{Bison Declarations, ,The Bison
1445 only terminal symbol that needs to be declared is @code{NUM}, the token
1544 The first alternative is a token which is a newline character; this means
1632 that isn't part of a number is a separate token.  Note that the token-code
1633 for such a single-character token is the character itself.
1636 represents a token type.  The same text used in Bison rules to stand for
1637 this token type is also a C expression for the numeric code for the type.
1638 This works in two ways.  If the token
1641 token type is an identifier, that identifier is defined by Bison as a C
1645 The semantic value of the token (if it has one) is stored into the
1651 A token type code of zero is returned if the end-of-input is encountered.
1659    number on the stack and the token NUM, or the numeric code
1841 %token NUM
1874 In the second section (Bison declarations), @code{%left} declares token
1877 @code{%token} which is used to declare a token type name without
1985 %token NUM
2071 able to feed the parser with the token locations, as it already does for
2132 @code{YYLTYPE}) containing the token's location.
2134 Now, each time this function returns a token, the parser has its number
2227 %token <val>  NUM        /* Simple double precision number.  */
2228 %token <tptr> VAR FNCT   /* Variable and Function.  */
2257 symbols, just as @code{%token} is used for declaring token types.  We
2543   /* Any other character is a token by itself.        */
2717 @cindex token type
2723 A @dfn{terminal symbol} (also known as a @dfn{token type}) represents a
2725 rules to mean that a token in that class is allowed.  The symbol is
2727 function returns a token type code to indicate what kind of token has
2742 A @dfn{named token type} is written with an identifier, like an
2745 @code{%tokenToken Decl, ,Token Type Names}.
2748 @cindex character token
2749 @cindex literal token
2751 A @dfn{character token type} (or @dfn{literal character token}) is
2753 constants; for example, @code{'+'} is a character token type.  A
2754 character token type doesn't need to be declared unless you need to
2759 By convention, a character token type is used only to represent a
2760 token that consists of that particular character.  Thus, the token
2762 token.  Nothing enforces this convention, but if you depart from it,
2774 @cindex string token
2775 @cindex literal string token
2777 A @dfn{literal string token} is written like a C string constant; for
2778 example, @code{"<="} is a literal string token.  A literal string token
2783 You can associate the literal string token with a symbolic name as an
2784 alias, using the @code{%token} declaration (@pxref{Token Decl, ,Token
2786 retrieve the token number for the literal string token from the
2791 By convention, a literal string token is used only to represent a token
2792 that consists of that particular string.  Thus, you should use the token
2793 type @code{"<="} to represent the string @samp{<=} as a token.  Bison
2801 literal string token must contain two or more characters; for a token
2802 containing just one character, use a character token (see above).
2811 Whichever way you write the token type in the grammar rules, you write
2813 for a character token type is simply the positive numeric code of the
2817 Each named token type becomes a C macro in
2823 token-type macro definitions to be available there.  Use the @samp{-d}
2853 value of the error token is 256, unless you explicitly assigned 256 to
2854 one of your tokens with a @code{%token} declaration.
2886 says that two groupings of type @code{exp}, with a @samp{+} token in between,
3096 @code{%token} Bison declaration (@pxref{Token Decl, ,Token Type Names})
3139 connected by a plus-sign token.  In the action, @code{$1} and @code{$3}
3144 useful semantic value associated with the @samp{+} token, it could be
3192 It is also possible to access the semantic value of the look-ahead token, if
3350 token and look at what follows before deciding whether there is a
3379 information to do it correctly.  (The open-brace token is what is called
3380 the @dfn{look-ahead} token at this time, since the parser is still
3401 statement by the first token (which is true in C), then one solution which
3414 Now the first token of the following declaration or statement,
3554 It is also possible to access the location of the look-ahead token, if any,
3645 All token type names (but not single-character literal tokens such as
3657 * Token Decl::        Declaring terminal symbols.
3683 @node Token Decl
3684 @subsection Token Type Names
3685 @cindex declaring token type names
3686 @cindex token type names, declaring
3688 @findex %token
3690 The basic way to declare a token type name (terminal symbol) is as follows:
3693 %token @var{name}
3698 can use the name @var{name} to stand for this token type's code.
3701 @code{%nonassoc} instead of @code{%token}, if you wish to specify
3705 You can explicitly specify the numeric code for a token type by appending
3707 following the token name:
3710 %token NUM 300
3711 %token XNUM 0x12d // a GNU extension
3716 all token types.  Bison will automatically select codes that don't conflict
3720 @code{%token} or other token declaration to include the data type
3732 %token <val> NUM      /* define token NUM and its type */
3736 You can associate a literal string token with a token type name by
3737 writing the literal string at the end of a @code{%token}
3741 %token arrow "=>"
3749 %token  <operator>  OR      "||"
3750 %token  <operator>  LE 134  "<="
3755 Once you equate the literal string and the token name, you can use them
3757 @code{yylex} function can use the token name or the literal string to
3758 obtain the token type code number (@pxref{Calling Convention}).
3767 declare a token and specify its precedence and associativity, all at
3773 @code{%token}: either
3786 And indeed any of these declarations serves the purposes of @code{%token}.
3835 in the @code{%token} and @code{%type} declarations to pick one of the types
3887 terminal symbol.  All kinds of token declarations allow
3954 %token <string> STRING
4118 @deffn {Directive} %token
4119 Declare a terminal symbol (token type name) with no precedence
4120 or associativity specified (@pxref{Token Decl, ,Token Type Names}).
4124 Declare a terminal symbol (token type name) that is right-associative
4129 Declare a terminal symbol (token
4134 Declare a terminal symbol (token type name) that is nonassociative
4174 Write a header file containing macro definitions for the token type
4200 and to the token type codes.  @xref{Token Values, ,Semantic Values of
4276 @deffn {Directive} %token-table
4277 Generate an array of token names in the parser file.  The name of the
4279 token whose internal Bison token code number is @var{i}.  The first
4286 the token in Bison.  For single-character literals and literal
4294 When you specify @code{%token-table}, Bison also generates macro
4300 The highest token number, plus one.
4312 parser states and what is done for each type of look-ahead token in
4464 need to arrange for the token-type macro definitions to be available there.
4472 * Token Values::      How @code{yylex} must return the semantic value
4473                         of the token it has read.
4474 * Token Locations::   How @code{yylex} must return the text location
4475                         (line number, etc.) of the token, if the
4485 for the type of token it has just found; a zero or negative value
4488 When a token is referred to in the grammar rules by a name, that name
4490 numeric code for that token type.  So @code{yylex} can use the name
4493 When a token is referred to in the grammar rules by a character literal,
4494 the numeric code for that character is also the code for the token type.
4511     return c;      /* Assume token type for `+' is '+'.  */
4513   return INT;      /* Return the type of the token.  */
4523 @code{yylex} can determine the token type codes for them:
4527 If the grammar defines symbolic token names as aliases for the
4533 @code{yylex} can find the multicharacter token in the @code{yytname}
4534 table.  The index of the token in the table is the token type's code.
4535 The name of a multicharacter token is recorded in @code{yytname} with a
4536 double-quote, the token's characters, and another double-quote.  The
4537 token's characters are escaped as necessary to be suitable as input
4540 Here's code for looking up a multicharacter token in @code{yytname},
4541 assuming that the characters of the token are stored in
4542 @code{token_buffer}, and assuming that the token does not contain any
4559 @code{%token-table} declaration.  @xref{Decl Summary}.
4562 @node Token Values
4566 In an ordinary (nonreentrant) parser, the semantic value of the token must
4576   return INT;      /* Return the type of the token.  */
4583 Collection of Value Types}).  So when you store a token's value, you
4604   return INT;            /* Return the type of the token.  */
4609 @node Token Locations
4617 location of a token just parsed in the global variable @code{yylloc}.
4646   return INT;      /* Return the type of the token.  */
4706 whenever it reads a token which cannot satisfy any syntax rule.  An
4857 @deffn {Macro} YYBACKUP (@var{token}, @var{value});
4859 Unshift a token.  This macro is allowed only for rules that reduce
4860 a single value, and only when there is no look-ahead token.
4862 It installs a look-ahead token with token type @var{token} and
4867 a look-ahead token already, then it reports a syntax error with
4876 Value stored in @code{yychar} when there is no look-ahead token.
4902 Variable containing either the look-ahead token, or @code{YYEOF} when the
4904 has been performed so the next token is not yet known.
4911 Discard the current look-ahead token.  This is useful primarily in
4925 Variable containing the look-ahead token location when @code{yychar} is not set
4933 Variable containing the look-ahead token semantic value when @code{yychar} is
4959 @c you must make @code{yylex} supply this information about each token.
5068 token is traditionally called @dfn{shifting}.
5071 @samp{3} to come.  The stack will have four elements, one for each token
5074 But the stack does not always have an element for each token read.  When
5089 and the next input token is a newline character, then the last three
5105 16.  Then the newline token can be shifted.
5114 * Look-Ahead::        Parser looks one token ahead when deciding what to do.
5127 @cindex look-ahead token
5133 token in order to decide what to do.
5135 When a token is read, it is not immediately shifted; first it becomes the
5136 @dfn{look-ahead token}, which is not on the stack.  Now the parser can
5138 the look-ahead token remains off to the side.  When no more reductions
5139 should take place, the look-ahead token is shifted onto the stack.  This
5141 token type of the look-ahead token, some rules may choose to delay their
5164 should be done?  If the following token is @samp{)}, then the first three
5169 If the following token is @samp{!}, then it must be shifted immediately so
5179 The look-ahead token is stored in the variable @code{yychar}.
5207 When the @code{ELSE} token is read and becomes the look-ahead token, the
5260 %token IF THEN ELSE variable
5317 depends on the next token.  Of course, if the next token is @samp{)}, we
5319 token sequence @w{@samp{- 2 )}} or anything starting with that.  But if
5320 the next token is @samp{*} or @samp{<}, we have a choice: either
5325 the next operator token @var{op} is shifted, then it must be reduced
5341 contains @w{@samp{1 - 2}} and the look-ahead token is @samp{-}: shifting
5402 of the rule being considered with that of the look-ahead token.  If the
5403 token's precedence is higher, the choice is to shift.  If the rule's
5411 the look-ahead token has no precedence, then the default is to shift.
5427 @code{%nonassoc}, can only be used once for a given token; so a token has
5500 The values pushed on the parser stack are not simply token type codes; they
5505 Each time a look-ahead token is read, the current parser state together
5506 with the type of look-ahead token are looked up in a table.  This table
5507 token.''  In this case, it also
5515 There is one other alternative: the table can say that the look-ahead token
5642 %token ID
5672 It would seem that this grammar can be parsed with only a single token
5706 %token BOGUS
5723 As long as the token @code{BOGUS} is never generated by @code{yylex},
5760 based on a summary of the preceding input and on one extra token of look-ahead.
5797 grammar symbol that produces the same segment of the input token
5911 recognize the special token @code{error}.  This is a terminal symbol that
5913 handling.  The Bison parser generates an @code{error} token whenever a
5914 syntax error happens; if you have provided a rule to recognize this token
5940 @code{error} token is acceptable.  (This means that the subexpressions
5942 At this point the @code{error} token can be shifted.  Then, if the old
5943 look-ahead token is not acceptable to be shifted next, the parser reads
5944 tokens and discards them until it finds a token which is acceptable.  In
5984 Note that rules which accept the @code{error} token may have actions, just
5994 The previous look-ahead token is reanalyzed immediately after an error.  If
5996 this token.  Write the statement @samp{yyclearin;} in the error rule's
6003 probably correct.  The previous look-ahead token ought to be discarded
6016 syntactic units.  In many languages, the meaning of a token is affected by
6022 * Semantic Tokens::   Token parsing can depend on the semantic context.
6023 * Lexical Tie-ins::   Token parsing can depend on the syntactic context.
6032 @section Semantic Info in Token Types
6045 The method used in @acronym{GNU} C is to have two different token types,
6048 to decide which token type to return: @code{TYPENAME} if the identifier is
6052 token type to recognize.  @code{IDENTIFIER} is accepted as an expression,
6057 accepted---there is one rule for each of the two token types.
6121 particular, the token @samp{a1b} must be treated as an integer rather than
6258 %token NUM STR
6290 Conflict in state 8 between rule 2 and token '+' resolved as reduce.
6291 Conflict in state 8 between rule 2 and token '-' resolved as reduce.
6292 Conflict in state 8 between rule 2 and token '*' resolved as shift.
6307 @cindex token, useless
6308 @cindex useless token
6391 symbol, and the look-ahead is a @code{NUM}, then this token is shifted on
6400 report lists @code{NUM} as a look-ahead token because @code{NUM} can be
6434 the rule 5, @samp{exp: NUM;}, is completed.  Whatever the look-ahead token
6460 '+' . exp}.  Since there is no default action, any other token than
6549 shifting the next token and going to the corresponding state, or
6556 look-ahead token is @samp{*}, since we specified that @samp{*} has higher
6689 Each time the parser calls @code{yylex}, what kind of token was read.
6692 Each time a token is shifted, the depth and complete contents of the
6704 possible input token.  As you read the successive trace messages, you
6717 The debugging information normally gives the token type of each token
6721 standard I/O stream, the numeric code for the token type, and the token
6885 @itemx --token-table
6886 Pretend that @code{%token-table} was specified.  @xref{Decl Summary}.
6896 file containing macro definitions for the token type names defined in
6970 @item @option{--token-table}                @tab @option{-k}
7220 Return the next token.  Its type is the return value, its semantic
7509 The token numbered as 0 corresponds to end of file; the following line
7517 %token        END      0 "end of file"
7518 %token        ASSIGN     ":="
7519 %token <sval> IDENTIFIER "identifier"
7520 %token <ival> NUMBER     "number"
7602 #define yyterminate() return token::END
7652 @code{yy::calcxx_parser::token::identifier} into
7653 @code{token::identifier} for instance.
7658   typedef yy::calcxx_parser::token token;
7662 ":="       return token::ASSIGN;
7669   return token::NUMBER;
7671 @{id@}       yylval->sval = new std::string (yytext); return token::IDENTIFIER;
7801   /* One token only.  */
7956 %token START_FOO START_BAR;
8201 The predefined token marking the end of the token stream.  It cannot be
8206 A token name reserved for error recovery.  This token may be used in
8210 token @code{error} becomes the current look-ahead token.  Actions
8212 token is reset to the token that originally caused the violation.
8236 Bison declaration to assign left associativity to token(s).
8271 Bison declaration to assign nonassociativity to token(s).
8302 Bison declaration to assign right associativity to token(s).
8311 @deffn {Directive} %token
8312 Bison declaration to declare token(s) without specifying precedence.
8313 @xref{Token Decl, ,Token Type Names}.
8316 @deffn {Directive} %token-table
8317 Bison declaration to include a token name table in the parser file.
8327 The predefined token onto which all undefined values returned by
8352 token.  @xref{Action Features, ,Special Features for Use in Actions}.
8357 look-ahead token.  (In a pure parser, it is a local variable within
8364 look-ahead token.  @xref{Error Recovery}.
8411 the next token.  @xref{Lexical, ,The Lexical Analyzer Function
8424 numbers associated with a token.  (In a pure parser, it is a local
8429 @xref{Token Locations, ,Textual Locations of Tokens}.
8430 In semantic actions, it stores the location of the look-ahead token.
8441 value associated with a token.  (In a pure parser, it is a local
8444 @xref{Token Values, ,Semantic Values of Tokens}.
8445 In semantic actions, it stores the semantic value of the look-ahead token.
8571 Parsing a sentence of a language by analyzing it token by token from
8582 @item Literal string token
8583 A token which consists of two or more fixed characters.  @xref{Symbols}.
8585 @item Look-ahead token
8586 A token already read but not yet shifted.  @xref{Look-Ahead, ,Look-Ahead
8595 The class of context-free grammars in which at most one token of
8601 words, a construct that is not a token.  @xref{Symbols}.
8659 @item Token
8661 that describes a token in the grammar is a terminal symbol.
8667 grammatically indivisible.  The piece of text it represents is a token.
OpenGrok