Lines Matching full:token
44 written with the preprocessing token as the fundamental unit; the
60 * Token Spacing:: Spacing and paste avoidance issues.
107 interface for obtaining the next token, `cpp_get_token', takes care of
110 clients of the library can easily spell a given token, such as
114 Lexing a token
117 Lexing of an individual token is handled by `_cpp_lex_direct' and its
125 The job of `_cpp_lex_direct' is simply to lex a token. It is not
132 The lexer places the token it lexes into storage pointed to by the
136 `line' and `col' values of the token just before the location that
139 The lexer does not consider whitespace to be a token in its own
140 right. If whitespace (other than a new line) precedes a token, it sets
141 the `PREV_WHITE' bit in the token's flags. Each token has its `line'
143 of the token. This line number is the line number in the translation
147 The first token on a logical, i.e. unescaped, line has the flag
152 Clients cannot reliably determine this for themselves: the first token
155 token on the line certainly won't have the `BOL' flag set.
161 `in_directive' is set, the lexer returns a `CPP_EOF' token, which is
163 In a directive a `CPP_EOF' token never means end-of-file.
176 `PREV_WHITE' flag of a token if it meets a new line when `parsing_args'
189 This is a good example of the subtlety of getting token spacing
219 anywhere--between the `+' and `=' of the `+=' token, within the
235 `+=' token; it needs to be prepared for an escaped newline of some
254 diagnostic is appropriate. Since we change state on a per-token basis,
258 lexing header names. Normally, a `<' would be lexed as a single token.
260 token as far as the nearest `>' character. Note that we don't allow
266 force. For example, `::' is a single token in C++, but in C it is two
271 Once a token has been lexed, it leads an independent existence. The
273 storage from the original input buffer, so a token remains valid and
288 token stream. For example, after the name of a function-like macro, it
289 wants to check the next token to see if it is an opening parenthesis.
293 pragmas to access the full `#pragma' token stream. The stand-alone
294 preprocessor wants to be able to test the current token with the
297 be sure the pointer to the previous token is still valid. The
299 parsing arbitrarily far ahead in the token stream, and then to be able
303 preprocessor lex all tokens on a line consecutively into a token buffer,
304 which I call a "token run", and when meeting an unescaped new line
320 the token run?
325 expanded by chaining a new token run on to the end of the existing one.
334 stringification and token pasting. I handled this by allocating space
335 for these tokens from the lexer's token run chain. This means they
339 Lexing into a line of tokens solves some of the token memory
343 current line. So cpplib only moves back to the start of the token run
350 share full control over the lifetime of token pointers too.
352 The routine `_cpp_lex_token' handles moving to new token runs,
354 previously-lexed tokens if we stepped back in the token stream. It also
355 checks each token for the `BOL' flag, which might indicate a directive
359 multiple-include optimization if a token was successfully lexed outside
407 token matters they are spelt differently. This spelling
414 token, after lexing, contains a pointer to its hash node, this is used
425 File: cppinternals.info, Node: Macro Expansion, Next: Token Spacing, Prev: Hash Nodes, Up: Top
437 of how things like nested macro expansion, stringification and token
446 contiguously in memory, so a pointer to the first one and a token count
453 special token of type `CPP_MACRO_ARG'. Each such token holds the index
474 contiguous list of tokens delimited by a starting and ending token.
475 When not in base context, cpplib obtains the next token from the list
493 Although these macros expand to a single token which cannot contain any
494 further macros, for reasons of token spacing (*note Token Spacing::)
496 by pushing a context containing just that one token.
520 later scan. This occurs when the identifier is the last token of an
523 parameter in the macro's replacement list, the subsequent token happens
524 to be an opening parenthesis (itself possibly the first token of an
527 It is important to note that when cpplib reads the last token of a
529 looking for the _next_ token do we pop it off the stack and drop to a
530 lower context. This makes backing up by one token easy, but more
532 is still disabled when we are considering the last token of its
543 the macro invocation]. This still leaves the argument token `foo'
545 replacement, the token `foo' is rejected for expansion, and marked
561 read the next token. Unfortunately, because of spacing issues (*note
562 Token Spacing::), there can be fake padding tokens in-between, and if
563 the next real token is not a parenthesis cpplib needs to be able to
564 back up that one token as well as retain the information in any
567 Backing up more than one token when macros are involved is not
572 as it reads tokens. If the next real token is not an opening
573 parenthesis, it backs up that one token, and then pushes an extra
581 have been lexed, it instead makes a copy of the token and adds the flag
586 from the lexer's current token run (*note Lexing a line::) using the
592 list, and cpplib only wants to back-up more than one lexer token in
597 File: cppinternals.info, Node: Token Spacing, Next: Line Numbering, Prev: Macro Expansion, Up: Top
599 Token Spacing
604 preprocessed output results in an identical token stream. Without
623 shown by the `EMPTY' example) from the original lexed token stream, we
624 need to check for accidental token pasting. We call this "paste
625 avoidance". Token addition and removal can only occur because of macro
628 additionally each token created by the `#' and `##' operators.
633 indicates that the token was preceded by whitespace of some form other
655 token, which I call a
656 "padding token", into the token stream to indicate that spacing of the
657 subsequent token is special. The preprocessor inserts padding tokens
659 point to a "source token" from which the subsequent real token should
665 example if a macro's first replacement token expands straight into
673 Here, two padding tokens are generated with sources the `foo' token
674 between the brackets, and the `bar' token from foo's replacement list,
675 respectively. Clearly the first padding token is the one to use, so
676 the output code should contain a rule that the first padding token in a
692 tokens, one per macro invocation, before the token `baz'. We would
694 source token `foo' with no leading space.
702 cpplib insert a padding token with a `NULL' source token when leaving
706 rule so that, if we see a padding token with a `NULL' source token,
707 _and_ that source token has no leading space, then we behave as if we
725 File: cppinternals.info, Node: Line Numbering, Next: Guard Macros, Prev: Token Spacing, Up: Top
734 the line number of a token passed to it:
748 * If the token results from a macro expansion, the line of the macro
754 the token. Consequently, but maybe unexpectedly, a token from the
755 replacement list of a macro expansion carries the location of the token
760 token. This is a because they are allocated from the lexer's token
765 recently _lexed_ token, unless they are passed a specific line and
768 original location in the macro definition that the token came from.
769 Since that is exactly the information each token carries, such an
773 the correct line to output the token on: the position attached to a
774 token is fairly useless if the token came from a macro expansion. All
776 the token's reported location is also wrong if it is part of a physical
780 whenever it lexes a preprocessing token that starts a new logical line
781 other than a directive. It passes this token (which may be a `CPP_EOF'
782 token indicating the end of the translation unit) to the callback
783 routine, which can then use the line and column of this token to
789 As mentioned above, cpplib stores with each token the line number that
873 When about to return a token that is not part of a directive,
901 at `EOF' without returning a token, if the `#endif' directive was not
1014 * paste avoidance: Token Spacing. (line 6)
1015 * spacing: Token Spacing. (line 6)
1016 * token run: Lexer. (line 192)
1017 * token spacing: Token Spacing. (line 6)
1029 Node: Token Spacing30070