Lines Matching full:character
16 states consume a single character, which may have various side-effects,
18 same character, or switches it to a new state (to consume the next
19 character), or repeats the same state (to consume the next character).
32 following tokens: DOCTYPE, start tag, end tag, comment, character,
42 Comment and character tokens have data.
78 Consume the next input character:
83 character reference data state.
95 In any case, emit the input character as a character token. Stay
113 In any case, emit the input character as a character token. Stay
120 Emit the input character as a character token. Stay in the data
123 8.2.4.2 Character reference data state
128 Attempt to consume a character reference, with no additional allowed
129 character.
131 If nothing is returned, emit a U+0026 AMPERSAND character token.
133 Otherwise, emit the character token that was returned.
142 Consume the next input character. If it is a U+002F SOLIDUS (/)
143 character, switch to the close tag open state. Otherwise, emit a
144 U+003C LESS-THAN SIGN character token and reconsume the current
145 input character in the data state.
148 Consume the next input character:
159 lowercase version of the input character (add 0x0020 to
160 the character's code point), then switch to the tag name
166 input character, then switch to the tag name state. (Don't
171 Parse error. Emit a U+003C LESS-THAN SIGN character token
172 and a U+003E GREATER-THAN SIGN character token. Switch to
179 Parse error. Emit a U+003C LESS-THAN SIGN character token
180 and reconsume the current input character in the data
192 * U+0009 CHARACTER TABULATION
200 ...then emit a U+003C LESS-THAN SIGN character token, a U+002F SOLIDUS
201 character token, and switch to the data state to process the next input
202 character.
206 character:
210 version of the input character (add 0x0020 to the character's
217 character, then switch to the tag name state. (Don't emit the
225 Parse error. Emit a U+003C LESS-THAN SIGN character token and a
226 U+002F SOLIDUS character token. Reconsume the EOF character in
234 Consume the next input character:
236 U+0009 CHARACTER TABULATION
249 Append the lowercase version of the current input character (add
250 0x0020 to the character's code point) to the current tag token's
255 character in the data state.
258 Append the current input character to the current tag token's
263 Consume the next input character:
265 U+0009 CHARACTER TABULATION
280 character (add 0x0020 to the character's code point), and its
290 character in the data state.
294 attribute's name to the current input character, and its value
299 Consume the next input character:
301 U+0009 CHARACTER TABULATION
317 Append the lowercase version of the current input character (add
318 0x0020 to the character's code point) to the current attribute's
327 character in the data state.
330 Append the current input character to the current attribute's
342 Consume the next input character:
344 U+0009 CHARACTER TABULATION
362 character (add 0x0020 to the character's code point), and its
371 character in the data state.
375 attribute's name to the current input character, and its value
380 Consume the next input character:
382 U+0009 CHARACTER TABULATION
393 this input character.
406 Parse error. Emit the current tag token. Reconsume the character
410 Append the current input character to the current attribute's
415 Consume the next input character:
421 Switch to the character reference in attribute value state, with
422 character being U+0022 QUOTATION MARK
426 Parse error. Emit the current tag token. Reconsume the character
430 Append the current input character to the current attribute's
435 Consume the next input character:
441 Switch to the character reference in attribute value state, with
442 the additional allowed character being U+0027 APOSTROPHE (').
445 Parse error. Emit the current tag token. Reconsume the character
449 Append the current input character to the current attribute's
454 Consume the next input character:
456 U+0009 CHARACTER TABULATION
463 Switch to the character reference in attribute value state, with
464 no additional allowed character.
475 Parse error. Emit the current tag token. Reconsume the character
479 Append the current input character to the current attribute's
482 8.2.4.13 Character reference in attribute value state
484 Attempt to consume a character reference.
486 If nothing is returned, append a U+0026 AMPERSAND character to the
489 Otherwise, append the returned character token to the current
497 Consume the next input character:
499 U+0009 CHARACTER TABULATION
513 character in the data state.
516 Parse error. Reconsume the character in the before attribute
521 Consume the next input character:
529 character in the data state.
532 Parse error. Reconsume the character in the before attribute
540 Consume every character up to and including the first U+003E
541 GREATER-THAN SIGN character (>) or the end of the file (EOF), whichever
543 all the characters starting from and including the character that
545 and including the character immediately before the last consumed
546 character (i.e. up to the character just before the U+003E or EOF
547 character). (If the comment was started by the end of the file (EOF),
552 If the end of the file was reached, reconsume the EOF character.
571 character before and after), then consume those characters and switch
576 The next character that is consumed, if any, is the first character
581 Consume the next input character:
590 Parse error. Emit the comment token. Reconsume the EOF character
594 Append the input character to the comment token's data. Switch
599 Consume the next input character:
608 Parse error. Emit the comment token. Reconsume the EOF character
612 Append a U+002D HYPHEN-MINUS (-) character and the input
613 character to the comment token's data. Switch to the comment
618 Consume the next input character:
624 Parse error. Emit the comment token. Reconsume the EOF character
628 Append the input character to the comment token's data. Stay in
633 Consume the next input character:
639 Parse error. Emit the comment token. Reconsume the EOF character
643 Append a U+002D HYPHEN-MINUS (-) character and the input
644 character to the comment token's data. Switch to the comment
649 Consume the next input character:
655 Parse error. Append a U+002D HYPHEN-MINUS (-) character to the
659 Parse error. Emit the comment token. Reconsume the EOF character
664 the input character to the comment token's data. Switch to the
669 Consume the next input character:
671 U+0009 CHARACTER TABULATION
678 Parse error. Reconsume the current character in the before
683 Consume the next input character:
685 U+0009 CHARACTER TABULATION
697 lowercase version of the input character (add 0x0020 to the
698 character's code point). Switch to the DOCTYPE name state.
702 flag to on. Emit the token. Reconsume the EOF character in the
707 input character. Switch to the DOCTYPE name state.
711 Consume the next input character:
713 U+0009 CHARACTER TABULATION
723 Append the lowercase version of the input character (add 0x0020
724 to the character's code point) to the current DOCTYPE token's
729 Emit that DOCTYPE token. Reconsume the EOF character in the data
733 Append the current input character to the current DOCTYPE
738 Consume the next input character:
740 U+0009 CHARACTER TABULATION
751 Emit that DOCTYPE token. Reconsume the EOF character in the data
755 If the six characters starting from the current input character
761 character are an ASCII case-insensitive match for the word
770 Consume the next input character:
772 U+0009 CHARACTER TABULATION
794 Emit that DOCTYPE token. Reconsume the EOF character in the data
803 Consume the next input character:
814 Emit that DOCTYPE token. Reconsume the EOF character in the data
818 Append the current input character to the current DOCTYPE
824 Consume the next input character:
835 Emit that DOCTYPE token. Reconsume the EOF character in the data
839 Append the current input character to the current DOCTYPE
845 Consume the next input character:
847 U+0009 CHARACTER TABULATION
868 Emit that DOCTYPE token. Reconsume the EOF character in the data
877 Consume the next input character:
879 U+0009 CHARACTER TABULATION
901 Emit that DOCTYPE token. Reconsume the EOF character in the data
910 Consume the next input character:
921 Emit that DOCTYPE token. Reconsume the EOF character in the data
925 Append the current input character to the current DOCTYPE
931 Consume the next input character:
942 Emit that DOCTYPE token. Reconsume the EOF character in the data
946 Append the current input character to the current DOCTYPE
952 Consume the next input character:
954 U+0009 CHARACTER TABULATION
965 Emit that DOCTYPE token. Reconsume the EOF character in the data
974 Consume the next input character:
980 Emit the DOCTYPE token. Reconsume the EOF character in the data
991 Consume every character up to the next occurrence of the three
992 character sequence U+005D RIGHT SQUARE BRACKET U+005D RIGHT SQUARE
994 whichever comes first. Emit a series of character tokens consisting of
995 all the characters consumed except the matching three character
1000 If the end of the file was reached, reconsume the EOF character.
1002 8.2.4.37 Tokenizing character references
1004 This section defines how to consume a character reference. This
1005 definition is used when parsing character references in text and in
1008 The behavior depends on the identity of the next character (the one
1009 immediately after the U+0026 AMPERSAND character):
1011 U+0009 CHARACTER TABULATION
1018 The additional allowed character, if there is one
1019 Not a character reference. No characters are consumed, and
1025 The behavior further depends on the character after the U+0023
1053 characters (and unconsume the U+0023 NUMBER SIGN character and,
1054 if appropriate, the X character). This is a parse error; nothing
1057 Otherwise, if the next character is a U+003B SEMICOLON, consume
1066 that number in the first column, and return a character token
1067 for the Unicode character given in the second column of that
1070 Number Unicode character
1073 0x81 U+FFFD REPLACEMENT CHARACTER
1085 0x8D U+FFFD REPLACEMENT CHARACTER
1087 0x8F U+FFFD REPLACEMENT CHARACTER
1088 0x90 U+FFFD REPLACEMENT CHARACTER
1101 0x9D U+FFFD REPLACEMENT CHARACTER
1113 a parse error; return a character token for the U+FFFD
1114 REPLACEMENT CHARACTER character instead.
1116 Otherwise, return a character token for the Unicode character
1122 column of the named character references table (in a
1128 If the last character matched is not a U+003B SEMICOLON (;),
1131 If the character reference is being consumed as part of an
1132 attribute, and the last character matched is not a U+003B
1133 SEMICOLON (;), and the next character is in the range U+0030
1140 Otherwise, return a character token for the character
1141 corresponding to the character reference name (as given by the
1142 second column of the named character references table).
1144 If the markup contains I'm ¬it; I tell you, the character
1146 the markup was I'm ∉ I tell you, the character reference