Home | History | Annotate | Download | only in regexp

Lines Matching defs:character

103 // In a 3-character pattern you can maximally step forwards 3 characters
750 // Examples of elements include character classes, plain strings
815 // * Choice nodes have 1-character lookahead.
816 // A choice node looks at the following character and eliminates some of
817 // the choices immediately based on that character. This is not yet
822 // implementation of this would push each character position onto the
929 void CountCharacter(int character) {
930 int index = (character & RegExpMacroAssembler::kTableMask);
949 explicit CharacterFrequency(int character)
950 : counter_(0), character_(character) { }
954 int character() { return character_; }
981 // Lookarounds to match lone surrogates for unicode character class matches
1448 // of the negative submatch and restore the character position.
1612 static int GetCaseIndependentLetters(Isolate* isolate, uc16 character,
1616 isolate->jsregexp_uncanonicalize()->get(character, '\0', letters);
1620 letters[0] = character;
1674 // character. We do not need to do anything since the one-byte pass
1718 // subtract the difference from the found character, then do the or
1755 // if this character lies before a character that matched.
1941 // heuristics are complicated a little by the fact that any 128-character
1944 // 128-character space can take up a lot of space in the ranges array if,
1945 // for example, we only want to match every second character (eg. the lower
1950 // character range (even non-Latin1 charset-based text has spaces and
1982 // Gets a series of segment boundaries representing a character class. If the
1983 // character is in the range between an even and an odd boundary (counting from
1985 // know that the character is in the range of min_char to max_char inclusive.
2000 // Just need to test if the character is before or on-or-after
2001 // a particular character.
2016 // character class.
2045 // determine whether the character is inside or outside the character class.
2502 // For 2-character preloads in one-byte mode or 1-character preloads in
2535 // We iterate along the text object, building up for each character a
2538 // machine word for the current character width in order to be used in
2578 // a match at this character position.
2592 // whether we have a match at this character position. Otherwise
2603 // determine definitely whether we have a match at this character
2626 // A quick check uses multi-character mask and compare. There is no
2665 // so the chances of a false positive rise. A character class
2824 // Character is outside Latin-1 completely
3024 // Emit the code to check for a ^ in multiline mode (1-character lookbehind
3030 // We will be loading the previous character into the current character
3042 // OK to load the previous character.
3095 // Next character is not a word character.
3131 character, so the question is
3136 // OK to load the previous character.
3216 // second pass and the character class in the last pass.
3224 // A slight complication involves the fact that the first character may already
3226 // do the test for that character first. We do this in separate passes. The
3229 // first_element_checked to indicate that that character does not need to be
3239 // or obviate the need for further checks at some character positions.
3354 // straight character sequences (possibly to be matched in a case-independent
3355 // way) and character classes. For efficiency we do not do this in a single
3357 // emitting code for some character positions every time. See the comment on
3378 // If a character is preloaded into the current character register then
3422 // We don't have an instruction for shifting the current character register
3445 // None of the standard character classes is different in the case
3636 void BoyerMoorePositionInfo::Set(int character) {
3637 SetInterval(Interval(character, character));
3709 // Find the highest-points range between 0 and length_ where the character
3733 // Add 1 to the frequency to give a small per-character boost for
3766 // character at max_lookahead offset is not one of these characters, then we
3875 * character register. R nodes do this preloading. Vertices are marked
4123 // any character one at a time. Any non-anchored regexp has such a
4126 // and step forwards 3 if the character is not one of abc. Abc need
4127 // not be atoms, they can be any reasonably limited character class or
4260 // Reload the current character, since the next quick check expects that.
4899 // The unicode range splitter categorizes given character ranges into:
5106 // Advance any character. If the character happens to be a lead surrogate and
5259 // independent character classes for comparison.
5318 // character. The sorting function above did not sort on more than one
5319 // character for reasons of correctness, but there may still be a longer
5397 // Found non-trivial run of single-character alternatives.
5865 // This is not a character range as defined by the spec but a
5866 // convenient shorthand for a character class that matches any
5867 // character.
5908 // If this is a singleton we just expand the one character.
5919 // follows. For a given start character we look up the remainder of the
5921 // find 'z' if the character is 'c'. A block is characterized by the
6395 // hard, so we just say that any character can match.
6440 uc16 character = atom->data()[j];
6444 isolate, character, bm->max_char() == String::kMaxOneByteCharCode,
6450 if (character <= max_char) bm->Set(offset, character);