Cross Reference: /external/icu/icu4c/source/i18n/usearch.cpp

Lines Matching defs:match
516 * The canonical match will only be performed after the default match fails.
519 * a number of characters in the text and tries to match the pattern from that
525 * Canonical match will be performed slightly differently. We'll split the
528 * will be done on MS first, and only when we match MS then some processing
530 * they match PA and EA. Hence the default shift values
531 * for the canonical match will take the size of either end's accent into
561 * Check to make sure that the match length is at the end of the character by
646                 // extra collation elements at the end of the match
715 * a following match. If the last character is a unsafe character, we'll only
720 * @param ce the text ce which failed the match.
722 *        failed the match
749     //   a initial match is found
758 * sets match not found
764     // this method resets the match result regardless of the error status.
800 * This checks for accents in the potential match started with a .
803 * have any extra accents. We have to normalize the potential match and find
804 * the immediate decomposed character before the match.
809 * determine that the potential match has extra non-ignorable preceding
821 *              of the match, FALSE otherwise.
894 * Used by exact matches, checks if there are accents before the match.
898 * the first pattern ce does not match the first ce of the character, we bail.
900 * character and find the immediate decomposed character before the match to
903 * that when the match is passed in here with extra beginning ces, the
904 * first or last ce that match has to occur within the first character.
911 * @return TRUE if there are accents on either side of the match,
980 * Used by exact matches, checks if there are accents bounding the match.
981 * Note this is the initial boundary check. If the potential match
990 * @param start offset of match
991 * @param end end offset of the match
992 * @return TRUE if there are accents on either side of the match,
1062 * Checks for identical match
1064 * @param start offset of possible match
1065 * @param end offset of possible match
1066 * @return TRUE if identical match is found
1090 * Checks to see if the match is repeated
1092 * @param start new match start index
1093 * @param end new match end index
1094 * @return TRUE if the the match is repeated, FALSE otherwise
1142 * Checks match for contraction.
1143 * If the match ends with a partial contraction we fail.
1144 * If the match starts too far off (because of backwards iteration) we try to
1150 * @param start offset of potential match, to be modified if necessary
1151 * @param end offset of potential match, to be modified if necessary
1153 * @return TRUE if match passes the contraction test, FALSE otherwise
1166     // This part checks if either ends of the match contains potential
1224 * Checks and sets the match information if found.
1227 * <li> the potential match does not repeat the previous match
1231 * <li> potential match does not end in the middle of a contraction
1238 *        will be the truncated end offset of the match or the new start
1241 * @return TRUE if the match is valid, FALSE otherwise
1271     // totally match, we will get rid of the ending ignorables.
1397 * @return TRUE if a match if found, FALSE otherwise
1424 * match with the pattern.
1432 * @param strsrch string search match
1436 * @return USEARCH_DONE if a match is not found, otherwise return the starting
1437 *         offset of the match. Note this start includes all preceding accents.
1489         UChar   *match     = addToUCharArray(buffer, &matchsize,
1497         // run the collator iterator through this match
1498         ucol_setText(coleiter, match, matchsize, status);
1501                 if (match != buffer) {
1502                     uprv_free(match);
1556 * Take the rearranged end accents and tries matching. If match failed at
1560 * We allow skipping of the ends of the accent set if the ces do not match.
1567 * @return USEARCH_DONE if a match is not found, otherwise return the starting
1568 *         offset of the match. Note this start includes all preceding accents.
1684 * Trying out the substring and sees if it can be a canonical match.
1689 * match with the pattern.
1701 * @return TRUE if the match is valid, FALSE otherwise
1762             return TRUE; // match found
1793 * Checks match for contraction.
1794 * If the match ends with a partial contraction we fail.
1795 * If the match starts too far off (because of backwards iteration) we try to
1800 * @param start offset of potential match, to be modified if necessary
1801 * @param end offset of potential match, to be modified if necessary
1803 * @return TRUE if match passes the contraction test, FALSE otherwise
1816     // This part checks if either ends of the match contains potential
1885 * Checks and sets the match information if found.
1888 * <li> the potential match does not repeat the previous match
1890 * <li> potential match does not end in the middle of a contraction
1898 *        will be the truncated end offset of the match or the new start
1901 * @return TRUE if the match is valid, FALSE otherwise
1910     // if we have a canonical accent match
1947 * a preceding match. If the first character is a unsafe character, we'll only
1953 * @param ce the text ce which failed the match.
1955 *        failed the match
1993 * Checks match for contraction.
1994 * If the match starts with a partial contraction we fail.
1998 * @param start offset of potential match, to be modified if necessary
1999 * @param end offset of potential match, to be modified if necessary
2001 * @return TRUE if match passes the contraction test, FALSE otherwise
2013     // This part checks if either if the start of the match contains potential
2016     // match, this guarantees that our end will not be a partial contraction,
2068 * Checks and sets the match information if found.
2071 * <li> the current match does not repeat the last match
2084 *        will be the truncated start offset of the match or the new start
2087 * @return TRUE if the match is valid, FALSE otherwise
2102     // the old match
2128 * match with the pattern.
2136 * @param strsrch string search match
2140 * @return USEARCH_DONE if a match is not found, otherwise return the ending
2141 *         offset of the match. Note this start includes all following accents.
2191             UChar   *match     = addToUCharArray(buffer, &matchsize,
2198             // run the collator iterator through this match
2200             ucol_setText(coleiter, match, matchsize, status);
2203                     if (match != buffer) {
2204                         uprv_free(match);
2216 * Take the rearranged start accents and tries matching. If match failed at
2220 * We allow skipping of the ends of the accent set if the ces do not match.
2227 * @return USEARCH_DONE if a match is not found, otherwise return the ending
2228 *         offset of the match. Note this start includes all following accents.
2347 * Trying out the substring and sees if it can be a canonical match.
2352 * match with the pattern.
2364 * @return TRUE if the match is valid, FALSE otherwise
2425             return TRUE; // match found
2433 * Checks match for contraction.
2434 * If the match starts with a partial contraction we fail.
2438 * @param start offset of potential match, to be modified if necessary
2439 * @param end offset of potential match, to be modified if necessary
2441 * @return TRUE if match passes the contraction test, FALSE otherwise
2453     // This part checks if either if the start of the match contains potential
2456     // match, this guarantees that our end will not be a partial contraction,
2523 * Checks and sets the match information if found.
2526 * <li> the potential match does not repeat the previous match
2528 * <li> potential match does not end in the middle of a contraction
2536 *        will be the truncated start offset of the match or the new start
2539 * @return TRUE if the match is valid, FALSE otherwise
2548     // if we have a canonical accent match
3169 * shifted to the start of the match. If a match is not found, the offset would
3175 * should not confuse the caller by returning the second match within the
3176 * same normalization buffer. If we do, the 2 results will have the same match
3177 * offsets, and that'll be confusing. I'll return the next match that doesn't
3188         // note offset is either equivalent to the start of the previous match
3201                 // not enough characters to match
3210                     // not enough characters to match
3221             // match is not found.
3259                     // next match will not preceed the current offset
3314             // a match is not found.
3327                 // not enough characters to match
3861     // Outer loop moves over match starting positions in the
3878         //  Inner loop checks for a match beginning at each
3913         targetIxOffset += strsrch->pattern.pcesLength; // this is now the offset in target CE space to end of the match so far
3916             // No match at this targetIx.  Try again at the next.
3921             // No match at all, we have run off the end of the target text.
3926         // We have found a match in CE space.
3928         //  There still is a chance of match failure if the CE range not correspond to
3936         // Look at the CE following the match.  If it is UCOL_NULLORDER the match
3937         //   extended to the end of input, and the match is good.
3939         // Look at the high and low indices of the CE following the match. If
3941         //    1. The match extended to the last CE from the target text, which is OK, or
3942         //    2. The last CE that was part of the match is in an expansion that extends
3943         //       to the first CE after the match. In this case, we reject the match.
3955                 // If we are at the end of the target too, match succeeds
3961                 // make sure it can be part of a match with the last patCE
3969                 // target element, but it has non-zero primary weight => match fails
3973                 // Else the target CE is not part of an expansion of the last matched element, match succeeds
3981         // Check for the start of the match being within a combining sequence.
3983         //   the match found combining marks in the target text that were attached
3985         //   This type of match should be rejected for not completely consuming a
3991         // Check for the start of the match being within an Collation Element Expansion,
3992         //   meaning that the first char of the match is only partially matched.
4001         //  Advance the match end position to the first acceptable match boundary.
4009             // incorrect match length when there are ignorable characters exist between
4028         //   advanced us beyond the end of the match in CE space, reject this match.
4053         printf("\n%s\n", found? "match found" : "no match");
4057     // All Done.  Store back the match bounds to the caller.
4118      * we can look at the CE following the match when we
4119      * check the match boundaries.
4122      * consider for the match.
4152     // Outer loop moves over match starting positions in the
4170         //  Inner loop checks for a match beginning at each
4197             // No match at this targetIx.  Try again at the next.
4202             // No match at all, we have run off the end of the target text.
4207         // We have found a match in CE space.
4209         //  There still is a chance of match failure if the CE range not correspond to
4215         // Check for the start of the match being within a combining sequence.
4217         //   the match found combining marks in the target text that were attached
4219         //   This type of match should be rejected for not completely consuming a
4225         // Look at the high index of the first CE in the match. If it's the same as the
4226         // low index, the first CE in the match is in the middle of an expansion.
4235             // Look at the CE following the match.  If it is UCOL_NULLORDER the match
4236             //   extended to the end of input, and the match is good.
4238             // Look at the high and low indices of the CE following the match. If
4240             //    1. The match extended to the last CE from the target text, which is OK, or
4241             //    2. The last CE that was part of the match is in an expansion that extends
4242             //       to the first CE after the match. In this case, we reject the match.
4251             //  Advance the match end position to the first acceptable match boundary.
4262             //   advanced us beyond the end of the match in CE space, reject this match.
4267             // Make sure the end of the match is on a break boundary
4304         printf("\n%s\n", found? "matchmatch");
4308     // All Done.  Store back the match bounds to the caller.
4357             // finding the last pattern ce match, imagine composite characters
4473             // finding the last pattern ce match, imagine composite characters
4571     // if setOffset is called previously or there was no previous match, we
4591             // finding the first pattern ce match, imagine composite
4665             // move the start position at the end of possible match
4719     // if setOffset is called previously or there was no previous match, we
4739             // finding the first pattern ce match, imagine composite
4818             // move the start position at the end of possible match
OpenGrok