Home | History | Annotate | Download | only in unicode
      1 /*
      2 *******************************************************************************
      3 *
      4 *   Copyright (C) 2002-2008, International Business Machines
      5 *   Corporation and others.  All Rights Reserved.
      6 *
      7 *******************************************************************************
      8 *   file name:  utf.h
      9 *   encoding:   US-ASCII
     10 *   tab size:   8 (not used)
     11 *   indentation:4
     12 *
     13 *   created on: 2002sep21
     14 *   created by: Markus W. Scherer
     15 */
     16 
     17 /**
     18  * \file
     19  * \brief C API: Deprecated macros for Unicode string handling
     20  */
     21 
     22 /**
     23  *
     24  * The macros in utf_old.h are all deprecated and their use discouraged.
     25  * Some of the design principles behind the set of UTF macros
     26  * have changed or proved impractical.
     27  * Almost all of the old "UTF macros" are at least renamed.
     28  * If you are looking for a new equivalent to an old macro, please see the
     29  * comment at the old one.
     30  *
     31  * utf_old.h is included by utf.h after unicode/umachine.h
     32  * and some common definitions, to not break old code.
     33  *
     34  * Brief summary of reasons for deprecation:
     35  * - Switch on UTF_SIZE (selection of UTF-8/16/32 default string processing)
     36  *   was impractical.
     37  * - Switch on UTF_SAFE etc. (selection of unsafe/safe/strict default string processing)
     38  *   was of little use and impractical.
     39  * - Whole classes of macros became obsolete outside of the UTF_SIZE/UTF_SAFE
     40  *   selection framework: UTF32_ macros (all trivial)
     41  *   and UTF_ default and intermediate macros (all aliases).
     42  * - The selection framework also caused many macro aliases.
     43  * - Change in Unicode standard: "irregular" sequences (3.0) became illegal (3.2).
     44  * - Change of language in Unicode standard:
     45  *   Growing distinction between internal x-bit Unicode strings and external UTF-x
     46  *   forms, with the former more lenient.
     47  *   Suggests renaming of UTF16_ macros to U16_.
     48  * - The prefix "UTF_" without a width number confused some users.
     49  * - "Safe" append macros needed the addition of an error indicator output.
     50  * - "Safe" UTF-8 macros used legitimate (if rarely used) code point values
     51  *   to indicate error conditions.
     52  * - The use of the "_CHAR" infix for code point operations confused some users.
     53  *
     54  * More details:
     55  *
     56  * Until ICU 2.2, utf.h theoretically allowed to choose among UTF-8/16/32
     57  * for string processing, and among unsafe/safe/strict default macros for that.
     58  *
     59  * It proved nearly impossible to write non-trivial, high-performance code
     60  * that is UTF-generic.
     61  * Unsafe default macros would be dangerous for default string processing,
     62  * and the main reason for the "strict" versions disappeared:
     63  * Between Unicode 3.0 and 3.2 all "irregular" UTF-8 sequences became illegal.
     64  * The only other conditions that "strict" checked for were non-characters,
     65  * which are valid during processing. Only during text input/output should they
     66  * be checked, and at that time other well-formedness checks may be
     67  * necessary or useful as well.
     68  * This can still be done by using U16_NEXT and U_IS_UNICODE_NONCHAR
     69  * or U_IS_UNICODE_CHAR.
     70  *
     71  * The old UTF8_..._SAFE macros also used some normal Unicode code points
     72  * to indicate malformed sequences.
     73  * The new UTF8_ macros without suffix use negative values instead.
     74  *
     75  * The entire contents of utf32.h was moved here without replacement
     76  * because all those macros were trivial and
     77  * were meaningful only in the framework of choosing the UTF size.
     78  *
     79  * See Jitterbug 2150 and its discussion on the ICU mailing list
     80  * in September 2002.
     81  *
     82  * <hr>
     83  *
     84  * <em>Obsolete part</em> of pre-ICU 2.4 utf.h file documentation:
     85  *
     86  * <p>The original concept for these files was for ICU to allow
     87  * in principle to set which UTF (UTF-8/16/32) is used internally
     88  * by defining UTF_SIZE to either 8, 16, or 32. utf.h would then define the UChar type
     89  * accordingly. UTF-16 was the default.</p>
     90  *
     91  * <p>This concept has been abandoned.
     92  * A lot of the ICU source code assumes UChar strings are in UTF-16.
     93  * This is especially true for low-level code like
     94  * conversion, normalization, and collation.
     95  * The utf.h header enforces the default of UTF-16.
     96  * The UTF-8 and UTF-32 macros remain for now for completeness and backward compatibility.</p>
     97  *
     98  * <p>Accordingly, utf.h defines UChar to be an unsigned 16-bit integer. If this matches wchar_t, then
     99  * UChar is defined to be exactly wchar_t, otherwise uint16_t.</p>
    100  *
    101  * <p>UChar32 is defined to be a signed 32-bit integer (int32_t), large enough for a 21-bit
    102  * Unicode code point (Unicode scalar value, 0..0x10ffff).
    103  * Before ICU 2.4, the definition of UChar32 was similarly platform-dependent as
    104  * the definition of UChar. For details see the documentation for UChar32 itself.</p>
    105  *
    106  * <p>utf.h also defines a number of C macros for handling single Unicode code points and
    107  * for using UTF Unicode strings. It includes utf8.h, utf16.h, and utf32.h for the actual
    108  * implementations of those macros and then aliases one set of them (for UTF-16) for general use.
    109  * The UTF-specific macros have the UTF size in the macro name prefixes (UTF16_...), while
    110  * the general alias macros always begin with UTF_...</p>
    111  *
    112  * <p>Many string operations can be done with or without error checking.
    113  * Where such a distinction is useful, there are two versions of the macros, "unsafe" and "safe"
    114  * ones with ..._UNSAFE and ..._SAFE suffixes. The unsafe macros are fast but may cause
    115  * program failures if the strings are not well-formed. The safe macros have an additional, boolean
    116  * parameter "strict". If strict is FALSE, then only illegal sequences are detected.
    117  * Otherwise, irregular sequences and non-characters are detected as well (like single surrogates).
    118  * Safe macros return special error code points for illegal/irregular sequences:
    119  * Typically, U+ffff, or values that would result in a code unit sequence of the same length
    120  * as the erroneous input sequence.<br>
    121  * Note that _UNSAFE macros have fewer parameters: They do not have the strictness parameter, and
    122  * they do not have start/length parameters for boundary checking.</p>
    123  *
    124  * <p>Here, the macros are aliased in two steps:
    125  * In the first step, the UTF-specific macros with UTF16_ prefix and _UNSAFE and _SAFE suffixes are
    126  * aliased according to the UTF_SIZE to macros with UTF_ prefix and the same suffixes and signatures.
    127  * Then, in a second step, the default, general alias macros are set to use either the unsafe or
    128  * the safe/not strict (default) or the safe/strict macro;
    129  * these general macros do not have a strictness parameter.</p>
    130  *
    131  * <p>It is possible to change the default choice for the general alias macros to be unsafe, safe/not strict or safe/strict.
    132  * The default is safe/not strict. It is not recommended to select the unsafe macros as the basis for
    133  * Unicode string handling in ICU! To select this, define UTF_SAFE, UTF_STRICT, or UTF_UNSAFE.</p>
    134  *
    135  * <p>For general use, one should use the default, general macros with UTF_ prefix and no _SAFE/_UNSAFE suffix.
    136  * Only in some cases it may be necessary to control the choice of macro directly and use a less generic alias.
    137  * For example, if it can be assumed that a string is well-formed and the index will stay within the bounds,
    138  * then the _UNSAFE version may be used.
    139  * If a UTF-8 string is to be processed, then the macros with UTF8_ prefixes need to be used.</p>
    140  *
    141  * <hr>
    142  *
    143  * @deprecated ICU 2.4. Use the macros in utf.h, utf16.h, utf8.h instead.
    144  */
    145 
    146 #ifndef __UTF_OLD_H__
    147 #define __UTF_OLD_H__
    148 
    149 #ifndef U_HIDE_DEPRECATED_API
    150 
    151 /* utf.h must be included first. */
    152 #ifndef __UTF_H__
    153 #   include "unicode/utf.h"
    154 #endif
    155 
    156 /* Formerly utf.h, part 1 --------------------------------------------------- */
    157 
    158 #ifdef U_USE_UTF_DEPRECATES
    159 /**
    160  * Unicode string and array offset and index type.
    161  * ICU always counts Unicode code units (UChars) for
    162  * string offsets, indexes, and lengths, not Unicode code points.
    163  *
    164  * @obsolete ICU 2.6. Use int32_t directly instead since this API will be removed in that release.
    165  */
    166 typedef int32_t UTextOffset;
    167 #endif
    168 
    169 /** Number of bits in a Unicode string code unit - ICU uses 16-bit Unicode. @deprecated ICU 2.4. Obsolete, see utf_old.h. */
    170 #define UTF_SIZE 16
    171 
    172 /**
    173  * The default choice for general Unicode string macros is to use the ..._SAFE macro implementations
    174  * with strict=FALSE.
    175  *
    176  * @deprecated ICU 2.4. Obsolete, see utf_old.h.
    177  */
    178 #define UTF_SAFE
    179 /** @deprecated ICU 2.4. Obsolete, see utf_old.h. */
    180 #undef UTF_UNSAFE
    181 /** @deprecated ICU 2.4. Obsolete, see utf_old.h. */
    182 #undef UTF_STRICT
    183 
    184 /**
    185  * UTF8_ERROR_VALUE_1 and UTF8_ERROR_VALUE_2 are special error values for UTF-8,
    186  * which need 1 or 2 bytes in UTF-8:
    187  * \code
    188  * U+0015 = NAK = Negative Acknowledge, C0 control character
    189  * U+009f = highest C1 control character
    190  * \endcode
    191  *
    192  * These are used by UTF8_..._SAFE macros so that they can return an error value
    193  * that needs the same number of code units (bytes) as were seen by
    194  * a macro. They should be tested with UTF_IS_ERROR() or UTF_IS_VALID().
    195  *
    196  * @deprecated ICU 2.4. Obsolete, see utf_old.h.
    197  */
    198 #define UTF8_ERROR_VALUE_1 0x15
    199 
    200 /**
    201  * See documentation on UTF8_ERROR_VALUE_1 for details.
    202  *
    203  * @deprecated ICU 2.4. Obsolete, see utf_old.h.
    204  */
    205 #define UTF8_ERROR_VALUE_2 0x9f
    206 
    207 /**
    208  * Error value for all UTFs. This code point value will be set by macros with error
    209  * checking if an error is detected.
    210  *
    211  * @deprecated ICU 2.4. Obsolete, see utf_old.h.
    212  */
    213 #define UTF_ERROR_VALUE 0xffff
    214 
    215 /**
    216  * Is a given 32-bit code an error value
    217  * as returned by one of the macros for any UTF?
    218  *
    219  * @deprecated ICU 2.4. Obsolete, see utf_old.h.
    220  */
    221 #define UTF_IS_ERROR(c) \
    222     (((c)&0xfffe)==0xfffe || (c)==UTF8_ERROR_VALUE_1 || (c)==UTF8_ERROR_VALUE_2)
    223 
    224 /**
    225  * This is a combined macro: Is c a valid Unicode value _and_ not an error code?
    226  *
    227  * @deprecated ICU 2.4. Obsolete, see utf_old.h.
    228  */
    229 #define UTF_IS_VALID(c) \
    230     (UTF_IS_UNICODE_CHAR(c) && \
    231      (c)!=UTF8_ERROR_VALUE_1 && (c)!=UTF8_ERROR_VALUE_2)
    232 
    233 /**
    234  * Is this code unit or code point a surrogate (U+d800..U+dfff)?
    235  * @deprecated ICU 2.4. Renamed to U_IS_SURROGATE and U16_IS_SURROGATE, see utf_old.h.
    236  */
    237 #define UTF_IS_SURROGATE(uchar) (((uchar)&0xfffff800)==0xd800)
    238 
    239 /**
    240  * Is a given 32-bit code point a Unicode noncharacter?
    241  *
    242  * @deprecated ICU 2.4. Renamed to U_IS_UNICODE_NONCHAR, see utf_old.h.
    243  */
    244 #define UTF_IS_UNICODE_NONCHAR(c) \
    245     ((c)>=0xfdd0 && \
    246      ((uint32_t)(c)<=0xfdef || ((c)&0xfffe)==0xfffe) && \
    247      (uint32_t)(c)<=0x10ffff)
    248 
    249 /**
    250  * Is a given 32-bit value a Unicode code point value (0..U+10ffff)
    251  * that can be assigned a character?
    252  *
    253  * Code points that are not characters include:
    254  * - single surrogate code points (U+d800..U+dfff, 2048 code points)
    255  * - the last two code points on each plane (U+__fffe and U+__ffff, 34 code points)
    256  * - U+fdd0..U+fdef (new with Unicode 3.1, 32 code points)
    257  * - the highest Unicode code point value is U+10ffff
    258  *
    259  * This means that all code points below U+d800 are character code points,
    260  * and that boundary is tested first for performance.
    261  *
    262  * @deprecated ICU 2.4. Renamed to U_IS_UNICODE_CHAR, see utf_old.h.
    263  */
    264 #define UTF_IS_UNICODE_CHAR(c) \
    265     ((uint32_t)(c)<0xd800 || \
    266         ((uint32_t)(c)>0xdfff && \
    267          (uint32_t)(c)<=0x10ffff && \
    268          !UTF_IS_UNICODE_NONCHAR(c)))
    269 
    270 /* Formerly utf8.h ---------------------------------------------------------- */
    271 
    272 /**
    273  * Count the trail bytes for a UTF-8 lead byte.
    274  * @deprecated ICU 2.4. Renamed to U8_COUNT_TRAIL_BYTES, see utf_old.h.
    275  */
    276 #define UTF8_COUNT_TRAIL_BYTES(leadByte) (utf8_countTrailBytes[(uint8_t)leadByte])
    277 
    278 /**
    279  * Mask a UTF-8 lead byte, leave only the lower bits that form part of the code point value.
    280  * @deprecated ICU 2.4. Renamed to U8_MASK_LEAD_BYTE, see utf_old.h.
    281  */
    282 #define UTF8_MASK_LEAD_BYTE(leadByte, countTrailBytes) ((leadByte)&=(1<<(6-(countTrailBytes)))-1)
    283 
    284 /** Is this this code point a single code unit (byte)? @deprecated ICU 2.4. Renamed to U8_IS_SINGLE, see utf_old.h. */
    285 #define UTF8_IS_SINGLE(uchar) (((uchar)&0x80)==0)
    286 /** Is this this code unit the lead code unit (byte) of a code point? @deprecated ICU 2.4. Renamed to U8_IS_LEAD, see utf_old.h. */
    287 #define UTF8_IS_LEAD(uchar) ((uint8_t)((uchar)-0xc0)<0x3e)
    288 /** Is this this code unit a trailing code unit (byte) of a code point? @deprecated ICU 2.4. Renamed to U8_IS_TRAIL, see utf_old.h. */
    289 #define UTF8_IS_TRAIL(uchar) (((uchar)&0xc0)==0x80)
    290 
    291 /** Does this scalar Unicode value need multiple code units for storage? @deprecated ICU 2.4. Use U8_LENGTH or test ((uint32_t)(c)>0x7f) instead, see utf_old.h. */
    292 #define UTF8_NEED_MULTIPLE_UCHAR(c) ((uint32_t)(c)>0x7f)
    293 
    294 /**
    295  * Given the lead character, how many bytes are taken by this code point.
    296  * ICU does not deal with code points >0x10ffff
    297  * unless necessary for advancing in the byte stream.
    298  *
    299  * These length macros take into account that for values >0x10ffff
    300  * the UTF8_APPEND_CHAR_SAFE macros would write the error code point 0xffff
    301  * with 3 bytes.
    302  * Code point comparisons need to be in uint32_t because UChar32
    303  * may be a signed type, and negative values must be recognized.
    304  *
    305  * @deprecated ICU 2.4. Use U8_LENGTH instead, see utf_old.h.
    306  */
    307 #if 1
    308 #   define UTF8_CHAR_LENGTH(c) \
    309         ((uint32_t)(c)<=0x7f ? 1 : \
    310             ((uint32_t)(c)<=0x7ff ? 2 : \
    311                 ((uint32_t)((c)-0x10000)>0xfffff ? 3 : 4) \
    312             ) \
    313         )
    314 #else
    315 #   define UTF8_CHAR_LENGTH(c) \
    316         ((uint32_t)(c)<=0x7f ? 1 : \
    317             ((uint32_t)(c)<=0x7ff ? 2 : \
    318                 ((uint32_t)(c)<=0xffff ? 3 : \
    319                     ((uint32_t)(c)<=0x10ffff ? 4 : \
    320                         ((uint32_t)(c)<=0x3ffffff ? 5 : \
    321                             ((uint32_t)(c)<=0x7fffffff ? 6 : 3) \
    322                         ) \
    323                     ) \
    324                 ) \
    325             ) \
    326         )
    327 #endif
    328 
    329 /** The maximum number of bytes per code point. @deprecated ICU 2.4. Renamed to U8_MAX_LENGTH, see utf_old.h. */
    330 #define UTF8_MAX_CHAR_LENGTH 4
    331 
    332 /** Average number of code units compared to UTF-16. @deprecated ICU 2.4. Obsolete, see utf_old.h. */
    333 #define UTF8_ARRAY_SIZE(size) ((5*(size))/2)
    334 
    335 /** @deprecated ICU 2.4. Renamed to U8_GET_UNSAFE, see utf_old.h. */
    336 #define UTF8_GET_CHAR_UNSAFE(s, i, c) { \
    337     int32_t _utf8_get_char_unsafe_index=(int32_t)(i); \
    338     UTF8_SET_CHAR_START_UNSAFE(s, _utf8_get_char_unsafe_index); \
    339     UTF8_NEXT_CHAR_UNSAFE(s, _utf8_get_char_unsafe_index, c); \
    340 }
    341 
    342 /** @deprecated ICU 2.4. Use U8_GET instead, see utf_old.h. */
    343 #define UTF8_GET_CHAR_SAFE(s, start, i, length, c, strict) { \
    344     int32_t _utf8_get_char_safe_index=(int32_t)(i); \
    345     UTF8_SET_CHAR_START_SAFE(s, start, _utf8_get_char_safe_index); \
    346     UTF8_NEXT_CHAR_SAFE(s, _utf8_get_char_safe_index, length, c, strict); \
    347 }
    348 
    349 /** @deprecated ICU 2.4. Renamed to U8_NEXT_UNSAFE, see utf_old.h. */
    350 #define UTF8_NEXT_CHAR_UNSAFE(s, i, c) { \
    351     (c)=(s)[(i)++]; \
    352     if((uint8_t)((c)-0xc0)<0x35) { \
    353         uint8_t __count=UTF8_COUNT_TRAIL_BYTES(c); \
    354         UTF8_MASK_LEAD_BYTE(c, __count); \
    355         switch(__count) { \
    356         /* each following branch falls through to the next one */ \
    357         case 3: \
    358             (c)=((c)<<6)|((s)[(i)++]&0x3f); \
    359         case 2: \
    360             (c)=((c)<<6)|((s)[(i)++]&0x3f); \
    361         case 1: \
    362             (c)=((c)<<6)|((s)[(i)++]&0x3f); \
    363         /* no other branches to optimize switch() */ \
    364             break; \
    365         } \
    366     } \
    367 }
    368 
    369 /** @deprecated ICU 2.4. Renamed to U8_APPEND_UNSAFE, see utf_old.h. */
    370 #define UTF8_APPEND_CHAR_UNSAFE(s, i, c) { \
    371     if((uint32_t)(c)<=0x7f) { \
    372         (s)[(i)++]=(uint8_t)(c); \
    373     } else { \
    374         if((uint32_t)(c)<=0x7ff) { \
    375             (s)[(i)++]=(uint8_t)(((c)>>6)|0xc0); \
    376         } else { \
    377             if((uint32_t)(c)<=0xffff) { \
    378                 (s)[(i)++]=(uint8_t)(((c)>>12)|0xe0); \
    379             } else { \
    380                 (s)[(i)++]=(uint8_t)(((c)>>18)|0xf0); \
    381                 (s)[(i)++]=(uint8_t)((((c)>>12)&0x3f)|0x80); \
    382             } \
    383             (s)[(i)++]=(uint8_t)((((c)>>6)&0x3f)|0x80); \
    384         } \
    385         (s)[(i)++]=(uint8_t)(((c)&0x3f)|0x80); \
    386     } \
    387 }
    388 
    389 /** @deprecated ICU 2.4. Renamed to U8_FWD_1_UNSAFE, see utf_old.h. */
    390 #define UTF8_FWD_1_UNSAFE(s, i) { \
    391     (i)+=1+UTF8_COUNT_TRAIL_BYTES((s)[i]); \
    392 }
    393 
    394 /** @deprecated ICU 2.4. Renamed to U8_FWD_N_UNSAFE, see utf_old.h. */
    395 #define UTF8_FWD_N_UNSAFE(s, i, n) { \
    396     int32_t __N=(n); \
    397     while(__N>0) { \
    398         UTF8_FWD_1_UNSAFE(s, i); \
    399         --__N; \
    400     } \
    401 }
    402 
    403 /** @deprecated ICU 2.4. Renamed to U8_SET_CP_START_UNSAFE, see utf_old.h. */
    404 #define UTF8_SET_CHAR_START_UNSAFE(s, i) { \
    405     while(UTF8_IS_TRAIL((s)[i])) { --(i); } \
    406 }
    407 
    408 /** @deprecated ICU 2.4. Use U8_NEXT instead, see utf_old.h. */
    409 #define UTF8_NEXT_CHAR_SAFE(s, i, length, c, strict) { \
    410     (c)=(s)[(i)++]; \
    411     if((c)>=0x80) { \
    412         if(UTF8_IS_LEAD(c)) { \
    413             (c)=utf8_nextCharSafeBody(s, &(i), (int32_t)(length), c, strict); \
    414         } else { \
    415             (c)=UTF8_ERROR_VALUE_1; \
    416         } \
    417     } \
    418 }
    419 
    420 /** @deprecated ICU 2.4. Use U8_APPEND instead, see utf_old.h. */
    421 #define UTF8_APPEND_CHAR_SAFE(s, i, length, c)  { \
    422     if((uint32_t)(c)<=0x7f) { \
    423         (s)[(i)++]=(uint8_t)(c); \
    424     } else { \
    425         (i)=utf8_appendCharSafeBody(s, (int32_t)(i), (int32_t)(length), c, NULL); \
    426     } \
    427 }
    428 
    429 /** @deprecated ICU 2.4. Renamed to U8_FWD_1, see utf_old.h. */
    430 #define UTF8_FWD_1_SAFE(s, i, length) U8_FWD_1(s, i, length)
    431 
    432 /** @deprecated ICU 2.4. Renamed to U8_FWD_N, see utf_old.h. */
    433 #define UTF8_FWD_N_SAFE(s, i, length, n) U8_FWD_N(s, i, length, n)
    434 
    435 /** @deprecated ICU 2.4. Renamed to U8_SET_CP_START, see utf_old.h. */
    436 #define UTF8_SET_CHAR_START_SAFE(s, start, i) U8_SET_CP_START(s, start, i)
    437 
    438 /** @deprecated ICU 2.4. Renamed to U8_PREV_UNSAFE, see utf_old.h. */
    439 #define UTF8_PREV_CHAR_UNSAFE(s, i, c) { \
    440     (c)=(s)[--(i)]; \
    441     if(UTF8_IS_TRAIL(c)) { \
    442         uint8_t __b, __count=1, __shift=6; \
    443 \
    444         /* c is a trail byte */ \
    445         (c)&=0x3f; \
    446         for(;;) { \
    447             __b=(s)[--(i)]; \
    448             if(__b>=0xc0) { \
    449                 UTF8_MASK_LEAD_BYTE(__b, __count); \
    450                 (c)|=(UChar32)__b<<__shift; \
    451                 break; \
    452             } else { \
    453                 (c)|=(UChar32)(__b&0x3f)<<__shift; \
    454                 ++__count; \
    455                 __shift+=6; \
    456             } \
    457         } \
    458     } \
    459 }
    460 
    461 /** @deprecated ICU 2.4. Renamed to U8_BACK_1_UNSAFE, see utf_old.h. */
    462 #define UTF8_BACK_1_UNSAFE(s, i) { \
    463     while(UTF8_IS_TRAIL((s)[--(i)])) {} \
    464 }
    465 
    466 /** @deprecated ICU 2.4. Renamed to U8_BACK_N_UNSAFE, see utf_old.h. */
    467 #define UTF8_BACK_N_UNSAFE(s, i, n) { \
    468     int32_t __N=(n); \
    469     while(__N>0) { \
    470         UTF8_BACK_1_UNSAFE(s, i); \
    471         --__N; \
    472     } \
    473 }
    474 
    475 /** @deprecated ICU 2.4. Renamed to U8_SET_CP_LIMIT_UNSAFE, see utf_old.h. */
    476 #define UTF8_SET_CHAR_LIMIT_UNSAFE(s, i) { \
    477     UTF8_BACK_1_UNSAFE(s, i); \
    478     UTF8_FWD_1_UNSAFE(s, i); \
    479 }
    480 
    481 /** @deprecated ICU 2.4. Use U8_PREV instead, see utf_old.h. */
    482 #define UTF8_PREV_CHAR_SAFE(s, start, i, c, strict) { \
    483     (c)=(s)[--(i)]; \
    484     if((c)>=0x80) { \
    485         if((c)<=0xbf) { \
    486             (c)=utf8_prevCharSafeBody(s, start, &(i), c, strict); \
    487         } else { \
    488             (c)=UTF8_ERROR_VALUE_1; \
    489         } \
    490     } \
    491 }
    492 
    493 /** @deprecated ICU 2.4. Renamed to U8_BACK_1, see utf_old.h. */
    494 #define UTF8_BACK_1_SAFE(s, start, i) U8_BACK_1(s, start, i)
    495 
    496 /** @deprecated ICU 2.4. Renamed to U8_BACK_N, see utf_old.h. */
    497 #define UTF8_BACK_N_SAFE(s, start, i, n) U8_BACK_N(s, start, i, n)
    498 
    499 /** @deprecated ICU 2.4. Renamed to U8_SET_CP_LIMIT, see utf_old.h. */
    500 #define UTF8_SET_CHAR_LIMIT_SAFE(s, start, i, length) U8_SET_CP_LIMIT(s, start, i, length)
    501 
    502 /* Formerly utf16.h --------------------------------------------------------- */
    503 
    504 /** Is uchar a first/lead surrogate? @deprecated ICU 2.4. Renamed to U_IS_LEAD and U16_IS_LEAD, see utf_old.h. */
    505 #define UTF_IS_FIRST_SURROGATE(uchar) (((uchar)&0xfffffc00)==0xd800)
    506 
    507 /** Is uchar a second/trail surrogate? @deprecated ICU 2.4. Renamed to U_IS_TRAIL and U16_IS_TRAIL, see utf_old.h. */
    508 #define UTF_IS_SECOND_SURROGATE(uchar) (((uchar)&0xfffffc00)==0xdc00)
    509 
    510 /** Assuming c is a surrogate, is it a first/lead surrogate? @deprecated ICU 2.4. Renamed to U_IS_SURROGATE_LEAD and U16_IS_SURROGATE_LEAD, see utf_old.h. */
    511 #define UTF_IS_SURROGATE_FIRST(c) (((c)&0x400)==0)
    512 
    513 /** Helper constant for UTF16_GET_PAIR_VALUE. @deprecated ICU 2.4. Renamed to U16_SURROGATE_OFFSET, see utf_old.h. */
    514 #define UTF_SURROGATE_OFFSET ((0xd800<<10UL)+0xdc00-0x10000)
    515 
    516 /** Get the UTF-32 value from the surrogate code units. @deprecated ICU 2.4. Renamed to U16_GET_SUPPLEMENTARY, see utf_old.h. */
    517 #define UTF16_GET_PAIR_VALUE(first, second) \
    518     (((first)<<10UL)+(second)-UTF_SURROGATE_OFFSET)
    519 
    520 /** @deprecated ICU 2.4. Renamed to U16_LEAD, see utf_old.h. */
    521 #define UTF_FIRST_SURROGATE(supplementary) (UChar)(((supplementary)>>10)+0xd7c0)
    522 
    523 /** @deprecated ICU 2.4. Renamed to U16_TRAIL, see utf_old.h. */
    524 #define UTF_SECOND_SURROGATE(supplementary) (UChar)(((supplementary)&0x3ff)|0xdc00)
    525 
    526 /** @deprecated ICU 2.4. Renamed to U16_LEAD, see utf_old.h. */
    527 #define UTF16_LEAD(supplementary) UTF_FIRST_SURROGATE(supplementary)
    528 
    529 /** @deprecated ICU 2.4. Renamed to U16_TRAIL, see utf_old.h. */
    530 #define UTF16_TRAIL(supplementary) UTF_SECOND_SURROGATE(supplementary)
    531 
    532 /** @deprecated ICU 2.4. Renamed to U16_IS_SINGLE, see utf_old.h. */
    533 #define UTF16_IS_SINGLE(uchar) !UTF_IS_SURROGATE(uchar)
    534 
    535 /** @deprecated ICU 2.4. Renamed to U16_IS_LEAD, see utf_old.h. */
    536 #define UTF16_IS_LEAD(uchar) UTF_IS_FIRST_SURROGATE(uchar)
    537 
    538 /** @deprecated ICU 2.4. Renamed to U16_IS_TRAIL, see utf_old.h. */
    539 #define UTF16_IS_TRAIL(uchar) UTF_IS_SECOND_SURROGATE(uchar)
    540 
    541 /** Does this scalar Unicode value need multiple code units for storage? @deprecated ICU 2.4. Use U16_LENGTH or test ((uint32_t)(c)>0xffff) instead, see utf_old.h. */
    542 #define UTF16_NEED_MULTIPLE_UCHAR(c) ((uint32_t)(c)>0xffff)
    543 
    544 /** @deprecated ICU 2.4. Renamed to U16_LENGTH, see utf_old.h. */
    545 #define UTF16_CHAR_LENGTH(c) ((uint32_t)(c)<=0xffff ? 1 : 2)
    546 
    547 /** @deprecated ICU 2.4. Renamed to U16_MAX_LENGTH, see utf_old.h. */
    548 #define UTF16_MAX_CHAR_LENGTH 2
    549 
    550 /** Average number of code units compared to UTF-16. @deprecated ICU 2.4. Obsolete, see utf_old.h. */
    551 #define UTF16_ARRAY_SIZE(size) (size)
    552 
    553 /**
    554  * Get a single code point from an offset that points to any
    555  * of the code units that belong to that code point.
    556  * Assume 0<=i<length.
    557  *
    558  * This could be used for iteration together with
    559  * UTF16_CHAR_LENGTH() and UTF_IS_ERROR(),
    560  * but the use of UTF16_NEXT_CHAR[_UNSAFE]() and
    561  * UTF16_PREV_CHAR[_UNSAFE]() is more efficient for that.
    562  * @deprecated ICU 2.4. Renamed to U16_GET_UNSAFE, see utf_old.h.
    563  */
    564 #define UTF16_GET_CHAR_UNSAFE(s, i, c) { \
    565     (c)=(s)[i]; \
    566     if(UTF_IS_SURROGATE(c)) { \
    567         if(UTF_IS_SURROGATE_FIRST(c)) { \
    568             (c)=UTF16_GET_PAIR_VALUE((c), (s)[(i)+1]); \
    569         } else { \
    570             (c)=UTF16_GET_PAIR_VALUE((s)[(i)-1], (c)); \
    571         } \
    572     } \
    573 }
    574 
    575 /** @deprecated ICU 2.4. Use U16_GET instead, see utf_old.h. */
    576 #define UTF16_GET_CHAR_SAFE(s, start, i, length, c, strict) { \
    577     (c)=(s)[i]; \
    578     if(UTF_IS_SURROGATE(c)) { \
    579         uint16_t __c2; \
    580         if(UTF_IS_SURROGATE_FIRST(c)) { \
    581             if((i)+1<(length) && UTF_IS_SECOND_SURROGATE(__c2=(s)[(i)+1])) { \
    582                 (c)=UTF16_GET_PAIR_VALUE((c), __c2); \
    583                 /* strict: ((c)&0xfffe)==0xfffe is caught by UTF_IS_ERROR() and UTF_IS_UNICODE_CHAR() */ \
    584             } else if(strict) {\
    585                 /* unmatched first surrogate */ \
    586                 (c)=UTF_ERROR_VALUE; \
    587             } \
    588         } else { \
    589             if((i)-1>=(start) && UTF_IS_FIRST_SURROGATE(__c2=(s)[(i)-1])) { \
    590                 (c)=UTF16_GET_PAIR_VALUE(__c2, (c)); \
    591                 /* strict: ((c)&0xfffe)==0xfffe is caught by UTF_IS_ERROR() and UTF_IS_UNICODE_CHAR() */ \
    592             } else if(strict) {\
    593                 /* unmatched second surrogate */ \
    594                 (c)=UTF_ERROR_VALUE; \
    595             } \
    596         } \
    597     } else if((strict) && !UTF_IS_UNICODE_CHAR(c)) { \
    598         (c)=UTF_ERROR_VALUE; \
    599     } \
    600 }
    601 
    602 /** @deprecated ICU 2.4. Renamed to U16_NEXT_UNSAFE, see utf_old.h. */
    603 #define UTF16_NEXT_CHAR_UNSAFE(s, i, c) { \
    604     (c)=(s)[(i)++]; \
    605     if(UTF_IS_FIRST_SURROGATE(c)) { \
    606         (c)=UTF16_GET_PAIR_VALUE((c), (s)[(i)++]); \
    607     } \
    608 }
    609 
    610 /** @deprecated ICU 2.4. Renamed to U16_APPEND_UNSAFE, see utf_old.h. */
    611 #define UTF16_APPEND_CHAR_UNSAFE(s, i, c) { \
    612     if((uint32_t)(c)<=0xffff) { \
    613         (s)[(i)++]=(uint16_t)(c); \
    614     } else { \
    615         (s)[(i)++]=(uint16_t)(((c)>>10)+0xd7c0); \
    616         (s)[(i)++]=(uint16_t)(((c)&0x3ff)|0xdc00); \
    617     } \
    618 }
    619 
    620 /** @deprecated ICU 2.4. Renamed to U16_FWD_1_UNSAFE, see utf_old.h. */
    621 #define UTF16_FWD_1_UNSAFE(s, i) { \
    622     if(UTF_IS_FIRST_SURROGATE((s)[(i)++])) { \
    623         ++(i); \
    624     } \
    625 }
    626 
    627 /** @deprecated ICU 2.4. Renamed to U16_FWD_N_UNSAFE, see utf_old.h. */
    628 #define UTF16_FWD_N_UNSAFE(s, i, n) { \
    629     int32_t __N=(n); \
    630     while(__N>0) { \
    631         UTF16_FWD_1_UNSAFE(s, i); \
    632         --__N; \
    633     } \
    634 }
    635 
    636 /** @deprecated ICU 2.4. Renamed to U16_SET_CP_START_UNSAFE, see utf_old.h. */
    637 #define UTF16_SET_CHAR_START_UNSAFE(s, i) { \
    638     if(UTF_IS_SECOND_SURROGATE((s)[i])) { \
    639         --(i); \
    640     } \
    641 }
    642 
    643 /** @deprecated ICU 2.4. Use U16_NEXT instead, see utf_old.h. */
    644 #define UTF16_NEXT_CHAR_SAFE(s, i, length, c, strict) { \
    645     (c)=(s)[(i)++]; \
    646     if(UTF_IS_FIRST_SURROGATE(c)) { \
    647         uint16_t __c2; \
    648         if((i)<(length) && UTF_IS_SECOND_SURROGATE(__c2=(s)[(i)])) { \
    649             ++(i); \
    650             (c)=UTF16_GET_PAIR_VALUE((c), __c2); \
    651             /* strict: ((c)&0xfffe)==0xfffe is caught by UTF_IS_ERROR() and UTF_IS_UNICODE_CHAR() */ \
    652         } else if(strict) {\
    653             /* unmatched first surrogate */ \
    654             (c)=UTF_ERROR_VALUE; \
    655         } \
    656     } else if((strict) && !UTF_IS_UNICODE_CHAR(c)) { \
    657         /* unmatched second surrogate or other non-character */ \
    658         (c)=UTF_ERROR_VALUE; \
    659     } \
    660 }
    661 
    662 /** @deprecated ICU 2.4. Use U16_APPEND instead, see utf_old.h. */
    663 #define UTF16_APPEND_CHAR_SAFE(s, i, length, c) { \
    664     if((uint32_t)(c)<=0xffff) { \
    665         (s)[(i)++]=(uint16_t)(c); \
    666     } else if((uint32_t)(c)<=0x10ffff) { \
    667         if((i)+1<(length)) { \
    668             (s)[(i)++]=(uint16_t)(((c)>>10)+0xd7c0); \
    669             (s)[(i)++]=(uint16_t)(((c)&0x3ff)|0xdc00); \
    670         } else /* not enough space */ { \
    671             (s)[(i)++]=UTF_ERROR_VALUE; \
    672         } \
    673     } else /* c>0x10ffff, write error value */ { \
    674         (s)[(i)++]=UTF_ERROR_VALUE; \
    675     } \
    676 }
    677 
    678 /** @deprecated ICU 2.4. Renamed to U16_FWD_1, see utf_old.h. */
    679 #define UTF16_FWD_1_SAFE(s, i, length) U16_FWD_1(s, i, length)
    680 
    681 /** @deprecated ICU 2.4. Renamed to U16_FWD_N, see utf_old.h. */
    682 #define UTF16_FWD_N_SAFE(s, i, length, n) U16_FWD_N(s, i, length, n)
    683 
    684 /** @deprecated ICU 2.4. Renamed to U16_SET_CP_START, see utf_old.h. */
    685 #define UTF16_SET_CHAR_START_SAFE(s, start, i) U16_SET_CP_START(s, start, i)
    686 
    687 /** @deprecated ICU 2.4. Renamed to U16_PREV_UNSAFE, see utf_old.h. */
    688 #define UTF16_PREV_CHAR_UNSAFE(s, i, c) { \
    689     (c)=(s)[--(i)]; \
    690     if(UTF_IS_SECOND_SURROGATE(c)) { \
    691         (c)=UTF16_GET_PAIR_VALUE((s)[--(i)], (c)); \
    692     } \
    693 }
    694 
    695 /** @deprecated ICU 2.4. Renamed to U16_BACK_1_UNSAFE, see utf_old.h. */
    696 #define UTF16_BACK_1_UNSAFE(s, i) { \
    697     if(UTF_IS_SECOND_SURROGATE((s)[--(i)])) { \
    698         --(i); \
    699     } \
    700 }
    701 
    702 /** @deprecated ICU 2.4. Renamed to U16_BACK_N_UNSAFE, see utf_old.h. */
    703 #define UTF16_BACK_N_UNSAFE(s, i, n) { \
    704     int32_t __N=(n); \
    705     while(__N>0) { \
    706         UTF16_BACK_1_UNSAFE(s, i); \
    707         --__N; \
    708     } \
    709 }
    710 
    711 /** @deprecated ICU 2.4. Renamed to U16_SET_CP_LIMIT_UNSAFE, see utf_old.h. */
    712 #define UTF16_SET_CHAR_LIMIT_UNSAFE(s, i) { \
    713     if(UTF_IS_FIRST_SURROGATE((s)[(i)-1])) { \
    714         ++(i); \
    715     } \
    716 }
    717 
    718 /** @deprecated ICU 2.4. Use U16_PREV instead, see utf_old.h. */
    719 #define UTF16_PREV_CHAR_SAFE(s, start, i, c, strict) { \
    720     (c)=(s)[--(i)]; \
    721     if(UTF_IS_SECOND_SURROGATE(c)) { \
    722         uint16_t __c2; \
    723         if((i)>(start) && UTF_IS_FIRST_SURROGATE(__c2=(s)[(i)-1])) { \
    724             --(i); \
    725             (c)=UTF16_GET_PAIR_VALUE(__c2, (c)); \
    726             /* strict: ((c)&0xfffe)==0xfffe is caught by UTF_IS_ERROR() and UTF_IS_UNICODE_CHAR() */ \
    727         } else if(strict) {\
    728             /* unmatched second surrogate */ \
    729             (c)=UTF_ERROR_VALUE; \
    730         } \
    731     } else if((strict) && !UTF_IS_UNICODE_CHAR(c)) { \
    732         /* unmatched first surrogate or other non-character */ \
    733         (c)=UTF_ERROR_VALUE; \
    734     } \
    735 }
    736 
    737 /** @deprecated ICU 2.4. Renamed to U16_BACK_1, see utf_old.h. */
    738 #define UTF16_BACK_1_SAFE(s, start, i) U16_BACK_1(s, start, i)
    739 
    740 /** @deprecated ICU 2.4. Renamed to U16_BACK_N, see utf_old.h. */
    741 #define UTF16_BACK_N_SAFE(s, start, i, n) U16_BACK_N(s, start, i, n)
    742 
    743 /** @deprecated ICU 2.4. Renamed to U16_SET_CP_LIMIT, see utf_old.h. */
    744 #define UTF16_SET_CHAR_LIMIT_SAFE(s, start, i, length) U16_SET_CP_LIMIT(s, start, i, length)
    745 
    746 /* Formerly utf32.h --------------------------------------------------------- */
    747 
    748 /*
    749 * Old documentation:
    750 *
    751 *   This file defines macros to deal with UTF-32 code units and code points.
    752 *   Signatures and semantics are the same as for the similarly named macros
    753 *   in utf16.h.
    754 *   utf32.h is included by utf.h after unicode/umachine.h</p>
    755 *   and some common definitions.
    756 *   <p><b>Usage:</b>  ICU coding guidelines for if() statements should be followed when using these macros.
    757 *                  Compound statements (curly braces {}) must be used  for if-else-while...
    758 *                  bodies and all macro statements should be terminated with semicolon.</p>
    759 */
    760 
    761 /* internal definitions ----------------------------------------------------- */
    762 
    763 /** @deprecated ICU 2.4. Obsolete, see utf_old.h. */
    764 #define UTF32_IS_SAFE(c, strict) \
    765     (!(strict) ? \
    766         (uint32_t)(c)<=0x10ffff : \
    767         UTF_IS_UNICODE_CHAR(c))
    768 
    769 /*
    770  * For the semantics of all of these macros, see utf16.h.
    771  * The UTF-32 versions are trivial because any code point is
    772  * encoded using exactly one code unit.
    773  */
    774 
    775 /* single-code point definitions -------------------------------------------- */
    776 
    777 /* classes of code unit values */
    778 
    779 /** @deprecated ICU 2.4. Obsolete, see utf_old.h. */
    780 #define UTF32_IS_SINGLE(uchar) 1
    781 /** @deprecated ICU 2.4. Obsolete, see utf_old.h. */
    782 #define UTF32_IS_LEAD(uchar) 0
    783 /** @deprecated ICU 2.4. Obsolete, see utf_old.h. */
    784 #define UTF32_IS_TRAIL(uchar) 0
    785 
    786 /* number of code units per code point */
    787 
    788 /** @deprecated ICU 2.4. Obsolete, see utf_old.h. */
    789 #define UTF32_NEED_MULTIPLE_UCHAR(c) 0
    790 /** @deprecated ICU 2.4. Obsolete, see utf_old.h. */
    791 #define UTF32_CHAR_LENGTH(c) 1
    792 /** @deprecated ICU 2.4. Obsolete, see utf_old.h. */
    793 #define UTF32_MAX_CHAR_LENGTH 1
    794 
    795 /* average number of code units compared to UTF-16 */
    796 
    797 /** @deprecated ICU 2.4. Obsolete, see utf_old.h. */
    798 #define UTF32_ARRAY_SIZE(size) (size)
    799 
    800 /** @deprecated ICU 2.4. Obsolete, see utf_old.h. */
    801 #define UTF32_GET_CHAR_UNSAFE(s, i, c) { \
    802     (c)=(s)[i]; \
    803 }
    804 
    805 /** @deprecated ICU 2.4. Obsolete, see utf_old.h. */
    806 #define UTF32_GET_CHAR_SAFE(s, start, i, length, c, strict) { \
    807     (c)=(s)[i]; \
    808     if(!UTF32_IS_SAFE(c, strict)) { \
    809         (c)=UTF_ERROR_VALUE; \
    810     } \
    811 }
    812 
    813 /* definitions with forward iteration --------------------------------------- */
    814 
    815 /** @deprecated ICU 2.4. Obsolete, see utf_old.h. */
    816 #define UTF32_NEXT_CHAR_UNSAFE(s, i, c) { \
    817     (c)=(s)[(i)++]; \
    818 }
    819 
    820 /** @deprecated ICU 2.4. Obsolete, see utf_old.h. */
    821 #define UTF32_APPEND_CHAR_UNSAFE(s, i, c) { \
    822     (s)[(i)++]=(c); \
    823 }
    824 
    825 /** @deprecated ICU 2.4. Obsolete, see utf_old.h. */
    826 #define UTF32_FWD_1_UNSAFE(s, i) { \
    827     ++(i); \
    828 }
    829 
    830 /** @deprecated ICU 2.4. Obsolete, see utf_old.h. */
    831 #define UTF32_FWD_N_UNSAFE(s, i, n) { \
    832     (i)+=(n); \
    833 }
    834 
    835 /** @deprecated ICU 2.4. Obsolete, see utf_old.h. */
    836 #define UTF32_SET_CHAR_START_UNSAFE(s, i) { \
    837 }
    838 
    839 /** @deprecated ICU 2.4. Obsolete, see utf_old.h. */
    840 #define UTF32_NEXT_CHAR_SAFE(s, i, length, c, strict) { \
    841     (c)=(s)[(i)++]; \
    842     if(!UTF32_IS_SAFE(c, strict)) { \
    843         (c)=UTF_ERROR_VALUE; \
    844     } \
    845 }
    846 
    847 /** @deprecated ICU 2.4. Obsolete, see utf_old.h. */
    848 #define UTF32_APPEND_CHAR_SAFE(s, i, length, c) { \
    849     if((uint32_t)(c)<=0x10ffff) { \
    850         (s)[(i)++]=(c); \
    851     } else /* c>0x10ffff, write 0xfffd */ { \
    852         (s)[(i)++]=0xfffd; \
    853     } \
    854 }
    855 
    856 /** @deprecated ICU 2.4. Obsolete, see utf_old.h. */
    857 #define UTF32_FWD_1_SAFE(s, i, length) { \
    858     ++(i); \
    859 }
    860 
    861 /** @deprecated ICU 2.4. Obsolete, see utf_old.h. */
    862 #define UTF32_FWD_N_SAFE(s, i, length, n) { \
    863     if(((i)+=(n))>(length)) { \
    864         (i)=(length); \
    865     } \
    866 }
    867 
    868 /** @deprecated ICU 2.4. Obsolete, see utf_old.h. */
    869 #define UTF32_SET_CHAR_START_SAFE(s, start, i) { \
    870 }
    871 
    872 /* definitions with backward iteration -------------------------------------- */
    873 
    874 /** @deprecated ICU 2.4. Obsolete, see utf_old.h. */
    875 #define UTF32_PREV_CHAR_UNSAFE(s, i, c) { \
    876     (c)=(s)[--(i)]; \
    877 }
    878 
    879 /** @deprecated ICU 2.4. Obsolete, see utf_old.h. */
    880 #define UTF32_BACK_1_UNSAFE(s, i) { \
    881     --(i); \
    882 }
    883 
    884 /** @deprecated ICU 2.4. Obsolete, see utf_old.h. */
    885 #define UTF32_BACK_N_UNSAFE(s, i, n) { \
    886     (i)-=(n); \
    887 }
    888 
    889 /** @deprecated ICU 2.4. Obsolete, see utf_old.h. */
    890 #define UTF32_SET_CHAR_LIMIT_UNSAFE(s, i) { \
    891 }
    892 
    893 /** @deprecated ICU 2.4. Obsolete, see utf_old.h. */
    894 #define UTF32_PREV_CHAR_SAFE(s, start, i, c, strict) { \
    895     (c)=(s)[--(i)]; \
    896     if(!UTF32_IS_SAFE(c, strict)) { \
    897         (c)=UTF_ERROR_VALUE; \
    898     } \
    899 }
    900 
    901 /** @deprecated ICU 2.4. Obsolete, see utf_old.h. */
    902 #define UTF32_BACK_1_SAFE(s, start, i) { \
    903     --(i); \
    904 }
    905 
    906 /** @deprecated ICU 2.4. Obsolete, see utf_old.h. */
    907 #define UTF32_BACK_N_SAFE(s, start, i, n) { \
    908     (i)-=(n); \
    909     if((i)<(start)) { \
    910         (i)=(start); \
    911     } \
    912 }
    913 
    914 /** @deprecated ICU 2.4. Obsolete, see utf_old.h. */
    915 #define UTF32_SET_CHAR_LIMIT_SAFE(s, i, length) { \
    916 }
    917 
    918 /* Formerly utf.h, part 2 --------------------------------------------------- */
    919 
    920 /**
    921  * Estimate the number of code units for a string based on the number of UTF-16 code units.
    922  *
    923  * @deprecated ICU 2.4. Obsolete, see utf_old.h.
    924  */
    925 #define UTF_ARRAY_SIZE(size) UTF16_ARRAY_SIZE(size)
    926 
    927 /** @deprecated ICU 2.4. Renamed to U16_GET_UNSAFE, see utf_old.h. */
    928 #define UTF_GET_CHAR_UNSAFE(s, i, c)                 UTF16_GET_CHAR_UNSAFE(s, i, c)
    929 
    930 /** @deprecated ICU 2.4. Use U16_GET instead, see utf_old.h. */
    931 #define UTF_GET_CHAR_SAFE(s, start, i, length, c, strict) UTF16_GET_CHAR_SAFE(s, start, i, length, c, strict)
    932 
    933 
    934 /** @deprecated ICU 2.4. Renamed to U16_NEXT_UNSAFE, see utf_old.h. */
    935 #define UTF_NEXT_CHAR_UNSAFE(s, i, c)                UTF16_NEXT_CHAR_UNSAFE(s, i, c)
    936 
    937 /** @deprecated ICU 2.4. Use U16_NEXT instead, see utf_old.h. */
    938 #define UTF_NEXT_CHAR_SAFE(s, i, length, c, strict)  UTF16_NEXT_CHAR_SAFE(s, i, length, c, strict)
    939 
    940 
    941 /** @deprecated ICU 2.4. Renamed to U16_APPEND_UNSAFE, see utf_old.h. */
    942 #define UTF_APPEND_CHAR_UNSAFE(s, i, c)              UTF16_APPEND_CHAR_UNSAFE(s, i, c)
    943 
    944 /** @deprecated ICU 2.4. Use U16_APPEND instead, see utf_old.h. */
    945 #define UTF_APPEND_CHAR_SAFE(s, i, length, c)        UTF16_APPEND_CHAR_SAFE(s, i, length, c)
    946 
    947 
    948 /** @deprecated ICU 2.4. Renamed to U16_FWD_1_UNSAFE, see utf_old.h. */
    949 #define UTF_FWD_1_UNSAFE(s, i)                       UTF16_FWD_1_UNSAFE(s, i)
    950 
    951 /** @deprecated ICU 2.4. Renamed to U16_FWD_1, see utf_old.h. */
    952 #define UTF_FWD_1_SAFE(s, i, length)                 UTF16_FWD_1_SAFE(s, i, length)
    953 
    954 
    955 /** @deprecated ICU 2.4. Renamed to U16_FWD_N_UNSAFE, see utf_old.h. */
    956 #define UTF_FWD_N_UNSAFE(s, i, n)                    UTF16_FWD_N_UNSAFE(s, i, n)
    957 
    958 /** @deprecated ICU 2.4. Renamed to U16_FWD_N, see utf_old.h. */
    959 #define UTF_FWD_N_SAFE(s, i, length, n)              UTF16_FWD_N_SAFE(s, i, length, n)
    960 
    961 
    962 /** @deprecated ICU 2.4. Renamed to U16_SET_CP_START_UNSAFE, see utf_old.h. */
    963 #define UTF_SET_CHAR_START_UNSAFE(s, i)              UTF16_SET_CHAR_START_UNSAFE(s, i)
    964 
    965 /** @deprecated ICU 2.4. Renamed to U16_SET_CP_START, see utf_old.h. */
    966 #define UTF_SET_CHAR_START_SAFE(s, start, i)         UTF16_SET_CHAR_START_SAFE(s, start, i)
    967 
    968 
    969 /** @deprecated ICU 2.4. Renamed to U16_PREV_UNSAFE, see utf_old.h. */
    970 #define UTF_PREV_CHAR_UNSAFE(s, i, c)                UTF16_PREV_CHAR_UNSAFE(s, i, c)
    971 
    972 /** @deprecated ICU 2.4. Use U16_PREV instead, see utf_old.h. */
    973 #define UTF_PREV_CHAR_SAFE(s, start, i, c, strict)   UTF16_PREV_CHAR_SAFE(s, start, i, c, strict)
    974 
    975 
    976 /** @deprecated ICU 2.4. Renamed to U16_BACK_1_UNSAFE, see utf_old.h. */
    977 #define UTF_BACK_1_UNSAFE(s, i)                      UTF16_BACK_1_UNSAFE(s, i)
    978 
    979 /** @deprecated ICU 2.4. Renamed to U16_BACK_1, see utf_old.h. */
    980 #define UTF_BACK_1_SAFE(s, start, i)                 UTF16_BACK_1_SAFE(s, start, i)
    981 
    982 
    983 /** @deprecated ICU 2.4. Renamed to U16_BACK_N_UNSAFE, see utf_old.h. */
    984 #define UTF_BACK_N_UNSAFE(s, i, n)                   UTF16_BACK_N_UNSAFE(s, i, n)
    985 
    986 /** @deprecated ICU 2.4. Renamed to U16_BACK_N, see utf_old.h. */
    987 #define UTF_BACK_N_SAFE(s, start, i, n)              UTF16_BACK_N_SAFE(s, start, i, n)
    988 
    989 
    990 /** @deprecated ICU 2.4. Renamed to U16_SET_CP_LIMIT_UNSAFE, see utf_old.h. */
    991 #define UTF_SET_CHAR_LIMIT_UNSAFE(s, i)              UTF16_SET_CHAR_LIMIT_UNSAFE(s, i)
    992 
    993 /** @deprecated ICU 2.4. Renamed to U16_SET_CP_LIMIT, see utf_old.h. */
    994 #define UTF_SET_CHAR_LIMIT_SAFE(s, start, i, length) UTF16_SET_CHAR_LIMIT_SAFE(s, start, i, length)
    995 
    996 /* Define default macros (UTF-16 "safe") ------------------------------------ */
    997 
    998 /**
    999  * Does this code unit alone encode a code point (BMP, not a surrogate)?
   1000  * Same as UTF16_IS_SINGLE.
   1001  * @deprecated ICU 2.4. Renamed to U_IS_SINGLE and U16_IS_SINGLE, see utf_old.h.
   1002  */
   1003 #define UTF_IS_SINGLE(uchar) U16_IS_SINGLE(uchar)
   1004 
   1005 /**
   1006  * Is this code unit the first one of several (a lead surrogate)?
   1007  * Same as UTF16_IS_LEAD.
   1008  * @deprecated ICU 2.4. Renamed to U_IS_LEAD and U16_IS_LEAD, see utf_old.h.
   1009  */
   1010 #define UTF_IS_LEAD(uchar) U16_IS_LEAD(uchar)
   1011 
   1012 /**
   1013  * Is this code unit one of several but not the first one (a trail surrogate)?
   1014  * Same as UTF16_IS_TRAIL.
   1015  * @deprecated ICU 2.4. Renamed to U_IS_TRAIL and U16_IS_TRAIL, see utf_old.h.
   1016  */
   1017 #define UTF_IS_TRAIL(uchar) U16_IS_TRAIL(uchar)
   1018 
   1019 /**
   1020  * Does this code point require multiple code units (is it a supplementary code point)?
   1021  * Same as UTF16_NEED_MULTIPLE_UCHAR.
   1022  * @deprecated ICU 2.4. Use U16_LENGTH or test ((uint32_t)(c)>0xffff) instead.
   1023  */
   1024 #define UTF_NEED_MULTIPLE_UCHAR(c) UTF16_NEED_MULTIPLE_UCHAR(c)
   1025 
   1026 /**
   1027  * How many code units are used to encode this code point (1 or 2)?
   1028  * Same as UTF16_CHAR_LENGTH.
   1029  * @deprecated ICU 2.4. Renamed to U16_LENGTH, see utf_old.h.
   1030  */
   1031 #define UTF_CHAR_LENGTH(c) U16_LENGTH(c)
   1032 
   1033 /**
   1034  * How many code units are used at most for any Unicode code point (2)?
   1035  * Same as UTF16_MAX_CHAR_LENGTH.
   1036  * @deprecated ICU 2.4. Renamed to U16_MAX_LENGTH, see utf_old.h.
   1037  */
   1038 #define UTF_MAX_CHAR_LENGTH U16_MAX_LENGTH
   1039 
   1040 /**
   1041  * Set c to the code point that contains the code unit i.
   1042  * i could point to the lead or the trail surrogate for the code point.
   1043  * i is not modified.
   1044  * Same as UTF16_GET_CHAR.
   1045  * \pre 0<=i<length
   1046  *
   1047  * @deprecated ICU 2.4. Renamed to U16_GET, see utf_old.h.
   1048  */
   1049 #define UTF_GET_CHAR(s, start, i, length, c) U16_GET(s, start, i, length, c)
   1050 
   1051 /**
   1052  * Set c to the code point that starts at code unit i
   1053  * and advance i to beyond the code units of this code point (post-increment).
   1054  * i must point to the first code unit of a code point.
   1055  * Otherwise c is set to the trail unit (surrogate) itself.
   1056  * Same as UTF16_NEXT_CHAR.
   1057  * \pre 0<=i<length
   1058  * \post 0<i<=length
   1059  *
   1060  * @deprecated ICU 2.4. Renamed to U16_NEXT, see utf_old.h.
   1061  */
   1062 #define UTF_NEXT_CHAR(s, i, length, c) U16_NEXT(s, i, length, c)
   1063 
   1064 /**
   1065  * Append the code units of code point c to the string at index i
   1066  * and advance i to beyond the new code units (post-increment).
   1067  * The code units beginning at index i will be overwritten.
   1068  * Same as UTF16_APPEND_CHAR.
   1069  * \pre 0<=c<=0x10ffff
   1070  * \pre 0<=i<length
   1071  * \post 0<i<=length
   1072  *
   1073  * @deprecated ICU 2.4. Use U16_APPEND instead, see utf_old.h.
   1074  */
   1075 #define UTF_APPEND_CHAR(s, i, length, c) UTF16_APPEND_CHAR_SAFE(s, i, length, c)
   1076 
   1077 /**
   1078  * Advance i to beyond the code units of the code point that begins at i.
   1079  * I.e., advance i by one code point.
   1080  * Same as UTF16_FWD_1.
   1081  * \pre 0<=i<length
   1082  * \post 0<i<=length
   1083  *
   1084  * @deprecated ICU 2.4. Renamed to U16_FWD_1, see utf_old.h.
   1085  */
   1086 #define UTF_FWD_1(s, i, length) U16_FWD_1(s, i, length)
   1087 
   1088 /**
   1089  * Advance i to beyond the code units of the n code points where the first one begins at i.
   1090  * I.e., advance i by n code points.
   1091  * Same as UT16_FWD_N.
   1092  * \pre 0<=i<length
   1093  * \post 0<i<=length
   1094  *
   1095  * @deprecated ICU 2.4. Renamed to U16_FWD_N, see utf_old.h.
   1096  */
   1097 #define UTF_FWD_N(s, i, length, n) U16_FWD_N(s, i, length, n)
   1098 
   1099 /**
   1100  * Take the random-access index i and adjust it so that it points to the beginning
   1101  * of a code point.
   1102  * The input index points to any code unit of a code point and is moved to point to
   1103  * the first code unit of the same code point. i is never incremented.
   1104  * In other words, if i points to a trail surrogate that is preceded by a matching
   1105  * lead surrogate, then i is decremented. Otherwise it is not modified.
   1106  * This can be used to start an iteration with UTF_NEXT_CHAR() from a random index.
   1107  * Same as UTF16_SET_CHAR_START.
   1108  * \pre start<=i<length
   1109  * \post start<=i<length
   1110  *
   1111  * @deprecated ICU 2.4. Renamed to U16_SET_CP_START, see utf_old.h.
   1112  */
   1113 #define UTF_SET_CHAR_START(s, start, i) U16_SET_CP_START(s, start, i)
   1114 
   1115 /**
   1116  * Set c to the code point that has code units before i
   1117  * and move i backward (towards the beginning of the string)
   1118  * to the first code unit of this code point (pre-increment).
   1119  * i must point to the first code unit after the last unit of a code point (i==length is allowed).
   1120  * Same as UTF16_PREV_CHAR.
   1121  * \pre start<i<=length
   1122  * \post start<=i<length
   1123  *
   1124  * @deprecated ICU 2.4. Renamed to U16_PREV, see utf_old.h.
   1125  */
   1126 #define UTF_PREV_CHAR(s, start, i, c) U16_PREV(s, start, i, c)
   1127 
   1128 /**
   1129  * Move i backward (towards the beginning of the string)
   1130  * to the first code unit of the code point that has code units before i.
   1131  * I.e., move i backward by one code point.
   1132  * i must point to the first code unit after the last unit of a code point (i==length is allowed).
   1133  * Same as UTF16_BACK_1.
   1134  * \pre start<i<=length
   1135  * \post start<=i<length
   1136  *
   1137  * @deprecated ICU 2.4. Renamed to U16_BACK_1, see utf_old.h.
   1138  */
   1139 #define UTF_BACK_1(s, start, i) U16_BACK_1(s, start, i)
   1140 
   1141 /**
   1142  * Move i backward (towards the beginning of the string)
   1143  * to the first code unit of the n code points that have code units before i.
   1144  * I.e., move i backward by n code points.
   1145  * i must point to the first code unit after the last unit of a code point (i==length is allowed).
   1146  * Same as UTF16_BACK_N.
   1147  * \pre start<i<=length
   1148  * \post start<=i<length
   1149  *
   1150  * @deprecated ICU 2.4. Renamed to U16_BACK_N, see utf_old.h.
   1151  */
   1152 #define UTF_BACK_N(s, start, i, n) U16_BACK_N(s, start, i, n)
   1153 
   1154 /**
   1155  * Take the random-access index i and adjust it so that it points beyond
   1156  * a code point. The input index points beyond any code unit
   1157  * of a code point and is moved to point beyond the last code unit of the same
   1158  * code point. i is never decremented.
   1159  * In other words, if i points to a trail surrogate that is preceded by a matching
   1160  * lead surrogate, then i is incremented. Otherwise it is not modified.
   1161  * This can be used to start an iteration with UTF_PREV_CHAR() from a random index.
   1162  * Same as UTF16_SET_CHAR_LIMIT.
   1163  * \pre start<i<=length
   1164  * \post start<i<=length
   1165  *
   1166  * @deprecated ICU 2.4. Renamed to U16_SET_CP_LIMIT, see utf_old.h.
   1167  */
   1168 #define UTF_SET_CHAR_LIMIT(s, start, i, length) U16_SET_CP_LIMIT(s, start, i, length)
   1169 
   1170 #endif /* U_HIDE_DEPRECATED_API */
   1171 
   1172 #endif
   1173 
   1174