1 Copyright (c) 2003-2005, International Business Machines Corporation and others. All Rights Reserved. 2 uciter8: Lenient reading of 8-bit Unicode with a UCharIterator 3 4 This sample demonstrates reading 5 8-bit Unicode text leniently, accepting a mix of UTF-8 and CESU-8 6 and also accepting single surrogates. 7 UTF-8-style macros are defined as well as a UCharIterator. 8 The macros are incomplete (do not assemble code points from pairs of surrogates) 9 but sufficient for the iterator. 10 11 If you wish to use the lenient-UTF/CESU-8 UCharIterator in a context outside of 12 this sample, then copy the uit_len8.c file, 13 as well as either the uit_len8.h header or just the prototype that it contains. 14 15 *** Warning: *** 16 This UCharIterator reads an arbitrary mix of UTF-8 and CESU-8 text. 17 It does not conform to any one Unicode charset specification, 18 and its use may lead to security risks. 19 20 21 Files: 22 uciter8.c Main source file in C 23 uit_len8.c Lenient-UTF/CESU-8 UCharIterator implementation 24 uit_len8.h Header file with the prototoype for the lenient-UTF/CESU-8 UCharIterator 25 uciter8.sln Windows MSVC workspace. Double-click this to get started. 26 uciter8.vcproj Windows MSVC project file 27 28 To Build uciter8 on Windows 29 1. Install and build ICU 30 2. In MSVC, open the workspace file icu\samples\uciter8\uciter8.sln 31 3. Choose a Debug or Release build. 32 4. Build. 33 34 To Run on Windows 35 1. Start a command shell window 36 2. Add ICU's bin directory to the path, e.g. 37 set PATH=c:\icu\bin;%PATH% 38 (Use the path to where ever ICU is on your system.) 39 3. cd into the uciter8 directory, e.g. 40 cd c:\icu\source\samples\uciter8\debug 41 4. Run it 42 uciter8 43 44 To Build on Unixes 45 1. Build ICU. 46 Specify an ICU install directory when running configure, 47 using the --prefix option. The steps to build ICU will look something 48 like this: 49 cd <icu directory>/source 50 runConfigureICU <platform-name> --prefix <icu install directory> [other options] 51 gmake all 52 53 2. Install ICU, 54 gmake install 55 56 3. Compile 57 cd <icu directory>/source/samples/uciter8 58 gmake ICU_PREFIX=<icu install directory) 59 60 To Run on Unixes 61 cd <icu directory>/source/samples/uciter8 62 63 gmake ICU_PREFIX=<icu install directory> check 64 -or- 65 66 export LD_LIBRARY_PATH=<icu install directory>/lib:.:$LD_LIBRARY_PATH 67 uciter8 68 69 70 Note: The name of the LD_LIBRARY_PATH variable is different on some systems. 71 If in doubt, run the sample using "gmake check", and note the name of 72 the variable that is used there. LD_LIBRARY_PATH is the correct name 73 for Linux and Solaris. 74