1 Name: icu 2 URL: http://site.icu-project.org/ 3 Version: 52.1 4 License: MIT 5 Security Critical: yes 6 7 Description: 8 This directory contains the source code of ICU 52.1 for C/C++ 9 10 1. It was obtained with the following: 11 12 $ svn export --native-eol LF http://source.icu-project.org/repos/icu/icu/tags/release-52-1 icu52 13 14 The following directories we don't use are removed: 15 16 - as_is 17 - packaging 18 - source/layout 19 - source/layoutex 20 - source/data/xml 21 22 patches/configure.patch is applied to get runConfigureICU work in the 23 icudata generation step without layout and layoutex directory by removing the 24 corresponding Makefile's from ac_config variable. 25 26 2. Apply the following patch for platform related headers (putilimpl.h and 27 others). 28 29 - patches/putil.patch for Android, QNX and newlib(NaCl-newlib). 30 Upstream bug for Android : http://bugs.icu-project.org/trac/ticket/10478 31 Upstream bug for QNX : http://bugs.icu-project.org/trac/ticket/10811 32 Upstream bug for newlib : http://bugs.icu-project.org/trac/ticket/10873 33 34 - patches/platform_nacl.patch to add U_PF_NATIVE_CLIENT 35 Upstream bug : http://bugs.icu-project.org/trac/ticket/11033 36 37 38 3. Breakiterator patches 39 40 - Apply patches/brkitr.patch 41 * word.txt 42 a. Move full stops (U+002E, U+FF0E) from MidNumLet to MidNum so that 43 FQDN labels can be split at '.' 44 b. Move fullwidth digits (U+FF10 - U+FF19) from Ideographic to Numeric. 45 See http://unicode.org/cldr/trac/ticket/6555 46 * line.txt 47 a. Use Japanese rules for all locales because Japanese tailoring only 48 affects Japanese specific characters. 49 See http://unicode.org/cldr/trac/ticket/3974 50 b. Minor changes in CL, OP and IS definitions to handle 'comma-variants' 51 more consistenly. 52 See http://unicode.org/cldr/trac/ticket/6557 53 c. Fix line breaking for Chinese characters and quotation marks 54 See http://unicode.org/cldr/trac/ticket/4200 and 55 http://crbug.com/39779 56 57 58 - Add a new file brklocal.mk (copied from brkfiles.mk) with line_ja.txt 59 and word_POSIX.txt dropped from the build list. 60 61 - Apply patches/khmer-dictbe.patch and put in a smaller Khmer dictionary 62 (source/data/brkitr/khmerdict.txt) obtained from 63 http://bugs.icu-project.org/trac/ticket/9451 64 65 - Add several common Chinese words that were dropped previously to 66 source/data/cjdict/brkitr/cjdict.txt 67 patch: patches/cjdict.patch 68 upstream bug: http://bugs.icu-project.org/trac/ticket/10888 69 70 71 - android/brkitr.patch (to be applied for Android build only) : 72 Reverts some changes about Chinese/Japanese segmentation rules in 73 patches/brkitr.patch to reduce binary size for Android. 74 75 4. Converter changes : 76 77 - converters.patch : 78 a. revises existing mapping tables 79 b. Remove a lot of unused aliases in the converter alias table 80 (source/data/mappings/convrtrs.txt ) leading to 40kB size reduction. 81 82 - Add source/data/mappings/ucmlocal.txt : to list only converters we need. 83 - Add three new tables per WHATWG encoding standards for EUC-JP, 84 Shift_JIS and CP866. 85 They're generated with scripts/{eucjp, sjis, ibm866}_gen.sh. 86 - Add three 'fake' tables for ISO-2022-CN(-Ext) : noop-*.ucm. 87 88 - uconv.patch 89 a. ucnv2022 uses 3 fake tables for ISO-2022-CN(-Ext) instead of two 90 huge tables. 91 b. ISO-2022-JP-[1-4] is dropped. 92 c. SCSU, BOCU, ISCII, UTF-7 conversion is diabled leading to 93 the 47kB reduction in the code size. 94 95 5. Locale changes 96 - patches/locale1.patch : 97 a. Exemplar character set changes for zh*, ja + 9 Indian locales 98 b. Minor fixes for Korean, a few Indic (AmPmMarkers) and 99 others (datetime format) 100 101 - Locale build configuration files: To include the full locale data 102 for Chrome's UI languages and the minimum locale data for other locales, 103 add reslocal.mk or {trns,sprep,rbnf,coll}local.mk files to 104 source/data/{coll,curr,lang.locale,curr,region,translit,zone,rbnf,sprep}. 105 106 This along with #8 (data.build.patch), #3 (brkiter) and #4 (converter) 107 cuts down the data size by ~ 11MB. 108 109 - Run scripts/trim_data.sh : About 2.1MB data size reduction. 110 a. Trim the locale data for Chrome's UI langauges : 111 locales, lang, region, currency 112 b. Trim the locale data for non-UI languages to the bare minimum : 113 ExemplarCharacters, LocaleScript, layout, and the name of the 114 language for a locale in its native language. 115 c. Remove the legacy Chinese character set-based collation 116 (big5han/gb2312han) that don't make any sense and nobdoy uses. 117 118 - android/patch_locale.sh (to be run for Android build only): 119 a. Makes changes to source/data/{curr,region,lang} to exclude these data 120 except the language and script names of zh_Hans and zh_Hant. 121 b. Remove exemplar cities in timezone data (data/zone) 122 c. Keep only the minimal calendar data in data/locales 123 124 - Add tg.txt to source/data/locale source/data/lang to add the minimal locale 125 data necessary for the spellchecker. In both directories, add tg.txt to 126 reslocal.mk 127 128 6. Timezone data update 129 - Grab the latest version of the following timezone data files and 130 put them in source/data/misc. 131 132 metaZones.txt 133 timezoneTypes.txt 134 windowsZones.txt 135 zoneinfo64.txt 136 137 As of August 2014, the latest version is 2014f and the above files 138 are available at 139 http://source.icu-project.org/repos/icu/data/trunk/tzdata/icunew/2014f/44/ 140 141 7. Transliterator customization 142 143 - Also add css3transform.txt to source/data/trnslit. 144 - Put the following line in trnslocal.mk 145 146 TRANSLIT_SOURCE=css3transform.txt 147 148 8. Build-related changes 149 150 - patches/wpo.patch 151 Upstream bugs : http://bugs.icu-project.org/trac/ticket/8043 152 http://bugs.icu-project.org/trac/ticket/5701 153 - patches/vscomp.patch for building with Visual Studio on Windows. 154 a. do not use WINDOWS_LOCALE_API in locmap.c 155 b. do not redefine stringpiece::npos 156 c. fix a Windows build failure with U_USING_ICU_NAMESPACE=0 157 upstream bug: http://bugs.icu-project.org/trac/ticket/10486 158 fixed in ICU 53) 159 d. Explicitly use Windows 'A' API when argument is an LPSTR in wintz.c 160 upstream bug : http://bugs.icu-project.org/trac/ticket/10870 161 162 - patches/data.build.patch : 163 Remove unnecessary resources : invuca, unames, collator source, stringprep 164 - patches/data.build.win.patch : 165 Windows-only data build patch. 166 167 - patches/clang_win.patch : 168 Take care of 3 warnings from clang and MSVC 2013. 169 upstream bug : http://bugs.icu-project.org/trac/ticket/11102 170 171 9. Pre-built data files are checked in with the following steps on Linux: 172 173 a. Make a icu data build directory outside the Chromium source tree 174 and cd to that directory. 175 b. Run 176 177 ${CHROME_ICU_TREE_TOP}/source/runConfigureICU Linux --disable-layout 178 179 c. Run 'make' 180 d. 'make' will fail in the 1st pass. Copy 181 ${CHROME_ICU_TREE_TOP}/source/data/in/coll/invuca.icu 182 to {BUILD_DIR_ROOT}/data/out/build/icudt52l/coll and re-run 'make' 183 in {BUILD_DIR_ROOT}/data. 184 185 e. 'make' will fail again when pkgdata looks for css3transform.res. Edit 186 data/out/tmp/icudata.lst to replace 'css3transform.res' with 'root.res'. 187 (see http://bugs.icu-project.org/trac/ticket/10570 ) and run 'make' again. 188 189 190 - source/data/in/icudtl.dat : Built on Linux with all the patches 191 above applied. icudt52l.dat is generated in 192 {BUILD_DIR_ROOT}/data/out/tmp and copied to the above location with a 193 version number (52) dropped. 194 195 196 - {mac,linux}/icudtl_dat.S : Built on Linux with all the 197 patches above (except android/brkitr.patch) applied and checked in. 198 This file will be generated in {BUILD_DIR_ROOT}/data/out/tmp as 199 icudt52l_dat.S, but '52' is dropped while copying. 200 201 mac/icudtl_dat.S is identical to linux/icudtl_dat.S except for 202 the header portion. With "linux/icudtl_dat.S" in its place, 203 run scripts/make_mac_assembly.sh to generate it. 204 205 - android/icudtl_dat.S : Built on Linux with all the patches above and 206 android/brkitr.patch applied and android/patch_locale.sh executed. 207 '52' is dropped from the name generated in the build tree. 208 209 - android/icudtl.dat : Generated as icudt52l.dat in 210 {BUILD_DIR_ROOT}/data/out/tmp along with icudt52l_dat.S and 211 copied to the above location with '52' dropped in its name. 212 213 - windows/icudt.dll (by default, we set icu_use_icu_data_flag to 1 214 and don't use this file.) 215 216 a. check out a clean copy of icu52 from the upstream on Windows 217 outside the Chrome tree. 218 219 $ svn export --native-eol LF http://source.icu-project.org/repos/icu/icu/tags/release-52-1 ${SEPARATE_ICU_ROOT}/icu52 220 221 b. copy ${CHROME_ICU_ROOT}/source/data/in/icudtl.dat to 222 ${SEPARATE_ICU_ROOT}/source/data/in/icudt52l.dat 223 c. copy ${CHROME_ICU_ROOT}/source/data/makedata.mak to 224 ${SEPARATE_ICU_ROOT}/source/data/makedata.mak 225 c. In Visual Studio, open source/allinone/allinone.sln solution 226 in ${SEPARATE_ICU_ROOT} 227 d. Build 'makedata' target 228 e. icudt52.dll will be generated in ${SEPARATE_ICU_ROOT}/bin 229 f. Copy that icudt52.dll to ${CHROME_ICU_ROOT}/windows/icudt.dll 230 and check that in. 231 232 233 10. Change export of U_ICUDATA_ENTRY_POINT from U_IMPORT to U_EXPORT. 234 - patches/declspec.patch 235 236 11. Cherry-pick an upstream patch to fix a bug in bidi. 237 - patches/bidi.patch 238 - upstream bug : http://bugs.icu-project.org/trac/ticket/11054 239