Home | History | Annotate | Download | only in ldml
      1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
      2 "http://www.w3.org/TR/html4/loose.dtd">
      3 <html>
      4 
      5 <head>
      6 <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
      7 <meta http-equiv="Content-Language" content="en-us">
      8 <link rel="stylesheet" href="http://www.unicode.org/reports/reports.css"
      9 	type="text/css">
     10 <title>UTS #35: Unicode LDML: General</title>
     11 <style type="text/css">
     12 <!--
     13 .dtd {
     14 	font-family: monospace;
     15 	font-size: 90%;
     16 	background-color: #CCCCFF;
     17 	border-style: dotted;
     18 	border-width: 1px;
     19 }
     20 
     21 .xmlExample {
     22 	font-family: monospace;
     23 	font-size: 80%
     24 }
     25 
     26 .blockedInherited {
     27 	font-style: italic;
     28 	font-weight: bold;
     29 	border-style: dashed;
     30 	border-width: 1px;
     31 	background-color: #FF0000
     32 }
     33 
     34 .inherited {
     35 	font-weight: bold;
     36 	border-style: dashed;
     37 	border-width: 1px;
     38 	background-color: #00FF00
     39 }
     40 
     41 .element {
     42 	font-weight: bold;
     43 	color: red;
     44 }
     45 
     46 .attribute {
     47 	font-weight: bold;
     48 	color: maroon;
     49 }
     50 
     51 .attributeValue {
     52 	font-weight: bold;
     53 	color: blue;
     54 }
     55 
     56 li, p {
     57 	margin-top: 0.5em;
     58 	margin-bottom: 0.5em
     59 }
     60 
     61 h2, h3, h4, table {
     62 	margin-top: 1.5em;
     63 	margin-bottom: 0.5em;
     64 }
     65 -->
     66 </style>
     67 </head>
     68 
     69 <body>
     70 
     71 	<table class="header" width="100%">
     72 		<tr>
     73 			<td class="icon"><a href="http://unicode.org"> <img
     74 					alt="[Unicode]" src="http://unicode.org/webscripts/logo60s2.gif"
     75 					width="34" height="33"
     76 					style="vertical-align: middle; border-left-width: 0px; border-bottom-width: 0px; border-right-width: 0px; border-top-width: 0px;"></a>&nbsp;
     77 				<a class="bar" href="http://www.unicode.org/reports/">Technical
     78 					Reports</a></td>
     79 		</tr>
     80 		<tr>
     81 			<td class="gray">&nbsp;</td>
     82 		</tr>
     83 	</table>
     84 	<div class="body">
     85 		<h2 style="text-align: center">
     86 			Unicode Technical
     87 			Standard #35
     88 		</h2>
     89 		<h1 style="text-align: center">
     90 			Unicode Locale Data Markup Language (LDML)<br> Part 2: General
     91 		</h1>
     92 
     93 		<!-- At least the first row of this header table should be identical across the parts of this UTS. -->
     94 		<table border="1" cellpadding="2" cellspacing="0" class="wide">
     95 			<tr>
     96 				<td>Version</td>
     97 				<td>34</td>
     98 			</tr>
     99 			<tr>
    100 				<td>Editors</td>
    101 				<td>Yoshito Umaoka (<a href="mailto:yoshito_umaoka (a] us.ibm.com">yoshito_umaoka (a] us.ibm.com</a>)
    102 					and <a href="tr35.html#Acknowledgments">other CLDR committee
    103 						members</a></td>
    104 			</tr>
    105 		</table>
    106 
    107 		<p>
    108 			For the full header, summary, and status, see <a href="tr35.html">
    109 				Part 1: Core</a>
    110 		</p>
    111 
    112 		<h3>
    113 			<i>Summary</i>
    114 		</h3>
    115 		<p>
    116 			This document describes parts of an XML format (<i>vocabulary</i>)
    117 			for the exchange of structured locale data. This format is used in
    118 			the <a href="http://cldr.unicode.org/">Unicode Common Locale Data
    119 				Repository</a>.
    120 		</p>
    121 
    122 		<p>
    123 			This is a partial document, describing general parts of the LDML:
    124 			display names &amp; transforms, etc. For the other parts of the LDML
    125 			see the <a href="tr35.html">main LDML document</a> and the links
    126 			above.
    127 		</p>
    128 
    129 		<h3>
    130 			<i>Status</i>
    131 		</h3>
    132 
    133 		<!-- NOT YET APPROVED 
    134 		<p>
    135 				<i class="changed">This is a<b><font color="#ff3333">
    136 				draft </font></b>document which may be updated, replaced, or superseded by
    137 				other documents at any time. Publication does not imply endorsement
    138 				by the Unicode Consortium. This is not a stable document; it is
    139 				inappropriate to cite this document as other than a work in
    140 				progress.
    141 			</i>
    142 		</p>
    143 		 END NOT YET APPROVED -->
    144 		<!-- APPROVED -->
    145 		<p>
    146 			<i>This document has been reviewed by Unicode members and other
    147 				interested parties, and has been approved for publication by the
    148 				Unicode Consortium. This is a stable document and may be used as
    149 				reference material or cited as a normative reference by other
    150 				specifications.</i>
    151 		</p>
    152 		<!-- END APPROVED -->
    153 
    154 		<blockquote>
    155 			<p>
    156 				<i><b>A Unicode Technical Standard (UTS)</b> is an independent
    157 					specification. Conformance to the Unicode Standard does not imply
    158 					conformance to any UTS.</i>
    159 			</p>
    160 		</blockquote>
    161 		<p>
    162 			<i>Please submit corrigenda and other comments with the CLDR bug
    163 				reporting form [<a href="tr35.html#Bugs">Bugs</a>]. Related
    164 				information that is useful in understanding this document is found
    165 				in the <a href="tr35.html#References">References</a>. For the latest
    166 				version of the Unicode Standard see [<a href="tr35.html#Unicode">Unicode</a>].
    167 				For a list of current Unicode Technical Reports see [<a
    168 				href="tr35.html#Reports">Reports</a>]. For more information about
    169 				versions of the Unicode Standard, see [<a href="tr35.html#Versions">Versions</a>].
    170 			</i>
    171 		</p>
    172 
    173 		<!-- This section of Parts should be identical in all of the parts of this UTS. -->
    174 		<h2>
    175 			<a name="Parts" href="#Parts">Parts</a>
    176 		</h2>
    177 		<p>The LDML specification is divided into the following parts:</p>
    178 		<ul class="toc">
    179 			<li>Part 1: <a href="tr35.html#Contents">Core</a> (languages,
    180 				locales, basic structure)
    181 			</li>
    182 			<li>Part 2: <a href="tr35-general.html#Contents">General</a>
    183 				(display names &amp; transforms, etc.)
    184 			</li>
    185 			<li>Part 3: <a href="tr35-numbers.html#Contents">Numbers</a>
    186 				(number &amp; currency formatting)
    187 			</li>
    188 			<li>Part 4: <a href="tr35-dates.html#Contents">Dates</a> (date,
    189 				time, time zone formatting)
    190 			</li>
    191 			<li>Part 5: <a href="tr35-collation.html#Contents">Collation</a>
    192 				(sorting, searching, grouping)
    193 			</li>
    194 			<li>Part 6: <a href="tr35-info.html#Contents">Supplemental</a>
    195 				(supplemental data)
    196 			</li>
    197 			<li>Part 7: <a href="tr35-keyboards.html#Contents">Keyboards</a>
    198 				(keyboard mappings)
    199 			</li>
    200 		</ul>
    201 		<h2>
    202 			<a name="Contents" href="#Contents">Contents of Part 2, General</a>
    203 		</h2>
    204 		<!-- START Generated TOC: CheckHtmlFiles -->
    205 		<ul class="toc">
    206 			<li>1 <a href="#Display_Name_Elements">Display Name Elements</a></li>
    207 			<li>2 <a href="#Layout_Elements">Layout Elements</a></li>
    208 			<li>3 <a href="#Character_Elements">Character Elements</a>
    209 				<ul class="toc">
    210 					<li>3.1 <a href="#Exemplars">Exemplars</a>
    211 						<ul class="toc">
    212 							<li>3.1.1 <a href="#ExemplarSyntax">Exemplar Syntax</a></li>
    213 							<li>3.1.2 <a href="#Restrictions">Restrictions</a></li>
    214 						</ul>
    215 					</li>
    216 					<li>3.2 <a href="#Character_Mapping">Mapping</a></li>
    217 					<li>3.3 <a href="#IndexLabels">Index Labels</a></li>
    218 					<li>3.4 <a href="#Ellipsis">Ellipsis</a></li>
    219 					<li>3.5 <a href="#Character_More_Info">More Information</a></li>
    220 					<li>3.6 <a href="#Character_Parse_Lenient">Parse Lenient</a></li>
    221 				</ul>
    222 			</li>
    223 			<li>4 <a href="#Delimiter_Elements">Delimiter Elements</a></li>
    224 			<li>5 <a href="#Measurement_System_Data">Measurement System
    225 					Data</a>
    226 				<ul class="toc">
    227 					<li>5.1 <a href="#Measurement_Elements">Measurement
    228 							Elements (deprecated)</a></li>
    229 				</ul>
    230 			</li>
    231 			<li>6 <a href="#Unit_Elements">Unit Elements</a>
    232 				<ul class="toc">
    233 					<li>6.1 <a href="#perUnitPatterns">per Unit patterns</a></li>
    234 					<li>6.2 <a href="#Unit_Sequences">Unit Sequences</a></li>
    235 					<li>6.3 <a href="#durationUnit">durationUnit</a></li>
    236 					<li>6.4 <a href="#coordinateUnit">coordinateUnit</a></li>
    237 					<li>6.5 <a href="#Territory_Based_Unit_Preferences">Territory-Based
    238 							Unit Preferences</a></li>
    239 				</ul>
    240 			</li>
    241 			<li>7 <a href="#POSIX_Elements">POSIX Elements</a></li>
    242 			<li>8 <a href="#Reference_Elements">Reference Element</a></li>
    243 			<li>9 <a href="#Segmentations">Segmentations</a>
    244 				<ul class="toc">
    245 					<li>9.1 <a href="#Segmentation_Inheritance">Segmentation
    246 							Inheritance</a></li>
    247 					<li>9.2 <a href="#Segmentation_Exceptions">Segmentation
    248 							Suppressions</a></li>
    249 				</ul>
    250 			</li>
    251 			<li>10 <a href="#Transforms">Transforms</a>
    252 				<ul class="toc">
    253 					<li>10.1 <a href="#Inheritance">Inheritance</a>
    254 						<ul class="toc">
    255 							<li>10.1.1 <a href="#Pivots">Pivots</a></li>
    256 						</ul>
    257 					</li>
    258 					<li>10.2 <a href="#Variants">Variants</a></li>
    259 					<li>10.3 <a href="#Transform_Rules_Syntax">Transform Rules
    260 							Syntax</a>
    261 						<ul class="toc">
    262 							<li>10.3.1 <a href="#Dual_Rules">Dual Rules</a></li>
    263 							<li>10.3.2 <a href="#Context">Context</a></li>
    264 							<li>10.3.3 <a href="#Revisiting">Revisiting</a></li>
    265 							<li>10.3.4 <a href="#Example">Example</a></li>
    266 							<li>10.3.5 <a href="#Rule_Syntax">Rule Syntax</a></li>
    267 							<li>10.3.6 <a href="#Transform_Rules">Transform Rules</a></li>
    268 							<li>10.3.7 <a href="#Variable_Definition_Rules">Variable
    269 									Definition Rules</a></li>
    270 							<li>10.3.8 <a href="#Filter_Rules">Filter Rules</a></li>
    271 							<li>10.3.9 <a href="#Conversion_Rules">Conversion Rules</a></li>
    272 							<li>10.3.10 <a
    273 								href="#Intermixing_Transform_Rules_and_Conversion_Rules">Intermixing
    274 									Transform Rules and Conversion Rules</a></li>
    275 							<li>10.3.11 <a href="#Inverse_Summary">Inverse Summary</a></li>
    276 						</ul>
    277 					</li>
    278 				</ul>
    279 			</li>
    280 			<li>11 <a href="#ListPatterns">List Patterns</a>
    281 				<ul class="toc">
    282 					<li>11.1 <a href="#List_Gender">Gender of Lists</a></li>
    283 				</ul>
    284 			</li>
    285 			<li>12 <a href="#Context_Transform_Elements">ContextTransform
    286 					Elements</a>
    287 				<ul class="toc">
    288 					<li>Table: <a
    289 						href="#contextTransformUsage_type_attribute_values">Element
    290 							contextTransformUsage type attribute values</a></li>
    291 				</ul>
    292 			</li>
    293 			<li>13 <a href="#Choice_Patterns">Choice Patterns</a></li>
    294 			<li>14 <a href="#Annotations">Annotations and Labels</a>
    295 			  <ul class="toc">
    296 			    <li>14.1 <a href="#SynthesizingNames">Synthesizing Sequence Names</a></li>
    297 			    <li>14.2 <a href="#Character_Labels">Annotations Character Labels</a></li>
    298 			    <li>14.3 <a href="#Typographic_Names">Typographic Names</a></li>
    299 		      </ul>
    300 			</li>
    301 		</ul>
    302 		<!-- END Generated TOC: CheckHtmlFiles -->
    303 		<h2>
    304 			1 <a name="Display_Name_Elements" href="#Display_Name_Elements">Display
    305 				Name Elements</a>
    306 		</h2>
    307 		<p class="dtd">&lt;!ELEMENT localeDisplayNames ( alias | (
    308 			localeDisplayPattern?, languages?, scripts?, territories?,
    309 			subdivisions?, variants?, keys?, types?, transformNames?,
    310 			measurementSystemNames?, codePatterns?, special* ) )&gt;</p>
    311 		<p>
    312 			Display names for scripts, languages, countries, currencies, and
    313 			variants in this locale are supplied by this element. They supply
    314 			localized names for these items for use in user-interfaces for
    315 			various purposes such as displaying menu lists, displaying a language
    316 			name in a dialog, and so on. Capitalization should follow the
    317 			conventions used in the middle of running text; the
    318 			&lt;contextTransforms&gt; element may be used to specify the
    319 			appropriate capitalization for other contexts (see <i>Section 12
    320 				<a href="#Context_Transform_Elements">ContextTransform Elements</a>
    321 			</i>). Examples are given below.
    322 		</p>
    323 
    324 		<blockquote>
    325 			<p class="note">
    326 				<b>Note:</b> The "<span style="color: blue">en</span>" locale may
    327 				contain translated names for deprecated codes for debugging
    328 				purposes. Translation of deprecated codes into other languages is
    329 				discouraged.
    330 			</p>
    331 		</blockquote>
    332 
    333 		<p>Where present, the display names must be unique; that is, two
    334 			distinct code would not get the same display name. (There is one
    335 			exception to this: in time zones, where parsing results would give
    336 			the same GMT offset, the standard and daylight display names can be
    337 			the same across different time zone IDs.)</p>
    338 
    339 		<p>
    340 			Any translations should follow customary practice for the locale in
    341 			question. For more information, see [<a href="tr35.html#DataFormats">Data
    342 				Formats</a>].
    343 		</p>
    344 
    345 		<p class="element2">&lt;localeDisplayPattern&gt;</p>
    346 
    347 		<p class="dtd">&lt;!ELEMENT localeDisplayPattern ( alias |
    348 			(localePattern*, localeSeparator*, localeKeyTypePattern*, special*) )
    349 			&gt;</p>
    350 
    351 		<p>For compound language (locale) IDs such as "pt_BR" which
    352 			contain additional subtags beyond the initial language code: When the
    353 			&lt;languages&gt; data does not explicitly specify a display name
    354 			such as "Brazilian Portuguese" for a given compound language ID,
    355 			"Portuguese (Brazil)" from the display names of the subtags.</p>
    356 
    357 		<p>It includes three sub-elements:</p>
    358 		<ul>
    359 			<li>The &lt;localePattern&gt; element specifies a pattern such
    360 				as "{0} ({1})" in which {0} is replaced by the display name for the
    361 				primary language subtag and {1} is replaced by a list of the display
    362 				names for the remaining subtags.</li>
    363 			<li>The &lt;localeSeparator&gt; element specifies a pattern such
    364 				as "{0}, {1}" used when appending a subtag display name to the list
    365 				in the &lt;localePattern&gt; subpattern {1} above. If that list
    366 				includes more than one display name, then &lt;localeSeparator&gt;
    367 				subpattern {1} represents a new display name to be appended to the
    368 				current list in {0}. <em>Note: Before CLDR 24, the
    369 					&lt;localeSeparator&gt; element specified a separator string such
    370 					as ", ", not a pattern.</em>
    371 			</li>
    372 			<li>The &lt;localeKeyTypePattern&gt; element specifies the
    373 				pattern used to display key-type pairs, such as "{0}: {1}"</li>
    374 		</ul>
    375 
    376 		<p>For example, for the locale identifier
    377 			zh_Hant_CN_co_pinyin_cu_USD, the display would be "Chinese
    378 			(Traditional, China, Pinyin Sort Order, Currency: USD)". The key-type
    379 			for co_pinyin doesn't use the localeKeyTypePattern because there is a
    380 			translation for the key-type in English:</p>
    381 
    382 		<blockquote>
    383 			<p>&lt;type type="pinyin" key="collation"&gt;Pinyin Sort
    384 				Order&lt;/type&gt;</p>
    385 		</blockquote>
    386 
    387 		<p class="element2">&lt;languages&gt;</p>
    388 
    389 		<p>
    390 			This contains a list of elements that provide the user-translated
    391 			names for language codes, as described in <i> <a
    392 				href="tr35.html#Unicode_Language_and_Locale_Identifiers">Section
    393 					3, Unicode Language and Locale Identifiers</a></i>.
    394 		</p>
    395 
    396 		<blockquote>
    397 			<pre>&lt;language type="<span style="color: blue">ab</span>"&gt;<span
    398 					style="color: blue">Abkhazian</span>&lt;/language&gt;
    399 &lt;language type="<span style="color: blue">aa</span>"&gt;<span
    400 					style="color: blue">Afar</span>&lt;/language&gt;
    401 &lt;language type="<span style="color: blue">af</span>"&gt;<span
    402 					style="color: blue">Afrikaans</span>&lt;/language&gt;
    403 &lt;language type="<span style="color: blue">sq</span>"&gt;<span
    404 					style="color: blue">Albanian</span>&lt;/language&gt;
    405 </pre>
    406 		</blockquote>
    407 		<p>There should be no expectation that the list of
    408 			languages with translated names be complete: there are thousands of
    409 			languages that could have translated names. For debugging purposes or
    410 			comparison, when a language display name is missing, the Description
    411 			field of the language subtag registry can be used to supply a
    412 			fallback English user-readable name.</p>
    413 		<p>The type can actually be any locale ID as specified above. The
    414 			set of which locale IDs is not fixed, and depends on the locale. For
    415 			example, in one language one could translate the following locale
    416 			IDs, and in another, fall back on the normal composition.</p>
    417 
    418 		<table border="1" cellpadding="4" cellspacing="0">
    419 			<tr>
    420 				<th width="33%">type</th>
    421 				<th width="33%">translation</th>
    422 				<th width="34%">composition</th>
    423 			</tr>
    424 			<tr>
    425 				<td width="33%">nl_BE</td>
    426 				<td width="33%">Flemish</td>
    427 				<td width="34%">Dutch (Belgium)</td>
    428 			</tr>
    429 			<tr>
    430 				<td width="33%">zh_Hans</td>
    431 				<td width="33%">Simplified Chinese</td>
    432 				<td width="34%">Chinese (Simplified)</td>
    433 			</tr>
    434 			<tr>
    435 				<td width="33%">en_GB</td>
    436 				<td width="33%">British English</td>
    437 				<td width="34%">English (United Kingdom)</td>
    438 			</tr>
    439 		</table>
    440 
    441 		<p>Thus when a complete locale ID is formed by composition, the
    442 			longest match in the language type is used, and the remaining fields
    443 			(if any) added using composition.</p>
    444 
    445 		<p>Alternate short forms may be provided for some languages (and
    446 			for territories and other display names), for example.</p>
    447 
    448 		<blockquote>
    449 			<pre>&lt;language type="<span style="color: blue">az</span>"&gt;<span
    450 					style="color: blue">Azerbaijani</span>&lt;/language&gt;
    451 &lt;language type="<span style="color: blue">az</span>" alt="<span
    452 					style="color: blue">short</span>"&gt;<span style="color: blue">Azeri</span>&lt;/language&gt;
    453 &lt;language type="<span style="color: blue">en_GB</span>"&gt;<span
    454 					style="color: blue">British English</span>&lt;/language&gt;
    455 &lt;language type="<span style="color: blue">en_GB</span>" alt="<span
    456 					style="color: blue">short</span>"&gt;<span style="color: blue">U.K. English</span>&lt;/language&gt;
    457 &lt;language type="<span style="color: blue">en_US</span>"&gt;<span
    458 					style="color: blue">American English</span>&lt;/language&gt;
    459 &lt;language type="<span style="color: blue">en_US</span>" alt="<span
    460 					style="color: blue">short</span>"&gt;<span style="color: blue">U.S. English</span>&lt;/language&gt;
    461 </pre>
    462 		</blockquote>
    463 
    464 		<p class="element2">&lt;scripts&gt;</p>
    465 
    466 		<p>
    467 			This element can contain an number of script elements. Each script
    468 			element provides the localized name for a script code, as described
    469 			in <i> <a
    470 				href="tr35.html#Unicode_Language_and_Locale_Identifiers">Section
    471 					3, Unicode Language and Locale Identifiers</a>
    472 			</i>(see also <i>UAX #24: Script Names</i> [<a
    473 				href="http://www.unicode.org/reports/tr41/#UAX24">UAX24</a>]). For
    474 			example, in the language of this locale, the name for the Latin
    475 			script might be "Romana", and for the Cyrillic script is "Kyrillica".
    476 			That would be expressed with the following.
    477 		</p>
    478 
    479 		<blockquote>
    480 			<pre>&lt;script type="<span style="color: blue">Latn</span>"&gt;<span
    481 					style="color: blue">Romana</span>&lt;/script&gt;
    482 &lt;script type="<span style="color: blue">Cyrl</span>"&gt;<span
    483 					style="color: blue">Kyrillica</span>&lt;/script&gt;
    484 </pre>
    485 		</blockquote>
    486 
    487 		<p>The script names are most commonly used in conjunction with a
    488 			language name, using the &lt;localePattern&gt; combining pattern, and
    489 			the default form of the script name should be suitable for such use.
    490 			When a script name requires a different form for stand-alone use,
    491 			this can be specified using the "stand-alone" alternate:</p>
    492 
    493 		<blockquote>
    494 			<pre>&lt;script type="<span style="color: blue">Hans</span>"&gt;<span
    495 					style="color: blue">Simplified</span>&lt;/script&gt;
    496 &lt;script type="<span style="color: blue">Hans</span>" alt="<span
    497 					style="color: blue">stand-alone</span>"&gt;<span
    498 					style="color: blue">Simplified Han</span>&lt;/script&gt;
    499 &lt;script type="<span style="color: blue">Hant</span>"&gt;<span
    500 					style="color: blue">Traditional</span>&lt;/script&gt;
    501 &lt;script type="<span style="color: blue">Hant</span>" alt="<span
    502 					style="color: blue">stand-alone</span>"&gt;<span
    503 					style="color: blue">Traditional Han</span>&lt;/script&gt;
    504 </pre>
    505 		</blockquote>
    506 
    507 		<p>This will produce results such as the following:</p>
    508 		<ul>
    509 			<li>Display name of language + script, using
    510 				&lt;localePattern&gt;: Chinese (Simplified)</li>
    511 			<li>Display name of script alone, using &lt;localePattern&gt;:
    512 				Simplified Han</li>
    513 		</ul>
    514 
    515 		<p class="element2">&lt;territories&gt;</p>
    516 
    517 		<p>
    518 			This contains a list of elements that provide the user-translated
    519 			names for territory codes, as described in <i> <a
    520 				href="tr35.html#Unicode_Language_and_Locale_Identifiers">Section
    521 					3, Unicode Language and Locale Identifiers</a></i>.
    522 		</p>
    523 
    524 		<blockquote>
    525 			<pre>&lt;territory type="<span style="color: blue">AD</span>"&gt;<span
    526 					style="color: blue">Andorra</span>&lt;/territory&gt;
    527 &lt;territory type="<span style="color: blue">AF</span>"&gt;<span
    528 					style="color: blue">Afghanistan</span>&lt;/territory&gt;
    529 &lt;territory type="<span style="color: blue">AL</span>"&gt;<span
    530 					style="color: blue">Albania</span>&lt;/territory&gt;
    531 &lt;territory type="<span style="color: blue">AO</span>"&gt;<span
    532 					style="color: blue">Angola</span>&lt;/territory&gt;
    533 &lt;territory type="<span style="color: blue">DZ</span>"&gt;<span
    534 					style="color: blue">Algeria</span>&lt;/territory&gt;
    535 &lt;territory type="<span style="color: blue">GB</span>"&gt;<span
    536 					style="color: blue">United Kingdom</span>&lt;/territory&gt;
    537 &lt;territory type="<span style="color: blue">GB</span>" alt="<span
    538 					style="color: blue">short</span>"&gt;<span style="color: blue">U.K.</span>&lt;/territory&gt;
    539 &lt;territory type="<span style="color: blue">US</span>"&gt;<span
    540 					style="color: blue">United States</span>&lt;/territory&gt;
    541 &lt;territory type="<span style="color: blue">US</span>" alt="<span
    542 					style="color: blue">short</span>"&gt;<span style="color: blue">U.S.</span>&lt;/territory&gt;
    543 </pre>
    544 		</blockquote>
    545 
    546 		<p class="element2">&lt;variants&gt;</p>
    547 
    548 		<p>
    549 			This contains a list of elements that provide the user-translated
    550 			names for the <i>variant_code</i> values described in <i> <a
    551 				href="tr35.html#Unicode_Language_and_Locale_Identifiers">Section
    552 					3, Unicode Language and Locale Identifiers</a>
    553 			</i>.
    554 		</p>
    555 
    556 		<blockquote>
    557 			<pre>&lt;variant type="<span style="color: blue">nynorsk</span>"&gt;<span
    558 					style="color: blue">Nynorsk</span>&lt;/variant&gt;
    559 </pre>
    560 		</blockquote>
    561 
    562 		<p class="element2">&lt;keys&gt;</p>
    563 
    564 		<p>
    565 			This contains a list of elements that provide the user-translated
    566 			names for the <i>key</i> values described in <i> <a
    567 				href="tr35.html#Unicode_Language_and_Locale_Identifiers">Section
    568 					3, Unicode Language and Locale Identifiers</a></i>.
    569 		</p>
    570 
    571 		<blockquote>
    572 			<pre>&lt;key type="<span style="color: blue">collation</span>"&gt;<span
    573 					style="color: blue">Sortierung</span>&lt;/key&gt;
    574 </pre>
    575 		</blockquote>
    576 
    577 		<p class="element2">&lt;types&gt;</p>
    578 
    579 		<p>
    580 			This contains a list of elements that provide the user-translated
    581 			names&nbsp; for the <i>type</i> values described in <i> <a
    582 				href="tr35.html#Unicode_Language_and_Locale_Identifiers">Section
    583 					3, Unicode Language and Locale Identifiers</a>
    584 			</i>. Since the translation of an option name may depend on the <i>key</i>
    585 			it is used with, the latter is optionally supplied.
    586 		</p>
    587 
    588 		<blockquote>
    589 			<pre>&lt;type type="<span style="color: blue">phonebook</span>" key="<span
    590 					style="color: blue">collation</span>"&gt;<span style="color: blue">Telefonbuch</span>&lt;/type&gt;
    591 </pre>
    592 		</blockquote>
    593 
    594 		<p class="element2">&lt;measurementSystemNames&gt;</p>
    595 
    596 		<p>
    597 			This contains a list of elements that provide the user-translated
    598 			names for systems of measurement. The types currently supported are "<span
    599 				style="color: blue">US</span>", "<span style="color: blue">metric</span>",
    600 			and "<span style="color: blue">UK</span>".
    601 		</p>
    602 
    603 		<blockquote>
    604 			<pre>&lt;measurementSystemName type="<span style="color: blue">US</span>"&gt;<span
    605 					style="color: blue">U.S.</span>&lt;/type&gt;
    606 </pre>
    607 		</blockquote>
    608 
    609 		<p class="note">
    610 			<b>Note:</b> In the future, we may need to add display names for the
    611 			particular measurement units (millimeter versus millimetre versus
    612 			whatever the Greek, Russian, etc are), and a message format for
    613 			positioning those with respect to numbers. For example, "{number}
    614 			{unitName}" in some languages, but "{unitName} {number}" in others.
    615 		</p>
    616 
    617 		<p class="element2">&lt;transformNames&gt;</p>
    618 
    619 		<p>&nbsp; </p>
    620 
    621 		<blockquote>
    622 			<pre>&lt;transformName type="<span style="color: blue">Numeric</span>"&gt;<span
    623 					style="color: blue">Numeric</span>&lt;/type&gt;
    624 </pre>
    625 		</blockquote>
    626 
    627 		<p class="element2">&lt;codePatterns&gt;</p>
    628 
    629 		<blockquote>
    630 			<pre>&lt;codePattern type="<span style="color: blue">language</span>"&gt;<span
    631 					style="color: blue">Language: {0}</span>&lt;/type&gt;
    632 </pre>
    633 		</blockquote>
    634 		<p class="dtd">
    635 			&lt;!ELEMENT subdivisions ( alias | ( subdivision | special )* ) &gt;<br>
    636 			&lt;!ELEMENT subdivision ( #PCDATA )&gt;
    637 		</p>
    638 		<p>Note that the subdivision names are in separate files, in the
    639 			subdivisions/ directory. The type values are the fully qualified
    640 			subdivsion names. For example:</p>
    641 		<p class="xmlExample">
    642 			&lt;subdivision type=&quot;AL-04&quot;&gt;Fier
    643 			County&lt;/subdivision&gt;<br> &lt;subdivision
    644 			type=&quot;AL-FR&quot;&gt;Fier&lt;/subdivision&gt; &lt;!-- in AL-04 :
    645 			Fier County --&gt;<br> &lt;subdivision
    646 			type=&quot;AL-LU&quot;&gt;Lushnj&lt;/subdivision&gt; &lt;!-- in
    647 			AL-04 : Fier County --&gt;<br> &lt;subdivision
    648 			type=&quot;AL-MK&quot;&gt;Mallakastr&lt;/subdivision&gt; &lt;!-- in
    649 			AL-04 : Fier County --&gt;
    650 		</p>
    651 		<p>
    652 			See also <strong>Part 6</strong> <em>Section 2.1.1 <a
    653 				href="tr35-info.html#Subdivision_Containment">Subdivision
    654 					Containment</a></em>.
    655 		</p>
    656 
    657 
    658 		<h2>
    659 			2 <a name="Layout_Elements" href="#Layout_Elements">Layout
    660 				Elements</a>
    661 		</h2>
    662 
    663 
    664 		<p class="dtd">&lt;!ELEMENT layout ( alias | (orientation*,
    665 			inList*, inText*, special*) ) &gt;</p>
    666 		<p>This top-level element specifies general layout features. It
    667 			currently only has one possible element (other than &lt;special&gt;,
    668 			which is always permitted).</p>
    669 
    670 		<p class="dtd">
    671 			&lt;!ELEMENT orientation ( characterOrder*, lineOrder*, special* )
    672 			&gt;<br> &lt;!ELEMENT characterOrder ( #PCDATA ) &gt;<br>
    673 			&lt;!ELEMENT lineOrder ( #PCDATA ) &gt;
    674 		</p>
    675 
    676 		<p>The lineOrder and characterOrder elements specify the default
    677 			general ordering of lines within a page, and characters within a
    678 			line. The possible values are:</p>
    679 
    680 		<table>
    681 			<tr>
    682 				<th>Direction</th>
    683 				<th>Value</th>
    684 			</tr>
    685 			<tr>
    686 				<td rowspan="2">Vertical</td>
    687 				<td>top-to-bottom</td>
    688 			</tr>
    689 			<tr>
    690 				<td>bottom-to-top</td>
    691 			</tr>
    692 			<tr>
    693 				<td rowspan="2">Horizontal</td>
    694 				<td>left-to-right</td>
    695 			</tr>
    696 			<tr>
    697 				<td>right-to-left</td>
    698 			</tr>
    699 		</table>
    700 
    701 		<p>
    702 			If the value of lineOrder is one of the vertical values, then the
    703 			value of characterOrder must be one of the horizontal values, and
    704 			vice versa. For example, for English the lines are top-to-bottom, and
    705 			the characters are left-to-right. For Mongolian (in the Mongolian
    706 			Script) the lines are right-to-left, and the characters are top to
    707 			bottom. This does not override the ordering behavior of bidirectional
    708 			text; it does, however, supply the paragraph direction for that text
    709 			(for more information, see <i>UAX #9: The Bidirectional Algorithm</i>
    710 			[<a href="http://www.unicode.org/reports/tr41/#UAX9">UAX9</a>]).
    711 		</p>
    712 
    713 		<p>For dates, times, and other data to appear in the right order,
    714 			the display for them should be set to the orientation of the locale.</p>
    715 
    716 		<p>&lt;inList&gt; (deprecated)</p>
    717 
    718 		<p>
    719 			The &lt;inList&gt; element is deprecated and has been superseded by
    720 			the &lt;contextTransforms&gt; element; see <i>Section 12 <a
    721 				href="#Context_Transform_Elements">ContextTransform Elements</a>
    722 			</i>.
    723 		</p>
    724 
    725 		<p>This element controls whether display names (language,
    726 			territory, etc) are title cased in GUI menu lists and the like. It is
    727 			only used in languages where the normal display is lower case, but
    728 			title case is used in lists. There are two options:</p>
    729 
    730 		<pre>&lt;inList casing="titlecase-words"&gt;</pre>
    731 		<pre>&lt;inList casing="titlecase-firstword"&gt;</pre>
    732 
    733 		<p>
    734 			In both cases, the title case operation is the default title case
    735 			function defined by Chapter 3 of <i>[<a href="tr35.html#Unicode">Unicode</a>]
    736 			</i>. In the second case, only the first word (using the word boundaries
    737 			for that locale) will be title cased. The results can be fine-tuned
    738 			by using alt="list" on any element where titlecasing as defined by
    739 			the Unicode Standard will produce the wrong value. For example,
    740 			suppose that "turc de Crime" is a value, and the title case should
    741 			be "Turc de Crime". Then that can be expressed using the alt="list"
    742 			value.
    743 		</p>
    744 
    745 		<p>&lt;inText&gt; (deprecated)</p>
    746 
    747 		<p>
    748 			The &lt;inList&gt; element is deprecated and has been superseded by
    749 			the &lt;contextTransforms&gt; element; see <i>Section 12 <a
    750 				href="#Context_Transform_Elements">ContextTransform Elements</a>
    751 			</i>.
    752 		</p>
    753 
    754 		<p>This element indicates the casing of the data in the category
    755 			identified by the inText type attribute, when that data is written in
    756 			text or how it would appear in a dictionary. For example :</p>
    757 
    758 		<pre>&lt;inText type="languages"&gt;lowercase-words&lt;/inText&gt;</pre>
    759 
    760 		<p>indicates that language names embedded in text are normally
    761 			written in lower case. The possible values and their meanings are :</p>
    762 
    763 		<ul>
    764 			<li>titlecase-words : all words in the phrase should be title
    765 				case</li>
    766 			<li>titlecase-firstword : the first word should be title case</li>
    767 			<li>lowercase-words : all words in the phrase should be lower
    768 				case</li>
    769 			<li>mixed : a mixture of upper and lower case is permitted.
    770 				generally used when the correct value is unknown.</li>
    771 		</ul>
    772 
    773 
    774 		<h2>
    775 			3 <a name="Character_Elements" href="#Character_Elements">Character
    776 				Elements</a>
    777 		</h2>
    778 
    779 
    780 		<p class="dtd">&lt;!ELEMENT characters ( alias | ( exemplarCharacters*, ellipsis*, moreInformation*, stopwords*, indexLabels*, mapping*, parseLenients*, special* ) ) &gt;</p>
    781 		<p>
    782 			The &lt;characters&gt; element provides optional information about
    783 			characters that are in common use in the locale, and information that
    784 			can be helpful in picking resources or data appropriate for the
    785 			locale, such as when choosing among character encodings that are
    786 			typically used to transmit data in the language of the locale. It may
    787 			also be used to help reduce confusability issues: see [<a
    788 				href="http://www.unicode.org/reports/tr41/#UTR36">UTR39</a>]. It
    789 			typically only occurs in a language locale, not in a
    790 			language/territory locale. The stopwords are an experimental feature,
    791 			and should not be used.
    792 		</p>
    793 		<h3>
    794 			3.1 <a name="Exemplars" href="#Exemplars">Exemplars</a>
    795 		</h3>
    796 
    797 		<p>Exemplars are characters used by a language, separated into
    798 			different categories. The following table provides a summary, with
    799 			more details below.</p>
    800 		<table>
    801 			<tr>
    802 				<th scope="col">Type</th>
    803 				<th scope="col">Description</th>
    804 				<th scope="col">Examples</th>
    805 			</tr>
    806 			<tr>
    807 				<td>main / standard</td>
    808 				<td>Main letters used in the language</td>
    809 				<td style="font-family: Georgia, 'Times New Roman', Times, serif">a-z
    810 					  </td>
    811 			</tr>
    812 			<tr>
    813 				<td><span class="element2">auxiliary</span></td>
    814 				<td>Additional characters for common foreign words, technical
    815 					usage</td>
    816 				<td style="font-family: Georgia, 'Times New Roman', Times, serif">
    817 					                                 
    818 					  </td>
    819 			</tr>
    820 			<tr>
    821 				<td><span class="element2">index</span></td>
    822 				<td>Characters for the header of an index</td>
    823 				<td style="font-family: Georgia, 'Times New Roman', Times, serif">A
    824 					B C D E F G H I J K L M N O P Q R S T U V W X Y Z</td>
    825 			</tr>
    826 			<tr>
    827 				<td>punctuation</td>
    828 				<td>Common punctuation</td>
    829 				<td style="font-family: Georgia, 'Times New Roman', Times, serif">-
    830 					   , ; \: ! ? .      ( ) [ ]  @ * / &amp; #    </td>
    831 			</tr>
    832 		  <tr>
    833 			  <td>numbers</td>
    834 			  <td>The characters needed to display the common number formats: decimal, percent, and currency.</td>
    835 			  <td style="font-family: Georgia, 'Times New Roman', Times, serif">[\u061C\u200E \- ,   . %    + 0 1 2 3 4 5 6 7 8 9]</td>
    836 		  </tr>
    837 		</table>
    838 		<p>
    839 			The basic exemplar character sets (main and auxiliary) contain the
    840 			commonly used letters for a given modern form of a language, which
    841 			can be for testing and for determining the appropriate repertoire of
    842 			letters for charset conversion or collation. ("Letter" is interpreted
    843 			broadly, as anything having the property Alphabetic in the [<a
    844 				href="http://unicode.org/reports/tr41/#UAX44">UAX44</a>], which also
    845 			includes syllabaries and ideographs.) It is not a complete set of
    846 			letters used for a language, nor should it be considered to apply to
    847 			multiple languages in a particular country. Punctuation and other
    848 			symbols should not be included in the main and auxiliary sets. In
    849 			particular, format characters like CGJ are not included.
    850 		</p>
    851 		<p>
    852 			There are five sets altogether: main, auxiliary, punctuation, numbers, and
    853 			index. The <i>main</i> set should contain the minimal set required
    854 			for users of the language, while the <i>auxiliary</i> exemplar set is
    855 			designed to encompass additional characters: those non-native or
    856 			historical characters that would customarily occur in common
    857 			publications, dictionaries, and so on. Major style guidelines are
    858 			good references for the auxiliary set. So, for example, if Irish
    859 			newspapers and magazines would commonly have Danish names using ,
    860 			for example, then it would be appropriate to include  in the
    861 			auxiliary exemplar characters; just not in the main exemplar set.
    862 			Thus English has the following:
    863 		</p>
    864 
    865 		<p>
    866 			&lt;exemplarCharacters&gt;[a b c d e f g h i j k l m n o p q r s t u
    867 			v w x y z]&lt;/exemplarCharacters&gt;<br> &lt;exemplarCharacters
    868 			type="auxiliary"&gt;[                       
    869 			             ]&lt;/exemplarCharacters&gt;
    870 		</p>
    871 
    872 		<p>For a given language, there are a few factors that help for
    873 			determining whether a character belongs in the auxiliary set, instead
    874 			of the main set:</p>
    875 
    876 		<ul>
    877 			<li>The character is not available on all normal keyboards.</li>
    878 			<li>It is acceptable to always use spellings that avoid that
    879 				character.</li>
    880 		</ul>
    881 
    882 		<p>For example, the exemplar character set for en (English) is the
    883 			set [a-z]. This set does not contain the accented letters that are
    884 			sometimes seen in words like "rsum" or "nave", because it is
    885 			acceptable in common practice to spell those words without the
    886 			accents. The exemplar character set for fr (French), on the other
    887 			hand, must contain those characters: [a-z              
    888 			]. The main set typically includes those letters commonly
    889 			"alphabet".</p>
    890 
    891 		<p>
    892 			The <em>punctuation</em> set consists of common punctuation
    893 			characters that are used with the language (corresponding to main and
    894 			auxiliary). Symbols may also be included where they are common in
    895 			plain text, such as . It does not include characters with narrow
    896 			technical usage, such as dictionary punctuation/symbols or copy-edit
    897 			symbols. For example, English would have something like the
    898 			following:
    899 		</p>
    900 
    901 		<blockquote>
    902 			-    <br> , ; : ! ? .  <br> ' &lsquo; &rsquo; " &ldquo;
    903 			&rdquo;   <br> ( ) [ ] { }  <br>    @ &amp;   / #
    904 			%   *  <br> +     &lt;  =   &gt; <br>
    905 		</blockquote>
    906 
    907 		<p>
    908 			The numbers exemplars does not currently include lesser-used characters: exponential notation (3.1  10, , NAN). Nor does it contain the units or currency symbols such as $, , , It does contain %, because that occurs in the percent format. It may contain some special formatting characters like the RLM. A full list of the currency symbols used with that locale are in the &lt;currencies&gt; element, while the units can be gotten from  the &lt;units&gt; element (both using inheritance, of course).The digits used in each numbering system are accessed in
    909 			numberingSystems.xml. For more information, see <em><strong>Part
    910 					3:<a href="tr35-numbers.html#Contents">Numbers</a> </strong>, Section 2<a href="tr35-numbers.html#Number_Elements">Number
    911 		Elements</a></em>. </p>
    912         <p> <em>Examples for zh.xml:</em> </p>
    913         <table>
    914           <tr>
    915             <th scope="col">Type</th>
    916             <th scope="col">Description</th>
    917           </tr>
    918           <tr>
    919             <td>defaultNumberingSystem</td>
    920             <td>latn</td>
    921           </tr>
    922           <tr>
    923             <td>otherNumberingSystems/native</td>
    924             <td>hanidec</td>
    925           </tr>
    926           <tr>
    927             <td>otherNumberingSystems/traditional</td>
    928             <td>hans</td>
    929           </tr>
    930           <tr>
    931             <td>otherNumberingSystems/finance</td>
    932             <td>hansfin</td>
    933           </tr>
    934         </table>
    935         <p>When determining the character repertoire needed to support a
    936 			language, a reasonable initial set would include at least the
    937 			characters in the main and punctuation exemplar sets, along with the
    938 			digits and common symbols associated with the numberSystems supported
    939 			for the locale (see <i> <a
    940 				href="tr35-numbers.html#Numbering_Systems">Numbering Systems</a></i>).
    941 		</p>
    942 
    943 		<p>
    944 			The <em>index</em> characters are a set of characters for use as a UI
    945 			"index", that is, a list of clickable characters (or character
    946 			sequences) that allow the user to see a segment of a larger "target"
    947 			list. For details see the <a
    948 				href="tr35-collation.html#Collation_Indexes">Unicode LDML:
    949 				Collation</a> document. The index set may only contain characters whose
    950 			lowercase versions are in the main and auxiliary exemplar sets,
    951 			though for cased languages the index exemplars are typically in
    952 			uppercase. Characters from the auxiliary exemplar set may be
    953 			necessary in the index set if it needs to properly handle items such
    954 			as names which may require characters not included in the main
    955 			exemplar set.
    956 		</p>
    957 
    958 		<p>Here is a sample of the XML structure:</p>
    959 
    960 		<pre>&lt;exemplarCharacters type="index"&gt;[A B C D E F G H I J K L M N O P Q R S T U V W X Y Z]&lt;/exemplarCharacters&gt;</pre>
    961 
    962 		<p>The display of the index characters can be modified with the
    963 			Index labels elements, discussed in Section 5.6.4.</p>
    964 
    965 		<h4>
    966 			3.1.1 <a name="ExemplarSyntax" href="#ExemplarSyntax">Exemplar
    967 				Syntax</a>
    968 		</h4>
    969 
    970 
    971 		<p>
    972 			In all of the exemplar characters, the list of characters is in the <a
    973 				href="tr35.html#Unicode_Sets">Unicode Set</a> format, which normally
    974 			allows boolean combinations of sets of letters and Unicode
    975 			properties.
    976 		</p>
    977 
    978 		<p>
    979 			Sequences of characters that act like a single letter in the language
    980 			 especially in collation  are included within braces, such as [a-z
    981 			         {cs} {dz} {dzs} {gy} ...]. The characters should be
    982 			in normalized form (NFC). Where combining marks are used
    983 			generatively, and apply to a large number of base characters (such as
    984 			in Indic scripts), the individual combining marks should be included.
    985 			Where they are used with only a few base characters, the specific
    986 			combinations should be included. Wherever there is not a precomposed
    987 			character (for example, single codepoint) for a given combination,
    988 			that must be included within braces. For example, to include
    989 			sequences from the <a href="http://unicode.org/standard/where/">Where
    990 				is my Character?</a> page on the Unicode site, one would write: [{ch}
    991 			{t} {x} {} {} {i} {}], but for French one would just write
    992 			[a-z    ...]. When in doubt use braces, since it does no harm to
    993 			include them around single code points: for example, [a-z {} {} {}
    994 			...].
    995 		</p>
    996 
    997 		<p>If the letter 'z' were only ever used in the combination 'tz',
    998 			then we might have [a-y {tz}] in the main set. (The language would
    999 			probably have plain 'z' in the auxiliary set, for use in foreign
   1000 			words.) If combining characters can be used productively in
   1001 			combination with a large number of others (such as say Indic matras),
   1002 			then they are not listed in all the possible combinations, but
   1003 			separately, such as:</p>
   1004 
   1005 		<blockquote>[   - -    -      - 
   1006 			-    -  -  -  - -    -   -]</blockquote>
   1007 
   1008 		<p>The exemplar character set for Han characters is composed
   1009 			somewhat differently. It is even harder to draw a clear line for Han
   1010 			characters, since usage is more like a frequency curve that slowly
   1011 			trails off to the right in terms of decreasing frequency. So for this
   1012 			case, the exemplar characters simply contain a set of reasonably
   1013 			frequent characters for the language.</p>
   1014 
   1015 		<p>The ordering of the characters in the set is irrelevant, but
   1016 			for readability in the XML file the characters should be in sorted
   1017 			order according to the locale's conventions. The main and auxiliary
   1018 			sets should only contain lower case characters (except for the
   1019 			special case of Turkish and similar languages, where the dotted
   1020 			capital I should be included); the upper case letters are to be
   1021 			mechanically added when the set is used. For more information on
   1022 			casing, see the discussion of Special Casing in the Unicode Character
   1023 			Database.</p>
   1024 
   1025 		<h4>
   1026 			3.1.2 <a name="Restrictions" href="#Restrictions">Restrictions</a>
   1027 		</h4>
   1028 
   1029 
   1030 		<ol>
   1031 			<li>The main, auxiliary and index sets are normally restricted
   1032 				to those letters with a specific <a
   1033 				href="http://unicode.org/Public/UNIDATA/Scripts.txt">Script </a>character
   1034 				property (that is, not the values Common or Inherited) or required <a
   1035 				href="http://unicode.org/Public/UNIDATA/DerivedCoreProperties.txt">Default_Ignorable_Code_Point</a>
   1036 				characters (such as a non-joiner), or combining marks, or the <a
   1037 				href="http://www.unicode.org/Public/UNIDATA/auxiliary/WordBreakProperty.txt">Word_Break</a>
   1038 				properties <a name="Katakana" href="#Katakana">Katakana</a>, <a
   1039 				name="ALetter" href="#ALetter">ALetter</a>, or <a name="MidLetter"
   1040 				href="#MidLetter">MidLetter</a>.
   1041 			</li>
   1042 
   1043 			<li>The auxiliary set should not overlap with the main set.
   1044 				There is one exception to this: Hangul Syllables and CJK Ideographs
   1045 				can overlap between the sets.</li>
   1046 
   1047 			<li>Any <a
   1048 				href="http://unicode.org/Public/UNIDATA/DerivedCoreProperties.txt">Default_Ignorable_Code_Point</a>s
   1049 				should be in the auxiliary set , or, if they are only needed for
   1050 				currency formatting, in the currency set. These can include
   1051 				characters such as U+200E LEFT-TO-RIGHT MARK and U+200F
   1052 				RIGHT-TO-LEFT MARK which may be needed in bidirectional text in
   1053 				order for date, currency or other formats to display correctly.
   1054 			</li>
   1055 			<li>For exemplar characters the <a href="tr35.html#Unicode_Sets">Unicode
   1056 					Set</a> format is restricted so as to not use properties or boolean
   1057 				combinations .
   1058 			</li>
   1059 		</ol>
   1060 
   1061 		<h3>
   1062 			3.2 <a name="Character_Mapping" href="#Character_Mapping">Mapping</a>
   1063 		</h3>
   1064 
   1065 		<p>
   1066 			<b>This element has been deprecated.</b> For information on its
   1067 			structure and how it was intended to specify locale-specific
   1068 			preferred encodings for various purposes (e-mail, web), see the <a
   1069 				href="http://www.unicode.org/reports/tr35/tr35-39/tr35-general.html#Character_Mapping">Mapping</a>
   1070 			section from the CLDR 27 version of the LDML Specification.
   1071 		</p>
   1072 
   1073 
   1074 		<h3>
   1075 			3.3 <a name="IndexLabels" href="#IndexLabels">Index Labels</a>
   1076 		</h3>
   1077 
   1078 		<p>
   1079 			<b>This element and its subelements have been deprecated.</b> For
   1080 			information on its structure and how it was intended to provide data
   1081 			for a compressed display of index exemplar characters where space is
   1082 			limited, see the <a
   1083 				href="http://www.unicode.org/reports/tr35/tr35-39/tr35-general.html#IndexLabels">Index
   1084 				Labels</a> section from the CLDR 27 version of the LDML Specification.
   1085 		</p>
   1086 
   1087 		<p class="dtd">&lt;!ELEMENT indexLabels (indexSeparator*,
   1088 			compressedIndexSeparator*, indexRangePattern*, indexLabelBefore*,
   1089 			indexLabelAfter*, indexLabel*) &gt;</p>
   1090 
   1091 
   1092 		<h3>
   1093 			3.4 <a name="Ellipsis" href="#Ellipsis">Ellipsis</a>
   1094 		</h3>
   1095 
   1096 		<p class="dtd">
   1097 			&lt;!ELEMENT ellipsis ( #PCDATA ) &gt;<br> &lt;!ATTLIST ellipsis
   1098 			type ( initial | medial | final | word-initial | word-medial |
   1099 			word-final ) #IMPLIED &gt;
   1100 		</p>
   1101 
   1102 		<p>The ellipsis element provides patterns for use when truncating
   1103 			strings. There are three versions: initial for removing an initial
   1104 			part of the string (leaving final characters); medial for removing
   1105 			from the center of the string (leaving initial and final characters),
   1106 			and final for removing a final part of the string (leaving initial
   1107 			characters). For example, the following uses the ellipsis character
   1108 			in all three cases (although some languages may have different
   1109 			characters for different positions).</p>
   1110 
   1111 		<p>
   1112 			<code>
   1113 				&lt;ellipsis type="initial"&gt;{0}&lt;/ellipsis&gt;<br>
   1114 				&lt;ellipsis type="medial"&gt;{0}{1}&lt;/ellipsis&gt;<br>
   1115 				&lt;ellipsis type="final"&gt;{0}&lt;/ellipsis&gt;
   1116 			</code>
   1117 		</p>
   1118 		<p>There are alternatives for cases where the breaks are on a word
   1119 			boundary, where some languages include a space. For example, such as
   1120 			case would be:</p>
   1121 		<p>
   1122 			<code>&lt;ellipsis type="word-initial"&gt;
   1123 				{0}&lt;/ellipsis&gt;</code>
   1124 		</p>
   1125 
   1126 		<h3>
   1127 			3.5 <a name="Character_More_Info" href="#Character_More_Info">More
   1128 				Information</a>
   1129 		</h3>
   1130 
   1131 
   1132 		<p>The moreInformation string is one that can be displayed in an
   1133 			interface to indicate that more information is available. For
   1134 			example:</p>
   1135 		<p>&lt;moreInformation&gt;?&lt;/moreInformation&gt;</p>
   1136 		<h3> 3.6 <a name="Character_Parse_Lenient" href="#Character_Parse_Lenient">Parse Lenient</a> </h3>
   1137 		  <p  class='dtd'>&lt;!ELEMENT parseLenients ( alias | ( parseLenient*, special* ) ) &gt;<br>
   1138 		    &lt;!ATTLIST parseLenients scope (general | number | date) #REQUIRED &gt;<br>
   1139 		    &lt;!ATTLIST parseLenients level (lenient | stricter) #REQUIRED &gt;</p>
   1140 		  <p class='dtd'>&lt;!ELEMENT parseLenient ( #PCDATA ) &gt;<br>
   1141 		    &lt;!ATTLIST parseLenient sample CDATA #REQUIRED &gt;<br>
   1142 		    &lt;!ATTLIST parseLenient alt NMTOKENS #IMPLIED &gt;<br>
   1143 		    &lt;!ATTLIST parseLenient draft (approved | contributed | provisional | unconfirmed) #IMPLIED &gt;<br>
   1144 		  </p>
   1145 <p>Example:</p>
   1146 <pre>&lt;parseLenients scope=&quot;date&quot; level=&quot;lenient&quot;&gt;
   1147   &lt;parseLenient sample=&quot;-&quot;&gt;[\-./]&lt;/parseLenient&gt;
   1148   &lt;parseLenient sample=&quot;:&quot;&gt;[\:]&lt;/parseLenient&gt;
   1149 &lt;/parseLenients&gt;</pre>
   1150 <p>The parseLenient elements are used to indicate that characters within a particular UnicodeSet are normally to be treated as equivalent when doing a lenient parse. The <strong>scope</strong> attribute value defines where the lenient sets are intended for use. The <strong>level</strong> attribute value is included for future expansion; currently the only value is &quot;lenient&quot;.</p>
   1151 <p>The <strong>sample</strong> attribute value is a paradigm element of that UnicodeSet, but the only reason for pulling it out separately is so that different classes of characters are separated, and to enable inheritance overriding. The first version of this data is populated with the data used for lenient parsing from ICU.</p>
   1152 
   1153 		<h2>
   1154 			4 <a name="Delimiter_Elements" href="#Delimiter_Elements">Delimiter
   1155 				Elements</a>
   1156 		</h2>
   1157 
   1158 
   1159 		<p class="dtd">&lt;!ELEMENT delimiters (alias | (quotationStart*,
   1160 			quotationEnd*, alternateQuotationStart*, alternateQuotationEnd*,
   1161 			special*)) &gt;</p>
   1162 
   1163 		<p>The delimiters supply common delimiters for bracketing
   1164 			quotations. The quotation marks are used with simple quoted text,
   1165 			such as:</p>
   1166 
   1167 		<blockquote>
   1168 			<p>He said, Dont be absurd!</p>
   1169 		</blockquote>
   1170 
   1171 		<p>When quotations are nested, the quotation marks and alternate
   1172 			marks are used in an alternating fashion:</p>
   1173 
   1174 		<blockquote>
   1175 			<p>He said, Remember what the Mad Hatter said: Not the same
   1176 				thing a bit! Why you might just as well say that I see what I eat
   1177 				is the same thing as I eat what I see!</p>
   1178 		</blockquote>
   1179 
   1180 		<p>
   1181 			<code>&lt;quotationStart&gt;</code>
   1182 			<span style="color: blue"></span>
   1183 			<code>&lt;/quotationStart&gt;</code>
   1184 			<br>
   1185 			<code>&lt;quotationEnd&gt;</code>
   1186 			<span style="color: blue"></span>
   1187 			<code>&lt;/quotationEnd&gt;</code>
   1188 			<br>
   1189 			<code>&lt;alternateQuotationStart&gt;</code>
   1190 			<span style="color: blue"></span>
   1191 			<code>&lt;/alternateQuotationStart&gt;</code>
   1192 			<br>
   1193 			<code>&lt;alternateQuotationEnd&gt;</code>
   1194 			<span style="color: blue"></span>
   1195 			<code>&lt;/alternateQuotationEnd&gt;</code>
   1196 		</p>
   1197 
   1198 
   1199 		<h2>
   1200 			5 <a name="Measurement_System_Data" href="#Measurement_System_Data">Measurement
   1201 				System Data</a>
   1202 		</h2>
   1203 
   1204 
   1205 		<p class="dtd">
   1206 			&lt;!ELEMENT measurementData ( measurementSystem*, paperSize* ) &gt;<br>
   1207 			<br> &lt;!ELEMENT measurementSystem EMPTY &gt;<br>
   1208 			&lt;!ATTLIST measurementSystem type ( metric | US | UK ) #REQUIRED
   1209 			&gt;<br> &lt;!ATTLIST measurementSystem category ( temperature )
   1210 			#IMPLIED &gt;<br>&lt;!ATTLIST measurementSystem territories
   1211 			NMTOKENS #REQUIRED &gt;<br> <br> &lt;!ELEMENT paperSize
   1212 			EMPTY &gt;<br> &lt;!ATTLIST paperSize type ( A4 | US-Letter )
   1213 			#REQUIRED &gt;<br> &lt;!ATTLIST paperSize territories NMTOKENS
   1214 			#REQUIRED &gt;
   1215 		</p>
   1216 
   1217 		<p>The measurement system is the normal measurement system in
   1218 			common everyday use (except for date/time). For example:</p>
   1219 
   1220 		<pre>&lt;measurementData&gt;
   1221  &lt;measurementSystem type=&quot;metric&quot;  territories=&quot;001&quot;/&gt;
   1222  &lt;measurementSystem type=&quot;US&quot;  territories=&quot;LR MM US&quot;/&gt;
   1223  &lt;measurementSystem type=&quot;metric&quot; category=&quot;temperature&quot; territories=&quot;LR MM&quot;/&gt;
   1224  &lt;measurementSystem type=&quot;US&quot; category=&quot;temperature&quot; territories=&quot;BS BZ KY PR PW&quot;/&gt;
   1225  &lt;measurementSystem type=&quot;UK&quot;  territories=&quot;GB&quot;/&gt;
   1226  &lt;paperSize type=&quot;A4&quot;  territories=&quot;001&quot;/&gt;
   1227  &lt;paperSize type=&quot;US-Letter&quot;  territories=&quot;BZ CA CL CO CR GT MX NI PA PH PR SV US VE&quot;/&gt;
   1228 &lt;/measurementData&gt;</pre>
   1229 
   1230 		<p>The values are "metric", "US", or "UK"; others may be added
   1231 			over time.</p>
   1232 		<ul>
   1233 			<li>The "metric" value indicates the use of SI [<a
   1234 				href="tr35.html#ISO1000">ISO1000</a>] base or derived units, or
   1235 				non-SI units accepted for use with SI: for example, meters,
   1236 				kilograms, liters, and degrees Celsius.
   1237 			</li>
   1238 			<li>The "US" value indicates the customary system of measurement
   1239 				as used in the United States: feet, inches, pints, quarts, degrees
   1240 				Fahrenheit, and so on.</li>
   1241 			<li>The "UK" value indicates the mix of metric units and
   1242 				Imperial units (feet, inches, pints, quarts, and so on) used in the
   1243 				United Kingdom, in which Imperial volume units such
   1244 				as pint, quart, and gallon are different sizes than in the "US"
   1245 				customary system. For more detail about specific units
   1246 				for various usages, see <strong>Part 6: Supplemental:</strong> <em>Section 2.4.1
   1247 				<a href="tr35-info.html#Preferred_Units_For_Usage">Preferred Units for
   1248 				Specific Usages</a></em>.
   1249 			</li>
   1250 		</ul>
   1251 		<p>In some cases, it may be common to use different measurement
   1252 			systems for different categories of measurements. For example, the
   1253 			following indicates that for the category of temperature, in the
   1254 			regions LR and MM, it is more common to use metric units than US
   1255 			units.</p>
   1256 
   1257 		<pre>
   1258 			&lt;measurementSystem type=&quot;metric&quot; category=&quot;temperature&quot; territories=&quot;LR MM&quot;/&gt;
   1259 		</pre>
   1260 
   1261 		<p>The paperSize attribute gives the height and width of paper
   1262 			used for normal business letters. The values are "A4" and
   1263 			"US-Letter".</p>
   1264 
   1265 		<p>For both measurementSystem entries and paperSize entries, later
   1266 			entries for specific territories such as "US" will override the value
   1267 			assigned to that territory by earlier entries for more inclusive
   1268 			territories such as "001".</p>
   1269 
   1270 		<p>The measurement information was formerly in the main LDML file,
   1271 			and had a somewhat different format.</p>
   1272 			
   1273 		<p>Again, for finer-grained detail about specific units
   1274 			for various usages, see <strong>Part 6: Supplemental:</strong> <em>Section 2.4.1
   1275 			<a href="tr35-info.html#Preferred_Units_For_Usage">Preferred Units for
   1276 			Specific Usages</a></em>.</p>
   1277 
   1278 		<h3>
   1279 			5.1 <a name="Measurement_Elements" href="#Measurement_Elements">Measurement
   1280 				Elements (deprecated)</a>
   1281 		</h3>
   1282 
   1283 
   1284 		<p class="dtd">&lt;!ELEMENT measurement (alias |
   1285 			(measurementSystem?, paperSize?, special*)) &gt;</p>
   1286 		<p>The measurement element is deprecated in the main LDML files,
   1287 			because the data is more appropriately organized as connected to
   1288 			territories, not to linguistic data. Instead, the measurementData
   1289 			element in the supplemental data file should be used.</p>
   1290 
   1291 
   1292 		<h2>
   1293 			6 <a name="Unit_Elements" href="#Unit_Elements">Unit Elements</a>
   1294 		</h2>
   1295 
   1296 
   1297 		<p class="dtd">
   1298 			&lt;!ELEMENT units (alias | (unit*, unitLength*, durationUnit*,
   1299 			special*) ) &gt;<br> <br> &lt;!ELEMENT unitLength (alias |
   1300 			(compoundUnit*, unit*, coordinateUnit*, special*) ) &gt;<br>
   1301 			&lt;!ATTLIST unitLength type (long | short | narrow) #REQUIRED &gt; <br>
   1302 			<br> &lt;!ELEMENT compoundUnit (alias | (compoundUnitPattern*,
   1303 			special*) ) &gt;<br> &lt;!ATTLIST compoundUnit type NMTOKEN
   1304 			#REQUIRED &gt; <br> <br> &lt;!ELEMENT unit (alias |
   1305 			(displayName*, unitPattern*, perUnitPattern*, special*) ) &gt;<br>
   1306 			&lt;!ATTLIST unit type NMTOKEN #REQUIRED &gt; <br> <br>
   1307 			&lt;!ELEMENT durationUnit (alias | (durationUnitPattern*, special*) )
   1308 			&gt;<br> &lt;!ATTLIST durationUnit type NMTOKEN #REQUIRED &gt; <br>
   1309 			<br> &lt;!ELEMENT unitPattern ( #PCDATA ) &gt;<br>
   1310 			&lt;!ATTLIST unitPattern count (0 | 1 | zero | one | two | few | many
   1311 			| other) #REQUIRED &gt; <br> <br> &lt;!ELEMENT
   1312 			compoundUnitPattern ( #PCDATA ) &gt;<br> <br> &lt;!ELEMENT
   1313 			coordinateUnit ( alias | ( displayName*, coordinateUnitPattern*, special* ) ) &gt;<br>&lt;!ELEMENT
   1314 			coordinateUnitPattern ( #PCDATA ) &gt;<br> &lt;!ATTLIST
   1315 			coordinateUnitPattern type (north | east | south | west) #REQUIRED
   1316 			&gt; <br> <br> &lt;!ELEMENT durationUnitPattern ( #PCDATA )
   1317 			&gt;<br>
   1318 		</p>
   1319 
   1320 		<p>These elements specify the localized way of formatting
   1321 			quantities of units such as years, months, days, hours, minutes and
   1322 			seconds for example, in English, "1 day" or "3 days". The English
   1323 			rules that produce this example are as follows ({0} indicates the
   1324 			position of the formatted numeric value):</p>
   1325 
   1326 		<pre>&lt;unit type="duration-day"&gt;
   1327 &nbsp;&nbsp;&lt;displayName&gt;days&lt;/displayName&gt;
   1328 &nbsp;&nbsp;&lt;unitPattern count="one"&gt;<span style="color: blue">{0} day</span>&lt;/unitName&gt;
   1329 &nbsp;&nbsp;&lt;unitPattern count="other"&gt;<span style="color: blue">{0} days</span>&lt;/unitName&gt;
   1330 &lt;/unit&gt;</pre>
   1331 
   1332 		<p>In addition to supporting language-specific plural cases
   1333 			such as one and other, unitPatterns support the language-independent
   1334 			explicit cases 0 and 1 for special handling of numeric values that are
   1335 			exactly 0 or 1; see
   1336 			<a href="tr35-numbers.html#Explicit_0_1_rules">Explicit 0 and 1 rules</a>.</p>
   1337 		<p>
   1338 			Units, like other values with a <strong>count</strong> attribute, use
   1339 			a special inheritance. See <strong>Part 1: Core:</strong> <em>Section
   1340 				4.1 <a href="tr35.html#Multiple_Inheritance">Multiple
   1341 					Inheritance</a>
   1342 			</em>.
   1343 		</p>
   1344 		<p>The displayName is used for labels, such as in a UI. It is
   1345 			typically lowercased and as neutral a plural form as possible, and
   1346 			then uses the casing context for the proper display. For example, for
   1347 			English in a UI it would appear as titlecase:</p>
   1348 		<p>
   1349 			<strong>Duration:</strong>
   1350 		</p>
   1351 		<table style="margin-left: 5em">
   1352 			<tr>
   1353 				<td>Days</td>
   1354 				<td style="color: silver">enter the vacation length</td>
   1355 			</tr>
   1356 		</table>
   1357 		<p>&nbsp;</p>
   1358 		<p>The value of the type attribute are <em>unit identifiers</em>. Syntactically, they have the following structure:</p>
   1359 		<div class='syntax'>
   1360 		<p>unit_identifier := type &quot;-&quot; unit</p>
   1361 		<p>type := [a-z]+</p>
   1362 		<p>unit := [a-z]+([-][a-z]+)*</p>
   1363 		</div>
   1364 		<p>Example:		</p>
   1365 		<p class="xmlExample">&lt;unit
   1366 			type=&quot;acceleration-g-force&quot;&gt;</p>
   1367 		<p></p>
   1368 		<p>
   1369 			Examples of these include but are not limited to the following. The units in CLDR are not comprehensive; it is anticipated that
   1370 			more will be added over time. The complete list of supported units is in the
   1371 			validity data: see <em>Section <a href="tr35.html#Validity_Data">3.11
   1372 					Validity Data</a></em>.
   1373 		</p>
   1374 		<table>
   1375 			<tr>
   1376 				<td><strong>Type</strong></td>
   1377 				<td><strong>Unit</strong></td>
   1378 				<td><strong>Sample Format</strong></td>
   1379 			</tr>
   1380 			<tr>
   1381 				<td><em>acceleration</em></td>
   1382 				<td>g-force</td>
   1383 				<td>{0} G</td>
   1384 			</tr>
   1385 			<tr>
   1386 				<td><em>acceleration</em></td>
   1387 				<td>meter-per-second-squared</td>
   1388 				<td>{0} m/s</td>
   1389 			</tr>
   1390 			<tr>
   1391 				<td><em>angle</em></td>
   1392 				<td>revolution</td>
   1393 				<td>{0} rev</td>
   1394 			</tr>
   1395 			<tr>
   1396 				<td><em>angle</em></td>
   1397 				<td>radian</td>
   1398 				<td>{0} rad</td>
   1399 			</tr>
   1400 			<tr>
   1401 				<td><em>angle</em></td>
   1402 				<td>degree</td>
   1403 				<td>{0}</td>
   1404 			</tr>
   1405 			<tr>
   1406 				<td><em>angle</em></td>
   1407 				<td>arc-minute</td>
   1408 				<td>{0}</td>
   1409 			</tr>
   1410 			<tr>
   1411 				<td><em>angle</em></td>
   1412 				<td>arc-second</td>
   1413 				<td>{0}</td>
   1414 			</tr>
   1415 			<tr>
   1416 				<td><em>area</em></td>
   1417 				<td>square-kilometer</td>
   1418 				<td>{0} km</td>
   1419 			</tr>
   1420 			<tr>
   1421 				<td><em>area</em></td>
   1422 				<td>hectare</td>
   1423 				<td>{0} ha</td>
   1424 			</tr>
   1425 			<tr>
   1426 				<td><em>area</em></td>
   1427 				<td>square-meter</td>
   1428 				<td>{0} m</td>
   1429 			</tr>
   1430 			<tr>
   1431 				<td><em>area</em></td>
   1432 				<td>square-centimeter</td>
   1433 				<td>{0} cm</td>
   1434 			</tr>
   1435 			<tr>
   1436 				<td><em>area</em></td>
   1437 				<td>square-mile</td>
   1438 				<td>{0} mi</td>
   1439 			</tr>
   1440 			<tr>
   1441 				<td><em>area</em></td>
   1442 				<td>acre</td>
   1443 				<td>{0} ac</td>
   1444 			</tr>
   1445 			<tr>
   1446 				<td><em>area</em></td>
   1447 				<td>square-yard</td>
   1448 				<td>{0} yd</td>
   1449 			</tr>
   1450 			<tr>
   1451 				<td><em>area</em></td>
   1452 				<td>square-foot</td>
   1453 				<td>{0} ft</td>
   1454 			</tr>
   1455 			<tr>
   1456 				<td><em>area</em></td>
   1457 				<td>square-inch</td>
   1458 				<td>{0} in</td>
   1459 			</tr>
   1460 			<tr>
   1461 				<td><em>concentr</em></td>
   1462 				<td>karat</td>
   1463 				<td>{0} kt</td>
   1464 				<td>dimensionless</td>
   1465 			</tr>
   1466 			<tr>
   1467 				<td><em>concentr</em></td>
   1468 				<td>milligram-per-deciliter</td>
   1469 				<td>{0} mg/dL</td>
   1470 			</tr>
   1471 			<tr>
   1472 				<td><em>concentr</em></td>
   1473 				<td>millimole-per-liter</td>
   1474 				<td>{0} mmol/L</td>
   1475 			</tr>
   1476 			<tr>
   1477 				<td><em>concentr</em></td>
   1478 				<td>part-per-million</td>
   1479 				<td>{0} ppm</td>
   1480 				<td>dimensionless</td>
   1481 			</tr>
   1482 			<tr>
   1483 				<td><em>concentr</em></td>
   1484 				<td>percent</td>
   1485 				<td>{0}%</td>
   1486 				<td>dimensionless</td>
   1487 			</tr>
   1488 			<tr>
   1489 				<td><em>concentr</em></td>
   1490 				<td>permille</td>
   1491 				<td>{0}</td>
   1492 				<td>dimensionless</td>
   1493 			</tr>
   1494 			<tr>
   1495 				<td><em>consumption</em></td>
   1496 				<td>liter-per-kilometer</td>
   1497 				<td>{0} L/km</td>
   1498 			</tr>
   1499 			<tr>
   1500 				<td><em>consumption</em></td>
   1501 				<td>liter-per-100kilometers</td>
   1502 				<td>{0} L/100km</td>
   1503 			</tr>
   1504 			<tr>
   1505 				<td><em>consumption</em></td>
   1506 				<td>mile-per-gallon (US)</td>
   1507 				<td>{0} mpg</td>
   1508 			</tr>
   1509 			<tr>
   1510 				<td><em>consumption</em></td>
   1511 				<td>mile-per-gallon-imperial</td>
   1512 				<td>{0} mpg Imp.</td>
   1513 			</tr>
   1514 			<tr>
   1515 				<td><em>digital</em></td>
   1516 				<td>petabyte</td>
   1517 				<td>{0} PB</td>
   1518 			</tr>
   1519 			<tr>
   1520 				<td><em>digital</em></td>
   1521 				<td>terabyte</td>
   1522 				<td>{0} TB</td>
   1523 			</tr>
   1524 			<tr>
   1525 				<td><em>digital</em></td>
   1526 				<td>terabit</td>
   1527 				<td>{0} Tb</td>
   1528 			</tr>
   1529 			<tr>
   1530 				<td><em>digital</em></td>
   1531 				<td>gigabyte</td>
   1532 				<td>{0} GB</td>
   1533 			</tr>
   1534 			<tr>
   1535 				<td><em>digital</em></td>
   1536 				<td>gigabit</td>
   1537 				<td>{0} Gb</td>
   1538 			</tr>
   1539 			<tr>
   1540 				<td><em>digital</em></td>
   1541 				<td>megabyte</td>
   1542 				<td>{0} MB</td>
   1543 			</tr>
   1544 			<tr>
   1545 				<td><em>digital</em></td>
   1546 				<td>megabit</td>
   1547 				<td>{0} Mb</td>
   1548 			</tr>
   1549 			<tr>
   1550 				<td><em>digital</em></td>
   1551 				<td>kilobyte</td>
   1552 				<td>{0} kB</td>
   1553 			</tr>
   1554 			<tr>
   1555 				<td><em>digital</em></td>
   1556 				<td>kilobit</td>
   1557 				<td>{0} kb</td>
   1558 			</tr>
   1559 			<tr>
   1560 				<td><em>digital</em></td>
   1561 				<td>byte</td>
   1562 				<td>{0} byte</td>
   1563 			</tr>
   1564 			<tr>
   1565 				<td><em>digital</em></td>
   1566 				<td>bit</td>
   1567 				<td>{0} bit</td>
   1568 			</tr>
   1569 			<tr>
   1570 				<td><em>duration</em></td>
   1571 				<td>century</td>
   1572 				<td>{0} c</td>
   1573 			</tr>
   1574 			<tr>
   1575 				<td><em>duration</em></td>
   1576 				<td>year</td>
   1577 				<td>{0} y</td>
   1578 			</tr>
   1579 			<tr>
   1580 				<td><em>duration</em></td>
   1581 				<td>year-person</td>
   1582 				<td>{0} y</td>
   1583 			</tr>
   1584 			<tr>
   1585 				<td><em>duration</em></td>
   1586 				<td>month</td>
   1587 				<td>{0} m</td>
   1588 			</tr>
   1589 			<tr>
   1590 				<td><em>duration</em></td>
   1591 				<td>month-person</td>
   1592 				<td>{0} m</td>
   1593 			</tr>
   1594 			<tr>
   1595 				<td><em>duration</em></td>
   1596 				<td>week</td>
   1597 				<td>{0} w</td>
   1598 			</tr>
   1599 			<tr>
   1600 				<td><em>duration</em></td>
   1601 				<td>week-person</td>
   1602 				<td>{0} w</td>
   1603 			</tr>
   1604 			<tr>
   1605 				<td><em>duration</em></td>
   1606 				<td>day</td>
   1607 				<td>{0} d</td>
   1608 			</tr>
   1609 			<tr>
   1610 				<td><em>duration</em></td>
   1611 				<td>day-person</td>
   1612 				<td>{0} d</td>
   1613 			</tr>
   1614 			<tr>
   1615 				<td><em>duration</em></td>
   1616 				<td>hour</td>
   1617 				<td>{0} h</td>
   1618 			</tr>
   1619 			<tr>
   1620 				<td><em>duration</em></td>
   1621 				<td>minute</td>
   1622 				<td>{0} min</td>
   1623 			</tr>
   1624 			<tr>
   1625 				<td><em>duration</em></td>
   1626 				<td>second</td>
   1627 				<td>{0} s</td>
   1628 			</tr>
   1629 			<tr>
   1630 				<td><em>duration</em></td>
   1631 				<td>millisecond</td>
   1632 				<td>{0} ms</td>
   1633 			</tr>
   1634 			<tr>
   1635 				<td><em>duration</em></td>
   1636 				<td>microsecond</td>
   1637 				<td>{0} s</td>
   1638 			</tr>
   1639 			<tr>
   1640 				<td><em>duration</em></td>
   1641 				<td>nanosecond</td>
   1642 				<td>{0} ns</td>
   1643 			</tr>
   1644 			<tr>
   1645 				<td><em>electric</em></td>
   1646 				<td>ampere</td>
   1647 				<td>{0} A</td>
   1648 			</tr>
   1649 			<tr>
   1650 				<td><em>electric</em></td>
   1651 				<td>milliampere</td>
   1652 				<td>{0} mA</td>
   1653 			</tr>
   1654 			<tr>
   1655 				<td><em>electric</em></td>
   1656 				<td>ohm</td>
   1657 				<td>{0} </td>
   1658 			</tr>
   1659 			<tr>
   1660 				<td><em>electric</em></td>
   1661 				<td>volt</td>
   1662 				<td>{0} V</td>
   1663 			</tr>
   1664 			<tr>
   1665 				<td><em>energy</em></td>
   1666 				<td>kilocalorie</td>
   1667 				<td>{0} kcal</td>
   1668 			</tr>
   1669 			<tr>
   1670 				<td><em>energy</em></td>
   1671 				<td>calorie</td>
   1672 				<td>{0} cal</td>
   1673 			</tr>
   1674 			<tr>
   1675 				<td><em>energy</em></td>
   1676 				<td>foodcalorie</td>
   1677 				<td>{0} Cal</td>
   1678 			</tr>
   1679 			<tr>
   1680 				<td><em>energy</em></td>
   1681 				<td>kilojoule</td>
   1682 				<td>{0} kJ</td>
   1683 			</tr>
   1684 			<tr>
   1685 				<td><em>energy</em></td>
   1686 				<td>joule</td>
   1687 				<td>{0} J</td>
   1688 			</tr>
   1689 			<tr>
   1690 				<td><em>energy</em></td>
   1691 				<td>kilowatt-hour</td>
   1692 				<td>{0} kWh</td>
   1693 			</tr>
   1694 			<tr>
   1695 				<td><em>frequency</em></td>
   1696 				<td>gigahertz</td>
   1697 				<td>{0} GHz</td>
   1698 			</tr>
   1699 			<tr>
   1700 				<td><em>frequency</em></td>
   1701 				<td>megahertz</td>
   1702 				<td>{0} MHz</td>
   1703 			</tr>
   1704 			<tr>
   1705 				<td><em>frequency</em></td>
   1706 				<td>kilohertz</td>
   1707 				<td>{0} kHz</td>
   1708 			</tr>
   1709 			<tr>
   1710 				<td><em>frequency</em></td>
   1711 				<td>hertz</td>
   1712 				<td>{0} Hz</td>
   1713 			</tr>
   1714 			<tr>
   1715 				<td><em>length</em></td>
   1716 				<td>kilometer</td>
   1717 				<td>{0} km</td>
   1718 			</tr>
   1719 			<tr>
   1720 				<td><em>length</em></td>
   1721 				<td>meter</td>
   1722 				<td>{0} m</td>
   1723 			</tr>
   1724 			<tr>
   1725 				<td><em>length</em></td>
   1726 				<td>decimeter</td>
   1727 				<td>{0} dm</td>
   1728 			</tr>
   1729 			<tr>
   1730 				<td><em>length</em></td>
   1731 				<td>centimeter</td>
   1732 				<td>{0} cm</td>
   1733 			</tr>
   1734 			<tr>
   1735 				<td><em>length</em></td>
   1736 				<td>millimeter</td>
   1737 				<td>{0} mm</td>
   1738 			</tr>
   1739 			<tr>
   1740 				<td><em>length</em></td>
   1741 				<td>micrometer</td>
   1742 				<td>{0} m</td>
   1743 			</tr>
   1744 			<tr>
   1745 				<td><em>length</em></td>
   1746 				<td>nanometer</td>
   1747 				<td>{0} nm</td>
   1748 			</tr>
   1749 			<tr>
   1750 				<td><em>length</em></td>
   1751 				<td>picometer</td>
   1752 				<td>{0} pm</td>
   1753 			</tr>
   1754 			<tr>
   1755 				<td><em>length</em></td>
   1756 				<td>mile</td>
   1757 				<td>{0} mi</td>
   1758 			</tr>
   1759 			<tr>
   1760 				<td><em>length</em></td>
   1761 				<td>yard</td>
   1762 				<td>{0} yd</td>
   1763 			</tr>
   1764 			<tr>
   1765 				<td><em>length</em></td>
   1766 				<td>foot</td>
   1767 				<td>{0} ft</td>
   1768 			</tr>
   1769 			<tr>
   1770 				<td><em>length</em></td>
   1771 				<td>inch</td>
   1772 				<td>{0} in</td>
   1773 			</tr>
   1774 			<tr>
   1775 				<td><em>length</em></td>
   1776 				<td>parsec</td>
   1777 				<td>{0} pc</td>
   1778 			</tr>
   1779 			<tr>
   1780 				<td><em>length</em></td>
   1781 				<td>light-year</td>
   1782 				<td>{0} ly</td>
   1783 			</tr>
   1784 			<tr>
   1785 				<td><em>length</em></td>
   1786 				<td>astronomical-unit</td>
   1787 				<td>{0} au</td>
   1788 			</tr>
   1789 			<tr>
   1790 				<td><em>length</em></td>
   1791 				<td>furlong</td>
   1792 				<td>{0} fur</td>
   1793 			</tr>
   1794 			<tr>
   1795 				<td><em>length</em></td>
   1796 				<td>fathom</td>
   1797 				<td>{0} fm</td>
   1798 			</tr>
   1799 			<tr>
   1800 				<td><em>length</em></td>
   1801 				<td>nautical-mile</td>
   1802 				<td>{0} nmi</td>
   1803 			</tr>
   1804 			<tr>
   1805 				<td><em>length</em></td>
   1806 				<td>mile-scandinavian</td>
   1807 				<td>{0} smi</td>
   1808 			</tr>
   1809 			<tr>
   1810 				<td><em>length</em></td>
   1811 				<td>point</td>
   1812 				<td>{0} pt</td>
   1813 				<td> typographic point, 1/72 inch</td>
   1814 			</tr>
   1815 			<tr>
   1816 				<td><em>light</em></td>
   1817 				<td>lux</td>
   1818 				<td>{0} lx</td>
   1819 			</tr>
   1820 			<tr>
   1821 				<td><em>mass</em></td>
   1822 				<td>metric-ton</td>
   1823 				<td>{0} t</td>
   1824 			</tr>
   1825 			<tr>
   1826 				<td><em>mass</em></td>
   1827 				<td>kilogram</td>
   1828 				<td>{0} kg</td>
   1829 			</tr>
   1830 			<tr>
   1831 				<td><em>mass</em></td>
   1832 				<td>gram</td>
   1833 				<td>{0} g</td>
   1834 			</tr>
   1835 			<tr>
   1836 				<td><em>mass</em></td>
   1837 				<td>milligram</td>
   1838 				<td>{0} mg</td>
   1839 			</tr>
   1840 			<tr>
   1841 				<td><em>mass</em></td>
   1842 				<td>microgram</td>
   1843 				<td>{0} g</td>
   1844 			</tr>
   1845 			<tr>
   1846 				<td><em>mass</em></td>
   1847 				<td>ton</td>
   1848 				<td>{0} tn</td>
   1849 			</tr>
   1850 			<tr>
   1851 				<td><em>mass</em></td>
   1852 				<td>stone</td>
   1853 				<td>{0} st</td>
   1854 			</tr>
   1855 			<tr>
   1856 				<td><em>mass</em></td>
   1857 				<td>pound</td>
   1858 				<td>{0} lb</td>
   1859 			</tr>
   1860 			<tr>
   1861 				<td><em>mass</em></td>
   1862 				<td>ounce</td>
   1863 				<td>{0} oz</td>
   1864 			</tr>
   1865 			<tr>
   1866 				<td><em>mass</em></td>
   1867 				<td>ounce-troy</td>
   1868 				<td>{0} oz t</td>
   1869 			</tr>
   1870 			<tr>
   1871 				<td><em>mass</em></td>
   1872 				<td>carat</td>
   1873 				<td>{0} CD</td>
   1874 			</tr>
   1875 			<tr>
   1876 				<td><em>power</em></td>
   1877 				<td>gigawatt</td>
   1878 				<td>{0} GW</td>
   1879 			</tr>
   1880 			<tr>
   1881 				<td><em>power</em></td>
   1882 				<td>megawatt</td>
   1883 				<td>{0} MW</td>
   1884 			</tr>
   1885 			<tr>
   1886 				<td><em>power</em></td>
   1887 				<td>kilowatt</td>
   1888 				<td>{0} kW</td>
   1889 			</tr>
   1890 			<tr>
   1891 				<td><em>power</em></td>
   1892 				<td>watt</td>
   1893 				<td>{0} W</td>
   1894 			</tr>
   1895 			<tr>
   1896 				<td><em>power</em></td>
   1897 				<td>milliwatt</td>
   1898 				<td>{0} mW</td>
   1899 			</tr>
   1900 			<tr>
   1901 				<td><em>power</em></td>
   1902 				<td>horsepower</td>
   1903 				<td>{0} hp</td>
   1904 			</tr>
   1905 			<tr>
   1906 				<td><em>pressure</em></td>
   1907 				<td>hectopascal</td>
   1908 				<td>{0} hPa</td>
   1909 			</tr>
   1910 			<tr>
   1911 				<td><em>pressure</em></td>
   1912 				<td>millimeter-of-mercury</td>
   1913 				<td>{0} mm Hg</td>
   1914 			</tr>
   1915 			<tr>
   1916 				<td><em>pressure</em></td>
   1917 				<td>pound-per-square-inch</td>
   1918 				<td>{0} psi</td>
   1919 			</tr>
   1920 			<tr>
   1921 				<td><em>pressure</em></td>
   1922 				<td>inch-hg</td>
   1923 				<td>{0} inHg</td>
   1924 			</tr>
   1925 			<tr>
   1926 				<td><em>pressure</em></td>
   1927 				<td>millibar</td>
   1928 				<td>{0} mbar</td>
   1929 			</tr>
   1930 			<tr>
   1931 				<td><em>pressure</em></td>
   1932 				<td>atmosphere</td>
   1933 				<td>{0} atm</td>
   1934 			</tr>
   1935 			<tr>
   1936 				<td><em>speed</em></td>
   1937 				<td>kilometer-per-hour</td>
   1938 				<td>{0} km/h</td>
   1939 			</tr>
   1940 			<tr>
   1941 				<td><em>speed</em></td>
   1942 				<td>meter-per-second</td>
   1943 				<td>{0} m/s</td>
   1944 			</tr>
   1945 			<tr>
   1946 				<td><em>speed</em></td>
   1947 				<td>mile-per-hour</td>
   1948 				<td>{0} mi/h</td>
   1949 			</tr>
   1950 			<tr>
   1951 				<td><em>speed</em></td>
   1952 				<td>knot</td>
   1953 				<td>{0} kn</td>
   1954 			</tr>
   1955 			<tr>
   1956 				<td><em>temperature</em></td>
   1957 				<td>generic</td>
   1958 				<td>{0}</td>
   1959 			</tr>
   1960 			<tr>
   1961 				<td><em>temperature</em></td>
   1962 				<td>celsius</td>
   1963 				<td>{0}C</td>
   1964 			</tr>
   1965 			<tr>
   1966 				<td><em>temperature</em></td>
   1967 				<td>fahrenheit</td>
   1968 				<td>{0}F</td>
   1969 			</tr>
   1970 			<tr>
   1971 				<td><em>temperature</em></td>
   1972 				<td>kelvin</td>
   1973 				<td>{0} K</td>
   1974 			</tr>
   1975 			<tr>
   1976 				<td><em>volume</em></td>
   1977 				<td>cubic-kilometer</td>
   1978 				<td>{0} km</td>
   1979 			</tr>
   1980 			<tr>
   1981 				<td><em>volume</em></td>
   1982 				<td>cubic-meter</td>
   1983 				<td>{0} m</td>
   1984 			</tr>
   1985 			<tr>
   1986 				<td><em>volume</em></td>
   1987 				<td>cubic-centimeter</td>
   1988 				<td>{0} cm</td>
   1989 			</tr>
   1990 			<tr>
   1991 				<td><em>volume</em></td>
   1992 				<td>cubic-mile</td>
   1993 				<td>{0} mi</td>
   1994 			</tr>
   1995 			<tr>
   1996 				<td><em>volume</em></td>
   1997 				<td>cubic-yard</td>
   1998 				<td>{0} yd</td>
   1999 			</tr>
   2000 			<tr>
   2001 				<td><em>volume</em></td>
   2002 				<td>cubic-foot</td>
   2003 				<td>{0} ft</td>
   2004 			</tr>
   2005 			<tr>
   2006 				<td><em>volume</em></td>
   2007 				<td>cubic-inch</td>
   2008 				<td>{0} in</td>
   2009 			</tr>
   2010 			<tr>
   2011 				<td><em>volume</em></td>
   2012 				<td>megaliter</td>
   2013 				<td>{0} ML</td>
   2014 			</tr>
   2015 			<tr>
   2016 				<td><em>volume</em></td>
   2017 				<td>hectoliter</td>
   2018 				<td>{0} hL</td>
   2019 			</tr>
   2020 			<tr>
   2021 				<td><em>volume</em></td>
   2022 				<td>liter</td>
   2023 				<td>{0} L</td>
   2024 			</tr>
   2025 			<tr>
   2026 				<td><em>volume</em></td>
   2027 				<td>deciliter</td>
   2028 				<td>{0} dL</td>
   2029 			</tr>
   2030 			<tr>
   2031 				<td><em>volume</em></td>
   2032 				<td>centiliter</td>
   2033 				<td>{0} cL</td>
   2034 			</tr>
   2035 			<tr>
   2036 				<td><em>volume</em></td>
   2037 				<td>milliliter</td>
   2038 				<td>{0} mL</td>
   2039 			</tr>
   2040 			<tr>
   2041 				<td><em>volume</em></td>
   2042 				<td>pint-metric</td>
   2043 				<td>{0} mpt</td>
   2044 			</tr>
   2045 			<tr>
   2046 				<td><em>volume</em></td>
   2047 				<td>cup-metric</td>
   2048 				<td>{0} mc</td>
   2049 			</tr>
   2050 			<tr>
   2051 				<td><em>volume</em></td>
   2052 				<td>acre-foot</td>
   2053 				<td>{0} ac ft</td>
   2054 			</tr>
   2055 			<tr>
   2056 				<td><em>volume</em></td>
   2057 				<td>bushel</td>
   2058 				<td>{0} bu</td>
   2059 			</tr>
   2060 			<tr>
   2061 				<td><em>volume</em></td>
   2062 				<td>gallon (US)</td>
   2063 				<td>{0} gal</td>
   2064 			</tr>
   2065 			<tr>
   2066 				<td><em>volume</em></td>
   2067 				<td>gallon-imperial</td>
   2068 				<td>{0} gal Imp.</td>
   2069 			</tr>
   2070 			<tr>
   2071 				<td><em>volume</em></td>
   2072 				<td>quart</td>
   2073 				<td>{0} qt</td>
   2074 			</tr>
   2075 			<tr>
   2076 				<td><em>volume</em></td>
   2077 				<td>pint</td>
   2078 				<td>{0} pt</td>
   2079 			</tr>
   2080 			<tr>
   2081 				<td><em>volume</em></td>
   2082 				<td>cup</td>
   2083 				<td>{0} c</td>
   2084 			</tr>
   2085 			<tr>
   2086 				<td><em>volume</em></td>
   2087 				<td>fluid-ounce</td>
   2088 				<td>{0} fl oz</td>
   2089 			</tr>
   2090 			<tr>
   2091 				<td><em>volume</em></td>
   2092 				<td>tablespoon</td>
   2093 				<td>{0} tbsp</td>
   2094 			</tr>
   2095 			<tr>
   2096 				<td><em>volume</em></td>
   2097 				<td>teaspoon</td>
   2098 				<td>{0} tsp</td>
   2099 			</tr>
   2100 		</table>
   2101 		<p>
   2102 			There are three widths: <strong>long</strong>, <strong>short</strong>,
   2103 			and <strong>narrow</strong>. As usual, the narrow forms may not be
   2104 			unique: in English, 1 could mean 1 minute of arc, or 1 foot. Thus
   2105 			narrow forms should only be used where the context makes the meaning
   2106 			clear.
   2107 		</p>
   2108 		<p>
   2109 			Where the unit of measurement is one of the <a
   2110 				href="http://physics.nist.gov/cuu/Units/units.html">International
   2111 				System of Units (SI)</a>, the short and narrow forms will typically use
   2112 			the international symbols, such as mm for millimeter. They may,
   2113 			however, be different if that is customary for the language or
   2114 			locale. For example, in Russian it may be more typical to see the
   2115 			Cyrillic characters .
   2116 		</p>
   2117 		<p>Units are included for translation even where they are not
   2118 			typically used in a particular locale, such as kilometers in the US,
   2119 			or inches in Germany. This is to account for use by travelers and
   2120 			specialized domains, such as the German Fernseher von 32 bis 55
   2121 			Zoll (80 bis 140 cm) for TV screen size in inches and centimeters.</p>
   2122 		<p>For temperature, there is a special unit &lt;unit
   2123 			type=&quot;temperature-generic&quot;&gt;, which is used when it is
   2124 			clear from context whether Celcius or Fahrenheit is implied.</p>
   2125 		<p>For duration, there are special units such as &lt;unit
   2126 			type=&quot;duration-year-person&quot;&gt; and &lt;unit
   2127 			type=&quot;duration-year-week&quot;&gt; for indicating the age of a
   2128 			person, which requires special forms in some languages. For example,
   2129 			in "zh", references to a person being 3 days old or 30 years old
   2130 			would use the forms 3 and 30 respectively.</p>
   2131 		<h3>
   2132 			6.1 <a name="perUnitPatterns" href="#perUnitPatterns">per Unit
   2133 				patterns</a><a name="compoundUnitPattern" href="#compoundUnitPattern"></a>
   2134 		</h3>
   2135 		<p>
   2136 			A common combination of units is X per Y, such as <em>miles per
   2137 				hour</em> or <em>liters per second</em>. Some units already have
   2138 			'precomputed' forms, such as <strong>kilometer-per-hour</strong>;
   2139 			where such units exist, they should be used in preference. There are
   2140 			two other patterns that can be used to compose unit symbols or names.
   2141 		</p>
   2142 		<p>
   2143 			<strong>compoundUnit</strong>  This is used to construct a pattern
   2144 			from two unit names. For example, a form such as &quot;{0} per
   2145 			{1}&quot; or &quot;{0}/{1}&quot; can be used to construct cases such
   2146 			as &quot;2 feet<strong> per </strong>second&quot; or &quot;ft<strong>/</strong>s&quot;
   2147 		</p>
   2148 		<p>
   2149 			<strong>perUnitPattern</strong>  This is used as the denominator
   2150 			with another unit name. For example, a form such as &quot;{0} per
   2151 			second&quot; can be used to form &quot;2 feet<strong> per
   2152 				second</strong>&quot;.
   2153 		</p>
   2154 		<p>The difference between these is that in some inflected
   2155 			languages, the compoundUnit cannot be used to form grammatical
   2156 			phrases. This is typically because the &quot;per&quot; +
   2157 			&quot;second&quot; combine in a non-trivial way. For such languages,
   2158 			the compoundUnit should only be used as a fallback, when there is no
   2159 			other recourse.</p>
   2160 		<p>When constructing a pattern for value=V, numeratorUnit=N,
   2161 			denominatorUnit=D, the following precess is used.</p>
   2162 		<ol>
   2163 			<li>If there is a compound form for N/D already available, use
   2164 				it.</li>
   2165 			<li>Otherwise, format the N pattern with the number using plural
   2166 				categories.
   2167 				<ul>
   2168 					<li> &quot;3 kilograms&quot;</li>
   2169 				</ul>
   2170 			</li>
   2171 			<li>See if there is a <strong>perUnitPattern</strong> for D.
   2172 
   2173 				<ol>
   2174 					<li>If so, then substitute the formatted numerator into the <strong>perUnitPattern</strong>
   2175 						<ul>
   2176 							<li>&quot;3 kilograms&quot; + &quot;{0} per second&quot; 
   2177 								&quot;3 kilograms per second&quot;</li>
   2178 						</ul>
   2179 					</li>
   2180 					<li>If not, get the <strong>compoundUnit</strong> pattern, and
   2181 						substitute the formatted numerator for {0} and the singular form
   2182 						of the denominator for {1}, after stripping the {0} and trimming
   2183 						spaces.
   2184 						<ul>
   2185 							<li>&quot;3 kilograms&quot; + &quot;{0} per {1}&quot; +
   2186 								&quot;{0} second&quot; </li>
   2187 							<li>&quot;3 kilograms&quot; + &quot;{0} per {1}&quot; +
   2188 								&quot;second&quot; </li>
   2189 							<li>&quot;3 kilograms per second&quot;</li>
   2190 						</ul></li>
   2191 				</ol>
   2192 			</li>
   2193 		</ol>
   2194 		<p>The patterns can have different unit lengths, so the
   2195 			appropriate unit length should be used (with fallbacks if necessary).</p>
   2196 		<h3>
   2197 			6.2 <a name="Unit_Sequences" href="#Unit_Sequences">Unit
   2198 				Sequences</a>
   2199 		</h3>
   2200 		<p>
   2201 			Units may be used in composed sequences, such as <strong>5
   2202 				30</strong> for 5 degrees 30 minutes, or <strong>3 ft 2 in.</strong>For that
   2203 			purpose, the appropriate width of the unit listPattern can be used to
   2204 			compose the units in a sequence.
   2205 		</p>
   2206 		<pre>&lt;listPattern type=&quot;unit&quot;&gt; (for the long form)
   2207 &lt;listPattern type=&quot;unit-narrow&quot;&gt;
   2208 &lt;listPattern type=&quot;unit-short&quot;&gt;
   2209 </pre>
   2210 		<h3>
   2211 			6.3 <a name="durationUnit" href="#durationUnit">durationUnit</a>
   2212 		</h3>
   2213 		<p>The durationUnit is a special type of unit used for composed
   2214 			time unit durations.</p>
   2215 		<pre>&lt;durationUnit type=&quot;hms&quot;&gt;
   2216   &lt;durationUnitPattern&gt;h:mm:ss&lt;/durationUnitPattern&gt; &lt;!-- 33:04:59 --&gt;
   2217 &lt;/durationUnit&gt;   </pre>
   2218 		<p>The type contains a skeleton, where 'h' stands for hours, 'm'
   2219 			for minutes, and 's' for sections. These are the same symbols used in
   2220 			availableFormats, except that there is no need to distinguish
   2221 			different forms of the hour.</p>
   2222 
   2223 		<h3>
   2224 			6.4 <a name="coordinateUnit" href="#coordinateUnit">coordinateUnit</a>
   2225 		</h3>
   2226 		<p>
   2227 			The <strong>coordinateUnitPattern</strong> is a special type of
   2228 			pattern used for composing degrees of latitude and longitude, with an
   2229 			indicator of the quadrant. There are exactly 4 type values,
   2230 			plus a displayName for the items in this category. An angle
   2231 			is composed using the appropriate combination of the <strong>angle-degrees</strong>,
   2232 			<strong>angle-arc-minute</strong> and <strong>angle-arc-second</strong>
   2233 			values. It is then substituted for the placeholder field {0} in the
   2234 			appropriate <strong>coordinateUnit</strong> pattern.
   2235 		</p>
   2236 		<p class="xmlExample">
   2237 			&lt;displayName&gt;direction&lt;/displayName&gt;<br>
   2238 			&lt;coordinateUnitPattern
   2239 			type=&quot;east&quot;&gt;{0}E&lt;/coordinateUnitPattern&gt;<br>
   2240 			&lt;coordinateUnitPattern
   2241 			type=&quot;north&quot;&gt;{0}N&lt;/coordinateUnitPattern&gt;<br>
   2242 			&lt;coordinateUnitPattern
   2243 			type=&quot;south&quot;&gt;{0}S&lt;/coordinateUnitPattern&gt;<br>
   2244 			&lt;coordinateUnitPattern
   2245 			type=&quot;west&quot;&gt;{0}W&lt;/coordinateUnitPattern&gt;
   2246 		</p>
   2247 
   2248 		<h3>
   2249 			6.5 <a name="Territory_Based_Unit_Preferences"
   2250 				href="#Territory_Based_Unit_Preferences">Territory-Based Unit
   2251 				Preferences</a>
   2252 		</h3>
   2253 		<p>Different locales have different preferences
   2254 			for which unit or combination of units is used for a particular
   2255 			usage, such as measuring a persons height. This is more fine-grained
   2256 			than merely a preference for metric versus US or UK measurement
   2257 			systems. For example, one locale may use meters alone, while another
   2258 			may use centimeters alone or a combination of meters and centimeters;
   2259 			a third may use inches alone, or (informally) a combination of feet
   2260 			and inches.</p>
   2261 		<p>
   2262 			The &lt;unitPreferenceData&gt; element, described in <a
   2263 				href="tr35-info.html#Preferred_Units_For_Usage">Preferred Units
   2264 				for Specific Usages</a>, provides information on which unit or
   2265 			combination of units is used for various purposes in different
   2266 			locales, with options for the level of formality and the scale of the
   2267 			measurement (e.g measuring the height of an adult versus that of an
   2268 			infant).
   2269 		</p>
   2270 
   2271 		<h2>
   2272 			7 <a name="POSIX_Elements" href="#POSIX_Elements">POSIX Elements</a>
   2273 		</h2>
   2274 
   2275 
   2276 		<p class="dtd">
   2277 			&lt;!ELEMENT posix (alias | (messages*, special*)) &gt;<br>
   2278 			&lt;!ELEMENT messages (alias | ( yesstr*, nostr*)) &gt;
   2279 		</p>
   2280 
   2281 		<p>The following are included for compatibility with POSIX.</p>
   2282 
   2283 		<p>
   2284 			&lt;posix&gt;<br> &nbsp;&nbsp;&nbsp;&nbsp;&lt;posix:messages&gt;<br>
   2285 			&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&lt;posix:yesstr&gt;<span
   2286 				style="color: #0000FF">ja</span>&lt;/posix:yesstr&gt;<br>
   2287 			&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&lt;posix:nostr&gt;<span
   2288 				style="color: #0000FF">nein</span>&lt;/posix:nostr&gt;<br>
   2289 			&nbsp;&nbsp;&nbsp;&nbsp;&lt;/posix:messages&gt;<br>
   2290 			&lt;posix&gt;
   2291 		</p>
   2292 
   2293 		<ol>
   2294 			<li>The values for yesstr and nostr contain a colon-separated
   2295 				list of strings that would normally be recognized as "yes" and "no"
   2296 				responses. For cased languages, this shall include only the lower
   2297 				case version. POSIX locale generation tools must generate the upper
   2298 				case equivalents, and the abbreviated versions, and add the English
   2299 				words wherever they do not conflict. Examples:
   2300 				<ul>
   2301 					<li>ja  ja:Ja:j:J:yes:Yes:y:Y</li>
   2302 					<li>ja  ja:Ja:j:J:yes:Yes // exclude y:Y if it conflicts with
   2303 						the native "no".</li>
   2304 				</ul>
   2305 			</li>
   2306 
   2307 			<li>The older elements yesexpr and noexpr are deprecated. They
   2308 				should instead be generated from yesstr and nostr so that they match
   2309 				all the responses.</li>
   2310 		</ol>
   2311 
   2312 		<p>So for English, the appropriate strings and expressions would
   2313 			be as follows:</p>
   2314 
   2315 		<p>
   2316 			yesstr "yes:y"<br> nostr "no:n"
   2317 		</p>
   2318 
   2319 		<p>The generated yesexpr and noexpr would be:</p>
   2320 
   2321 		<p>
   2322 			<code>
   2323 				yesexpr "^([yY]([eE][sS])?)"<br>
   2324 			</code>
   2325 			This would match y,Y,yes,yeS,yEs,yES,Yes,YeS,YEs,YES.<br> <br>
   2326 			<code>noexpr "^([nN][oO]?)"</code>
   2327 			<br> This would match n,N,no,nO,No,NO.
   2328 		</p>
   2329 
   2330 
   2331 		<h2>
   2332 			8 <a name="Reference_Elements" href="#Reference_Elements">Reference
   2333 				Element</a>
   2334 		</h2>
   2335 
   2336 
   2337 		<p>(Use only in supplemental data; deprecated for ldml.dtd and
   2338 			locale data)</p>
   2339 		<p class="dtd">
   2340 			&lt;!ELEMENT references ( reference* ) &gt;<br> &lt;!ELEMENT
   2341 			reference ( #PCDATA ) &gt;<br> &lt;!ATTLIST reference type
   2342 			NMTOKEN #REQUIRED&gt;<br> &lt;!ATTLIST reference standard ( true
   2343 			| false ) #IMPLIED &gt;<br> &lt;!ATTLIST reference uri CDATA
   2344 			#IMPLIED &gt;
   2345 		</p>
   2346 
   2347 		<p>The references section supplies a central location for
   2348 			specifying references and standards. The uri should be supplied if at
   2349 			all possible. If not online, then a ISBN number should be supplied,
   2350 			such as in the following example:</p>
   2351 
   2352 		<p class="example">
   2353 			&lt;reference type="R2"
   2354 			uri="http://www.ur.se/nyhetsjournalistik/3lan.html"&gt;Landskoder p
   2355 			Internet&lt;/reference&gt;<br> &lt;reference type="R3"
   2356 			uri="URN:ISBN:91-47-04974-X"&gt;Svenska skrivregler&lt;/reference&gt;
   2357 		</p>
   2358 
   2359 
   2360 		<h2>
   2361 			9 <a name="Segmentations" href="#Segmentations">Segmentations</a>
   2362 		</h2>
   2363 
   2364 		<p class="dtd">&lt;!ELEMENT segmentations ( alias | segmentation*)
   2365 			&gt;</p>
   2366 		<p class="dtd">
   2367 			&lt;!ELEMENT segmentation ( alias | (variables?, segmentRules? ,
   2368 			exceptions?, suppressions?) | special*) &gt; <br> &lt;!ATTLIST
   2369 			segmentation type NMTOKEN #REQUIRED &gt;
   2370 		</p>
   2371 		<p class="dtd">&lt;!ELEMENT variables ( alias | variable*) &gt;</p>
   2372 		<p class="dtd">
   2373 			&lt;!ELEMENT variable ( #PCDATA ) &gt;<br> &lt;!ATTLIST variable
   2374 			id CDATA #REQUIRED &gt;
   2375 		</p>
   2376 		<p class="dtd">&lt;!ELEMENT segmentRules ( alias | rule*) &gt;</p>
   2377 		<p class="dtd">
   2378 			&lt;!ELEMENT rule ( #PCDATA ) &gt;<br> &lt;!ATTLIST rule id
   2379 			NMTOKEN #REQUIRED &gt;
   2380 		</p>
   2381 		<p class="dtd">&lt;!ELEMENT suppressions ( suppression* ) &gt;</p>
   2382 		<p class="dtd">&lt;!ATTLIST suppressions type NMTOKEN "standard"
   2383 			&gt;</p>
   2384 		<p class="dtd">&lt;!ATTLIST suppressions draft ( approved |
   2385 			contributed | provisional | unconfirmed ) #IMPLIED &gt;</p>
   2386 		<p class="dtd">&lt;!ELEMENT suppression ( #PCDATA ) &gt;</p>
   2387 
   2388 		<p>
   2389 			The segmentations element provides for segmentation of text into
   2390 			words, lines, or other segments. The structure is based on [<a
   2391 				href="http://www.unicode.org/reports/tr41/#UAX29">UAX29</a>]
   2392 			notation, but adapted to be machine-readable. It uses a list of
   2393 			variables (representing character classes) and a list of rules. Each
   2394 			must have an id attribute.
   2395 		</p>
   2396 
   2397 		<p>
   2398 			The rules in <i>root</i> implement the segmentations found in [<a
   2399 				href="http://www.unicode.org/reports/tr41/#UAX29">UAX29</a>] and [<a
   2400 				href="http://www.unicode.org/reports/tr41/#UAX14">UAX14</a>], for
   2401 			grapheme clusters, words, sentences, and lines. They can be
   2402 			overridden by rules in child locales.
   2403 		</p>
   2404 
   2405 		<p>Here is an example:</p>
   2406 
   2407 		<pre>&lt;segmentations&gt;
   2408   &lt;segmentation type="GraphemeClusterBreak"&gt;
   2409     &lt;variables&gt;
   2410       &lt;variable id="$CR"&gt;\p{Grapheme_Cluster_Break=CR}&lt;/variable&gt;
   2411       &lt;variable id="$LF"&gt;\p{Grapheme_Cluster_Break=LF}&lt;/variable&gt;
   2412       &lt;variable id="$Control"&gt;\p{Grapheme_Cluster_Break=Control}&lt;/variable&gt;
   2413       &lt;variable id="$Extend"&gt;\p{Grapheme_Cluster_Break=Extend}&lt;/variable&gt;
   2414       &lt;variable id="$L"&gt;\p{Grapheme_Cluster_Break=L}&lt;/variable&gt;
   2415       &lt;variable id="$V"&gt;\p{Grapheme_Cluster_Break=V}&lt;/variable&gt;
   2416       &lt;variable id="$T"&gt;\p{Grapheme_Cluster_Break=T}&lt;/variable&gt;
   2417       &lt;variable id="$LV"&gt;\p{Grapheme_Cluster_Break=LV}&lt;/variable&gt;
   2418       &lt;variable id="$LVT"&gt;\p{Grapheme_Cluster_Break=LVT}&lt;/variable&gt;
   2419     &lt;/variables&gt;
   2420     &lt;segmentRules&gt;
   2421       &lt;rule id="3"&gt; $CR  $LF &lt;/rule&gt;
   2422       &lt;rule id="4"&gt; ( $Control | $CR | $LF )  &lt;/rule&gt;
   2423       &lt;rule id="5"&gt;  ( $Control | $CR | $LF ) &lt;/rule&gt;
   2424       &lt;rule id="6"&gt; $L  ( $L | $V | $LV | $LVT ) &lt;/rule&gt;
   2425       &lt;rule id="7"&gt; ( $LV | $V )  ( $V | $T ) &lt;/rule&gt;
   2426       &lt;rule id="8"&gt; ( $LVT | $T)  $T &lt;/rule&gt;
   2427       &lt;rule id="9"&gt;  $Extend &lt;/rule&gt;
   2428     &lt;/segmentRules&gt;
   2429   &lt;/segmentation&gt;
   2430 ...</pre>
   2431 
   2432 		<p>
   2433 			<b>Variables:</b> All variable ids must start with a $, and otherwise
   2434 			be valid identifiers according to the Unicode definitions in [<a
   2435 				href="http://www.unicode.org/reports/tr41/#UAX31">UAX31</a>]. The
   2436 			contents of a variable is a regular expression using variables and <a
   2437 				href="tr35.html#Unicode_Sets">UnicodeSet</a>s. The ordering of
   2438 			variables is important; they are evaluated in order from first to
   2439 			last (see <i><a href="#Segmentation_Inheritance">Section 9.1
   2440 					Segmentation Inheritance</a></i>). It is an error to use a variable before
   2441 			it is defined.
   2442 		</p>
   2443 
   2444 		<p>
   2445 			<b>Rules:</b> The contents of a rule uses the syntax of [<a
   2446 				href="http://www.unicode.org/reports/tr41/#UAX29">UAX29</a>]. The
   2447 			rules are evaluated in numeric id order (which may not be the order
   2448 			in which the appear in the file). The first rule that matches
   2449 			determines the status of a boundary position, that is, whether it
   2450 			breaks or not. Thus  means a break is allowed;  means a break is
   2451 			forbidden. It is an error if the rule does not contain exactly one of
   2452 			these characters (except where a rule has no contents at all, or if
   2453 			the rule uses a variable that has not been defined.
   2454 		</p>
   2455 
   2456 		<p>There are some implicit rules:</p>
   2457 
   2458 		<ul>
   2459 			<li>The implicit initial rules are always "start-of-text " and
   2460 				" end-of-text"; these are not to be included explicitly.</li>
   2461 			<li>The implicit final rule is always "Any  Any". This is not
   2462 				to be included explicitly.</li>
   2463 		</ul>
   2464 
   2465 		<blockquote>
   2466 			<p>
   2467 				<b>Note:</b> A rule like X Format* -&gt; X in [<a
   2468 					href="http://www.unicode.org/reports/tr41/#UAX29">UAX29</a>] and [<a
   2469 					href="http://www.unicode.org/reports/tr41/#UAX14">UAX14</a>] is not
   2470 				supported. Instead, this needs to be expressed as normal regular
   2471 				expressions. The normal way to support this is to modify the
   2472 				variables, such as in the following example:
   2473 			</p>
   2474 
   2475 			<pre id="line870">&lt;variable id="$Format"&gt;\p{Word_Break=Format}&lt;/variable&gt;
   2476 &lt;variable id="$Katakana"&gt;\p{Word_Break=Katakana}&lt;/variable&gt;
   2477 ...
   2478 &lt;!-- In place of rule 3, add format and extend to everything --&gt;
   2479 &lt;variable id="$X"&gt;[$Format $Extend]*&lt;/variable&gt;
   2480 &lt;variable id="$Katakana"&gt;($Katakana $X)&lt;/variable&gt;
   2481 &lt;variable id="$ALetter"&gt;($ALetter $X)&lt;/variable&gt;
   2482 ...</pre>
   2483 		</blockquote>
   2484 
   2485 		<h3>
   2486 			9.1 <a name="Segmentation_Inheritance"
   2487 				href="#Segmentation_Inheritance">Segmentation Inheritance</a>
   2488 		</h3>
   2489 
   2490 
   2491 		<p>Variables and rules both inherit from the parent.</p>
   2492 
   2493 		<p>
   2494 			<b>Variables:</b> The child&#39;s variable list is logically appended
   2495 			to the parent&#39;s, and evaluated in that order. For example:
   2496 		</p>
   2497 
   2498 		<p>
   2499 			<font color="#0000FF"><code>// in parent</code></font>
   2500 			<code>
   2501 				<br> &lt;variable id="$AL"&gt;[:linebreak=AL:]&lt;/variable&gt;<br>
   2502 				&lt;variable id="$YY"&gt;[[:linebreak=XX:]$AL]&lt;/variable&gt;
   2503 			</code>
   2504 			<font color="#0000FF"><code>// adds $AL</code></font>
   2505 		</p>
   2506 
   2507 		<p>
   2508 			<font color="#0000FF"><code>// in child</code></font>
   2509 			<code>
   2510 				<br> &lt;variable id="$AL"&gt;[$AL &amp;&amp;
   2511 				[^a-z]]&lt;/variable&gt; <font color="#0000FF">// changes
   2512 					$AL, does not affect $YY</font><br> &lt;variable
   2513 				id="$ABC"&gt;[abc]&lt;/variable&gt;
   2514 			</code>
   2515 			<font color="#0000FF"><code>// adds new rule</code></font>
   2516 		</p>
   2517 
   2518 		<p>
   2519 			<b>Rules:</b> The rules are also logically appended to the
   2520 			parent&#39;s. Because rules are evaluated in numeric id order, to
   2521 			insert a rule in between others just requires using an intermediate
   2522 			number. For example, to insert a rule after id="10.1" and before
   2523 			id="10.2", just use id="10.15". To delete a rule, use empty contents,
   2524 			such as:
   2525 		</p>
   2526 
   2527 		<p>
   2528 			<code>&lt;rule id="3"/&gt;</code>
   2529 			<font color="#0000FF"><code> // deletes rule 3</code></font>
   2530 		</p>
   2531 
   2532 
   2533 		<h3>
   2534 			9.2 <a name="Segmentation_Exceptions" href="#Segmentation_Exceptions">Segmentation
   2535 				Suppressions </a>
   2536 		</h3>
   2537 
   2538 		<p>
   2539 			<b>Note:</b> As of CLDR 26, the
   2540 			<code>&lt;suppressions&gt;</code>
   2541 			data is to be considered a technology preview. Data currently in CLDR
   2542 			was extracted from the Unicode Localization Interoperability project,
   2543 			or ULI. See <a href="http://uli.unicode.org">http://uli.unicode.org</a>
   2544 			for more information on the ULI project.
   2545 		</p>
   2546 
   2547 		<p>
   2548 			The segmentation <b>suppressions</b> list provides a set of cases
   2549 			which, though otherwise identified as a segment by rules, should be
   2550 			skipped (suppressed) during segmentation.
   2551 		</p>
   2552 
   2553 		<p>For example, in the English phrase "Mr. Smith", CLDR
   2554 			segmentation rules would normally find a Sentence Break between "Mr"
   2555 			and "Smith". However, typically, "Mr." is just an abbreviation for
   2556 			"Mister", and not actually the end of a sentence.</p>
   2557 
   2558 		<p>
   2559 			Each suppression has a separate
   2560 			<code>&lt;suppression&gt;</code>
   2561 			element, whose contents are the break to be skipped.
   2562 		</p>
   2563 
   2564 		<p>Example:</p>
   2565 
   2566 		<pre>
   2567     &lt;segmentation type="SentenceBreak"&gt;
   2568       &lt;suppressions type="standard" draft="provisional"&gt;
   2569         &lt;suppression&gt;Maj.&lt;/suppression&gt;
   2570         &lt;suppression&gt;Mr.&lt;/suppression&gt;
   2571         &lt;suppression&gt;Lt.Cdr.&lt;/suppression&gt;
   2572 	. . .
   2573       &lt;/suppressions&gt;
   2574     &lt;/segmentation&gt;
   2575                 </pre>
   2576 
   2577 		<p>
   2578 			<b>Note:</b> These elements were called
   2579 			<code>&lt;exceptions&gt;</code>
   2580 			and
   2581 			<code>&lt;exception&gt;</code>
   2582 			prior to CLDR 26, but those names are now deprecated.
   2583 		</p>
   2584 
   2585 		<h2>
   2586 			10 <a name="Transforms" href="#Transforms">Transforms</a>
   2587 		</h2>
   2588 
   2589 
   2590 		<p>
   2591 			Transforms provide a set of rules for transforming text via a
   2592 			specialized set of context-sensitive matching rules. They are
   2593 			commonly used for transliterations or transcriptions, but also other
   2594 			transformations such as full-width to half-width (for <i>katakana</i>
   2595 			characters). The rules can be simple one-to-one relationships between
   2596 			characters, or involve more complicated mappings. Here is an example:
   2597 		</p>
   2598 
   2599 		<pre>&lt;transform source="Greek" target="Latin" variant="UNGEGN" direction="both"&gt;
   2600 ...
   2601   &lt;comment&gt;Useful variables&lt;/comment&gt;
   2602   &lt;tRule&gt;$gammaLike = [] ;&lt;/tRule&gt;
   2603   &lt;tRule&gt;$egammaLike = [GKXCgkxc] ;&lt;/tRule&gt;
   2604 ...
   2605   &lt;comment&gt;Rules are predicated on running NFD first, and NFC afterwards&lt;/comment&gt;
   2606   &lt;tRule&gt;::NFD (NFC) ;&lt;/tRule&gt;
   2607 ...
   2608   &lt;tRule&gt;  l ;&lt;/tRule&gt;
   2609   &lt;tRule&gt;  L ;&lt;/tRule&gt;
   2610 ...
   2611   &lt;tRule&gt; } $gammaLike  n } $egammaLike ;&lt;/tRule&gt;
   2612   &lt;tRule&gt;  g ;&lt;/tRule&gt;
   2613 ...
   2614   &lt;tRule&gt;::NFC (NFD) ;&lt;/tRule&gt;
   2615 ...
   2616 &lt;/transform&gt;</pre>
   2617 
   2618 		<p>The source and target values are valid locale identifiers,
   2619 			where &#39;und&#39; means an unspecified language, plus some
   2620 			additional extensions.</p>
   2621 
   2622 		<ul>
   2623 			<li>The long names of a script according to [<a
   2624 				href="http://www.unicode.org/reports/tr41/#UAX24">UAX24</a>] may be
   2625 				used instead of the short script codes. The script identifier may
   2626 				also omit und; that is, "und_Latn" may be written as just "Latn".
   2627 			</li>
   2628 
   2629 			<li>The long names of the English languages may also be used
   2630 				instead of the languages.</li>
   2631 
   2632 			<li>The term "Any" may be used instead of a solitary "und".</li>
   2633 
   2634 			<li>Other identifiers may be used for special purposes. In CLDR,
   2635 				these include: Accents, Digit, Fullwidth, Halfwidth, Jamo,
   2636 				NumericPinyin, Pinyin, Publishing, Tone. (Other than these values,
   2637 				valid private use locale identifiers should be used, such as
   2638 				"x-Special".)</li>
   2639 
   2640 			<li>When presenting localizing transform names, the "und_" is
   2641 				normally omitted. Thus for a transliterator with the ID
   2642 				"und_Latn-und_Grek" (or the equivalent "Latin-Greek"), the
   2643 				translated name for Greek would be -.</li>
   2644 		</ul>
   2645 		<p>In version 29.0, BCP47 identifiers were added
   2646 			as aliases (while retaining the old identifiers). The following table
   2647 			shows the relationship between the old identifiers and the BCP47
   2648 			format identifiers.</p>
   2649 		<table class='simple'>
   2650 			<tbody>
   2651 				<tr>
   2652 					<th>Old ID</th>
   2653 					<th>BCP47 ID</th>
   2654 					<th>Comments</th>
   2655 				</tr>
   2656 				<tr>
   2657 					<td><strong>es_FONIPA</strong>-es_419_FONIPA</td>
   2658 					<td>es-419-fonipa-t-<strong>es-fonipa</strong></td>
   2659 					<td rowspan="2">The order reverses with -t-. That is, the
   2660 						language subtag part is what results.</td>
   2661 				</tr>
   2662 				<tr>
   2663 					<td><strong>hy_AREVMDA</strong>-hy_AREVMDA_FONIPA</td>
   2664 					<td>hy-arevmda-fonipa-t-<strong>hy-arevmda</strong></td>
   2665 				</tr>
   2666 				<tr>
   2667 					<td><strong>Devanagari</strong>-Latin</td>
   2668 					<td>und-Latn-t-<strong>und-deva</strong></td>
   2669 					<td rowspan="2">Scripts add <strong>und-</strong></td>
   2670 				</tr>
   2671 				<tr>
   2672 					<td><strong>Latin</strong>-Devanagari</td>
   2673 					<td>und-Deva-t-<strong>und-latn</strong></td>
   2674 				</tr>
   2675 				<tr>
   2676 					<td>Greek-Latin/UNGEGN</td>
   2677 					<td>und-Latn-t-und-grek-<strong>m0-ungegn</strong></td>
   2678 					<td>Variants use the <strong>-m0-</strong> key.
   2679 					</td>
   2680 				</tr>
   2681 				<tr>
   2682 					<td>Russian-Latin/BGN</td>
   2683 					<td>ru<strong>-Latn</strong>-t-ru-m0-bgn
   2684 					</td>
   2685 					<td>Languages will have a script when it isnt the default.</td>
   2686 				</tr>
   2687 				<tr>
   2688 					<td>Any-Hex/xml</td>
   2689 					<td>und-t-<strong>d0-hex</strong>-m0-xml
   2690 					</td>
   2691 					<td rowspan="2"><strong>Any</strong> becomes <strong>und</strong>,
   2692 						and keys <strong>d0</strong> (destination) and <strong>s0</strong>
   2693 						(source) are used for non-locales.</td>
   2694 				</tr>
   2695 				<tr>
   2696 					<td>Hex-Any/xml</td>
   2697 					<td>und-t-<strong>s0-hex</strong>-m0-xml
   2698 					</td>
   2699 				</tr>
   2700 				<tr>
   2701 					<td>Any-<strong>Publishing</strong></td>
   2702 					<td>und-t-d0-<strong>publish</strong></td>
   2703 					<td rowspan="2">Non-locales are normally the lowercases of the
   2704 						old ID, but may change because of BCP47 length restrictions.</td>
   2705 				</tr>
   2706 				<tr>
   2707 					<td><strong>Publishing</strong>-Any</td>
   2708 					<td>und-t-s0-<strong>publish</strong></td>
   2709 				</tr>
   2710 			</tbody>
   2711 		</table>
   2712 		<p>Note that the script and region codes are cased
   2713 			iff they are in the main subtag, but are lowercase in extensions.</p>
   2714 		<h3>
   2715 			10.1 <a name="Inheritance" href="#Inheritance">Inheritance</a>
   2716 		</h3>
   2717 
   2718 		<p>The CLDR transforms are built using the following locale
   2719 			inheritance. While this inheritance is not required of LDML
   2720 			implementations, the transforms supplied with CLDR may not otherwise
   2721 			behave as expected without some changes.</p>
   2722 
   2723 		<p>For either the source or the target, the fallback starts from
   2724 			the maximized locale ID (using the likely-subtags data). It also uses
   2725 			the country for lookup before the base language is reached, and root
   2726 			is never accessed: instead the script(s) associated with the language
   2727 			are used. Where there are multiple scripts, the maximized script is
   2728 			tried first, and then the other scripts associated with the language
   2729 			(from supplemental data).</p>
   2730 
   2731 		<p>
   2732 			For example, see the bolded items below in the fallback chain for <strong>az_IR</strong>.
   2733 		</p>
   2734 
   2735 		<table>
   2736 			<tr>
   2737 				<th>&nbsp;</th>
   2738 				<th>Locale ID</th>
   2739 				<th>Comments</th>
   2740 			</tr>
   2741 			<tr>
   2742 				<td>1</td>
   2743 				<td><strong>az_Arab_IR</strong></td>
   2744 				<td>The maximized locale for az_IR</td>
   2745 			</tr>
   2746 			<tr>
   2747 				<td>2</td>
   2748 				<td>az_Arab</td>
   2749 				<td>Normal fallback</td>
   2750 			</tr>
   2751 			<tr>
   2752 				<td>3</td>
   2753 				<td><strong>az_IR</strong></td>
   2754 				<td>Inserted country locale</td>
   2755 			</tr>
   2756 			<tr>
   2757 				<td>4</td>
   2758 				<td>az</td>
   2759 				<td>Normal fallback</td>
   2760 			</tr>
   2761 			<tr>
   2762 				<td>5</td>
   2763 				<td><strong>Arab</strong></td>
   2764 				<td>Maximized script</td>
   2765 			</tr>
   2766 			<tr>
   2767 				<td>6</td>
   2768 				<td><strong>Cyrl</strong></td>
   2769 				<td>Other associated script</td>
   2770 			</tr>
   2771 		</table>
   2772 
   2773 		<p>The source, target, and variant use "laddered" fallback, where
   2774 			the source changes the most quickly (using the above rules), then the
   2775 			target (using the above rules), then the variant if any, is
   2776 			discarded. That is, in pseudo code:</p>
   2777 
   2778 		<ul>
   2779 			<li>for variant in {variant, ""}
   2780 				<ul>
   2781 					<li>for target in target-chain
   2782 						<ul>
   2783 							<li>for source in source-chain
   2784 								<ul>
   2785 									<li>transform = lookup source-target/variant</li>
   2786 									<li>if transform != null return transform</li>
   2787 								</ul>
   2788 							</li>
   2789 						</ul>
   2790 					</li>
   2791 				</ul>
   2792 			</li>
   2793 		</ul>
   2794 
   2795 		<p>
   2796 			For example, here is the fallback chain for <strong>ru_RU-el_GR/BGN</strong>.
   2797 		</p>
   2798 		<div align="center">
   2799 			<table>
   2800 				<tr>
   2801 					<th>source</th>
   2802 					<th>&nbsp;</th>
   2803 					<th>target</th>
   2804 					<th>variant</th>
   2805 				</tr>
   2806 				<tr>
   2807 					<td>ru_RU</td>
   2808 					<td>-</td>
   2809 					<td>el_GR</td>
   2810 					<td>/BGN</td>
   2811 				</tr>
   2812 				<tr>
   2813 					<td>ru</td>
   2814 					<td>-</td>
   2815 					<td>el_GR</td>
   2816 					<td>/BGN</td>
   2817 				</tr>
   2818 				<tr>
   2819 					<td>Cyrl</td>
   2820 					<td>-</td>
   2821 					<td>el_GR</td>
   2822 					<td>/BGN</td>
   2823 				</tr>
   2824 				<tr>
   2825 					<td>ru_RU</td>
   2826 					<td>-</td>
   2827 					<td>el</td>
   2828 					<td>/BGN</td>
   2829 				</tr>
   2830 				<tr>
   2831 					<td>ru</td>
   2832 					<td>-</td>
   2833 					<td>el</td>
   2834 					<td>/BGN</td>
   2835 				</tr>
   2836 				<tr>
   2837 					<td>Cyrl</td>
   2838 					<td>-</td>
   2839 					<td>el</td>
   2840 					<td>/BGN</td>
   2841 				</tr>
   2842 				<tr>
   2843 					<td>ru_RU</td>
   2844 					<td>-</td>
   2845 					<td>Grek</td>
   2846 					<td>/BGN</td>
   2847 				</tr>
   2848 				<tr>
   2849 					<td>ru</td>
   2850 					<td>-</td>
   2851 					<td>Grek</td>
   2852 					<td>/BGN</td>
   2853 				</tr>
   2854 				<tr>
   2855 					<td>Cyrl</td>
   2856 					<td>-</td>
   2857 					<td>Grek</td>
   2858 					<td>/BGN</td>
   2859 				</tr>
   2860 				<tr>
   2861 					<td>ru_RU</td>
   2862 					<td>-</td>
   2863 					<td>el_GR</td>
   2864 					<td></td>
   2865 				</tr>
   2866 				<tr>
   2867 					<td>ru</td>
   2868 					<td>-</td>
   2869 					<td>el_GR</td>
   2870 					<td></td>
   2871 				</tr>
   2872 				<tr>
   2873 					<td>Cyrl</td>
   2874 					<td>-</td>
   2875 					<td>el_GR</td>
   2876 					<td></td>
   2877 				</tr>
   2878 				<tr>
   2879 					<td>ru_RU</td>
   2880 					<td>-</td>
   2881 					<td>el</td>
   2882 					<td></td>
   2883 				</tr>
   2884 				<tr>
   2885 					<td>ru</td>
   2886 					<td>-</td>
   2887 					<td>el</td>
   2888 					<td></td>
   2889 				</tr>
   2890 				<tr>
   2891 					<td>Cyrl</td>
   2892 					<td>-</td>
   2893 					<td>el</td>
   2894 					<td></td>
   2895 				</tr>
   2896 				<tr>
   2897 					<td>ru_RU</td>
   2898 					<td>-</td>
   2899 					<td>Grek</td>
   2900 					<td></td>
   2901 				</tr>
   2902 				<tr>
   2903 					<td>ru</td>
   2904 					<td>-</td>
   2905 					<td>Grek</td>
   2906 					<td></td>
   2907 				</tr>
   2908 				<tr>
   2909 					<td>Cyrl</td>
   2910 					<td>-</td>
   2911 					<td>Grek</td>
   2912 					<td></td>
   2913 				</tr>
   2914 			</table>
   2915 		</div>
   2916 		<p>Japanese and Korean are special, since they can
   2917 			be represented by combined script codes, such as ja_Jpan, ja_Hrkt,
   2918 			ja_Hira, or ja_Kana. These need to be considered in the above
   2919 			fallback chain as well.</p>
   2920 		<h4>
   2921 			10.1.1 <a name="Pivots" href="#Pivots">Pivots</a>
   2922 		</h4>
   2923 		<p>
   2924 			Transforms can also use <i>pivots</i>. These are used when there is
   2925 			no direct transform between a source and target, but there are
   2926 			transforms X-Y and Y-Z. In such a case, the transforms can be
   2927 			internally chained to get X-Y = X-Y;Y-Z. This is done explicitly with
   2928 			the Indic script transforms: to get Devanagari-Latin, internally it
   2929 			is done by transforming first from Devanagari to Interindic (an
   2930 			internal superset encoding for Indic scripts), then from Interindic
   2931 			to Latin. This allows there to be only N sets of transform rules for
   2932 			the Indic scripts: each one to and from Interindic. These pivots are
   2933 			explicitly represented in the CLDR transforms.</p>
   2934 		<p>Note that the characters currently used by Interindic are private use characters. To prevent those from leaking out into text, transforms converting from Interindic must ensure that they convert all the possible values used in Interindic.</p>
   2935 		<p>
   2936 			The pivots can also be produced automatically (implicitly), as a
   2937 			fallback. A particularly useful pivot is IPA, since that tends to
   2938 			preserve pronunciation. For example, <em>Czech to IPA</em> can be
   2939 			chained with <em>IPA to Katakana</em> to get <em>Czech to
   2940 				Katakana</em>.
   2941 		</p>
   2942 		<p>CLDR often has special forms of IPA: not just
   2943 			&quot;und-FONIPA&quot; but &quot;cs-FONIPA&quot;: specifically IPA
   2944 			that has come from Czech. These variants typically preserve some
   2945 			features of the source language  such as double consonants  that
   2946 			are indistinguishable from single consonants in that language, but
   2947 			that are often preserved in traditional transliterations. Thus when
   2948 			matching prospective pivots, FONIPA is treated specially. If there is
   2949 			an exact match, that match is used (such as cs-cs_FONIPA +
   2950 			cs_FONIPA-ko). Otherwise, the language is ignored, as for example in
   2951 			cs-cs_FONIPA + ru_FONIPA-ko.</p>
   2952 		<p>The interaction of implicit pivots and
   2953 			inheritance may result in a longer inheritance chain lookup than
   2954 			desired, so implementers may consider having some sort of caching
   2955 			mechanism to increase performance.</p>
   2956 		<h3>
   2957 			10.2 <a name="Variants" href="#Variants">Variants</a>
   2958 		</h3>
   2959 
   2960 		<p>
   2961 			Variants used in CLDR include UNGEGN and BGN, both indicating sources
   2962 			for transliterations. There is an additional attribute
   2963 			<code>private="true"</code>
   2964 			which is used to indicate that the transform is meant for internal
   2965 			use, and should not be displayed as a separate choice in a UI.
   2966 		</p>
   2967 
   2968 		<p>There are many different systems of transliteration. The goal
   2969 			for the "unqualified" script transliterations are</p>
   2970 
   2971 		<ol>
   2972 			<li>to be lossless when going to Latin and back</li>
   2973 			<li>to be as lossless as possible when going to other scripts</li>
   2974 			<li>to abide by a common standard as much as possible (possibly
   2975 				supplemented to meet goals 1 and 2).</li>
   2976 		</ol>
   2977 
   2978 		<p>Language-to-language transliterations, and variant
   2979 			script-to-script transliterations are generally transcriptions, and
   2980 			not expected to be lossless.</p>
   2981 
   2982 		<p>Additional transliterations may also be defined, such as
   2983 			customized language-specific transliterations (such as between
   2984 			Russian and French), or those that match a particular transliteration
   2985 			standard, such as the following:</p>
   2986 
   2987 		<ul>
   2988 			<li>UNGEGN - United Nations Group of Experts on Geographical
   2989 				Names</li>
   2990 			<li>BGN - United States Board on Geographic Names</li>
   2991 			<li>ISO9 - ISO/IEC 9</li>
   2992 			<li>ISO15915 - ISO/IEC 15915</li>
   2993 			<li>ISCII91 - ISCII 91</li>
   2994 			<li>KMOCT - South Korean Ministry of Culture &amp; Tourism</li>
   2995 			<li>USLC - US Library of Congress</li>
   2996 			<li>UKPCGN - Permanent Committee on Geographical Names for
   2997 				British Official Use</li>
   2998 			<li>RUGOST - Russian Main Administration of Geodesy and
   2999 				Cartography</li>
   3000 		</ul>
   3001 
   3002 		<p>
   3003 			The rules for transforms are described in Section 10.3 <a
   3004 				href="#Transform_Rules_Syntax">Transform Rules Syntax</a>. For more
   3005 			information on Transliteration, see <a
   3006 				href="http://cldr.unicode.org/index/cldr-spec/transliteration-guidelines">Transliteration
   3007 				Guidelines</a>.
   3008 		</p>
   3009 
   3010 		<h3>
   3011 			10.3 <a name="Transform_Rules_Syntax" href="#Transform_Rules_Syntax">Transform
   3012 				Rules Syntax</a>
   3013 		</h3>
   3014 
   3015 
   3016 		<p class="dtd">
   3017 			&lt;!ELEMENT transforms ( transform*) &gt;<br> &lt;!ELEMENT
   3018 			transform ((comment | tRule)*) &gt;<br> &lt;!ATTLIST transform
   3019 			source CDATA #IMPLIED &gt;<br> &lt;!ATTLIST transform target
   3020 			CDATA #IMPLIED &gt;<br> &lt;!ATTLIST transform variant CDATA
   3021 			#IMPLIED &gt;<br> &lt;!ATTLIST transform direction ( forward |
   3022 			backward | both ) "both" &gt;<br> &lt;!ATTLIST
   3023 				transform alias CDATA #IMPLIED &gt; <br>  &lt;!--@VALUE--&gt;
   3024 				<br> &lt;!ATTLIST transform backwardAlias CDATA #IMPLIED &gt; <br>
   3025 				 &lt;!--@VALUE--&gt;
   3026 			<br> &lt;!ATTLIST transform visibility ( internal | external )
   3027 			"external" &gt;<br> &lt;!ELEMENT comment (#PCDATA) &gt;<br>
   3028 			&lt;!ELEMENT tRule (#PCDATA) &gt;
   3029 		</p>
   3030 		<p>
   3031 			The transform attributes indicate the <strong>source</strong>, <strong>target</strong>,
   3032 			<strong>direction</strong>, and <strong>alias</strong>es. For
   3033 			example:
   3034 		</p>
   3035 		<p class='example'>
   3036 			&lt;transform<br>  source=&quot;ja_Hrkt&quot;<br> 
   3037 			target=&quot;ja_Latn&quot;<br>  variant=&quot;BGN&quot;<br>
   3038 			 direction=&quot;forward&quot;<br> 
   3039 			draft=&quot;provisional&quot;<br> 
   3040 			alias=&quot;Katakana-Latin/BGN ja-Latn-t-ja-hrkt-m0-bgn&quot;&gt;
   3041 		</p>
   3042 		<p>
   3043 			The direction is either <strong>forward</strong> or <strong>both</strong>
   3044 			(<strong>backward</strong> is possible in theory, but not used). This
   3045 			indicates which directions the rules support.
   3046 		</p>
   3047 		<p>
   3048 			If the direction is <strong>forward</strong>, then an ID is composed
   3049 			from <strong>target + &quot;-&quot; + source + &quot;/&quot;
   3050 				+ variant</strong>. If the direction is <strong>both</strong>, then the
   3051 			inverse ID is also value: <strong>source + &quot;-&quot; +
   3052 				target + &quot;/&quot; + variant</strong>. The <strong>alias</strong>
   3053 			attribute contains a space-delimited list of alternant forward IDs,
   3054 			while the <strong>backwardAlias</strong> contains a space-delimited
   3055 			list of alternant backward IDs. The BCP47 versions of the IDs will be
   3056 			in the <strong>alias</strong> and/or <strong>backwardAlias</strong>
   3057 			attributes.
   3058 		</p>
   3059 		<p>
   3060 			The <strong>visibility</strong> attribute indicates whether the IDs
   3061 			should be externally visible, or whether they are only used
   3062 			internally.
   3063 		</p>
   3064 		<p>In previous versions, the rules were expressed
   3065 			as fine-grained XML. That was discarded in CLDR version 29, in favor
   3066 			of a simpler format where the separate rules are simply terminated
   3067 			with &quot;;&quot;.</p>
   3068 		<p>
   3069 			The transform rules are similar to regular-expression substitutions,
   3070 			but adapted to the specific domain of text transformations. The rules
   3071 			and comments in this discussion will be intermixed, with # marking
   3072 			the comments. The simplest rule is a
   3073 			conversion rule, which replaces one string of characters with
   3074 			another. The conversion rule takes the following form:
   3075 		</p>
   3076 
   3077 		<table cellspacing="0" cellpadding="8" border="1">
   3078 			<tr>
   3079 				<td valign="top" bgcolor="#eeeeee"><code>xy  z ;</code></td>
   3080 			</tr>
   3081 		</table>
   3082 
   3083 		<p>This converts any substring "xy" into "z". Rules are executed
   3084 			in order; consider the following rules:</p>
   3085 
   3086 		<table cellspacing="0" cellpadding="8" border="1">
   3087 			<tr>
   3088 				<td valign="top" bgcolor="#eeeeee"><code>
   3089 						sch  sh ;<br> ss  z ;
   3090 					</code></td>
   3091 			</tr>
   3092 		</table>
   3093 
   3094 		<p>This conversion rule transforms "bass school" into "baz shool".
   3095 			The transform walks through the string from start to finish. Thus
   3096 			given the rules above "bassch" will convert to "bazch", because the
   3097 			"ss" rule is found before the "sch" rule in the string (later, we'll
   3098 			see a way to override this behavior). If two rules can both apply at
   3099 			a given point in the string, then the transform applies the first
   3100 			rule in the list.</p>
   3101 
   3102 		<p>All of the ASCII characters except numbers and letters are
   3103 			reserved for use in the rule syntax, as are the characters , , .
   3104 			Normally, these characters do not need to be converted. However, to
   3105 			convert them use either a pair of single quotes or a slash. The pair
   3106 			of single quotes can be used to surround a whole string of text. The
   3107 			slash affects only the character immediately after it. For example,
   3108 			to convert from a U+2190() LEFTWARDS ARROW to the string "arrow
   3109 			sign" (with a space), use one of the following rules:</p>
   3110 
   3111 		<table cellspacing="0" cellpadding="8" border="1">
   3112 			<tr>
   3113 				<td valign="top" bgcolor="#eeeeee"><code>
   3114 						\&nbsp;&nbsp; &nbsp; arrow\ sign ;<br> ''&nbsp;&nbsp;
   3115 						&nbsp;&nbsp; 'arrow sign' ;<br> ''&nbsp;&nbsp;
   3116 						&nbsp;&nbsp; arrow' 'sign ;
   3117 					</code></td>
   3118 			</tr>
   3119 		</table>
   3120 
   3121 		<p>Spaces may be inserted anywhere without any effect on the
   3122 			rules. Use extra space to separate items out for clarity without
   3123 			worrying about the effects. This feature is particularly useful with
   3124 			combining marks; it is handy to put some spaces around it to separate
   3125 			it from the surrounding text. The following is an example:</p>
   3126 
   3127 		<table cellspacing="0" cellpadding="8" border="1">
   3128 			<tr>
   3129 				<td valign="top" bgcolor="#eeeeee"><code>&nbsp; i ; #
   3130 						an iota-subscript diacritic turns into an i.</code></td>
   3131 			</tr>
   3132 		</table>
   3133 
   3134 		<p>For a real space in the rules, place quotes around it. For a
   3135 			real backslash, either double it \\, or quote it '\'. For a real
   3136 			single quote, double it '', or place a backslash before it \'.</p>
   3137 
   3138 		<p>Any text that starts with a hash mark and concludes a line is a
   3139 			comment. Comments help document how the rules work. The following
   3140 			shows a comment in a rule:</p>
   3141 
   3142 		<table cellspacing="0" cellpadding="8" border="1">
   3143 			<tr>
   3144 				<td valign="top" bgcolor="#eeeeee"><code>x  ks ; #
   3145 						change every x into ks</code></td>
   3146 			</tr>
   3147 		</table>
   3148 
   3149 		<p>The \u and \x hex notations can be used instead of any
   3150 			letter. For instance, instead of using the Greek , one could write
   3151 			either of the following:</p>
   3152 
   3153 		<table cellspacing="0" cellpadding="8" border="1">
   3154 			<tr>
   3155 				<td valign="top" bgcolor="#eeeeee"><code>
   3156 						\u03C0  p ;<br> \x{3C0}  p ;
   3157 					</code></td>
   3158 			</tr>
   3159 		</table>
   3160 
   3161 		<p>One can also define and use variables, such as:</p>
   3162 
   3163 		<table cellspacing="0" cellpadding="8" border="1">
   3164 			<tr>
   3165 				<td valign="top" bgcolor="#eeeeee"><code>
   3166 						$pi = \u03C0 ;<br> $pi  p ;
   3167 					</code></td>
   3168 			</tr>
   3169 		</table>
   3170 
   3171 		<h4>
   3172 			10.3.1 <a name="Dual_Rules" href="#Dual_Rules">Dual Rules</a>
   3173 		</h4>
   3174 		<p>Rules can also specify what happens when an inverse transform
   3175 			is formed. To do this, we reverse the direction of the "" sign. Thus
   3176 			the above example becomes:</p>
   3177 
   3178 		<table cellspacing="0" cellpadding="8">
   3179 			<tr>
   3180 				<td valign="top" bgcolor="#eeeeee"><code>$pi  p ;</code></td>
   3181 			</tr>
   3182 		</table>
   3183 
   3184 		<p>With the inverse transform, "p" will convert to the Greek p.
   3185 			These two directions can be combined together into a dual conversion
   3186 			rule by using the "" operator, yielding:</p>
   3187 
   3188 		<table cellspacing="0" cellpadding="8" border="1">
   3189 			<tr>
   3190 				<td valign="top" bgcolor="#eeeeee"><code>$pi  p ;</code></td>
   3191 			</tr>
   3192 		</table>
   3193 
   3194 		<h4>
   3195 			10.3.2 <a name="Context" href="#Context">Context</a>
   3196 		</h4>
   3197 
   3198 		<p>Context can be used to have the results of a transformation be
   3199 			different depending on the characters before or after. The following
   3200 			rule removes hyphens, but only when they follow lowercase characters:
   3201 		</p>
   3202 
   3203 		<table cellspacing="0" cellpadding="8" border="1">
   3204 			<tr>
   3205 				<td valign="top" bgcolor="#eeeeee"><code> [:Lowercase:]
   3206 						{ '-'  ; </code></td>
   3207 			</tr>
   3208 		</table>
   3209 
   3210 		<p>Contexts can be before or after or both, such as in a rule to
   3211 			remove hyphens between lowercase and uppercase letters:</p>
   3212 		<table cellspacing="0" cellpadding="8" border="1">
   3213 			<tr>
   3214 				<td valign="top" bgcolor="#eeeeee"><code>[:Lowercase:] {
   3215 						'-' } [:Uppercase:]  ;</code></td>
   3216 			</tr>
   3217 		</table>
   3218 		<p>Each context is optional and may be empty; the following two
   3219 			rules are equivalent:</p>
   3220 		<table cellspacing="0" cellpadding="8" border="1">
   3221 			<tr>
   3222 				<td valign="top" bgcolor="#eeeeee"><code>
   3223 						$pi  p ;<br> {$pi}  {p} ;
   3224 					</code></td>
   3225 			</tr>
   3226 		</table>
   3227 		<p>
   3228 			The context itself ([:
   3229 			<code> Lowercase </code>
   3230 			:]) is unaffected by the replacement; only the text within braces is
   3231 			changed.
   3232 		</p>
   3233 		<p>
   3234 			Character classes (UnicodeSets) in the contexts can contain the
   3235 			special symbol $, which means off either end of the string. It is
   3236 			roughly similar to $ and ^ in regex. Unlike normal regex, however, it
   3237 			can occur in character classes. Thus the following rule removes
   3238 			hyphens that are after lowercase characters, <em>or</em> are at the
   3239 			start of a string.
   3240 		</p>
   3241 		<table cellspacing="0" cellpadding="8" border="1">
   3242 			<tr>
   3243 				<td valign="top" bgcolor="#eeeeee"><code>[[:Lowercase:]$]
   3244 						{'-'  ;</code></td>
   3245 			</tr>
   3246 		</table>
   3247 
   3248 		<p>
   3249 			Thus the negation of a UnicodeSet will normally also match before or
   3250 			after the end of a string. The following will remove hyphens that are
   3251 			not after lowercase characters<em>, including hyphens at the
   3252 				start of a string</em>.
   3253 		</p>
   3254 		<table cellspacing="0" cellpadding="8" border="1">
   3255 			<tr>
   3256 				<td valign="top" bgcolor="#eeeeee"><code>[^[:Lowercase:]]
   3257 						{'-'  ;</code></td>
   3258 			</tr>
   3259 		</table>
   3260 		<p>It will thus convert -B A-B a-b to B AB a-b.</p>
   3261 		<h4>
   3262 			10.3.3 <a name="Revisiting" href="#Revisiting">Revisiting</a>
   3263 		</h4>
   3264 
   3265 		<p>If the resulting text contains a vertical bar "|", then that
   3266 			means that processing will proceed from that point and that the
   3267 			transform will revisit part of the resulting text. Thus the | marks a
   3268 			"cursor" position. For example, if we have the following, then the
   3269 			string "xa" will convert to "w".</p>
   3270 
   3271 		<table cellspacing="0" cellpadding="8" border="1">
   3272 			<tr>
   3273 				<td valign="top" bgcolor="#eeeeee"><code>
   3274 						x  y | z ;<br> z a  w;
   3275 					</code></td>
   3276 			</tr>
   3277 		</table>
   3278 
   3279 		<p>First, "xa" is converted to "yza". Then the processing will
   3280 			continue from after the character "y", pick up the "za", and convert
   3281 			it. Had we not had the "|", the result would have been simply "yza".
   3282 			The '@' character can be used as filler character to place the
   3283 			revisiting point off the start or end of the string. Thus the
   3284 			following causes x to be replaced, and the cursor to be backed up by
   3285 			two characters.</p>
   3286 
   3287 		<table cellspacing="0" cellpadding="8" border="1">
   3288 			<tr>
   3289 				<td valign="top" bgcolor="#eeeeee"><code>x  |@@y;</code></td>
   3290 			</tr>
   3291 		</table>
   3292 
   3293 		<h4>
   3294 			10.3.4 <a name="Example" href="#Example">Example</a>
   3295 		</h4>
   3296 
   3297 		<p>The following shows how these features are combined together in
   3298 			the Transliterator "Any-Publishing". This transform converts the
   3299 			ASCII typewriter conventions into text more suitable for desktop
   3300 			publishing (in English). It turns straight quotation marks or UNIX
   3301 			style quotation marks into curly quotation marks, fixes multiple
   3302 			spaces, and converts double-hyphens into a dash.</p>
   3303 
   3304 		<table cellspacing="0" cellpadding="8" border="1">
   3305 			<tr>
   3306 				<td valign="top" bgcolor="#eeeeee"><code>
   3307 						# Variables<br> <br> $single = \' ;<br> $space = '
   3308 						' ;<br> $double = \" ;<br> $back = \` ;<br> $tab =
   3309 						'\u0008' ;<br> <br> # the following is for spaces, line
   3310 						ends, (, [, {, ...<br> $makeRight = [[:separator:][:start
   3311 						punctuation:][:initial punctuation:]] ;<br> <br> # fix
   3312 						UNIX quotes<br> <br> $back $back   ; # generate right
   3313 						d.q.m. (double quotation mark)<br> $back   ;<br> <br>
   3314 						# fix typewriter quotes, by context<br> <br> $makeRight
   3315 						{ $double   ; # convert a double to right d.q.m. after certain
   3316 						chars<br> ^ { $double   ; # convert a double at the start
   3317 						of the line.<br> $double   ; # otherwise convert to a left
   3318 						q.m.<br> <br> $makeRight {$single}   ; # do the same
   3319 						for s.q.m.s<br> ^ {$single}   ;<br> $single  ;<br>
   3320 						<br> # fix multiple spaces and hyphens<br> <br>
   3321 						$space {$space}  ; # collapse multiple spaces<br> '--'   ;
   3322 						# convert fake dash into real one
   3323 					</code></td>
   3324 			</tr>
   3325 		</table>
   3326 		<p>There is an online demo where the rules can be tested, at:</p>
   3327 		<p>
   3328 			<a target="demo" href="http://unicode.org/cldr/utility/transform.jsp">http://unicode.org/cldr/utility/transform.jsp</a>
   3329 		</p>
   3330 		<h4>
   3331 			10.3.5 <a name="Rule_Syntax" href="#Rule_Syntax">Rule Syntax</a>
   3332 		</h4>
   3333 
   3334 		<p>The following describes the full format of the list of rules
   3335 			used to create a transform. Each rule in the list is terminated by a
   3336 			semicolon. The list consists of the following:</p>
   3337 
   3338 		<ul>
   3339 			<li>an optional filter rule</li>
   3340 			<li>zero or more transform rules</li>
   3341 			<li>zero or more variable-definition rules</li>
   3342 			<li>zero or more conversion rules</li>
   3343 			<li>an optional inverse filter rule</li>
   3344 		</ul>
   3345 
   3346 		<p>The filter rule, if present, must appear at the beginning of
   3347 			the list, before any of the other rules.&nbsp; The inverse filter
   3348 			rule, if present, must appear at the end of the list, after all of
   3349 			the other rules.&nbsp; The other rules may occur in any order and be
   3350 			freely intermixed.</p>
   3351 
   3352 		<p>The rule list can also generate the inverse of the transform.
   3353 			In that case, the inverse of each of the rules is used, as described
   3354 			below.</p>
   3355 
   3356 		<h4>
   3357 			10.3.6 <a name="Transform_Rules" href="#Transform_Rules">Transform
   3358 				Rules</a>
   3359 		</h4>
   3360 
   3361 		<p>Each transform rule consists of two colons followed by a
   3362 			transform name, which is of the form source-target. For example:</p>
   3363 
   3364 		<table cellspacing="0" cellpadding="8" border="1">
   3365 			<tr>
   3366 				<td valign="top" bgcolor="#eeeeee"><code>
   3367 						:: NFD ;<br> :: und_Latn-und_Greek ;<br> :: Latin-Greek;
   3368 						# alternate form
   3369 					</code></td>
   3370 			</tr>
   3371 		</table>
   3372 
   3373 		<p>If either the source or target is 'und', it can be omitted,
   3374 			thus 'und_NFC' is equivalent to 'NFC'. For compatibility, the English
   3375 			names for scripts can be used instead of the und_Latn locale name,
   3376 			and "Any" can be used instead of "und". Case is not significant.</p>
   3377 
   3378 		<p>The following transforms are defined not by rules, but by the
   3379 			operations in the Unicode Standard, and may be used in building any
   3380 			other transform:</p>
   3381 
   3382 		<blockquote>
   3383 			<b>Any-NFC, Any-NFD, Any-NFKD, Any-NFKC</b> - the normalization forms
   3384 			defined by [<a href="http://www.unicode.org/reports/tr41/#UAX15">UAX15</a>].<br>
   3385 			<p>
   3386 				<b>Any-Lower, Any-Upper, Any-Title</b> - full case transformations,
   3387 				defined by [<a href="tr35.html#Unicode">Unicode</a>] Chapter 3.
   3388 			</p>
   3389 		</blockquote>
   3390 
   3391 		<p>In addition, the following special cases are defined:</p>
   3392 
   3393 		<blockquote>
   3394 			<b>Any-Null</b> - has no effect; that is, each character is left
   3395 			alone.<br> <b>Any-Remove</b> - maps each character to the empty
   3396 			string; this, removes each character.
   3397 		</blockquote>
   3398 
   3399 		<p>The inverse of a transform rule uses parentheses to indicate
   3400 			what should be done when the inverse transform is used. For example:</p>
   3401 
   3402 		<table cellspacing="0" cellpadding="8" border="1">
   3403 			<tr>
   3404 				<td valign="top" bgcolor="#eeeeee"><code>
   3405 						:: lower () ; # only executed for the normal<br> :: (lower) ;
   3406 						# only executed for the inverse<br> :: lower ; # executed for
   3407 						both the normal and the inverse
   3408 					</code></td>
   3409 			</tr>
   3410 		</table>
   3411 
   3412 		<h4>
   3413 			10.3.7 <a name="Variable_Definition_Rules"
   3414 				href="#Variable_Definition_Rules">Variable Definition Rules</a>
   3415 		</h4>
   3416 
   3417 		<p>Each variable definition is of the following form:</p>
   3418 
   3419 		<table cellspacing="0" cellpadding="8" border="1">
   3420 			<tr>
   3421 				<td valign="top" bgcolor="#eeeeee"><code>$variableName =
   3422 						contents ;</code></td>
   3423 			</tr>
   3424 		</table>
   3425 
   3426 		<p>
   3427 			The variable name can contain letters and digits, but must start with
   3428 			a letter. More precisely, the variable names use Unicode identifiers
   3429 			as defined by [<a href="http://www.unicode.org/reports/tr41/#UAX31">UAX31</a>].
   3430 			The identifier properties allow for the use of foreign letters and
   3431 			numbers.
   3432 		</p>
   3433 
   3434 		<p>The contents of a variable definition is any sequence of
   3435 			Unicode sets and characters or characters. For example:</p>
   3436 
   3437 		<table cellspacing="0" cellpadding="8" border="1">
   3438 			<tr>
   3439 				<td valign="top" bgcolor="#eeeeee"><code>$mac = M [aA]
   3440 						[cC] ;</code></td>
   3441 			</tr>
   3442 		</table>
   3443 
   3444 		<p>Variables are only replaced within other variable definition
   3445 			rules and within conversion rules. They have no effect on
   3446 			transliteration rules.</p>
   3447 
   3448 		<h4>
   3449 			10.3.8 <a name="Filter_Rules" href="#Filter_Rules">Filter Rules</a>
   3450 		</h4>
   3451 
   3452 		<p>A filter rule consists of two colons followed by a UnicodeSet.
   3453 			This filter is global in that only the characters matching the filter
   3454 			will be affected by any transform rules or conversion rules. The
   3455 			inverse filter rule consists of two colons followed by a UnicodeSet
   3456 			in parentheses. This filter is also global for the inverse transform.</p>
   3457 
   3458 		<p>For example, the Hiragana-Latin transform can be implemented by
   3459 			"pivoting" through the Katakana converter, as follows:</p>
   3460 
   3461 		<table cellspacing="0" cellpadding="8" border="1">
   3462 			<tr>
   3463 				<td valign="top" bgcolor="#eeeeee"><code>
   3464 						:: [:^Katakana:] ; # do not touch any katakana that was in the
   3465 						text!<br> :: Hiragana-Katakana;<br> :: Katakana-Latin;<br>
   3466 						:: ([:^Katakana:]) ; # do not touch any katakana that was in the
   3467 						text<br>
   3468 						&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
   3469 						# for the inverse either!
   3470 					</code></td>
   3471 			</tr>
   3472 		</table>
   3473 
   3474 		<p>The filters keep the transform from mistakenly converting any
   3475 			of the "pivot" characters. Note that this is a case where a rule list
   3476 			contains no conversion rules at all, just transform rules and
   3477 			filters.</p>
   3478 
   3479 		<h4>
   3480 			10.3.9 <a name="Conversion_Rules" href="#Conversion_Rules">Conversion
   3481 				Rules</a>
   3482 		</h4>
   3483 
   3484 		<p>Conversion rules can be forward, backward, or double. The
   3485 			complete conversion rule syntax is described below:</p>
   3486 
   3487 		<p>
   3488 			<b>Forward</b>
   3489 		</p>
   3490 
   3491 		<blockquote>
   3492 			<p>A forward conversion rule is of the following form:</p>
   3493 
   3494 			<blockquote>
   3495 				<pre>before_context { text_to_replace } after_context  completed_result | result_to_revisit ;</pre>
   3496 			</blockquote>
   3497 
   3498 			<p>If there is no before_context, then the "{" can be omitted. If
   3499 				there is no after_context, then the "}" can be omitted. If there is
   3500 				no result_to_revisit, then the "|" can be omitted. A forward
   3501 				conversion rule is only executed for the normal transform and is
   3502 				ignored when generating the inverse transform.</p>
   3503 		</blockquote>
   3504 
   3505 		<p>
   3506 			<b>Backward</b>
   3507 		</p>
   3508 
   3509 		<blockquote>
   3510 			<p>A backward conversion rule is of the following form:</p>
   3511 
   3512 			<blockquote>
   3513 				<pre>completed_result | result_to_revisit  before_context { text_to_replace } after_context ;</pre>
   3514 			</blockquote>
   3515 
   3516 			<p>The same omission rules apply as in the case of forward
   3517 				conversion rules. A backward conversion rule is only executed for
   3518 				the inverse transform and is ignored when generating the normal
   3519 				transform.</p>
   3520 		</blockquote>
   3521 
   3522 		<p>
   3523 			<b>Dual</b>
   3524 		</p>
   3525 		<blockquote>
   3526 			<p>A dual conversion rule combines a forward conversion rule and
   3527 				a backward conversion rule into one, as discussed above. It is of
   3528 				the form:</p>
   3529 
   3530 			<table cellspacing="0" cellpadding="8" border="1">
   3531 				<tr>
   3532 					<td valign="top" bgcolor="#eeeeee"><code>a { b | c } d
   3533 							 e { f | g } h ;</code></td>
   3534 				</tr>
   3535 			</table>
   3536 
   3537 			<p>When generating the normal transform and the inverse, the
   3538 				revisit mark "|" and the before and after contexts are ignored on
   3539 				the sides where they do not belong. Thus, the above is exactly
   3540 				equivalent to the sequence of the following two rules:</p>
   3541 
   3542 			<table cellspacing="0" cellpadding="8" border="1">
   3543 				<tr>
   3544 					<td valign="top" bgcolor="#eeeeee"><code>
   3545 							a { b c } d&nbsp; &nbsp; f | g&nbsp; ;<br> b | c&nbsp;
   3546 							&nbsp; e { f g } h ;&nbsp;
   3547 						</code></td>
   3548 				</tr>
   3549 			</table>
   3550 		</blockquote>
   3551 
   3552 		<h4>
   3553 			10.3.10 <a name="Intermixing_Transform_Rules_and_Conversion_Rules"
   3554 				href="#Intermixing_Transform_Rules_and_Conversion_Rules">
   3555 				Intermixing Transform Rules and Conversion Rules</a>
   3556 		</h4>
   3557 
   3558 		<p>Transform rules and conversion rules may be freely intermixed.
   3559 			Inserting a transform rule into the middle of a set of conversion
   3560 			rules has an important side effect.</p>
   3561 
   3562 		<p>Normally, conversion rules are considered together as a
   3563 			group.&nbsp; The only time their order in the rule set is important
   3564 			is when more than one rule matches at the same point in the
   3565 			string.&nbsp; In that case, the one that occurs earlier in the rule
   3566 			set wins.&nbsp; In all other situations, when multiple rules match
   3567 			overlapping parts of the string, the one that matches earlier wins.</p>
   3568 
   3569 		<p>Transform rules apply to the whole string.&nbsp; If you have
   3570 			several transform rules in a row, the first one is applied to the
   3571 			whole string, then the second one is applied to the whole string, and
   3572 			so on.&nbsp; To reconcile this behavior with the behavior of
   3573 			conversion rules, transform rules have the side effect of breaking a
   3574 			surrounding set of conversion rules into two groups: First all of the
   3575 			conversion rules before the transform rule are applied as a group to
   3576 			the whole string in the usual way, then the transform rule is applied
   3577 			to the whole string, and then the conversion rules after the
   3578 			transform rule are applied as a group to the whole string.&nbsp; For
   3579 			example, consider the following rules:</p>
   3580 
   3581 		<table cellspacing="0" cellpadding="8" border="1">
   3582 			<tr>
   3583 				<td valign="top" bgcolor="#eeeeee"><code>
   3584 						abc  xyz;<br> xyz  def;<br> ::Upper;
   3585 					</code></td>
   3586 			</tr>
   3587 		</table>
   3588 
   3589 		<p>If you apply these rules to abcxyz, you get XYZDEF.&nbsp;
   3590 			If you move the ::Upper; to the middle of the rule set and change
   3591 			the cases accordingly, then applying this to abcxyz produces
   3592 			DEFDEF.</p>
   3593 
   3594 		<table cellspacing="0" cellpadding="8" border="1">
   3595 			<tr>
   3596 				<td valign="top" bgcolor="#eeeeee"><code>
   3597 						abc  xyz;<br> ::Upper;<br> XYZ  DEF;
   3598 					</code></td>
   3599 			</tr>
   3600 		</table>
   3601 
   3602 		<p>This is because ::Upper; causes the transliterator to reset
   3603 			to the beginning of the string. The first rule turns the string into
   3604 			xyzxyz, the second rule upper cases the whole thing to XYZXYZ,
   3605 			and the third rule turns this into DEFDEF.</p>
   3606 
   3607 		<p>This can be useful when a transform naturally occurs in
   3608 			multiple passes.&nbsp; Consider this rule set:</p>
   3609 
   3610 		<table cellspacing="0" cellpadding="8" border="1">
   3611 			<tr>
   3612 				<td valign="top" bgcolor="#eeeeee"><code>
   3613 						[:Separator:]*  ' ';<br> 'high school'  'H.S.';<br>
   3614 						'middle school'  'M.S.';<br> 'elementary school'  'E.S.';
   3615 					</code></td>
   3616 			</tr>
   3617 		</table>
   3618 
   3619 		<p>If you apply this rule to high school, you get H.S., but if
   3620 			you apply it to high&nbsp; school (with two spaces), you just get
   3621 			high school (with one space). To have high&nbsp; school (with two
   3622 			spaces) turn into H.S., you'd either have to have the first rule
   3623 			back up some arbitrary distance (far enough to see elementary, if
   3624 			you want all the rules to work), or you have to include the whole
   3625 			left-hand side of the first rule in the other rules, which can make
   3626 			them hard to read and maintain:</p>
   3627 
   3628 		<table cellspacing="0" cellpadding="8" border="1">
   3629 			<tr>
   3630 				<td valign="top" bgcolor="#eeeeee"><code>
   3631 						$space = [:Separator:]*;<br> high $space school  'H.S.';<br>
   3632 						middle $space school  'M.S.';<br> elementary $space school 
   3633 						'E.S.';
   3634 					</code></td>
   3635 			</tr>
   3636 		</table>
   3637 
   3638 		<p>
   3639 			Instead, you can simply insert 
   3640 			<code>::Null;</code>
   3641 			 in order to get things to work right:
   3642 		</p>
   3643 
   3644 		<table cellspacing="0" cellpadding="8" border="1">
   3645 			<tr>
   3646 				<td valign="top" bgcolor="#eeeeee"><code>
   3647 						[:Separator:]*  ' ';<br> ::Null;<br> 'high school' 
   3648 						'H.S.';<br> 'middle school'  'M.S.';<br> 'elementary
   3649 						school'  'E.S.';
   3650 					</code></td>
   3651 			</tr>
   3652 		</table>
   3653 
   3654 		<p>The ::Null; has no effect of its own (the null transform, by
   3655 			definition, does not do anything), but it splits the other rules into
   3656 			two passes: The first rule is applied to the whole string,
   3657 			normalizing all runs of white space into single spaces, and then we
   3658 			start over at the beginning of the string to look for the phrases.
   3659 			high&nbsp;&nbsp;&nbsp; school (with four spaces) gets correctly
   3660 			converted to H.S..</p>
   3661 
   3662 		<p>This can also sometimes be useful with rules that have
   3663 			overlapping domains.&nbsp; Consider this rule set from before:</p>
   3664 
   3665 		<table cellspacing="0" cellpadding="8" border="1">
   3666 			<tr>
   3667 				<td valign="top" bgcolor="#eeeeee"><code>
   3668 						sch  sh ;<br> ss  z ;
   3669 					</code></td>
   3670 			</tr>
   3671 		</table>
   3672 
   3673 		<p>Apply this rule to bassch results in bazch because ss
   3674 			matches earlier in the string than sch. If you really wanted
   3675 			basshthat is, if you wanted the first rule to win even when the
   3676 			second rule matches earlier in the string, you'd either have to add
   3677 			another rule for this special case...</p>
   3678 
   3679 		<table cellspacing="0" cellpadding="8" border="1">
   3680 			<tr>
   3681 				<td valign="top" bgcolor="#eeeeee"><code>
   3682 						sch  sh ;<br> ssch  ssh;<br> ss  z ;
   3683 					</code></td>
   3684 			</tr>
   3685 		</table>
   3686 
   3687 		<p>...or you could use a transform rule to apply the conversions
   3688 			in two passes:</p>
   3689 
   3690 		<table cellspacing="0" cellpadding="8" border="1">
   3691 			<tr>
   3692 				<td valign="top" bgcolor="#eeeeee"><code>
   3693 						sch  sh ;<br> ::Null;<br> ss  z ;
   3694 					</code></td>
   3695 			</tr>
   3696 		</table>
   3697 
   3698 		<h4>
   3699 			10.3.11 <a name="Inverse_Summary" href="#Inverse_Summary">Inverse
   3700 				Summary</a>
   3701 		</h4>
   3702 
   3703 		<p>The following table shows how the same rule list generates two
   3704 			different transforms, where the inverse is restated in terms of
   3705 			forward rules (this is a contrived example, simply to show the
   3706 			reordering):</p>
   3707 
   3708 		<table>
   3709 			<tr bgcolor="#99ccff">
   3710 				<th bgcolor="#cccccc">Original Rules</th>
   3711 				<th bgcolor="#cccccc">Forward</th>
   3712 				<th bgcolor="#cccccc">Inverse</th>
   3713 			</tr>
   3714 			<tr bgcolor="#99ccff">
   3715 				<td bgcolor="#eeeeee"><code>
   3716 						:: [:Uppercase Letter:] ;<br> :: latin-greek ;<br> ::
   3717 						greek-japanese ;<br> x  y ;<br> z  w ;<br> r  m
   3718 						; <br> :: upper;<br> a  b ;<br> c  d ;<br>
   3719 						:: any-publishing ;<br> :: ([:Number:]) ;
   3720 					</code></td>
   3721 				<td bgcolor="#eeeeee"><code>
   3722 						:: [:Uppercase Letter:] ;<br> :: latin-greek ;<br> ::
   3723 						greek-japanese ;<br> x  y ;<br> z  w ;<br> ::
   3724 						upper ;<br> a  b ;<br> c  d ;<br> ::
   3725 						any-publishing ;<br>
   3726 					</code></td>
   3727 				<td bgcolor="#eeeeee"><code>
   3728 						:: [:Number:] ;<br> :: publishing-any ;<br> d  c ;<br>
   3729 						:: lower ;<br> y  x ;<br> m  r ;<br> ::
   3730 						japanese-greek ;<br> :: greek-latin ;<br>
   3731 					</code></td>
   3732 			</tr>
   3733 		</table>
   3734 
   3735 		<p>Note how the irrelevant rules (the inverse filter rule and the
   3736 			rules containing ) are omitted (ignored, actually) in the forward
   3737 			direction, and notice how things are reversed: the transform rules
   3738 			are inverted and happen in the opposite order, and the groups of
   3739 			conversion rules are also executed in the opposite relative order
   3740 			(although the rules within each group are executed in the same
   3741 			order).</p>
   3742 
   3743 		<h2>
   3744 			11 <a name="ListPatterns" href="#ListPatterns">List Patterns</a>
   3745 		</h2>
   3746 
   3747 
   3748 		<p class="dtd">&lt;!ELEMENT listPatterns (alias | (listPattern*,
   3749 			special*)) &gt;</p>
   3750 
   3751 		<p class="dtd">
   3752 			&lt;!ELEMENT listPattern (alias | (listPatternPart*, special*)) &gt;<br>
   3753 			&lt;!ATTLIST listPattern type (NMTOKEN) #IMPLIED &gt;
   3754 		</p>
   3755 
   3756 		<p class="dtd">
   3757 			&lt;!ELEMENT listPatternPart ( #PCDATA ) &gt;<br> &lt;!ATTLIST
   3758 			listPatternPart type (start | middle | end | 2 | 3) #REQUIRED &gt;
   3759 		</p>
   3760 
   3761 		<p>List patterns can be used to format variable-length lists of
   3762 			things in a locale-sensitive manner, such as "Monday, Tuesday,
   3763 			Friday, and Saturday" (in English) versus "lundi, mardi, vendredi et
   3764 			samedi" (in French). For example, consider the following example:</p>
   3765 
   3766 		<pre class="example">&lt;listPatterns&gt;
   3767  &lt;listPattern&gt;
   3768   &lt;listPatternPart type="2"&gt;{0} and {1}&lt;/listPatternPart&gt;
   3769   &lt;listPatternPart type="start"&gt;{0}, {1}&lt;/listPatternPart&gt;
   3770   &lt;listPatternPart type="middle"&gt;{0}, {1}&lt;/listPatternPart&gt;
   3771   &lt;listPatternPart type="end"&gt;{0}, and {1}&lt;/listPatternPart&gt;
   3772  &lt;/listPattern&gt;
   3773 &lt;/listPatterns&gt;</pre>
   3774 
   3775 		<p>The data is used as follows: If there is a type type matches
   3776 			exactly the number of elements in the desired list (such as "2" in
   3777 			the above list), then use that pattern. Otherwise,</p>
   3778 
   3779 		<ol>
   3780 			<li>Format the last two elements with the "end" format.</li>
   3781 			<li>Then use middle format to add on subsequent elements working
   3782 				towards the front, all but the very first element. That is, {1} is
   3783 				what you've already done, and {0} is the previous element.</li>
   3784 			<li>Then use "start" to add the front element, again with {1} as
   3785 				what you've done so far, and {0} is the first element.</li>
   3786 		</ol>
   3787 		<p>Thus a list (a,b,c,...m, n) is formatted as:
   3788 			start(a,middle(b,middle(c,middle(...end(m, n))...)))</p>
   3789 
   3790 
   3791 		<p>The following type attributes are in use:</p>
   3792 		<table border="1" cellpadding="2" cellspacing="0" class='simple'>
   3793 		  <tr>
   3794 		    <th>type attribute value</th>
   3795 		    <th>Description</th>
   3796 		    <th>Examples</th>
   3797 	      </tr>
   3798 		  <tr>
   3799 		    <td nowrap>standard (or no <strong>type</strong>)</td>
   3800 		    <td>A typical 'and' list for arbitrary placeholders</td>
   3801 		    <td nowrap><em>January, February, and March</em></td>
   3802 	      </tr>
   3803 			  <tr>
   3804 		    <td>standard-short</td>
   3805 		    <td>A short version of a 'and' list, suitable for use with short or abbreviated placeholder values</td>
   3806 		    <td><em>Jan., Feb., and Mar.</em></td>
   3807 	      </tr>
   3808 	  <tr>
   3809 		    <td>or</td>
   3810 		    <td>A typical 'or' list for arbitrary placeholders</td>
   3811 		    <td><em>January, February, or March</em></td>
   3812 	      </tr>
   3813 	  <tr>
   3814 	    <td>or-short</td>
   3815 	    <td>A short version of an 'or' list</td>
   3816 	    <td><em>Jan., Feb., or Mar.</em></td>
   3817 	    </tr>
   3818 	  <tr>
   3819 	    <td>unit</td>
   3820 	    <td>A list suitable for wide units</td>
   3821 	    <td><em>3 feet, 7 inches</em></td>
   3822 	    </tr>
   3823 	  <tr>
   3824 	    <td>unit-short</td>
   3825 	    <td>A list suitable for short units</td>
   3826 	    <td><em>3 ft, 7 in</em></td>
   3827 	    </tr>
   3828 	  <tr>
   3829 	    <td>unit-narrow</td>
   3830 	    <td>A list suitable for narrow units, where space on the screen is very limited.</td>
   3831 	    <td><em>3 7</em></td>
   3832 	    </tr>
   3833       </table>
   3834 		<p>In many languages there may not be a difference among many of these lists. In others, the spacing, the length or presence or a conjunction, and the separators may change.</p>
   3835 
   3836 		<h3>
   3837 			11.1 <a name="List_Gender" href="#List_Gender">Gender of Lists</a>
   3838 		</h3>
   3839 
   3840 
   3841 		<p class="dtd">
   3842 			&lt;!-- Gender List support --&gt;<br> &lt;!ELEMENT gender (
   3843 			personList+ ) &gt;<br> &lt;!ELEMENT personList EMPTY &gt;<br>
   3844 			&lt;!ATTLIST personList type ( neutral | mixedNeutral | maleTaints )
   3845 			#REQUIRED &gt;<br> &lt;!ATTLIST personList locales NMTOKENS
   3846 			#REQUIRED &gt;<br>
   3847 		</p>
   3848 
   3849 		<p>This can be used to determine the gender of a list of 2 or more
   3850 			persons, such as "Tom and Mary", for use with gender-selection
   3851 			messages. For example,</p>
   3852 
   3853 		<pre class="example">
   3854   &lt;supplementalData&gt;
   3855     &lt;gender&gt;
   3856       &lt;!-- neutral: gender(list) = other --&gt;
   3857       &lt;personList type="neutral" locales="af da en..."/&gt;
   3858 
   3859       &lt;!-- mixedNeutral: gender(all male) = male, gender(all female) = female, otherwise gender(list) = other --&gt;
   3860       &lt;personList type="mixedNeutral" locales="el"/&gt;
   3861 
   3862       &lt;!-- maleTaints: gender(all female) = female, otherwise gender(list) = male --&gt;
   3863       &lt;personList type="maleTaints" locales="ar ca..."/&gt;
   3864     &lt;/gender&gt;
   3865   &lt;/supplementalData&gt;</pre>
   3866 
   3867 		<p>There are three ways the gender of a list can be formatted:</p>
   3868 
   3869 		<ol>
   3870 			<li><b>neutral:</b> A gender-independent "other" form will be
   3871 				used for the list.</li>
   3872 
   3873 			<li><b>mixedNeutral:</b> If the elements of the list are all
   3874 				male, "male" form is used for the list. If all the elements of the
   3875 				lists are female, "female" form is used. If the list has a mix of
   3876 				male, female and neutral names, the "other" form is used.</li>
   3877 
   3878 			<li><b>maleTaints:</b> If all the elements of the lists are
   3879 				female, "female" form is used, otherwise the "male" form is used.</li>
   3880 		</ol>
   3881 
   3882 
   3883 		<h2>
   3884 			12 <a name="Context_Transform_Elements"
   3885 				href="#Context_Transform_Elements">ContextTransform Elements</a>
   3886 		</h2>
   3887 
   3888 
   3889 		<p class="dtd">
   3890 			&lt;!ELEMENT contextTransforms ( alias | (contextTransformUsage*,
   3891 			special*)) &gt;<br> &lt;!ELEMENT contextTransformUsage ( alias |
   3892 			(contextTransform*, special*)) &gt;<br> &lt;!ATTLIST
   3893 			contextTransformUsage type CDATA #REQUIRED &gt;<br> &lt;!ELEMENT
   3894 			contextTransform ( #PCDATA ) &gt;<br> &lt;!ATTLIST
   3895 			contextTransform type ( uiListOrMenu | stand-alone ) #REQUIRED &gt;
   3896 		</p>
   3897 
   3898 		<p>CLDR locale elements provide data for display names or symbols
   3899 			in many categories. The default capitalization for these elements is
   3900 			intended to be the form used in the middle of running text. In many
   3901 			languages, other capitalization may be required in other contexts,
   3902 			depending on the type of name or symbol.</p>
   3903 
   3904 		<p>
   3905 			Each &lt;contextTransformUsage&gt; elements type attribute specifies
   3906 			a category of data from the table below; the element includes one or
   3907 			more &lt;contextTransform&gt; elements that specify how to perform
   3908 			capitalization of this category of data in different contexts. The
   3909 			&lt;contextTransform&gt; elements are needed primarily for cases in
   3910 			which the capitalization is other than the default form used in the
   3911 			middle of running text. However, it is also useful to mark cases in
   3912 			which it is <em>known</em> that no transformation from this default
   3913 			form is needed; this may be necessary, for example, to override the
   3914 			transformation specified by a parent locale. The following values are
   3915 			currently defined for the &lt;contextTransform&gt; element:
   3916 		</p>
   3917 
   3918 		<ul>
   3919 			<li>"titlecase-firstword" designates the case in which raw CLDR
   3920 				text that is in middle-of-sentence form, typically lowercase, needs
   3921 				to have its first word titlecased.</li>
   3922 			<li>"no-change" designates the case in which it is known that no
   3923 				change from the raw CLDR text (middle-of-sentence form) is needed.</li>
   3924 		</ul>
   3925 
   3926 		<p>Four contexts for capitalization behavior are currently
   3927 			identified. Two need no data, and hence have no corresponding
   3928 			&lt;contextTransform&gt; elements:</p>
   3929 
   3930 		<ul>
   3931 			<li>In the middle of running text: This is the default form, so
   3932 				no additional data is required.</li>
   3933 			<li>At the beginning of a complete sentence: The initial word is
   3934 				titlecased, no additional data is required to indicate this.</li>
   3935 		</ul>
   3936 
   3937 		<p>Two other contexts require &lt;contextTransform&gt; elements if
   3938 			their capitalization behavior is other than the default for running
   3939 			text. The context is identified by the type attribute, as follows:</p>
   3940 
   3941 		<ul>
   3942 			<li>uiListOrMenu: Capitalization appropriate to a user-interface
   3943 				list or menu.</li>
   3944 			<li>stand-alone: Capitalization appropriate to an isolated
   3945 				user-interface element (e.g. an isolated name on a calendar page)</li>
   3946 		</ul>
   3947 
   3948 		<p>Example:</p>
   3949 
   3950 		<pre>    &lt;contextTransforms&gt;
   3951         &lt;contextTransformUsage type="languages"&gt;
   3952              &lt;contextTransform type="uiListOrMenu"&gt;titlecase-firstword&lt;/contextTransform&gt;
   3953              &lt;contextTransform type="stand-alone"&gt;titlecase-firstword&lt;/contextTransform&gt;
   3954         &lt;/contextTransformUsage&gt;
   3955         &lt;contextTransformUsage type="month-format-except-narrow"&gt;
   3956              &lt;contextTransform type="uiListOrMenu"&gt;titlecase-firstword&lt;/contextTransform&gt;
   3957         &lt;/contextTransformUsage&gt;
   3958         &lt;contextTransformUsage type="month-standalone-except-narrow"&gt;
   3959              &lt;contextTransform type="uiListOrMenu"&gt;titlecase-firstword&lt;/contextTransform&gt;
   3960         &lt;/contextTransformUsage&gt;
   3961     &lt;/contextTransforms&gt;</pre>
   3962 
   3963 		<table cellspacing="0" cellpadding="2" border="1" class='simple'>
   3964 			<caption>
   3965 				<a name="contextTransformUsage_type_attribute_values"
   3966 					href="#contextTransformUsage_type_attribute_values">Element
   3967 					contextTransformUsage type attribute values</a>
   3968 			</caption>
   3969 			<tr>
   3970 				<th>type attribute value</th>
   3971 				<th>Description</th>
   3972 			</tr>
   3973 			<tr>
   3974 				<td>all</td>
   3975 				<td>Special value, indicates that the specified transformation
   3976 					applies to all of the categories below</td>
   3977 			</tr>
   3978 			<tr>
   3979 				<td>language</td>
   3980 				<td>localeDisplayNames language names</td>
   3981 			</tr>
   3982 			<tr>
   3983 				<td>script</td>
   3984 				<td>localeDisplayNames script names</td>
   3985 			</tr>
   3986 			<tr>
   3987 				<td>territory</td>
   3988 				<td>localeDisplayNames territory names</td>
   3989 			</tr>
   3990 			<tr>
   3991 				<td>variant</td>
   3992 				<td>localeDisplayNames variant names</td>
   3993 			</tr>
   3994 			<tr>
   3995 				<td>key</td>
   3996 				<td>localeDisplayNames key names</td>
   3997 			</tr>
   3998 			<tr>
   3999 				<td>keyValue</td>
   4000 				<td>localeDisplayNames key value type names</td>
   4001 			</tr>
   4002 			<tr>
   4003 				<td>month-format-except-narrow</td>
   4004 				<td>dates/calendars/calendar[type=*]/months format wide and
   4005 					abbreviated month names</td>
   4006 			</tr>
   4007 			<tr>
   4008 				<td>month-standalone-except-narrow</td>
   4009 				<td>dates/calendars/calendar[type=*]/months stand-alone wide
   4010 					and abbreviated month names</td>
   4011 			</tr>
   4012 			<tr>
   4013 				<td>month-narrow</td>
   4014 				<td>dates/calendars/calendar[type=*]/months format and
   4015 					stand-alone narrow month names</td>
   4016 			</tr>
   4017 			<tr>
   4018 				<td>day-format-except-narrow</td>
   4019 				<td>dates/calendars/calendar[type=*]/days format wide and
   4020 					abbreviated day names</td>
   4021 			</tr>
   4022 			<tr>
   4023 				<td>day-standalone-except-narrow</td>
   4024 				<td>dates/calendars/calendar[type=*]/days stand-alone wide and
   4025 					abbreviated day names</td>
   4026 			</tr>
   4027 			<tr>
   4028 				<td>day-narrow</td>
   4029 				<td>dates/calendars/calendar[type=*]/days format and
   4030 					stand-alone narrow day names</td>
   4031 			</tr>
   4032 			<tr>
   4033 				<td>era-name</td>
   4034 				<td>dates/calendars/calendar[type=*]/eras (wide) era names</td>
   4035 			</tr>
   4036 			<tr>
   4037 				<td>era-abbr</td>
   4038 				<td>dates/calendars/calendar[type=*]/eras abbreviated era names</td>
   4039 			</tr>
   4040 			<tr>
   4041 				<td>era-narrow</td>
   4042 				<td>dates/calendars/calendar[type=*]/eras narrow era names</td>
   4043 			</tr>
   4044 			<tr>
   4045 				<td>quarter-format-wide</td>
   4046 				<td>dates/calendars/calendar[type=*]/quarters format wide
   4047 					quarter names</td>
   4048 			</tr>
   4049 			<tr>
   4050 				<td>quarter-standalone-wide</td>
   4051 				<td>dates/calendars/calendar[type=*]/quarters stand-alone wide
   4052 					quarter names</td>
   4053 			</tr>
   4054 			<tr>
   4055 				<td>quarter-abbreviated</td>
   4056 				<td>dates/calendars/calendar[type=*]/quarters format and
   4057 					stand-alone abbreviated quarter names</td>
   4058 			</tr>
   4059 			<tr>
   4060 				<td>quarter-narrow</td>
   4061 				<td>dates/calendars/calendar[type=*]/quarters format and
   4062 					stand-alone narrow quarter names</td>
   4063 			</tr>
   4064 			<tr>
   4065 				<td>calendar-field</td>
   4066 				<td>dates/fields/field[type=*]/displayName field names<br>(for
   4067 					relative forms see type "tense" below)
   4068 				</td>
   4069 			</tr>
   4070 			<tr>
   4071 				<td>zone-exemplarCity</td>
   4072 				<td>dates/timeZoneNames/zone[type=*]/exemplarCity city names</td>
   4073 			</tr>
   4074 			<tr>
   4075 				<td>zone-long</td>
   4076 				<td>dates/timeZoneNames/zone[type=*]/long zone names</td>
   4077 			</tr>
   4078 			<tr>
   4079 				<td>zone-short</td>
   4080 				<td>dates/timeZoneNames/zone[type=*]/short zone names</td>
   4081 			</tr>
   4082 			<tr>
   4083 				<td>metazone-long</td>
   4084 				<td>dates/timeZoneNames/metazone[type=*]/long metazone names</td>
   4085 			</tr>
   4086 			<tr>
   4087 				<td>metazone-short</td>
   4088 				<td>dates/timeZoneNames/metazone[type=*]/short metazone names</td>
   4089 			</tr>
   4090 			<tr>
   4091 				<td>symbol</td>
   4092 				<td>numbers/currencies/currency[type=*]/symbol symbol names</td>
   4093 			</tr>
   4094 			<tr>
   4095 				<td>currencyName</td>
   4096 				<td>numbers/currencies/currency[type=*]/displayName currency
   4097 					names</td>
   4098 			</tr>
   4099 			<tr>
   4100 				<td>currencyName-count</td>
   4101 				<td>numbers/currencies/currency[type=*]/displayName[count=*]
   4102 					currency names for use with count</td>
   4103 			</tr>
   4104 			<tr>
   4105 				<td>relative</td>
   4106 				<td>dates/fields/field[type=*]/relative and
   4107 					dates/fields/field[type=*]/relativeTime relative field names</td>
   4108 			</tr>
   4109 			<tr>
   4110 				<td>unit-pattern</td>
   4111 				<td>units/unitLength[type=*]/unit[type=*]/unitPattern[count=*]
   4112 					unit names</td>
   4113 			</tr>
   4114 			<tr>
   4115 				<td>number-spellout</td>
   4116 				<td>rbnf/rulesetGrouping[type=*]/ruleset[type=*]/rbnfrule
   4117 					number spellout rules</td>
   4118 			</tr>
   4119 		</table>
   4120 
   4121 		<h2>
   4122 			13 <a name="Choice_Patterns" href="#Choice_Patterns">Choice
   4123 				Patterns</a>
   4124 		</h2>
   4125 
   4126 
   4127 		<p>A choice pattern is a string that chooses among a number of
   4128 			strings, based on numeric value. It has the following form:</p>
   4129 
   4130 		<p>
   4131 			&lt;choice_pattern&gt; = &lt;choice&gt; ( '|' &lt;choice&gt; )*<br>
   4132 			&lt;choice&gt; = &lt;number&gt;&lt;relation&gt;&lt;string&gt;<br>
   4133 			&lt;number&gt; = ('+' | '-')? (<font size="3">'' | [0-9]+
   4134 				('.' [0-9]+)?)<br> &lt;relation&gt; = '&lt;' | '
   4135 			</font><span style="color: blue">'</span>
   4136 		</p>
   4137 
   4138 		<p>The interpretation of a choice pattern is that given a number
   4139 			N, the pattern is scanned from right to left, for each choice
   4140 			evaluating &lt;number&gt; &lt;relation&gt; N. The first choice that
   4141 			matches results in the corresponding string. If no match is found,
   4142 			then the first string is used. For example:</p>
   4143 
   4144 		<table border="1" cellpadding="0" cellspacing="0">
   4145 			<tr>
   4146 				<td width="33%">Pattern</td>
   4147 				<td width="33%">N</td>
   4148 				<td width="34%">Result</td>
   4149 			</tr>
   4150 			<tr>
   4151 				<td width="33%" rowspan="4">0Rf|1Ru|1&lt;Re</td>
   4152 				<td width="33%">-<font size="3">, </font>-3, -1, -0.000001
   4153 				</td>
   4154 				<td width="34%">Rf (defaulted to first string)</td>
   4155 			</tr>
   4156 			<tr>
   4157 				<td width="33%">0, 0.01, 0.9999</td>
   4158 				<td width="34%">Rf</td>
   4159 			</tr>
   4160 			<tr>
   4161 				<td width="33%">1</td>
   4162 				<td width="34%">Ru</td>
   4163 			</tr>
   4164 			<tr>
   4165 				<td width="33%">1.00001, 5, 99, <font size="3"></font></td>
   4166 				<td width="34%">Re</td>
   4167 			</tr>
   4168 		</table>
   4169 		<p>Quoting is done using ' characters, as in date or number
   4170 			formats.</p>
   4171 		<h2>
   4172 			14 <a name="Annotations" href="#Annotations">Annotations and Labels</a>
   4173 		</h2>
   4174 		<p>Annotations provide information about characters, typically
   4175 			used in input. For example, on a mobile keyboard they can be used to
   4176 			do completion. They are typically used for symbols, especially emoji
   4177 			characters.  </p>
   4178 		<p>For more information, see version 5.0 or <a href="http://unicode.org/reports/tr51/">UTR #51, Unicode Emoji</a>. (Note that during the period between the publication of CLDR v31 and that of Emoji 5.0, the Latest Proposed Update link should be used to get to the draft specification for Emoji 5.0.)<br>
   4179 		</p>
   4180 
   4181 		<p class="dtd">&lt;!ELEMENT annotations ( annotation* ) &gt;</p>
   4182 		<p class="dtd">&lt;!ELEMENT annotation ( #PCDATA ) &gt;</p>
   4183 		<p class="dtd">&lt;!ATTLIST annotation cp CDATA #REQUIRED &gt;</p>
   4184 		<p class="dtd">&lt;!ATTLIST annotation type (tts) #IMPLIED &gt;</p>
   4185 
   4186 		<p>There are two kinds of annotations: <strong>short names</strong>, and <strong>keywords</strong>.</p>
   4187       <p>With an attribute <strong>type="tts"</strong>, the value is  a <strong>short name</strong>, such as one that can be used for text-to-speech. It should be treated as one of the element values for other
   4188           purposes.</p>
   4189         <p>When there is no<strong> type </strong>attribute, the value is a set of <strong>keywords</strong>, delimited by |. Spaces around each element are to be trimmed. The <strong>keywords</strong> are  words associated with the character(s) that might be used in searching for the character, or in predictive typing on keyboards. The short name itself can be used as a keyword.</p>
   4190         <p>Here is an example from German:</p>
   4191 
   4192 		<pre class="example">
   4193 &lt;annotation cp=""&gt;schlecht | Hand | Daumen | nach unten&lt;/annotation&gt;
   4194 &lt;annotation cp="" type="tts"&gt;Daumen runter&lt;/annotation&gt;
   4195 </pre>
   4196 
   4197 		<p>The cp attribute value has two formats: either a single string, or if contained within [] a UnicodeSet. The latter format can contain 
   4198 			multiple code points or strings. A code point pr string can occur in multiple annotation
   4199 			element <strong>cp</strong> values, such as the following, which also contains the
   4200 			&quot;thumbs down&quot; character.</p>
   4201 		<pre class="example"><span >&lt;annotation cp='[---]'&gt;hand&lt;/annotation&gt;</span></pre>
   4202 		<p>Both for short names and keywords, values do not have to match between different languages. They should be the most common values that people using <em>that</em> language
   4203 			would associated with those characters. For example, a &quot;black heart&quot; might
   4204 			have the association of &quot;wicked&quot; in English, but not in some other languages.</p>
   4205 		<p>The cp value may contain sequences, but does not contain any Emoji or Text
   4206   		Variant (VS15 &amp; VS16) characters. All such characters should be removed before looking up any short names and keywords.</p>
   4207 		<h3>
   4208 			14.1 <a name="SynthesizingNames" href="#SynthesizingNames">Synthesizing Sequence Names</a>
   4209 		</h3>
   4210 		<p>Many emoji are represented by sequences of characters. When there are no annotation
   4211 			elements for that string, the short name can be synthesized as follows.
   4212 			<strong>Note:</strong> The process details may change after the release of this
   4213 			specification, and may further change in the future if other sequences are added.
   4214 			Please see the <a href='https://sites.google.com/site/cldr/index/downloads/cldr-30#TOC-Known-Issues'>Known
   4215 			Issues</a> section of the CLDR download page for any updates.</p>
   4216 		<ol>
   4217 		  <li>If  <strong>sequence</strong> is an <strong>emoji flag sequence</strong>, look up the territory name in CLDR for the
   4218 		  		corresponding ASCII characters and return as the short name. For example, the regional
   4219 		  		indicator symbols P+F would map to Franzsisch-Polynesien in German.</li>
   4220 		  <li>If <strong>sequence</strong> is an <strong>emoji tag sequence</strong>, look up the subdivision name in CLDR for the
   4221 		  		corresponding ASCII characters and return as the short name. For example, the TAG characters gbsct would map to Schottland in German.</li>
   4222 		  <li>If  <strong>sequence</strong> is a keycap sequence or , use the characterLabel for &quot;keycap&quot;
   4223 		  		as the <strong>prefixName</strong> and  set the <strong>suffix</strong> to be the sequence (or &quot;10&quot; in the case of ), then go to step 8.</li>
   4224 		  <li>Let<strong> suffix</strong> and <strong>prefixName</strong> be &quot;&quot;.</li>
   4225 		  <li>If  <strong>sequence</strong> contains any emoji modifiers, move them (in order) into <strong>suffix</strong>, removing them from  <strong>sequence</strong>.		  </li>
   4226 		  <li>If  <strong>sequence</strong> is a &quot;KISS&quot;, &quot;HEART&quot; or &quot;FAMILY&quot; emoji
   4227 		  		ZWJ sequence, move the characters in  <strong>sequence</strong> to the front of <strong>suffix</strong>, and set the <strong>sequence</strong> to be  &quot;&quot;, &quot;&quot;, or &quot;&quot;
   4228 		  		respectively, and go to step 7.
   4229 		        <ol>
   4230 		      <li>A KISS sequence contains ZWJ, &quot;&quot;,  and &quot;&quot;, which are skipped in moving to <strong>suffix</strong>.</li>
   4231 		      <li>A HEART sequence contains ZWJ and &quot;&quot;, which are skipped in moving to <strong>suffix</strong>.</li>
   4232 		      <li>A FAMILY sequence contains only characters from the set {, , , , , , }.
   4233 		      		Nothing is skipped in  moving to <strong>suffix</strong>, except ZWJ.</li>
   4234 	        </ol>
   4235 		  </li>
   4236 		  <li>If   <strong>sequence</strong> ends with  or , and does not have a name, remove the  or   and move the name for &quot;&quot; or
   4237 	      &quot;&quot; respectively to the start of<strong> prefixName</strong>.</li>
   4238 		  <li>Transform   <strong>sequence</strong> and append to <strong>prefixName</strong>, by successively getting  names for the longest subsequences, skipping any singleton ZWJ characters. If there is more than one name,  use the listPattern for unit-short, type=2 to link them.</li>
   4239 		  <li>Transform <strong>suffix</strong> into <strong>suffixName</strong> in the same manner.</li>
   4240 		  <li>If both the <strong>prefixName</strong> and <strong>suffixName</strong> are non-empty, form the name by joining them with the  &quot;category-list&quot; characterLabelPattern and return it. Otherwise return whichever of them is non-empty.</li>
   4241 	    </ol>
   4242 		<p>The synthesized keywords can follow a similar process.</p>
   4243 		<ol>
   4244 		  <li>For an <strong>emoji flag sequence</strong> or <strong>emoji tag sequence</strong> representing a subdivision, use &quot;flag&quot;.</li>
   4245 		  <li>For keycap sequences, use &quot;keycap&quot;.</li>
   4246 		  <li>For other sequences, add the keywords for the subsequences used to get the short names for <strong>prefixName</strong>, and the short names used for <strong>suffixName</strong>.</li>
   4247 	    </ol>
   4248 		<p>Some examples for   English data (v30) are given in the following table.</p>
   4249 	  <table cellspacing="0" cellpadding="2" border="1">
   4250         <caption>Synthesized Emoji Sequence Names</caption>
   4251 		  <tbody>
   4252 		    <tr>
   4253 		      <th>Sequence</th>
   4254 		      <th>Short Name</th>
   4255 		      <th>Keywords</th>
   4256 	        </tr>
   4257 		    <tr>
   4258 		      <td></td>
   4259 		      <td>European Union</td>
   4260 		      <td>flag</td>
   4261 	        </tr>
   4262 		    <tr>
   4263 		      <td>#</td>
   4264 		      <td>keycap: #</td>
   4265 		      <td>keycap</td>
   4266 	        </tr>
   4267 		    <tr>
   4268 		      <td>9</td>
   4269 		      <td>keycap: 9</td>
   4270 		      <td>keycap</td>
   4271 	        </tr>
   4272 		    <tr>
   4273 		      <td></td>
   4274 		      <td>kiss</td>
   4275 		      <td>couple</td>
   4276 	        </tr>
   4277 		    <tr>
   4278 		      <td></td>
   4279 		      <td>kiss: woman, woman</td>
   4280 		      <td>couple, woman</td>
   4281 	        </tr>
   4282 		    <tr>
   4283 		      <td></td>
   4284 		      <td>couple with heart</td>
   4285 		      <td>love, couple</td>
   4286 	        </tr>
   4287 		    <tr>
   4288 		      <td></td>
   4289 		      <td>couple with heart: woman, woman</td>
   4290 		      <td>love, couple, woman</td>
   4291 	        </tr>
   4292 		    <tr>
   4293 		      <td></td>
   4294 		      <td>family</td>
   4295 		      <td>family</td>
   4296 	        </tr>
   4297 		    <tr>
   4298 		      <td></td>
   4299 		      <td>family: woman, woman, girl</td>
   4300 		      <td>woman, family, girl</td>
   4301 	        </tr>
   4302 		    <tr>
   4303 		      <td></td>
   4304 		      <td>boy: light skin tone</td>
   4305 		      <td>young, light skin tone, boy</td>
   4306 	        </tr>
   4307 		    <tr>
   4308 		      <td></td>
   4309 		      <td>woman: dark skin tone</td>
   4310 		      <td>woman, dark skin tone</td>
   4311 	        </tr>
   4312 		    <tr>
   4313 		      <td></td>
   4314 		      <td>man judge</td>
   4315 		      <td>scales, justice, man</td>
   4316 	        </tr>
   4317 		    <tr>
   4318 		      <td></td>
   4319 		      <td>man judge: dark skin tone</td>
   4320 		      <td>scales, justice, dark skin tone, man</td>
   4321 	        </tr>
   4322 		    <tr>
   4323 		      <td></td>
   4324 		      <td>woman judge</td>
   4325 		      <td>woman, scales, judge</td>
   4326 	        </tr>
   4327 		    <tr>
   4328 		      <td></td>
   4329 		      <td>woman judge: medium-light skin tone</td>
   4330 		      <td>woman, scales, medium-light skin tone, judge</td>
   4331 	        </tr>
   4332 		    <tr>
   4333 		      <td></td>
   4334 		      <td>police officer</td>
   4335 		      <td>police, cop, officer</td>
   4336 	        </tr>
   4337 		    <tr>
   4338 		      <td></td>
   4339 		      <td>police officer: dark skin tone</td>
   4340 		      <td>police, cop, officer, dark skin tone</td>
   4341 	        </tr>
   4342 		    <tr>
   4343 		      <td></td>
   4344 		      <td>man police officer</td>
   4345 		      <td>police, cop, officer, man</td>
   4346 	        </tr>
   4347 		    <tr>
   4348 		      <td></td>
   4349 		      <td>man police officer: medium-light skin tone</td>
   4350 		      <td>police, cop, officer, medium-light skin tone, man</td>
   4351 	        </tr>
   4352 		    <tr>
   4353 		      <td></td>
   4354 		      <td>woman police officer</td>
   4355 		      <td>police, woman, cop, officer</td>
   4356 	        </tr>
   4357 		    <tr>
   4358 		      <td></td>
   4359 		      <td>woman police officer: dark skin tone</td>
   4360 		      <td>police, woman, cop, officer, dark skin tone</td>
   4361 	        </tr>
   4362 		    <tr>
   4363 		      <td></td>
   4364 		      <td>person biking</td>
   4365 		      <td>cyclist, bicycle, biking</td>
   4366 	        </tr>
   4367 		    <tr>
   4368 		      <td></td>
   4369 		      <td>person biking: dark skin tone</td>
   4370 		      <td>cyclist, bicycle, biking, dark skin tone</td>
   4371 	        </tr>
   4372 		    <tr>
   4373 		      <td></td>
   4374 		      <td>man biking</td>
   4375 		      <td>cyclist, bicycle, biking, man</td>
   4376 	        </tr>
   4377 		    <tr>
   4378 		      <td></td>
   4379 		      <td>man biking: dark skin tone</td>
   4380 		      <td>cyclist, bicycle, biking, dark skin tone, man</td>
   4381 	        </tr>
   4382 		    <tr>
   4383 		      <td></td>
   4384 		      <td>woman biking</td>
   4385 		      <td>cyclist, woman, bicycle, biking</td>
   4386 	        </tr>
   4387 		    <tr>
   4388 		      <td></td>
   4389 		      <td>woman biking: dark skin tone</td>
   4390 		      <td>cyclist, woman, bicycle, biking, dark skin tone</td>
   4391 	        </tr>
   4392 	      </tbody>
   4393 	  </table>
   4394 
   4395 
   4396 	  <p>
   4397 			For more information, see <a href='http://unicode.org/reports/tr51'>Unicode
   4398 				Emoji</a>.
   4399 		</p>
   4400 	  		<h3>
   4401 			14.2 <a name="Character_Labels" href="#Character_Labels">Annotations Character Labels</a>
   4402 		</h3>
   4403 	  		<p class="dtd">&lt;!ELEMENT characterLabels ( alias | ( characterLabelPattern*, characterLabel*, special* ) ) &gt; </p>
   4404 	  		<p class="dtd">&lt;!ELEMENT characterLabelPattern ( #PCDATA ) &gt; </p>
   4405 	  		<p class="dtd">&lt;!ATTLIST characterLabelPattern type NMTOKEN #REQUIRED &gt;</p>
   4406 	  		<p class="dtd">&lt;!ATTLIST characterLabelPattern count (0 | 1 | zero | one | two | few | many | other) #IMPLIED &gt;     &lt;!-- count only used for certain patterns&quot; --&gt;</p>
   4407 	  		<p class="dtd">&lt;!ELEMENT characterLabel ( #PCDATA ) &gt; </p>
   4408 	  		<p class="dtd">&lt;!ATTLIST characterLabel type NMTOKEN #REQUIRED &gt;</p>
   4409             <p>The character labels can be used for categories or groups of characters in a character picker or keyboard palette. They have the above structure. Items with special meanings are explained below. Many of the categories are based on terms used in Unicode. Consult the <a href='http://www.unicode.org/glossary/'>Unicode Glossary</a> where the meaning is not clear.</p>
   4410 <p>The following are special patterns used in composing labels.</p>
   4411 <table>
   4412 <caption>characterLabelPattern</caption>
   4413 <tr>
   4414   <th>Type</th>
   4415   <th>English</th>
   4416   <th>Description of the group specified.</th>
   4417 </tr>
   4418 <tr><th>all</th><td>{0}  all</td>
   4419 <td>Used where the title {0} is just a subset. For example, {0} might be &quot;Latin&quot;, and contain the most common Latin characters. Then &quot;Latin  all&quot; would be all of them.</td></tr>
   4420 <tr><th>category-list</th><td>{0}: {1}</td>
   4421 <td>Use for a name, where {0} is the main item like &quot;Family&quot;, and {1} is a list of one or more components or subcategories. The list is formatted using a list pattern.</td></tr>
   4422 <tr><th>compatibility</th><td>{0}  compatibility</td>
   4423 <td>For grouping Unicode compatibility characters separately, such as &quot;Arabic  compatibility&quot;.</td></tr>
   4424 <tr><th>enclosed</th><td>{0}  enclosed</td>
   4425 <td>For indicating enclosed forms, such as &quot;digits  enclosed&quot;</td></tr>
   4426 <tr><th>extended</th><td>{0}  extended</td>
   4427 <td>For indicating a group of &quot;extended&quot; characters (special use, technical, etc.)</td></tr>
   4428 <tr><th>historic</th><td>{0}  historic</td>
   4429   <td>For indicating a group of &quot;historic&quot; characters (no longer in common use).</td></tr>
   4430 <tr><th>miscellaneous</th><td>{0}  miscellaneous</td>
   4431   <td>For indicating a group of &quot;miscellaneous&quot; characters (typically that don't fall into a broader class).</td></tr>
   4432 <tr><th>other</th><td>{0}  other</td>
   4433   <td>Used where the title {0} is just a subset. For example, {0} might be &quot;Latin&quot;, and contain the most common Latin characters. Then &quot;Latin  other&quot; would be the rest of them.</td></tr>
   4434 <tr><th>scripts</th><td>scripts  {0}</td>
   4435 <td>For indicating a group of &quot;scripts&quot; characters matching {0}. The value for {0} may be a geographic indicator, like &quot;Africa&quot; (although there are specific combinations listed below), or some other designation, like &quot;other&quot; (from below).</td></tr>
   4436 <tr>
   4437   <th>strokes</th><td>{0} strokes</td>
   4438   <td>Used as an index title for CJK characters. It takes a &quot;count&quot; value, which allows the right plural form to be specified for the language.</td></tr>
   4439 </table>
   4440 <p>The following are character labels. Where the meaning of the label is fairly clear (like "animal") or is in the Unicode glossary, it is omitted.</p>
   4441 <table>
   4442 <caption>characterLabel</caption>
   4443 <tr><th>activities</th><td>activity</td>
   4444 <td>Human activities, such as running.</td></tr>
   4445 <tr><th>african_scripts</th><td>African script</td>
   4446 <td>Scripts associated with the continent of Africa.</td></tr>
   4447 <tr><th>american_scripts</th><td>American script</td>
   4448 <td>Scripts associated with the continents of North and South America.</td></tr>
   4449 <tr><th>animals_nature</th><td>animal or nature</td>
   4450   <td>A broad category uses for </td></tr>
   4451 <tr><th>arrows</th><td>arrow</td>
   4452 <td>Arrow symbols</td></tr>
   4453 <tr><th>body</th><td>body</td>
   4454 <td>Symbols for body parts, such as an arm.</td></tr>
   4455 <tr><th>box_drawing</th><td>box drawing</td>
   4456 <td>Unicode box-drawing characters (geometric shapes)</td></tr>
   4457 <tr><th>bullets_stars</th><td>bullet or star</td>
   4458 <td>Unicode bullets (such as  or  or ) or stars (...)</td></tr>
   4459 <tr><th>consonantal_jamo</th><td>consonantal jamo</td>
   4460   <td>Korean Jamo consonants.</td></tr>
   4461 <tr><th>currency_symbols</th><td>currency symbol</td>
   4462   <td>Symbols such as $, , </td></tr>
   4463 <tr><th>dash_connector</th><td>dash or connector</td>
   4464   <td>Characters like _ or </td></tr>
   4465 <tr><th>dingbats</th><td>dingbat</td>
   4466 <td>Font dingbat characters, such as  or .</td></tr>
   4467 <tr><th>downwards_upwards_arrows</th><td>downwards upwards arrow</td>
   4468   <td>,...</td></tr>
   4469 <tr><th>female</th><td>female</td>
   4470 <td>Indicates that a character is female or feminine in appearance.</td></tr>
   4471 <tr><th>format</th><td>format</td>
   4472 <td>A Unicode format character.</td></tr>
   4473 <tr><th>format_whitespace</th><td>format &amp; whitespace</td>
   4474   <td>A Unicode format character or whitespace.</td></tr>
   4475 <tr><th>full_width_form_variant</th><td>full-width variant</td>
   4476   <td>Full width variant, such as a wide A.</td></tr>
   4477 <tr><th>half_width_form_variant</th><td>half-width variant</td>
   4478 <td>Narrow width variant, such as a half-width katakana character.</td></tr>
   4479 <tr><th>han_characters</th><td>Han character</td>
   4480   <td>Han (aka CJK: Chinese, Japanese, or Korean) ideograph</td></tr>
   4481 <tr><th>han_radicals</th><td>Han radical</td>
   4482   <td>Radical (component) used in Han characters.</td></tr>
   4483 <tr><th>hanja</th><td>hanja</td>
   4484   <td>Korean name for Han character.</td></tr>
   4485 <tr><th>hanzi_simplified</th><td>Hanzi (simplified)</td>
   4486   <td>Simplified Chinese ideograph</td></tr>
   4487 <tr><th>hanzi_traditional</th><td>Hanzi (traditional)</td>
   4488   <td>Traditional Chinese ideograph</td></tr>
   4489 <tr><th>historic_scripts</th><td>historic script</td>
   4490   <td>Script no longer in common modern usage, such as Runes or Hieroglyphs.</td></tr>
   4491 <tr><th>ideographic_desc_characters</th><td>ideographic desc. character</td>
   4492   <td>Special Unicode characters (see the glossary).</td></tr>
   4493 <tr><th>kanji</th><td>kanji</td>
   4494   <td>Japanese Han ideograph</td></tr>
   4495 <tr><th>keycap</th><td>keycap</td>
   4496   <td>A key on a computer keyboard or phone. For example, the &quot;3&quot; key on a phone or laptop would be &quot;keycap: 3&quot;</td></tr>
   4497 <tr><th>limited_use</th><td>limited-use</td>
   4498 <td>Not in common modern use.</td></tr>
   4499 <tr><th>male</th><td>male</td>
   4500   <td>Indicates that a character is male or masculine in appearance.</td></tr>
   4501 <tr><th>modifier</th><td>modifier</td>
   4502 <td>A Unicode modifier letter or symbol.</td></tr>
   4503 <tr><th>nonspacing</th><td>nonspacing</td>
   4504   <td>Uses for characters that occupy no width by themselves, such as the  over the a in .</td></tr>
   4505 </table>
   4506 		  		<h3>
   4507 			14.3 <a name="Typographic_Names" href="#Typographic_Names">Typographic Names</a>
   4508 		</h3>
   4509 
   4510 		<p class='dtd'>&lt;!ELEMENT typographicNames ( alias | ( axisName*, styleName*, featureName*, special* ) ) &gt;</p>
   4511 		<p class='dtd'>&lt;!ELEMENT axisName ( #PCDATA ) &gt;<br>
   4512 		  &lt;!ATTLIST axisName type (ital | opsz | slnt | wdth | wght) #REQUIRED &gt;<br>
   4513 	  &lt;!ATTLIST axisName alt NMTOKENS #IMPLIED &gt;</p>
   4514 		<p class='dtd'>&lt;!ELEMENT styleName ( #PCDATA ) &gt;<br>
   4515 		  &lt;!ATTLIST styleName type (ital | opsz | slnt | wdth | wght) #REQUIRED &gt;<br>
   4516 		  &lt;!ATTLIST styleName subtype NMTOKEN #REQUIRED &gt;<br>
   4517 	  &lt;!ATTLIST styleName alt NMTOKENS #IMPLIED &gt;</p>
   4518 		<p class='dtd'>&lt;!ELEMENT featureName ( #PCDATA ) &gt;<br>
   4519 		  &lt;!ATTLIST featureName type (afrc | cpsp | dlig | frac | lnum | onum | ordn | pnum | smcp | tnum | zero) #REQUIRED &gt;<br>
   4520 	  &lt;!ATTLIST featureName alt NMTOKENS #IMPLIED &gt;</p>
   4521 		<p>The typographic names provide for names of font features for use in a UI. This is useful for apps that show the name of font styles and design axes according to the users languages. It would also be useful for system-level libraries.</p>
   4522 		<p>The identifers (types) use the tags from the OpenType Feature Tag Registry. Given their large number, only the names of frequently-used OpenType feature names are available CLDR. (Many features are not user-visible settings, but instead serve as a data channel for sofware to pass information to the font). 
   4523 		The example below shows an approach for using the CLDR data. Of course, applications are free to implement their own algorithms depending on their specific needs.</p>
   4524 <p>To find a localized subfamily name such as &ldquo;Extraleicht Schmal&rdquo; for a font called &ldquo;Extralight Condensed&rdquo;, a system or application library might do the following: </p>
   4525         <ol>
   4526           <li>
   4527             <p>Determine the set of languages in which the subfamily name can potentially be returned.This is the union of the languages for which the font contains &lsquo;name&rsquo; table entries with ID 2 or 17, plus the languages for which CLDR supplies typographic names. </p>
   4528           </li>
   4529           <li>
   4530             <p>Use a language matching algorithm such as in ICU to find the best available language given the user preferences. The resulting subfamily name will be localized to this language. </p>
   4531           </li>
   4532           <li>
   4533             <p>If the font&rsquo;s &lsquo;name&rsquo; table contains a typographic subfamily name (ID17) in this language and all font variation axes are set to their defaults, return this name. </p>
   4534           </li>
   4535           <li>
   4536             <p>If the font&rsquo;s &lsquo;name&rsquo; table contains a font subfamilyname (&lsquo;name&rsquo;ID2) in this language and all font variation axes are set to their defaults, return this name. </p>
   4537           </li>
   4538           <li>
   4539             <p>If the font has a style attributes (STAT) table, lookup the design axis tags and their ordering. If the font has no STAT table, assume [Width, Weight, Slant] as axis ordering, and infer the font&rsquo;s style atributes from other available data in the font (eg. the OS/2 table). </p>
   4540           </li>
   4541           <li>For each design axis, find a localized style name for its value.
   4542              <ol>
   4543             <li>If the font&rsquo;s style attributes point to a &lsquo;name&rsquo; table entry that is available the result language, use this name.</li>
   4544             <li>Otherwise, generate a fallback name from CLDR style Name data. 
   4545                <ol>
   4546                 <li>The type key is the OpenType axis tag ( &lsquo;wght&rsquo;). The subtype and alt keys are taken from the entry in English CLDR where the string is equal to the English name in the font. For example, when the font uses a weight whose English style name is &ldquo;Extralight&rdquo;, this will lead to subtype = &ldquo;200&rdquo; and alt = &ldquo;variant&rdquo;. If there is no match, take the axis value (&ldquo;200&rdquo;) for subtype and the empty string for alt. </li>
   4547               <li>Look up (type, subtype) in a data table derived from CLDR&rsquo;s style names. If CLDR supplies multiple alternate names for this (type, subtype), use the one whose &ldquo;alt&rdquo; key is matching; otherwise, use the default alternate (which has no &ldquo;alt&rdquo; atribute in CLDR).</li>
   4548             </ol>
   4549           </li>
   4550         </ol>
   4551         </li>
   4552         <li>Concatenate the strings, with a separator between them.</li>
   4553         </ol>
   4554 
   4555 	  <hr>
   4556 		<p class="copyright">
   4557 			Copyright  20012018 Unicode, Inc. All
   4558 			Rights Reserved. The Unicode Consortium makes no expressed or implied
   4559 			warranty of any kind, and assumes no liability for errors or
   4560 			omissions. No liability is assumed for incidental and consequential
   4561 			damages in connection with or arising out of the use of the
   4562 			information or programs contained or accompanying this technical
   4563 			report. The Unicode <a href="http://unicode.org/copyright.html">Terms
   4564 				of Use</a> apply.
   4565 		</p>
   4566 		<p class="copyright">Unicode and the Unicode logo are trademarks
   4567 			of Unicode, Inc., and are registered in some jurisdictions.</p>
   4568 	</div>
   4569 
   4570 </body>
   4571 
   4572 </html>
   4573