Home | History | Annotate | Download | only in ldml
      1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
      2 "http://www.w3.org/TR/html4/loose.dtd">
      3 <html>
      4 
      5 <head>
      6 <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
      7 <meta http-equiv="Content-Language" content="en-us">
      8 <link rel="stylesheet" href="http://unicode.org/reports/reports.css"
      9 	type="text/css">
     10 <title>UTS #35: Unicode Locale Data Markup Language</title>
     11 <style type="text/css">
     12 <!--
     13 .dtd {
     14 	font-family: monospace;
     15 	font-size: 90%;
     16 	background-color: #CCCCFF;
     17 	border-style: dotted;
     18 	border-width: 1px;
     19 }
     20 
     21 .xmlExample {
     22 	font-family: monospace;
     23 	font-size: 80%
     24 }
     25 
     26 .blockedInherited {
     27 	font-style: italic;
     28 	font-weight: bold;
     29 	border-style: dashed;
     30 	border-width: 1px;
     31 	background-color: #FF0000
     32 }
     33 
     34 .inherited {
     35 	font-weight: bold;
     36 	border-style: dashed;
     37 	border-width: 1px;
     38 	background-color: #00FF00
     39 }
     40 
     41 .element {
     42 	font-weight: bold;
     43 	color: red;
     44 }
     45 
     46 .attribute {
     47 	font-weight: bold;
     48 	color: maroon;
     49 }
     50 
     51 .attributeValue {
     52 	font-weight: bold;
     53 	color: blue;
     54 }
     55 
     56 li, p {
     57 	margin-top: 0.5em;
     58 	margin-bottom: 0.5em
     59 }
     60 
     61 h2, h3, h4, h5, table {
     62 	margin-top: 1.5em;
     63 	margin-bottom: 0.5em;
     64 }
     65 
     66 h5 {
     67 	font-size: medium;
     68 	font-style: italic
     69 }
     70 -->
     71 </style>
     72 </head>
     73 
     74 <body>
     75 
     76 	<table class="header" width="100%">
     77 		<tr>
     78 			<td class="icon"><a href="http://unicode.org"> <img
     79 					alt="[Unicode]" src="http://unicode.org/webscripts/logo60s2.gif"
     80 					width="34" height="33"
     81 					style="vertical-align: middle; border-left-width: 0px; border-bottom-width: 0px; border-right-width: 0px; border-top-width: 0px;"></a>&nbsp;
     82 				<a class="bar" href="http://www.unicode.org/reports/">Technical
     83 					Reports</a></td>
     84 		</tr>
     85 		<tr>
     86 			<td class="gray">&nbsp;</td>
     87 		</tr>
     88 	</table>
     89 	<div class="body">
     90 		<h2 style="text-align: center">
     91 			Unicode Technical Standard #35
     92 		</h2>
     93 		<h1 style="text-align: center">Unicode Locale Data Markup Language (LDML)</h1>
     94 
     95 		<!-- At least the first row of this header table should be identical across the parts of this UTS. -->
     96 		<table border="1" cellpadding="2" cellspacing="0" class="wide">
     97 			<tr>
     98 				<td>Version</td>
     99 				<td>34</td>
    100 			</tr>
    101 			<tr>
    102 				<td>Editors</td>
    103 				<td><a
    104 					href="https://plus.google.com/114199149796022210033?rel=author">
    105 						Mark Davis</a> (<a href="mailto:markdavis (a] google.com">markdavis (a] google.com</a>)
    106 					and <a href="tr35.html#Acknowledgments">other CLDR committee
    107 						members</a></td>
    108 			</tr>
    109 			<tr>
    110 				<td>Date</td>
    111 				<td>2018-10-10</td>
    112 			</tr>
    113 			<tr>
    114 				<!-- This link must be made live when posting the final version but is disabled during proposed update stage. -->
    115 				<td>This Version</td>
    116 				<td>
    117 				<a href="http://www.unicode.org/reports/tr35/tr35-53/tr35.html">
    118 				http://www.unicode.org/reports/tr35/tr35-53/tr35.html</a></td>
    119 			</tr>
    120 			<tr>
    121 				<td>Previous Version</td>
    122 				<td>
    123 				<a href="http://www.unicode.org/reports/tr35/tr35-51/tr35.html">http://www.unicode.org/reports/tr35/tr35-51/tr35.html</a></td>
    124 			</tr>
    125 			<tr>
    126 				<td>Latest Version</td>
    127 				<td><a href="http://www.unicode.org/reports/tr35/">http://www.unicode.org/reports/tr35/</a></td>
    128 			</tr>
    129 			<tr>
    130 				<td>Corrigenda</td>
    131 				<td><a href="http://unicode.org/cldr/corrigenda.html">http://unicode.org/cldr/corrigenda.html</a></td>
    132 			</tr>
    133 			<tr>
    134 				<td>Latest Proposed Update</td>
    135 				<td><a href="http://www.unicode.org/reports/tr35/proposed.html">http://www.unicode.org/reports/tr35/proposed.html</a></td>
    136 			</tr>
    137 			<tr>
    138 				<td>Namespace</td>
    139 				<td><a href="http://cldr.unicode.org/">http://cldr.unicode.org/</a></td>
    140 			</tr>
    141 			<tr>
    142 				<td>DTDs</td>
    143 				<td><a href="http://unicode.org/cldr/dtd/34/">
    144 				http://unicode.org/cldr/dtd/34/</a></td>
    145 			</tr>
    146 			<tr>
    147 				<td>Revision</td>
    148 				<td><a href="#Modifications">53</a></td>
    149 			</tr>
    150 		</table>
    151 		<h3>
    152 			<i>Summary</i>
    153 		</h3>
    154 		<p>
    155 			This document describes an XML format (<i>vocabulary</i>) for the
    156 			exchange of structured locale data. This format is used in the <a
    157 				href="http://cldr.unicode.org/">Unicode Common Locale Data
    158 				Repository</a>.
    159 		</p>
    160 
    161 		<h3>
    162 			<i>Status</i>
    163 		</h3>
    164 
    165 		<!-- NOT YET APPROVED 
    166 		<p>
    167 				<i class="changed">This is a<b><font color="#ff3333">
    168 				draft </font></b>document which may be updated, replaced, or superseded by
    169 				other documents at any time. Publication does not imply endorsement
    170 				by the Unicode Consortium. This is not a stable document; it is
    171 				inappropriate to cite this document as other than a work in
    172 				progress.
    173 			</i>
    174 		</p>
    175 		 END NOT YET APPROVED -->
    176 		<!-- APPROVED -->
    177 		<p>
    178 			<i>This document has been reviewed by Unicode members and other
    179 				interested parties, and has been approved for publication by the
    180 				Unicode Consortium. This is a stable document and may be used as
    181 				reference material or cited as a normative reference by other
    182 				specifications.</i>
    183 		</p>
    184 		<!-- END APPROVED -->
    185 
    186 		<blockquote>
    187 			<p>
    188 				<i><b>A Unicode Technical Standard (UTS)</b> is an independent
    189 					specification. Conformance to the Unicode Standard does not imply
    190 					conformance to any UTS.</i>
    191 			</p>
    192 		</blockquote>
    193 		<p>
    194 			<i>Please submit corrigenda and other comments with the CLDR bug
    195 				reporting form [<a href="http://cldr.unicode.org/index/bug-reports">Bugs</a>].
    196 				Related information that is useful in understanding this document is
    197 				found in the <a href="#References">References</a>. For the latest
    198 				version of the Unicode Standard see [<a
    199 				href="http://www.unicode.org/versions/latest/">Unicode</a>]. For a
    200 				list of current Unicode Technical Reports see [<a
    201 				href="http://www.unicode.org/reports/">Reports</a>]. For more
    202 				information about versions of the Unicode Standard, see [<a
    203 				href="http://www.unicode.org/versions/">Versions</a>].
    204 			</i>
    205 		</p>
    206 
    207 		<!-- This section of Parts should be identical in all of the parts of this UTS. -->
    208 		<h2>
    209 			<a name="Parts" href="#Parts">Parts</a>
    210 		</h2>
    211 		<p>The LDML specification is divided into the following parts:</p>
    212 		<ul class="toc">
    213 			<li>Part 1: <a href="tr35.html#Contents">Core</a> (languages,
    214 				locales, basic structure)
    215 			</li>
    216 			<li>Part 2: <a href="tr35-general.html#Contents">General</a>
    217 				(display names &amp; transforms, etc.)
    218 			</li>
    219 			<li>Part 3: <a href="tr35-numbers.html#Contents">Numbers</a>
    220 				(number &amp; currency formatting)
    221 			</li>
    222 			<li>Part 4: <a href="tr35-dates.html#Contents">Dates</a> (date,
    223 				time, time zone formatting)
    224 			</li>
    225 			<li>Part 5: <a href="tr35-collation.html#Contents">Collation</a>
    226 				(sorting, searching, grouping)
    227 			</li>
    228 			<li>Part 6: <a href="tr35-info.html#Contents">Supplemental</a>
    229 				(supplemental data)
    230 			</li>
    231 			<li>Part 7: <a href="tr35-keyboards.html#Contents">Keyboards</a>
    232 				(keyboard mappings)
    233 			</li>
    234 		</ul>
    235 
    236 		<h2>
    237 			<a name="Contents" href="#Contents">Contents of Part 1, Core</a>
    238 		</h2>
    239 		<!-- START Generated TOC: CheckHtmlFiles -->
    240 		<ul class="toc">
    241 			<li>1 <a href="#Introduction">Introduction</a>
    242 				<ul class="toc">
    243 					<li>1.1 <a href="#Conformance">Conformance</a></li>
    244 				</ul>
    245 			</li>
    246 			<li>2 <a href="#Locale">What is a Locale?</a></li>
    247 			<li>3 <a href="#Identifiers">Unicode Language and Locale
    248 					Identifiers</a>
    249 				<ul class="toc">
    250 					<li>3.1 <a href="#Unicode_language_identifier">Unicode
    251 							Language Identifier</a></li>
    252 					<li>3.2 <a href="#Unicode_locale_identifier">Unicode
    253 							Locale Identifier</a></li>
    254 					<li>3.3 <a href="#BCP_47_Conformance">BCP 47 Conformance</a>
    255 						<ul class="toc">
    256 							<li>3.3.1 <a href="#BCP_47_Language_Tag_Conversion">BCP
    257 									47 Language Tag Conversion</a></li>
    258 						</ul>
    259 					</li>
    260 					<li>3.4 <a href="#Field_Definitions">Language Identifier
    261 							Field Definitions</a>
    262 						<ul class="toc">
    263 							<li>Table: <a href="#Language_Locale_Field_Definitions">Language
    264 									Identifier Field Definitions</a></li>
    265 						</ul>
    266 					</li>
    267 					<li>3.5 <a href="#Special_Codes">Special Codes</a>
    268 						<ul class="toc">
    269 							<li>3.5.1 <a href="#Unknown_or_Invalid_Identifiers">Unknown
    270 									or Invalid Identifiers</a></li>
    271 							<li>3.5.2 <a href="#Numeric_Codes">Numeric Codes</a></li>
    272 							<li>3.5.3 <a href="#Private_Use">Private Use Codes</a>
    273 								<ul class="toc">
    274 									<li>Table: <a href="#Private_Use_CLDR">Private Use
    275 											Codes in CLDR</a></li>
    276 								</ul>
    277 							</li>
    278 						</ul>
    279 					</li>
    280 					<li>3.6 <a href="#Locale_Extension_Key_and_Type_Data">Unicode
    281 							BCP 47 U Extension</a>
    282 						<ul class="toc">
    283 							<li>3.6.1 <a href="#Key_And_Type_Definitions_">Key And
    284 									Type Definitions</a>
    285 								<ul class="toc">
    286 									<li>Table: <a href="#Key_Type_Definitions">Key/Type
    287 											Definitions</a></li>
    288 								</ul>
    289 							</li>
    290 							<li>3.6.2 <a href="#Numbering System Data">Numbering
    291 									System Data</a></li>
    292 							<li>3.6.3 <a href="#Time_Zone_Identifiers">Time Zone
    293 									Identifiers</a></li>
    294 							<li>3.6.4 <a href="#Unicode_Locale_Extension_Data_Files">U
    295 									Extension Data Files</a>
    296 							</li>
    297 							<li>3.6.5 <a href="#Unicode_Subdivision_Codes">Subdivision
    298 									Codes</a>
    299 								<ul class="toc">
    300 									<li>3.6.5.1 <a href="#Validity">Validity</a></li>
    301 								</ul>
    302 							</li>
    303 						</ul>
    304 					</li>
    305 					<li>3.7 <a href="#t_Extension">Unicode BCP 47 T Extension</a>
    306 						<ul class="toc">
    307 							<li>3.7.1 <a href="#Transformed_Content_Data_File">T
    308 									Extension Data Files</a></li>
    309 						</ul>
    310 					</li>
    311 					<li>3.8 <a href="#Compatibility_with_Older_Identifiers">Compatibility
    312 							with Older Identifiers</a>
    313 						<ul class="toc">
    314 							<li>3.8.1 <a href="#Old_Locale_Extension_Syntax">Old
    315 									Locale Extension Syntax</a>
    316 								<ul class="toc">
    317 									<li>Table: <a href="#Locale_Extension_Mappings">Locale
    318 											Extension Mappings</a></li>
    319 								</ul>
    320 							</li>
    321 							<li>3.8.2 <a href="#Legacy_Variants">Legacy Variants</a>
    322 								<ul class="toc">
    323 									<li>Table: <a href="#Legacy_Variant_Mappings">Legacy
    324 											Variant Mappings</a></li>
    325 								</ul>
    326 							</li>
    327 							<li>3.8.3 <a href="#Relation_to_OpenI18n">Relation to
    328 									OpenI18n</a></li>
    329 						</ul>
    330 					</li>
    331 					<li>3.9 <a href="#Transmitting_Locale_Information">Transmitting
    332 							Locale Information</a>
    333 						<ul class="toc">
    334 							<li>3.9.1 <a href="#Message_Formatting_and_Exceptions">Message
    335 									Formatting and Exceptions</a></li>
    336 						</ul>
    337 					</li>
    338 					<li>3.10 <a href="#Language_and_Locale_IDs">Unicode
    339 							Language and Locale IDs</a>
    340 						<ul class="toc">
    341 							<li>3.10.1 <a href="#Written_Language">Written Language</a></li>
    342 						  <li>3.10.2 <a href="#Hybrid_Locale">Hybrid Locale Identifiers</a></li>
    343 						</ul>
    344 					</li>
    345 					<li>3.11 <a href="#Validity_Data">Validity Data</a></li>
    346 				</ul>
    347 			</li>
    348 			<li>4 <a href="#Locale_Inheritance">Locale Inheritance and
    349 					Matching</a>
    350 				<ul class="toc">
    351 					<li>4.1 <a href="#Lookup">Lookup</a>
    352 						<ul class="toc">
    353 							<li>4.1.1 <a href="#Bundle_vs_Item_Lookup">Bundle vs
    354 									Item Lookup</a>
    355 								<ul class="toc">
    356 									<li>Table: <a href="#Lookup-Differences">Lookup
    357 											Differences</a></li>
    358 								</ul>
    359 							</li>
    360 							<li>4.1.2 <a href="#Multiple_Inheritance">Lateral
    361 									Inheritance</a>
    362 								<ul class="toc">
    363 									<li>Table: <a href="#Count_Fallback_normal">Count
    364 											Fallback: normal</a></li>
    365 									<li>Table: <a href="#Count_Fallback_currency">Count
    366 											Fallback: currency</a></li>
    367 								</ul>
    368 							</li>
    369 							<li>4.1.3 <a href="#Parent_Locales">Parent Locales</a></li>
    370 						</ul>
    371 					</li>
    372 					<li>4.2 <a href="#Inheritance_and_Validity">Inheritance
    373 							and Validity</a>
    374 						<ul class="toc">
    375 							<li>4.2.1 <a href="#Definitions">Definitions</a></li>
    376 							<li>4.2.2 <a href="#Resolved_Data_File">Resolved Data
    377 									File</a></li>
    378 							<li>4.2.3 <a href="#Valid_Data">Valid Data</a></li>
    379 							<li>4.2.4 <a href="#Checking_for_Draft_Status">Checking
    380 									for Draft Status</a></li>
    381 							<li>4.2.5 <a href="#Keyword_and_Default_Resolution">Keyword
    382 									and Default Resolution</a></li>
    383 							<li>4.2.6 <a 
    384 				href="#Inheritance_vs_Related">Inheritance vs Related Information</a></li>
    385 						</ul>
    386 					</li>
    387 					<li>4.3 <a href="#Likely_Subtags">Likely Subtags</a></li>
    388 					<li>4.4 <a href="#LanguageMatching">Language Matching</a>
    389 					  <ul>
    390 					    <li>4.4.1 <a href="#EnhancedLanguageMatching">Enhanced Language Matching</a></li>
    391 				      </ul>
    392 					</li>
    393 				</ul>
    394 			</li>
    395 			<li>5 <a href="#XML_Format">XML Format</a>
    396 				<ul class="toc">
    397 					<li>5.1 <a href="#Common_Elements">Common Elements</a>
    398 						<ul class="toc">
    399 							<li>5.1.1 <a href="#special">Element special</a>
    400 								<ul class="toc">
    401 									<li>5.1.1.1 <a href="#Sample_Special_Elements">Sample
    402 											Special Elements</a></li>
    403 								</ul>
    404 							</li>
    405 							<li>5.1.2 <a href="#Alias_Elements">Element alias</a>
    406 								<ul class="toc">
    407 									<li>Table: <a href="#Inheritance_with_source_locale_">Inheritance
    408 											with source=&quot;locale&quot;</a></li>
    409 								</ul>
    410 							</li>
    411 							<li>5.1.3 <a href="#Element_displayName">Element
    412 									displayName</a></li>
    413 							<li>5.1.4 <a href="#Escaping_Characters">Escaping
    414 									Characters</a></li>
    415 						</ul>
    416 					</li>
    417 					<li>5.2 <a href="#Common_Attributes">Common Attributes</a>
    418 						<ul class="toc">
    419 							<li>5.2.1 <a href="#Attribute_type">Attribute type</a></li>
    420 							<li>5.2.2 <a href="#Attribute_draft">Attribute draft</a></li>
    421 							<li>5.2.3 <a href="#alt_attribute">Attribute alt</a></li>
    422 						</ul>
    423 					</li>
    424 					<li>5.3 <a href="#Common_Structures">Common Structures</a>
    425 						<ul class="toc">
    426 							<li>5.3.1 <a href="#Date_Ranges">Date and Date Ranges</a></li>
    427 							<li>5.3.2 <a href="#Text_Directionality">Text
    428 									Directionality</a></li>
    429 							<li>5.3.3 <a href="#Unicode_Sets">Unicode Sets</a>
    430 								<ul class="toc">
    431 									<li>5.3.3.1 <a href="#Lists_of_Code_Points">Lists of
    432 											Code Points</a></li>
    433 									<li>5.3.3.2 <a href="#Unicode_Properties">Unicode
    434 											Properties</a></li>
    435 									<li>5.3.3.3 <a href="#Boolean_Operations">Boolean
    436 											Operations</a></li>
    437 									<li>5.3.3.4 <a href="#UnicodeSet_Examples">UnicodeSet
    438 											Examples</a></li>
    439 								</ul>
    440 							</li>
    441 							<li>5.3.4 <a href="#String_Range">String Range</a></li>
    442 						</ul>
    443 					</li>
    444 					<li>5.4 <a href="#Identity_Elements">Identity Elements</a></li>
    445 					<li>5.5 <a href="#Valid_Attribute_Values">Valid Attribute
    446 							Values</a></li>
    447 					<li>5.6 <a href="#Canonical_Form">Canonical Form</a>
    448 						<ul class="toc">
    449 							<li>5.6.1 <a href="#Content">Content</a></li>
    450 							<li>5.6.2 <a href="#Ordering">Ordering</a></li>
    451 							<li>5.6.3 <a href="#Comments">Comments</a></li>
    452 						</ul>
    453 					</li>
    454                     	<li>5.7 <a href="#DTD_Annotations">DTD Annotations</a></li>
    455 
    456 				</ul>
    457 			</li>
    458 			<li>6 <a href="#Property_Data">Property Data</a>
    459 				<ul class="toc">
    460 					<li>6.1 <a href="#Script_Metadata">Script Metadata</a></li>
    461 					<li>6.2 <a href="#Extended_Pictographic">Extended Pictographic</a></li>
    462 					<li>6.3 <a href="#Labels.txt">Labels.txt</a></li>
    463 				</ul>
    464 			</li>
    465 			<li>7 <a href="#Format_Parse_Issues">Issues in Formatting
    466 					and Parsing</a>
    467 				<ul class="toc">
    468 					<li>7.1 <a href="#Lenient_Parsing">Lenient Parsing</a>
    469 						<ul class="toc">
    470 							<li>7.1.1 <a href="#Motivation">Motivation</a></li>
    471 							<li>7.1.2 <a href="#Loose_Matching">Loose Matching</a></li>
    472 						</ul>
    473 					</li>
    474 					<li>7.2 <a href="#Invalid_Patterns">Handling Invalid
    475 							Patterns</a></li>
    476 				</ul>
    477 			</li>
    478 			<li>Annex A <a href="#Deprecated_Structure">Deprecated Structure</a>
    479 				<ul class="toc">
    480 					<li>A.1 <a href="#Fallback_Elements">Element fallback</a></li>
    481 					<li>A.2 <a href="#BCP47_Keyword_Mapping">BCP 47 Keyword
    482 							Mapping</a></li>
    483 					<li>A.3 <a href="#Choice_Patterns">Choice Patterns</a></li>
    484 					<li>A.4 <a href="#Element_default">Element default</a></li>
    485 					<li>A.5 <a href="#Deprecated_Common_Attributes">Deprecated
    486 							Common Attributes</a>
    487 						<ul>
    488 							<li>A.5.1 <a href="#Attribute_standard">Attribute
    489 									standard</a></li>
    490 							<li>A.5.2 <a href="#Attribute_draft_nonLeaf">Attribute
    491 									draft in non-leaf elements</a></li>
    492 						</ul>
    493 					</li>
    494 					<li>A.6 <a href="#Element_base">Element base</a></li>
    495 					<li>A.7 <a href="#Element_rules">Element rules</a></li>
    496 					<li>A.8 <a href="#Deprecated_subelements_of_dates">Deprecated
    497 							subelements of &lt;dates&gt;</a></li>
    498 					<li>A.9 <a href="#Deprecated_subelements_of_calendars">Deprecated
    499 							subelements of &lt;calendars&gt;</a></li>
    500 					<li>A.10 <a href="#Deprecated_subelements_of_timeZoneNames">Deprecated
    501 							subelements of &lt;timeZoneNames&gt;</a></li>
    502 					<li>A.11 <a href="#Deprecated_subelements_of_zone_metazone">Deprecated
    503 							subelements of &lt;zone&gt; and &lt;metazone&gt;</a></li>
    504 					<li>A.12 <a
    505 						href="#Renamed_attribute_values_for_contextTransformUsage">Renamed
    506 							attribute values for &lt;contextTransformUsage&gt; element</a></li>
    507 					<li>A.13 <a href="#Deprecated_subelements_of_segmentations">Deprecated
    508 							subelements of &lt;segmentations&gt;</a></li>
    509 					<li>A.14 <a href="#Element_cp">Element cp</a></li>
    510 					<li>A.15 <a href="#validSubLocales">Attribute
    511 							validSubLocales</a></li>
    512 					<li>A.16 <a href="#postCodeElements">Elements
    513 							postalCodeData, postCodeRegex</a></li>
    514 					<li>A.17 <a href="#telephoneCodeData">Element
    515 							telephoneCodeData</a></li>
    516 				</ul>
    517 			</li>
    518 			<li>Annex B <a href="#Links_to_Other_Parts">Links to Other Parts</a>
    519 				<ul class="toc">
    520 					<li>Table: <a href="#Part_2_Links">Part 2 Links: General
    521 							(display names &amp; transforms, etc.)</a></li>
    522 					<li>Table: <a href="#Part_3_Links">Part 3 Links: Numbers
    523 							(number &amp; currency formatting)</a></li>
    524 					<li>Table: <a href="#Part_4_Links">Part 4 Links: Dates
    525 							(date, time, time zone formatting)</a></li>
    526 					<li>Table: <a href="#Part_5_Links">Part 5 Links: Collation
    527 							(sorting, searching, grouping)</a></li>
    528 					<li>Table: <a href="#Part_6_Links">Part 6 Links:
    529 							Supplemental (supplemental data)</a></li>
    530 					<li>Table: <a href="#Part_7_Links">Part 7 Links: Keyboards
    531 							(keyboard mappings)</a></li>
    532 				</ul>
    533 			</li>
    534 			<li><a href="#References">References</a></li>
    535 			<li><a href="#Acknowledgments">Acknowledgments</a></li>
    536 			<li><a href="#Modifications">Modifications</a></li>
    537 		</ul>
    538 		<!-- END Generated TOC: CheckHtmlFiles -->
    539 		<h2>
    540 			<a name="Introduction" href="#Introduction">1 Introduction</a>
    541 		</h2>
    542 		<p>Not long ago, computer systems were like separate worlds,
    543 			isolated from one another. The internet and related events have
    544 			changed all that. A single system can be built of many different
    545 			components, hardware and software, all needing to work together. Many
    546 			different technologies have been important in bridging the gaps; in
    547 			the internationalization arena, Unicode has provided a lingua franca
    548 			for communicating textual data. However, there remain differences in
    549 			the locale data used by different systems.</p>
    550 		<p>The best practice for internationalization is to store and
    551 			communicate language-neutral data, and format that data for the
    552 			client. This formatting can take place on any of a number of the
    553 			components in a system; a server might format data based on the
    554 			user&#39;s locale, or it could be that a client machine does the
    555 			formatting. The same goes for parsing data, and locale-sensitive
    556 			analysis of data.</p>
    557 		<p>
    558 			But there remain significant differences across systems and
    559 			applications in the locale-sensitive data used for such formatting,
    560 			parsing, and analysis. Many of those differences are simply
    561 			gratuitous; all within acceptable limits for human beings, but
    562 			yielding different results. In many other cases there are outright
    563 			errors. Whatever the cause, the differences can cause discrepancies
    564 			to creep into a heterogeneous system. This is especially serious in
    565 			the case of collation (sort-order), where different collation caused
    566 			not only ordering differences, but also different results of queries!
    567 			That is, with a query of customers with names between &quot;Abbot,
    568 			Cosmo&quot; and &quot;Arnold, James&quot;, if different systems have
    569 			different sort orders, different lists will be returned. (For
    570 			comparisons across systems formatted as HTML tables, see [<a
    571 				href="#Comparisons">Comparisons</a>].)
    572 		</p>
    573 		<blockquote>
    574 			<p class="note">
    575 				<b>Note:</b> There are many different equally valid ways in which
    576 				data can be judged to be &quot;correct&quot; for a particular
    577 				locale. The goal for the common locale data is to make it as
    578 				consistent as possible with existing locale data, and acceptable to
    579 				users in that locale.
    580 			</p>
    581 		</blockquote>
    582 		<p>This document specifies an XML format for the communication of
    583 			locale data: the Unicode Locale Data Markup Language (LDML). This
    584 			provides a common format for systems to interchange locale data so
    585 			that they can get the same results in the services provided by
    586 			internationalization libraries. It also provides a standard format
    587 			that can allow users to customize the behavior of a system. With it,
    588 			for example, collation (sorting) rules can be exchanged, allowing two
    589 			implementations to exchange a specification of tailored collation
    590 			rules. Using the same specification, the two implementations will
    591 			achieve the same results in comparing strings. Unicode LDML can also
    592 			be used to let a user encapsulate specialized sorting behavior for a
    593 			specific domain, or create a customized locale for a minority
    594 			language. Unicode LDML is also used in the Unicode Common Locale Data
    595 			Repository (CLDR). CLDR uses an open process for reconciling
    596 			differences between the locale data used on different systems and
    597 			validating the data, to produce with a useful, common, consistent
    598 			base of locale data.</p>
    599 		<p>
    600 			For more information, see the Common Locale Data Repository project
    601 			page [<a href="#localeProject">LocaleProject</a>].
    602 		</p>
    603 		<p>As LDML is an interchange format, it was designed for ease of
    604 			maintenance and simplicity of transformation into other formats,
    605 			above efficiency of run-time lookup and use. Implementations should
    606 			consider converting LDML data into a more compact format prior to
    607 			use.</p>
    608 		<h3>
    609 			<a name="Conformance" href="#Conformance">1.1 Conformance</a>
    610 		</h3>
    611 		<p>There are many ways to use the Unicode LDML format and the data
    612 			in CLDR, and the Unicode Consortium does not restrict the ways in
    613 			which the format or data are used. However, an implementation may
    614 			also claim conformance to LDML or to CLDR, as follows:</p>
    615 		<p>&nbsp;</p>
    616 		<p>
    617 			<i><b>UAX35-C1.</b> </i>An implementation that claims conformance to
    618 			this specification shall:
    619 		</p>
    620 		<ol>
    621 			<li>Identify the sections of the specification that it conforms
    622 				to.
    623 				<ul>
    624 					<li>For example, an implementation might claim conformance to
    625 						all LDML features except for <i>transforms</i> and <i>segments</i>.
    626 					</li>
    627 				</ul>
    628 			</li>
    629 			<li>Interpret the relevant elements and attributes of LDML
    630 				documents in accordance with the descriptions in those sections.
    631 				<ul>
    632 					<li>For example, an implementation that claims conformance to
    633 						the date format patterns must interpret the characters in such
    634 						patterns according to <a
    635 						href="tr35-dates.html#Date_Field_Symbol_Table">Date Field
    636 							Symbol Table</a>.
    637 					</li>
    638 				</ul>
    639 			</li>
    640 			<li>Declare which types of CLDR data that it uses.
    641 				<ul>
    642 					<li>For example, an implementation might declare that it only
    643 						uses language names, and those with a <i>draft</i> status of <i>contributed</i>
    644 						or <i>approved</i>.
    645 					</li>
    646 				</ul>
    647 			</li>
    648 		</ol>
    649 		<p>
    650 			<i><b>UAX35-C2.</b> </i>An implementation that claims conformance to
    651 			Unicode locale or language identifiers shall:
    652 		</p>
    653 		<ol>
    654 			<li>Specify whether Unicode locale extensions are allowed</li>
    655 			<li>Specify the canonical form used for identifiers in terms of
    656 				casing and field separator characters.</li>
    657 		</ol>
    658 		<p>External specifications may also reference particular
    659 			components of Unicode locale or language identifiers, such as:</p>
    660 		<blockquote>
    661 			<p>
    662 				<i>Field X can contain any Unicode region subtag values as given
    663 					in Unicode Technical Standard #35: Unicode Locale Data Markup
    664 					Language (LDML), excluding grouping codes.</i>
    665 			</p>
    666 		</blockquote>
    667 		<h2>
    668 			<a name="Locale" href="#Locale">2 What is a Locale?</a>
    669 		</h2>
    670 		<p>Before diving into the XML structure, it is helpful to describe
    671 			the model behind the structure. People do not have to subscribe to
    672 			this model to use data in LDML, but they do need to understand it so
    673 			that the data can be correctly translated into whatever model their
    674 			implementation uses.</p>
    675 		<p>
    676 			The first issue is basic: <i>what is a locale?</i> In this model, a
    677 			locale is an identifier (id) that refers to a set of user preferences
    678 			that tend to be shared across significant swaths of the world.
    679 			Traditionally, the data associated with this id provides support for
    680 			formatting and parsing of dates, times, numbers, and currencies; for
    681 			measurement units, for sort-order (collation), plus translated names
    682 			for time zones, languages, countries, and scripts. The data can also
    683 			include support for text boundaries (character, word, line, and
    684 			sentence), text transformations (including transliterations), and
    685 			other services.
    686 		</p>
    687 		<p>Locale data is not cast in stone: the data used on
    688 			someone&#39;s machine generally may reflect the US format, for
    689 			example, but preferences can typically set to override particular
    690 			items, such as setting the date format for 2002.03.15, or using
    691 			metric or Imperial measurement units. In the abstract, locales are
    692 			simply one of many sets of preferences that, say, a website may want
    693 			to remember for a particular user. Depending on the application, it
    694 			may want to also remember the user&#39;s time zone, preferred
    695 			currency, preferred character set, smoker/non-smoker preference, meal
    696 			preference (vegetarian, kosher, and so on), music preference,
    697 			religion, party affiliation, favorite charity, and so on.</p>
    698 		<p>Locale data in a system may also change over time: country
    699 			boundaries change; governments (and currencies) come and go:
    700 			committees impose new standards; bugs are found and fixed in the
    701 			source data; and so on. Thus the data needs to be versioned for
    702 			stability over time.</p>
    703 		<p>
    704 			In general terms, the locale id is a parameter that is supplied to a
    705 			particular service (date formatting, sorting, spell-checking, and so
    706 			on). The format in this document does not attempt to represent all
    707 			the data that could conceivably be used by all possible services.
    708 			Instead, it collects together data that is in common use in systems
    709 			and internationalization libraries for basic services. The main
    710 			difference among locales is in terms of language; there may also be
    711 			some differences according to different countries or regions.
    712 			However, the line between <i>locales</i> and <i>languages</i>, as
    713 			commonly used in the industry, are rather fuzzy. Note also that the
    714 			vast majority of the locale data in CLDR is in fact language data;
    715 			all non-linguistic data is separated out into a separate tree. For
    716 			more information, see <i><a href="#Language_and_Locale_IDs">Section
    717 					3.10 Language and Locale IDs</a></i>.
    718 		</p>
    719 		<p>
    720 			We will speak of data as being &quot;in locale X&quot;. That does not
    721 			imply that a locale <i>is</i> a collection of data; it is simply
    722 			shorthand for &quot;the set of data associated with the locale id
    723 			X&quot;. Each individual piece of data is called a <i>resource </i>or
    724 			<i>field</i>, and a tag indicating the key of the resource is called
    725 			a <i>resource tag.</i>
    726 		</p>
    727 		<h2>
    728 			<a name="Identifiers" href="#Identifiers"></a><a
    729 				name="Unicode_Language_and_Locale_Identifiers"
    730 				href="#Unicode_Language_and_Locale_Identifiers"> 3 Unicode
    731 				Language and Locale Identifiers</a>
    732 		</h2>
    733 		<p>
    734 			Unicode LDML uses stable identifiers based on [<a href="#BCP47">BCP47</a>]
    735 			for distinguishing among languages, locales, regions, currencies,
    736 			time zones, transforms, and so on. There are many systems for
    737 			identifiers for these entities. The Unicode LDML identifiers may not
    738 			match the identifiers used on a particular target system. If so, some
    739 			process of identifier translation may be required when using LDML
    740 			data.
    741 		</p>
    742 		<p>
    743 			The BCP 47 extensions (-u- and -t-) are described in <em>Section
    744 				3.6 <a href="#u_Extension">Unicode BCP 47 U Extension</a>
    745 			</em> and <em>Section 3.7 <a href="#BCP47_T_Extension">Unicode
    746 					BCP 47 T Extension</a></em>.
    747 		</p>
    748 		<h3>
    749 			<i><a name="Unicode_language_identifier"
    750 				href="#Unicode_language_identifier">3.1 Unicode Language
    751 					Identifier</a></i>
    752 		</h3>
    753 		<p>
    754 			A <i>Unicode language identifier</i> has the following structure
    755 			(provided in either EBNF (Perl-based) or ABNF [<a href="#RFC5234">RFC5234</a>]).
    756 			The following table defines syntactically well-formed identifiers:
    757 			they are not necessarily valid identifiers. For additional validity
    758 			criteria, see the links on the right.
    759 		</p>
    760 		<table>
    761 			<tr>
    762 				<th>&nbsp;</th>
    763 				<th><div align="center">EBNF</div></th>
    764 				<th><div align="center">ABNF</div></th>
    765 				<th><div align="center">Validity / Comments</div></th>
    766 			</tr>
    767 			<tr>
    768 				<td><code>
    769 						<a href="#unicode_language_id" name="unicode_language_id">unicode_language_id</a>
    770 					</code></td>
    771 				<td><code>
    772 						= &quot;root&quot;<br>
    773 						| (unicode_language_subtag <br>   (sep
    774 						unicode_script_subtag)? <br>  | unicode_script_subtag)<br>
    775 						 (sep unicode_region_subtag)? <br> 
    776 						 (sep
    777 						unicode_variant_subtag)* ;
    778 					</code></td>
    779 				<td><code>
    780 						= &quot;root&quot;<br>
    781 / (unicode_language_subtag <br>   [sep
    782 						unicode_script_subtag] <br>  / unicode_script_subtag)<br>
    783 						 [sep unicode_region_subtag] <br> 
    784 						 *(sep
    785 						unicode_variant_subtag)
    786 					</code></td><td>"root" is treated as a special <code>unicode_language_subtag</code></tr>
    787 			<tr>
    788 				<td><code>
    789 						<a href="#unicode_language_subtag" name="unicode_language_subtag">unicode_language_subtag</a>
    790 					</code></td>
    791 				<td><code> = alpha{2,3} | alpha{5,8}; </code></td>
    792 				<td><code> = 2*3ALPHA / 5*8ALPHA </code></td>
    793 				<td><code>
    794 						<a href='#unicode_language_subtag_validity'>validity</a><br>
    795 						<a href='http://unicode.org/cldr/latest/common/validity/language.xml'>latest-data</a>
    796 					</code></td>
    797 			</tr>
    798 			<tr>
    799 				<td><code>
    800 						<a href="#unicode_script_subtag" name="unicode_script_subtag">unicode_script_subtag</a>
    801 					</code></td>
    802 				<td><code>= alpha{4} ;</code></td>
    803 				<td><code>= 4ALPHA</code></td>
    804 				<td><code>
    805 						<a href='#unicode_script_subtag_validity'>validity</a><br>
    806 						<a href='http://unicode.org/cldr/latest/common/validity/script.xml'>latest-data</a>
    807 					</code></td>
    808 			</tr>
    809 			<tr>
    810 				<td><code>
    811 						<a href="#unicode_region_subtag" name="unicode_region_subtag">unicode_region_subtag</a>
    812 					</code></td>
    813 				<td><code>= (alpha{2} | digit{3}) ;</code></td>
    814 				<td><code>= 2ALPHA / 3DIGIT</code></td>
    815 				<td><code>
    816 						<a href='#unicode_language_subtag_validity'>validity</a><br>
    817 						<a href='http://unicode.org/cldr/latest/common/validity/region.xml'>latest-data</a>
    818 					</code></td>
    819 			</tr>
    820 			<tr>
    821 				<td><code>
    822 						<a href="#unicode_variant_subtag" name="unicode_variant_subtag">unicode_variant_subtag</a>
    823 					</code></td>
    824 				<td><code>
    825 						= (alphanum{5,8} <br> | digit alphanum{3}) ;
    826 					</code></td>
    827 				<td><code>
    828 						= 5*8alphanum<br>/ (DIGIT 3alphanum)
    829 					</code></td>
    830 				<td><code>
    831 						<a href='#unicode_language_subtag_validity'>validity</a><br>
    832 						<a href='http://unicode.org/cldr/latest/common/validity/variant.xml'>latest-data</a>
    833 					</code></td>
    834 			</tr>
    835 			<tr>
    836 				<td><code>sep</code></td>
    837 				<td><code>= [-_] ;</code></td>
    838 				<td><code>= "-" / "_"</code></td>
    839 			</tr>
    840 			<tr>
    841 				<td><code>digit</code></td>
    842 				<td><code>= [0-9] ;</code></td>
    843 				<td><code>&nbsp;</code></td>
    844 			</tr>
    845 			<tr>
    846 				<td><code>alpha</code></td>
    847 				<td><code>= [A-Z a-z] ;</code></td>
    848 				<td><code>&nbsp;</code></td>
    849 			</tr>
    850 			<tr>
    851 				<td><code>alphanum</code></td>
    852 				<td><code>= [0-9 A-Z a-z] ;</code></td>
    853 				<td><code>= ALPHA / DIGIT</code></td>
    854 			</tr>
    855 		</table>
    856 		<p>
    857 			The semantics of the various subtags is explained in <em>Section
    858 				3.4 <a href="#Field_Definitions">Language Identifier Field
    859 					Definitions</a>
    860 			</em>; there are also direct links from
    861 			<code>
    862 				<a href="#unicode_language_subtag">unicode_language_subtag</a>
    863 			</code>
    864 			, etc. While theoretically the
    865 			<code>
    866 				<a href="#unicode_language_subtag">unicode_language_subtag</a>
    867 			</code>
    868 			may have more than 3 letters through the IANA registration process,
    869 			in practice that has not occurred. The
    870 			<code>
    871 				<a href="#unicode_language_subtag">unicode_language_subtag</a>
    872 			</code>
    873 			&quot;und&quot; may be omitted when there is a
    874 			<code>
    875 				<a href="#unicode_script_subtag">unicode_script_subtag</a>
    876 			</code>
    877 			; for that reason
    878 			<code>
    879 				<a href="#unicode_language_subtag">unicode_language_subtag</a>
    880 			</code>
    881 			values with 4 letters are not permitted. However, such
    882 			<code>
    883 				<a href="#unicode_language_id">unicode_language_id</a>
    884 			</code>
    885 			values are not intended for general interchange, because they are not
    886 			valid BCP 47 tags. Instead, they are intended for certain protocols
    887 			such as the identification of transliterators or font ScriptLangTag
    888 			values.
    889 		</p>
    890 		<p>For example, &quot;en-US&quot; (American English),
    891 			&quot;en_GB&quot; (British English), &quot;es-419&quot; (Latin
    892 			American Spanish), and &quot;uz-Cyrl&quot; (Uzbek in Cyrillic) are
    893 			all valid Unicode language identifiers.</p>
    894 		<h3>
    895 			<i><a name="Unicode_locale_identifier"
    896 				href="#Unicode_locale_identifier">3.2 Unicode Locale Identifier</a></i>
    897 		</h3>
    898 		<p>
    899 			A <i>Unicode locale identifier</i> is composed of a Unicode language
    900 			identifier plus (optional) locale extensions. It has the
    901 			following structure. The semantics of the U and T extensions are
    902 			explained in <em>Section 3.6 <a href="#u_Extension">Unicode
    903 					BCP 47 U Extension</a>
    904 			</em> and <em>Section 3.7 <a href="#BCP47_T_Extension">Unicode
    905 			BCP 47 T Extension</a></em>. Other extensions and private use extensions are supported for pass-through. The following table defines syntactically
    906 			        <em>well-formed</em> identifiers: they are not necessarily <em>valid</em> identifiers.
    907 		For additional validity criteria, see the links on the right. </p>
    908 		<table border="0">
    909 			<tr>
    910 				<th>&nbsp;</th>
    911 				<th><div align="center">EBNF</div></th>
    912 				<th><div align="center">ABNF</div></th>
    913 				<th><div align="center">Validity</div></th>
    914 			</tr>
    915 			<tr>
    916 				<td><code>
    917 						<a href="#unicode_locale_id" name="unicode_locale_id">unicode_locale_id</a>
    918 					</code></td>
    919 				<td><code>
    920 						= unicode_language_id<br>
    921 						  extensions*<br>
    922 						  
    923 pu_extensions? ; </code></td>
    924 				<td><code>
    925 						= unicode_language_id<br>
    926 						  [extensions]  <br>
    927 						  
    928 				   1*pu_extensions </code></td>
    929 			</tr>
    930 		  <tr>
    931 				<td><code>
    932 				<a href="#extensions" name="extensions">extensions</a>
    933 					</code></td>
    934 				<td><code>
    935 						= unicode_locale_extensions <br>
    936 				| transformed_extensions <br>
    937 				| other_extensions ;</code></td>
    938 				<td><code>= unicode_locale_extensions <br>
    939 / transformed_extensions <br>
    940 / other_extensions</code></td>
    941 			</tr>
    942 			<tr>
    943 				<td><code>
    944 						<a href="#unicode_locale_extensions"
    945 							name="unicode_locale_extensions">unicode_locale_extensions</a>
    946 					</code></td>
    947 				<td><code>
    948 						= sep [uU]<br>  ((sep keyword)+ <br>  |(sep attribute)+
    949 						(sep keyword)*) ;
    950 					</code></td>
    951 				<td><code>
    952 						= sep &quot;u&quot; <br>  (1*(sep keyword) <br>  / 1*(sep
    953 						attribute) *(sep keyword))
    954 					</code></td>
    955 			</tr>
    956 			<tr>
    957 				<td><code>
    958 						<a href="#transformed_extensions" name="transformed_extensions">transformed_extensions</a>
    959 					</code></td>
    960 				<td><code>
    961 					= sep [tT] <br>  ((sep tlang (sep tfield)*) <br>
    962 					 | (sep tfield)+) ; </code></td>
    963 				<td><code>
    964 						= sep &quot;t&quot; <br>  ((sep tlang
    965 					*(sep tfield)) <br>  / 1*(sep tfield)) </code></td>
    966 			</tr>
    967 			<tr>
    968 				<td><code><a href="#pu_extensions" name="pu_extensions">pu_extensions</a></code></td>
    969 				<td><code>= sep [xX] <br>
    970 				   
    971 			    (sep alphanum{1,8})* ;</code></td>
    972 				<td><code>= sep &quot;x&quot; <br>
    973 				   
    974 			    [sep 1*8alphanum]</code></td>
    975 			</tr>
    976 		  <tr>
    977 				<td><code><a href="#other_extensions" name="other_extensions">other_extensions</a></code></td>
    978 				<td><code>= [alphanum-[tTuUxX]]<br>
    979 				   
    980 			    (sep alphanum{2,8})* ;</code></td>
    981 				<td><code>= (DIGIT<br>
    982 				    
    983 			    / %x61-%x73<br>
    984 			      / %x76-%x77<br>
    985 			      
    986 			    / %x79-%x7A)<br>
    987  
    988 	        *(sep 2*8alphanum)</code></td>
    989 			</tr>
    990 			<tr>
    991 				<td><code>keyword</code></td>
    992 				<td><code>= key (sep type)? ;</code></td>
    993 				<td><code>= key [sep type]</code></td>
    994 			</tr>
    995 		  <tr>
    996 				<td><code>key</code></td>
    997 				<td><code>
    998 						= alphanum alpha ;
    999 					</code></td>
   1000 				<td><code>
   1001 						= alphanum ALPHA
   1002 					</code></td>
   1003 				<td><code>
   1004 						<a href="#Key_Type_Definitions">validity</a><br> 
   1005 				<a
   1006 							href='http://unicode.org/cldr/latest/common/bcp47'>latest-data</a>
   1007 					</code></td>
   1008 			</tr>
   1009 			<tr>
   1010 				<td><code>type</code></td>
   1011 				<td><code>
   1012 						= alphanum{3,8}<br> (sep alphanum{3,8})* ;
   1013 					</code></td>
   1014 				<td><code>
   1015 						= 3*8alphanum<br> *(sep 3*8alphanum)
   1016 					</code></td>
   1017 				<td><code>
   1018 						<a href="#Key_Type_Definitions">validity</a><br> 
   1019 				<a
   1020 							href='http://unicode.org/cldr/latest/common/bcp47'>latest-data</a>
   1021 					</code></td>
   1022 			</tr>
   1023 			<tr>
   1024 				<td><code>attribute</code></td>
   1025 				<td><code>= alphanum{3,8} ;</code></td>
   1026 				<td><code>= 3*8alphanum</code></td>
   1027 			</tr>
   1028 			<tr>
   1029 				<td><code>
   1030 						<a name="unicode_subdivision_id" href="#unicode_subdivision_id">unicode_subdivision_id</a><a
   1031 							name="unicode_subdivision_subtag"></a><a
   1032 							name="subdivision_attribute"></a>
   1033 					</code></td>
   1034 				<td><code>
   1035 						= <a href="#unicode_region_subtag">unicode_region_subtag</a> unicode_subdivision_suffix ;
   1036 					</code></td>
   1037 				<td><code>
   1038 						= <a href="#unicode_region_subtag">unicode_region_subtag</a> unicode_subdivision_suffix
   1039 					</code></td>
   1040 				<td><code>
   1041 						<a href='#unicode_subdivision_subtag_validity'>validity</a><br>
   1042 						<a
   1043 							href='http://unicode.org/cldr/latest/common/validity/subdivision.xml'>latest-data</a>
   1044 					</code></td>
   1045 
   1046 			</tr>
   1047 			<tr>
   1048 				<td><code>unicode_subdivision_suffix</code></td>
   1049 				<td><code> = (alphanum{1,4} ;</code></td>
   1050 				<td><code>= 1*4alphanum</code></td>
   1051 			</tr>
   1052 			<tr>
   1053 				<td><code>
   1054 						<a name="unicode_measure_unit" href="#unicode_measure_unit">unicode_measure_unit</a>
   1055 					</code></td>
   1056 				<td><code>
   1057 						= alphanum{3,8}<br>  (sep alphanum{3,8})* ;
   1058 					</code></td>
   1059 				<td><code>
   1060 						= 3*8alphanum<br>  *(sep 3*8alphanum)
   1061 					</code></td>
   1062 				<td><code>
   1063 						<a href='#Validity_Data'>validity</a><br> 
   1064 				<a
   1065 							href='http://unicode.org/cldr/latest/common/validity/unit.xml'>latest-data</a>
   1066 					</code></td>
   1067 			</tr>
   1068 			<tr>
   1069 				<td><code>tlang</code></td>
   1070 				<td><code>
   1071 					= unicode_language_subtag<br>  (sep unicode_script_subtag)?<br>  (sep unicode_region_subtag)?<br>  (sep unicode_variant_subtag)* ; </code></td>
   1072 				<td><code>
   1073 						= unicode_language_subtag <br>  [sep unicode_script_subtag] <br>  [sep unicode_region_subtag] <br> 
   1074 						
   1075 					*(sep unicode_variant_subtag) </code></td>
   1076 			</tr>
   1077 			<tr>
   1078 				<td><code>tfield</code></td>
   1079 				<td><code>
   1080 						= tkey tvalue;
   1081 					</code></td>
   1082 				<td><code>
   1083 						= tkey tvalue
   1084 					</code></td>
   1085 				<td><code>
   1086 						<a href="#BCP47_T_Extension">validity</a><br> 
   1087 				<a
   1088 							href='http://unicode.org/cldr/latest/common/bcp47'>latest-data</a>
   1089 					</code></td>
   1090 
   1091 			</tr>
   1092 			<tr>
   1093 				<td><code>
   1094 						tkey
   1095 					</code></td>
   1096 				<td><code>
   1097 						= alpha digit ;
   1098 					</code></td>
   1099 				<td><code>= ALPHA DIGIT</code></td>
   1100 			</tr>
   1101 			<tr>
   1102 				<td><code>
   1103 						tvalue
   1104 					</code></td>
   1105 				<td><code>= (sep alphanum{3,8})+ ;</code></td>
   1106 				<td><code>= 1*(sep 3*8alphanum)</code></td>
   1107 			</tr>
   1108 		</table>
   1109 
   1110 		<p>
   1111 			For historical reasons, this is called a Unicode locale identifier.
   1112 			However, it really functions (with few exceptions) as a <span
   1113 				class="st">language</span> identifier, and accesses <span class="st">language</span>-based
   1114 			data. Except where it would be unclear, this document uses the term
   1115 			&quot;locale&quot; data loosely to encompass both types of data: for
   1116 			more information, see <i><a href="#Language_and_Locale_IDs">Section
   1117 					3.10 Language and Locale IDs</a></i>.
   1118 		</p>
   1119 		<p></p>
   1120 		<p>As of the release of this specification, there were no other_extensions defined. The other_extensions are present in the syntax to allow implementations to preserve that information. There cannot be more than one extension with the same singleton (-u-, -t-, ...). The private use extension must come after all other extensions. 
   1121 		</p>
   1122 		<p>As for terminology, the term <i>code</i> may also be used instead of
   1123 			&quot;subtag&quot;, and &quot;territory&quot; instead of
   1124 			&quot;region&quot;. The primary language subtag is also called the <i>base
   1125 				language code</i>. For example, the base language code for
   1126 			&quot;en-US&quot; (American English) is &quot;en&quot; (English). The
   1127 			<i>type</i> may also be referred to as a <i>value</i> or <i>key-value</i>.
   1128 		</p>
   1129 		<p>
   1130 			The identifiers can vary in case and in the separator characters. The
   1131 			&quot;-&quot; and &quot;_&quot; separators are treated as equivalent, although &quot;-&quot; is preferred.</p>
   1132 	  <p>All identifier field values are case-insensitive. Although case
   1133 			distinctions do not carry any special meaning, an implementation of
   1134 			LDML should use the casing recommendations in [<a href="#BCP47">BCP47</a>],
   1135 			especially when a Unicode locale identifier is used for locale data
   1136 		exchange in software protocols.</p>
   1137 	  <p>The canonical form of a <code><a href="#unicode_locale_id">unicode_locale_id</a></code> has:</p>
   1138 	  <ul>
   1139 	    <li>	a language subtag (those beginning with a script subtag only are specialized use)</li>
   1140 	    <li>any script subtag  in title case (eg, Hant)</li>
   1141 	    <li>any region subtag  in uppercase (eg, DE)</li>
   1142         <li>all other subtags  in lowercase (eg, en)</li>
   1143 	    <li>any variants in alphabetical order (eg, en-fonipa-scouse, not en-scouse-fonipa)</li>
   1144 	    <li>any extensions in alphabetical order by their singleton (eg, en-t-xxx-u-yyy, not en-u-yyy-t-xxx)</li>
   1145       </ul>
   1146 		<p>
   1147 		<b>Note:</b>		    The current version of CLDR data uses some non-preferred forms for backward compatibility. This might be changed in future CLDR releases.</p>
   1148 		<ul>
   1149 		  <li>It uses uppercase letters for
   1150 		    variant subtags, while the preferred forms are all lowercase.</li>
   1151 		  <li>It uses &quot;_&quot; as the separator, while the preferred form of the separator is  "-".</li>
   1152 		  <li>It uses &quot;root&quot;, while the preferred form is &quot;und&quot;.</li>
   1153       </ul>
   1154 		<h3>
   1155 			<a name="BCP_47_Conformance" href="#BCP_47_Conformance">3.3 BCP
   1156 				47 Conformance</a>
   1157 		</h3>
   1158 		<p>
   1159 			Unicode language and locale identifiers inherit the design and the
   1160 			repertoire of subtags from [<a href="#BCP47">BCP47</a>] Language
   1161 			Tags. There are some extensions and restrictions made for the use of
   1162 			the Unicode locale identifier in CLDR:
   1163 		</p>
   1164 		<ul>
   1165 			<li>It does not allow for the full syntax of [<a href="#BCP47">BCP47</a>]:
   1166               <ul>
   1167 		  <li>No extlang subtags are allowed (as in the BCP 47 canonical form, see BCP 47 <a href="https://tools.ietf.org/html/bcp47#section-4.5">Section 4.5</a> and <a href="https://tools.ietf.org/html/bcp47#section-3.1.7" target="_blank" >Section 3.1.7</a>)</li>
   1168 		  <li>No irregular  BCP 47 grandfathered tags are allowed (these are all deprecated in BCP 47)</li>
   1169 		  <li>A tag must not start with the subtag &quot;x&quot;: thus a <em>privateuse</em> (eg x-abc) can only be after a language subtag, like &quot;und&quot;</li>
   1170 		</ul>
   1171 			</li>
   1172 			<li>It allows for certain semantic additions and constraints:
   1173 				<ul>
   1174 					<li>Certain codes that are private-use in BCP-47 and ISO are given semantics by LDML</li>
   1175 					<li>Each macrolanguage has an identified  primary encompassed language, which  is treated as an alias for the macrolanguage, and thus is replaced when canonicalizing (as allowed by BCP 47, see <a href="https://tools.ietf.org/html/bcp47#section-4.1.2">Section 4.1.2</a>)</li>
   1176 				</ul>
   1177 			</li>
   1178 					<li>It allows certain syntax for backwards compatibility  (not BCP 47-compatible):
   1179                       <ul>
   1180                         <li>The "_" character for field separator characters, as well as the "-" used in [<a href="#BCP47">BCP47</a>]
   1181                           (however, the canonical form is with &quot;-&quot;)</li>
   1182                         <li>The subtag "root" to indicate the generic locale used as the parent
   1183                           of all languages in the CLDR data model				      (&quot;und&quot; can be used instead)</li>
   1184                         <li>The language tag may begin with a script subtag rather than a language subtag. This is specialized use only, and  not required for CLDR conformance.</li>
   1185                       </ul>
   1186 		  </li>
   1187 		</ul>
   1188 		<p>There are thus two subtypes of Unicode locale identifiers:</p>
   1189 		<ul>
   1190 		  <li>the term <em>Unicode CLDR locale identifier</em> applies where the backwards compatibility syntax is used.</li>
   1191 		  <li>the term <em>Unicode BCP 47 locale identifier</em> applies otherwise. A <em>Unicode BCP 47 locale identifier</em> is  also a valid BCP 47 language tag.</li>
   1192       </ul>
   1193 		<h4>
   1194 			<a name="BCP_47_Language_Tag_Conversion"
   1195 				href="#BCP_47_Language_Tag_Conversion">3.3.1 BCP 47 Language Tag
   1196 				Conversion</a>
   1197 		</h4>
   1198 		<p>The different identifiers can be converted to one another as described in this section.
   1199 		<p>
   1200 		<h5>
   1201 			<a name="Language_Tag_to_Locale_Identifier"
   1202 				href="#Language_Tag_to_Locale_Identifier">BCP 47 Language Tag to Unicode BCP 47 Locale Identifier</a>
   1203 		</h5>
   1204 		<p>A valid [<a href="#BCP47">BCP47</a>] language tag can be converted
   1205 			to a valid Unicode BCP 47 locale identifier by performing the
   1206 		following transformation. </p>
   1207 		<ol>
   1208 			<li>Canonicalize the language tag (afterwards, there will be no
   1209 		  extlang subtags).</li>
   1210 			<li>If the BCP 47 primary language subtag matches the <i>type</i>
   1211 				attribute of a <i>languageAlias</i> element in <a
   1212 				href="tr35-info.html#Supplemental_Data">Supplemental Data</a>,
   1213 				replace the language subtag with the <i>replacement</i> value.
   1214 				<ol>
   1215 					<li>If there are additional subtags in the <i>replacement</i>
   1216 						value, add them to the result, but only if there is no
   1217 						corresponding subtag already in the tag.
   1218 					</li>
   1219 				</ol>
   1220 			</li>
   1221 			<li>If the BCP 47 region subtag matches the <i>type</i>
   1222 				attribute of a <i>territoryAlias</i> element in <a
   1223 				href="tr35-info.html#Supplemental_Data">Supplemental Data</a>,
   1224 				replace the language subtag with the <i>replacement</i> value, as
   1225 				follows:
   1226 				<ol>
   1227 					<li>If there is a single territory in the replacement, use it.</li>
   1228 					<li>If there are multiple territories:
   1229 						<ol>
   1230 							<li>Look up the most likely territory for the base language
   1231 								code (and script, if there is one).</li>
   1232 							<li>If that likely territory is in the list, use it.</li>
   1233 							<li>Otherwise, use the first territory in the list.</li>
   1234 						</ol>
   1235 					</li>
   1236 				</ol>
   1237 			</li>
   1238 		  <li>If the tag is one of the five deprecated grandfathered tags (cel-gaulish, i-default, i-enochian, i-mingo, zh-min) remaining after step #1,  prefix by &quot;und-x-&quot;.</li>
   1239 		  <li>If the first subtag is &quot;x&quot;, prefix by &quot;und-&quot;.</li>
   1240 	  </ol>
   1241 		<p>The result is  a Unicode BCP 47 locale identifier,  in canonical form. It is both a BCP 47 language tag and a Unicode locale identifier.	Because the process maps from all BCP 47 language tags into a subset of BCP 47 language tags, the format changes are not reversible, much as a lowercase transformation of the string McGowan is not reversible.</p>
   1242 		<br>
   1243 		<p><em>Examples</em></p>
   1244 		<table>
   1245 			<tr>
   1246 			  <th style='width:10em'>BCP 47 language tag</th>
   1247 			  <th style='width:10em'>Unicode BCP 47 locale identifier</th>
   1248 				<th>Comments</th>
   1249 			</tr>
   1250 			<tr>
   1251 				<td><code>en-US</code></td>
   1252 				<td><code>en-US</code></td>
   1253 				<td>no changes</td>
   1254 			</tr>
   1255 			<tr>
   1256 			  <td><code>iw-FX</code></td>
   1257 			  <td><code>he-FR</code></td>
   1258 			  <td>BCP 47 canonicalization [1]</td>
   1259 		  </tr>
   1260 			<tr>
   1261 			  <td><code>cmn-TW</code></td>
   1262 			  <td><code>zh-TW</code></td>
   1263 			  <td>language alias [2]</td>
   1264 		  </tr>
   1265 			<tr>
   1266 				<td><code>zh-cmn-TW</code></td>
   1267 				<td><code>zh-TW</code></td>
   1268 				<td>BCP 47 canonicalization [1], then language alias [2]</td>
   1269 			</tr>
   1270 			<tr>
   1271 				<td><code>sr-CS</code></td>
   1272 				<td><code>sr-RS</code></td>
   1273 				<td>territory alias [3]</td>
   1274 			</tr>
   1275 			<tr>
   1276 				<td><code>sh</code></td>
   1277 				<td><code>sr-Latn</code></td>
   1278 				<td>multiple replacement subtags [2.1]</td>
   1279 			</tr>
   1280 			<tr>
   1281 				<td><code>sh-Cyrl</code></td>
   1282 				<td><code>sr-Cyrl</code></td>
   1283 				<td>no replacement with multiple replacement subtags [2.1  doesn't apply]</td>
   1284 			</tr>
   1285 			<tr>
   1286 				<td><code>hy-SU</code></td>
   1287 				<td><code>hy-AM</code></td>
   1288 				<td>multiple territory values [3.2]<br> <code>&lt;territoryAlias
   1289 						type=&quot;SU&quot; replacement=&quot;RU AM AZ BY EE GE KZ KG LV
   1290 						LT MD TJ TM UA UZ&quot; /&gt;</code></td>
   1291 			</tr>
   1292           <tr>
   1293 	          <td><code>i-enochian</code></td>
   1294 	          <td><code>und-x-i-enochian</code></td>
   1295 	          <td>prefix any grandfathered tags with &quot;und-x-&quot; [4]</td>
   1296           </tr>
   1297 	      <tr>
   1298 			  <td><code>x-abc</code></td>
   1299 			  <td><code>und-x-abc</code></td>
   1300 			  <td>prefix with &quot;und-&quot;, so that there is always a base language subtag [5]</td>
   1301 		  </tr>
   1302 	  </table>
   1303 	  <p>&nbsp;</p>
   1304 	  		<h5>
   1305 			<a name="Unicode_Locale_Identifier_CLDR_to_BCP_47"
   1306 				href="#Unicode_Locale_Identifier_CLDR_to_BCP_47">Unicode Locale Identifier: CLDR to BCP 47</a>
   1307 		</h5>
   1308 
   1309 	  <p>A Unicode CLDR locale identifier can be converted to a valid [<a
   1310 				href="#BCP47">BCP47</a>] language tag (which is also a Unicode BCP 47 locale identifier) by performing the following
   1311 	  transformation. </p>
   1312       <ol>
   1313         <li>Replace the "_" separators with "-"</li>
   1314         <li>Replace the special language identifier "root"  with the BCP
   1315           47 primary language tag "und"</li>
   1316         <li>Add an initial &quot;und&quot; primary language subtag if the first subtag is a script.</li>
   1317       </ol>
   1318       <p><em>Examples:</em></p>
   1319       <table>
   1320         <tr>
   1321           <th style='width:10em'>Unicode CLDR locale identifier</th>
   1322           <th style='width:10em'>BCP 47 language tag</th>
   1323           <th>Comments</th>
   1324         </tr>
   1325         <tr>
   1326           <td><code>en_US</code></td>
   1327           <td><code>en-US</code></td>
   1328           <td>change separator [1]</td>
   1329         </tr>
   1330         <tr>
   1331           <td><code>de_DE_u_co_phonebk</code></td>
   1332           <td><code>de-DE-u-co-phonebk</code></td>
   1333           <td>change separator [1]</td>
   1334         </tr>
   1335         <tr>
   1336           <td><code>root</code></td>
   1337           <td><code>und</code></td>
   1338           <td>change to &quot;und&quot; [2]</td>
   1339         </tr>
   1340         <tr>
   1341           <td><code>root_u_cu_usd</code></td>
   1342           <td><code>und-u-cu-usd</code></td>
   1343           <td>change to &quot;und&quot; [1, 2]</td>
   1344         </tr>
   1345         <tr>
   1346           <td><code>Latn_DE</code></td>
   1347           <td><code>und-Latn-DE</code></td>
   1348           <td>add &quot;und&quot; [1, 3]</td>
   1349         </tr>
   1350       </table><br>
   1351       <p></p>
   1352       	  		<h5>
   1353 			<a name="Unicode_Locale_Identifier_BCP_47_to_CLDR"
   1354 				href="#Unicode_Locale_Identifier_BCP_47_to_CLDR">Unicode Locale Identifier: BCP 47 to  CLDR</a>
   1355 		</h5>
   1356 
   1357 	  <p>A Unicode BCP 47 locale identifier can be transformed into a Unicode CLDR locale identifier by performing the following transformation.</p>
   1358         <ol>
   1359           <li>the separator is changed to &quot;_&quot;</li>
   1360           <li>the primary language subtag "und" is replaced with "root"
   1361             if no script, region, or variant subtags are present.</li>
   1362         </ol>
   1363 	  <p><em>Examples:</em></p>
   1364 		<table>
   1365 		  <tr>
   1366 		    <th style='width:10em'>BCP 47 language tag</th>
   1367 		    <th style='width:10em'>Unicode CLDR locale identifier</th>
   1368 		    <th>Comments</th>
   1369 	      </tr>
   1370 		  <tr>
   1371 		    <td><code>en-US</code></td>
   1372 		    <td><code>en_US</code></td>
   1373 		    <td>changes separator [1]</td>
   1374 	      </tr>
   1375 		  <tr>
   1376 		    <td><code>und</code></td>
   1377 		    <td><code>root</code></td>
   1378 		    <td>changes to &quot;root&quot;, because no script, region, or variant tag is
   1379 	        present  [2]</td>
   1380 	      </tr>
   1381 		  <tr>
   1382 		    <td><code>und-US</code></td>
   1383 			  <td><code>und_US</code></td>
   1384 		    <td>no change to &quot;und&quot;, because a region subtag is present [1]</td>
   1385 	      </tr>
   1386 		  <tr>
   1387 		    <td nowrap><code>und-u-cu-USD</code></td>
   1388 		    <td nowrap><code>root_u_cu_usd</code></td>
   1389 		    <td>changes to &quot;root&quot;, because no script, region, or variant tag is
   1390 		      present [1, 2]</td>
   1391 	      </tr>
   1392 		</table>
   1393 		<h3>
   1394 			<a name="Field_Definitions" href="#Field_Definitions">3.4
   1395 				Language Identifier Field Definitions </a>
   1396 		</h3>
   1397 		<p>
   1398 			Unicode language and locale identifier field values are provided in
   1399 			the following table. Note that some private-use BCP 47 field values
   1400 			are given specific meanings in CLDR. While field values are based on
   1401 			[<a href="#BCP47">BCP47</a>] subtag values, their validity status in
   1402 			CLDR is specified by means of machine-readable files in the <a
   1403 				href='http://unicode.org/repos/cldr/tags/latest/common/validity/'>common/validity/</a>
   1404 			subdirectory, such as language.xml. For the format of those files and
   1405 			more information, see <em><a href='#Validity_Data'>Section
   1406 					3.11 Validity Data</a></em>.
   1407 		</p>
   1408 		<table>
   1409 			<caption>
   1410 				<a name="Language_Locale_Field_Definitions"
   1411 					href="#Language_Locale_Field_Definitions">Language Identifier
   1412 					Field Definitions </a>
   1413 			</caption>
   1414 			<tr>
   1415 				<th>Field</th>
   1416 				<th>Valid values</th>
   1417 			</tr>
   1418 			<tr>
   1419 				<td><a href="#unicode_language_subtag_validity"
   1420 					name="unicode_language_subtag_validity">unicode_language_subtag</a>
   1421 					<p>
   1422 						(also known as a <i>Unicode base language code)</i>
   1423 					</p></td>
   1424 				<td>Subtags in the language.xml file (see <em>Section 3.11
   1425 						<a href="#Validity_Data">Validity Data</a>
   1426 				</em>). These are based on [<a href="#BCP47">BCP47</a>] subtag values
   1427 					marked as <b>Type: language</b>
   1428 					<p>ISO 639-3 introduces the notion of
   1429 						&quot;macrolanguages&quot;, where certain ISO 639-1 or ISO 639-2
   1430 						codes are given broad semantics, and additional codes are given
   1431 						for the narrower semantics. For backwards compatibility, Unicode
   1432 						language identifiers retain use of the narrower semantics for
   1433 						these codes. For example:</p>
   1434 					<table border="1" cellspacing="0" cellpadding="2"
   1435 						style="margin: 0.5em">
   1436 						<tr>
   1437 							<th>For</th>
   1438 							<th>Use</th>
   1439 							<th><i>Not</i></th>
   1440 						</tr>
   1441 						<tr>
   1442 							<td>Standard Chinese (Mandarin)</td>
   1443 							<td><code>zh</code></td>
   1444 							<td><code>cmn</code></td>
   1445 						</tr>
   1446 						<tr>
   1447 							<td>Standard Arabic</td>
   1448 							<td><code>ar</code></td>
   1449 							<td><code>arb</code></td>
   1450 						</tr>
   1451 						<tr>
   1452 							<td>Standard Malay</td>
   1453 							<td><code>ms</code></td>
   1454 							<td><code>zsm</code></td>
   1455 						</tr>
   1456 						<tr>
   1457 							<td>Standard Swahili</td>
   1458 							<td><code>sw</code></td>
   1459 							<td><code>swh</code></td>
   1460 						</tr>
   1461 						<tr>
   1462 							<td>Standard Uzbek</td>
   1463 							<td><code>uz</code></td>
   1464 							<td><code>uzn</code></td>
   1465 						</tr>
   1466 						<tr>
   1467 							<td>Standard Konkani</td>
   1468 							<td><code>kok</code></td>
   1469 							<td><code>knn</code></td>
   1470 						</tr>
   1471 						<tr>
   1472 							<td>Northern Kurdish</td>
   1473 							<td><code>ku</code></td>
   1474 							<td><code>kmr</code></td>
   1475 						</tr>
   1476 					</table>
   1477 					<p>
   1478 						If a language subtag matches the type attribute of a languageAlias
   1479 						element, then the replacement value is used instead. For example,
   1480 						because "swh" occurs in
   1481 						<tt>&lt;languageAlias type="swh" replacement="sw"/&gt;</tt>
   1482 						, "sw" must be used instead of "swh". Thus Unicode language
   1483 						identifiers use &quot;ar-EG&quot; for Standard Arabic (Egypt), not
   1484 						&quot;arb-EG&quot;; they use &quot;zh-TW&quot; for Mandarin
   1485 						Chinese (Taiwan), not &quot;cmn-TW&quot;.
   1486 					</p>
   1487 					<p>
   1488 						The private use codes listed as <strong>excluded</strong>
   1489 						in <em>Section 3.5.3 <a href="#Private_Use">Private Use Codes</a></em>
   1490 						will never be given specific semantics in Unicode identifiers, and
   1491 					are thus safe for use for other purposes by other applications. </p>
   1492 					<p>The CLDR provides data for normalizing language/locale
   1493 						codes, including mapping overlong codes like &quot;eng-840&quot;
   1494 						or &quot;eng-USA&quot; to the correct code &quot;en-US&quot;;
   1495 						see the
   1496 						<strong><a href="https://www.unicode.org/cldr/charts/latest/supplemental/aliases.html">Aliases</a></strong>
   1497 						Chart.</p>
   1498 					<p>The following are special language subtags:</p>
   1499                     <table class="simple" border="1" cellspacing="0" cellpadding="2">
   1500                       <tr>
   1501                         <td>&nbsp;</td>
   1502                         <td><strong>Name</strong></td>
   1503                         <td><strong>Comment</strong></td>
   1504                       </tr>
   1505                       <tr>
   1506                         <td><code>mis</code></td>
   1507                         <td>Uncoded languages</td>
   1508                         <td>The content is in a language that doesn't yet have an ISO 639 code.</td>
   1509                       </tr>
   1510                       <tr>
   1511                         <td><code>mul</code></td>
   1512                         <td>Multiple languages</td>
   1513                         <td>The content contains  more than one language or text that is simultaneously in multiple languages (such as brand names).</td>
   1514                       </tr>
   1515                       <tr>
   1516                         <td><code>zxx</code></td>
   1517                         <td>No linguistic content</td>
   1518                         <td>The content  is not in any particular languages (such as images, symbols, etc.)</td>
   1519                       </tr>
   1520                     </table></td>
   1521 			</tr>
   1522 			<tr>
   1523 				<td><a href="#unicode_script_subtag_validity"
   1524 					name="unicode_script_subtag_validity">unicode_script_subtag</a>
   1525 					<p>
   1526 						(also known as a <i>Unicode script code)</i>
   1527 					</p></td>
   1528 				<td>Subtags in the script.xml file (see <em>Section 3.11 <a
   1529 						href="#Validity_Data">Validity Data</a></em>). These are based on [<a
   1530 					href="#BCP47">BCP47</a>] subtag values marked as <b>Type:
   1531 						script</b>
   1532 					<p>In most cases the script is not necessary, since the
   1533 						language is only customarily written in a single script. Examples
   1534 						of cases where it is used are:</p>
   1535 					<table border="1" cellspacing="0" cellpadding="2"
   1536 						style="margin: 0.5em">
   1537 						<tr>
   1538 							<td><code>az_Arab</code></td>
   1539 							<td>Azerbaijani in Arabic script</td>
   1540 						</tr>
   1541 						<tr>
   1542 							<td><code>az_Cyrl</code></td>
   1543 							<td>Azerbaijani in Cyrillic script</td>
   1544 						</tr>
   1545 						<tr>
   1546 							<td><code>az_Latn</code></td>
   1547 							<td>Azerbaijani in Latin script</td>
   1548 						</tr>
   1549 						<tr>
   1550 							<td><code>zh_Hans</code></td>
   1551 							<td>Chinese, in simplified script (=zh, zh-Hans, zh-CN,
   1552 								zh-Hans-CN)</td>
   1553 						</tr>
   1554 						<tr>
   1555 							<td><code>zh_Hant</code></td>
   1556 							<td>Chinese, in traditional script</td>
   1557 						</tr>
   1558 					</table>
   1559 					<p>
   1560 						Unicode identifiers give specific semantics to certain Unicode Script values. For more information, see also [<a
   1561 							href="http://www.unicode.org/reports/tr41/#UAX24">UAX24</a>]:
   1562 					</p>
   1563 					<table cellspacing="0" cellpadding="2" border="1"
   1564 						style="margin: 0.5em">
   1565 						<tr>
   1566 						  <td><code>Qaag</code></td>
   1567 						  <td>Zawgyi</td>
   1568 						  <td colspan="2">Qaag is a special script code for identifying the non-standard use of Myanmar characters for display with the Zawgyi font. The purpose of the code is to enable migration to standard, interoperable use of Unicode by providing an identifier for Zawgyi for tagging text, applications, input methods, font tables, transformations, and other mechanisms used for migration.</td>
   1569 					  </tr>
   1570 						<tr>
   1571 							<td><code>Qaai</code></td>
   1572 							<td>Inherited</td>
   1573 							<td colspan="2"><strong>deprecated</strong>: the <em>canonicalized</em>
   1574 								form is Zinh</td>
   1575 						</tr>
   1576 					  <tr>
   1577 							<td><code>Zinh</code></td>
   1578 							<td>Inherited</td>
   1579 							<td colspan="2">&nbsp;</td>
   1580 						</tr>
   1581 						<tr>
   1582 							<td><code>Zsye</code></td>
   1583 							<td>Emoji Style</td>
   1584 							<td colspan="2">Prefer emoji style for characters that have both text
   1585 								and emoji styles available.</td>
   1586 						</tr>
   1587 						<tr>
   1588 							<td><code>Zsym</code></td>
   1589 							<td>Text Style</td>
   1590 							<td colspan="2">Prefer text style for characters that have both text and
   1591 								emoji styles available.</td>
   1592 						</tr>
   1593 						<tr>
   1594 						  <td rowspan="7"><code>Zxxx</code></td>
   1595 						  <td rowspan="7">Unwritten</td>
   1596 						  <td colspan="2">Indicates spoken or otherwise unwritten content. For example:</td>
   1597 					  </tr>
   1598 						<tr>
   1599 						  <th>Sample(s)</th>
   1600 						  <th>Description</th>
   1601 				      </tr>
   1602 						<tr>
   1603 						  <td>uz</td>
   1604 						  <td>either written or spoken content</td>
   1605 				      </tr>
   1606 						<tr>
   1607 						  <td>uz-Latn <em>or</em> uz-Arab</td>
   1608 						  <td>written-only content (particular script)</td>
   1609 				      </tr>
   1610 						<tr>
   1611 						  <td>uz-Zyyy</td>
   1612 						  <td>written-only content (unspecified script)</td>
   1613 				      </tr>
   1614 						<tr>
   1615 						  <td>uz-Zxxx</td>
   1616 						  <td>spoken-only content</td>
   1617 				      </tr>
   1618 						<tr>
   1619 						  <td>uz-Latn, uz-Zxxx</td>
   1620 						  <td>both specific written and spoken content (using a <em>language list</em>)</td>
   1621 				      </tr>
   1622 						<tr>
   1623 							<td><code>Zyyy</code></td>
   1624 							<td>Common</td>
   1625 							<td colspan="2">&nbsp;</td>
   1626 						</tr>
   1627 						<tr>
   1628 							<td><code>Zzzz</code></td>
   1629 							<td>Unknown</td>
   1630 							<td colspan="2">&nbsp;</td>
   1631 						</tr>
   1632 					</table>
   1633 					<p>The private use subtags listed as <strong>excluded</strong> in <em>Section 3.5.3 <a href="#Private_Use">Private Use Codes</a></em> will never be given
   1634 						specific semantics in Unicode identifiers, and are thus safe for
   1635 			  use for other purposes by other applications.</p></td>
   1636 			</tr>
   1637 			<tr>
   1638 				<td><a href="#unicode_region_subtag_validity"
   1639 					name="unicode_region_subtag_validity">unicode_region_subtag</a>
   1640 					<p>
   1641 						(also known as a <i>Unicode region code, </i>or<i> a Unicode
   1642 							territory code)</i>
   1643 					</p></td>
   1644 				<td>Subtags in the region.xml file (see<em> Section 3.11 <a
   1645 						href="#Validity_Data">Validity Data</a></em>). These are based on [<a
   1646 					href="#BCP47">BCP47</a>] subtag values marked as <b>Type:
   1647 						region</b>
   1648 					<p>Unicode identifiers give specific semantics to the following
   1649 						subtags:</p>
   1650 					<table border="1" cellspacing="0" cellpadding="2">
   1651 						<tr>
   1652 							<td>&nbsp;</td>
   1653 							<td><strong>Name</strong></td>
   1654 							<td><strong>Comment</strong></td>
   1655 							<td><strong> ISO 3166-1 status</strong></td>
   1656 						</tr>
   1657 						<tr>
   1658 							<td><code>QO</code></td>
   1659 							<td>Outlying Oceania</td>
   1660 							<td>countries in Oceania [009] that do not have a <a
   1661 								href="http://www.unicode.org/cldr/charts/latest/supplemental/territory_containment_un_m_49.html">subcontinent</a>.
   1662 							</td>
   1663 							<td>private use</td>
   1664 						</tr>
   1665 						<tr>
   1666 							<td><code>QU</code></td>
   1667 							<td>European Union</td>
   1668 							<td><strong>deprecated</strong>: the <em>canonicalized</em>
   1669 								form is EU</td>
   1670 							<td>private use</td>
   1671 						</tr>
   1672 						<tr>
   1673 							<td><code>UK</code></td>
   1674 							<td>United Kingdom</td>
   1675 							<td><strong>deprecated</strong>: the <em>canonicalized</em>
   1676 								form is GB</td>
   1677 							<td>exceptionally reserved</td>
   1678 						</tr>
   1679 						<tr>
   1680 						  <td><code>XA</code></td>
   1681 						  <td>Pseudo-Accents</td>
   1682 						  <td>special code indicating derived testing locale with English + added accents and lengthened</td>
   1683 						  <td>private use</td>
   1684 					  </tr>
   1685 						<tr>
   1686 						  <td><code>XB</code></td>
   1687 						  <td>Pseudo-Bidi</td>
   1688 						  <td>special code indicating derived testing locale with forced RTL English</td>
   1689 						  <td>private use</td>
   1690 					  </tr>
   1691 						<tr>
   1692 							<td><code>XK</code></td>
   1693 							<td>Kosovo</td>
   1694 							<td>industry practice</td>
   1695 							<td>private use</td>
   1696 						</tr>
   1697 						<tr>
   1698 							<td><code>ZZ</code></td>
   1699 							<td>Unknown or Invalid Territory</td>
   1700 							<td>used in APIs or as replacement for invalid code</td>
   1701 							<td>private use</td>
   1702 						</tr>
   1703 					</table>
   1704 					<p>The private use subtags listed as <strong>excluded</strong> in <em>Section 3.5.3 <a href="#Private_Use">Private Use Codes</a></em> will normally never be
   1705 						given specific semantics in Unicode identifiers, and are thus safe
   1706 						for use for other purposes by other applications. However, LDML
   1707 						may follow widespread industry practice in the use of some of
   1708 						these codes, such as for XK.</p>
   1709 					<p>The CLDR provides data for normalizing territory/region
   1710 						codes, including mapping overlong codes like &quot;eng-840&quot;
   1711 						or &quot;eng-USA&quot; to the correct code &quot;en-US&quot;.</p>
   1712 					<p>Special Codes:</p>
   1713 					<ul>
   1714 						<li>The territory code 'UK' has a special status in ISO, and
   1715 							is used for the domain name instead of GB. It is thus recognized
   1716 							by CLDR as being an alternate (unnormalized) form of 'GB'.</li>
   1717 						<li>The territory code '001' (the World) is used to indicate
   1718 							a standardized form, such as &quot;ar-001&quot; for Modern
   1719 							Standard Arabic.</li>
   1720 					</ul></td>
   1721 			</tr>
   1722 			<tr>
   1723 				<td><a href="#unicode_variant_subtag_validity"
   1724 					name="unicode_variant_subtag_validity">unicode_variant_subtag</a>
   1725 					<p>
   1726 						(also known as a <i>Unicode language variant code)</i>
   1727 					</p></td>
   1728 				<td>Subtags in the variant.xml file (see<em> Section 3.11
   1729 						<a href="#Validity_Data">Validity Data</a>
   1730 				</em>). These are based on [<a href="#BCP47">BCP47</a>] subtag values
   1731 					marked as <b>Type: variant</b>
   1732 					<p>
   1733 						CLDR provides data for normalizing variant codes. About handling
   1734 						of the "POSIX" variant see <i>Section 3.8.2, <a
   1735 							href="#Legacy_Variants">Legacy Variants</a></i>.
   1736 					</p></td>
   1737 			</tr>
   1738 		</table>
   1739 		<p>
   1740 			<i>Examples:</i>
   1741 		</p>
   1742 		<blockquote>
   1743 			<pre>en
   1744 fr_BE
   1745 zh-Hant-HK</pre>
   1746 		</blockquote>
   1747 		<p>
   1748 			<em>Deprecated</em> codessuch as QU aboveare valid, but strongly
   1749 			discouraged.
   1750 		</p>
   1751 		<p>
   1752 			A locale that only has a language subtag (and optionally a script
   1753 			subtag) is called a <i>language locale</i>; one with both language
   1754 			and territory subtag is called a <i>territory locale</i> (or <i>country
   1755 				locale</i>).
   1756 		</p>
   1757 		<h3>
   1758 			<a name="Special_Codes" href="#Special_Codes">3.5 Special Codes</a>
   1759 		</h3>
   1760 
   1761 		<h4>
   1762 			<a name="Unknown_or_Invalid_Identifiers"
   1763 				href="#Unknown_or_Invalid_Identifiers">3.5.1 Unknown or Invalid
   1764 				Identifiers</a>
   1765 		</h4>
   1766 		<p>The following identifiers are used to indicate an unknown or
   1767 			invalid code in Unicode language and locale identifiers. For Unicode
   1768 			identifiers, the region code uses a private use ISO 3166 code, and
   1769 			Time Zone code uses an additional code; the others are defined by the
   1770 			relevant standards. When these codes are used in APIs connected with
   1771 			Unicode identifiers, the meaning is that either there was no
   1772 			identifier available, or that at some point an input identifier value
   1773 			was determined to be invalid or ill-formed.</p>
   1774 		<table border="1" cellspacing="0" cellpadding="4"
   1775 			style="margin-top: 0.5em; margin-bottom: 0.5em" id="table4">
   1776 			<tr>
   1777 				<th>Code Type</th>
   1778 				<th>Value</th>
   1779 				<th>Description in Referenced Standards</th>
   1780 			</tr>
   1781 			<tr>
   1782 				<td>Language</td>
   1783 				<td><code>und</code></td>
   1784 				<td>Undetermined language, also used for root</td>
   1785 			</tr>
   1786 			<tr>
   1787 				<td>Script</td>
   1788 				<td><code>Zzzz</code></td>
   1789 				<td>Code for uncoded script, Unknown [<a
   1790 					href="http://www.unicode.org/reports/tr41/#UAX24">UAX24</a>]
   1791 				</td>
   1792 			</tr>
   1793 			<tr>
   1794 				<td>Region&nbsp;&nbsp;</td>
   1795 				<td><code>ZZ</code></td>
   1796 				<td>Unknown or Invalid Territory</td>
   1797 			</tr>
   1798 			<tr>
   1799 				<td>Currency</td>
   1800 				<td><code>XXX</code></td>
   1801 				<td>The codes assigned for transactions where no currency is
   1802 					involved</td>
   1803 			</tr>
   1804 			<tr>
   1805 				<td>Time Zone</td>
   1806 				<td><code>unk</code></td>
   1807 				<td>Unknown or Invalid Time Zone</td>
   1808 			</tr>
   1809 			<tr>
   1810 				<td>Subdivision</td>
   1811 				<td><em>&lt;region&gt;</em>zzzz</td>
   1812 				<td>Unknown or Invalid Subdivision</td>
   1813 			</tr>
   1814 		</table>
   1815 		<p>When only the script or region are known, then a locale ID will
   1816 			use &quot;und&quot; as the language subtag portion. Thus the locale
   1817 			tag &quot;und_Grek&quot; represents the Greek script;
   1818 			&quot;und_US&quot; represents the US territory.</p>
   1819 		<h4>
   1820 			<a name="Numeric_Codes" href="#Numeric_Codes">3.5.2 Numeric Codes</a>
   1821 		</h4>
   1822 		<p>For region codes, ISO and the UN establish a mapping to
   1823 			three-letter codes and numeric codes. However, this does not extend
   1824 			to the private use codes, which are the codes 900-999 (total: 100),
   1825 			and AAA, QMA-QZZ, XAA-XZZ, and ZZZ (total: 1092). Unicode identifiers
   1826 			supply a standard mapping to these: for the numeric codes, it uses
   1827 			the top of the numeric private use range; for the 3-letter codes it
   1828 			doubles the final letter. These are the resulting mappings for all of
   1829 			the private use region codes:</p>
   1830 		<table border="1" cellspacing="0" cellpadding="4"
   1831 			style="margin-top: 0.5em; margin-bottom: 0.5em" id="table19">
   1832 			<tr>
   1833 				<th>Region</th>
   1834 				<th>UN/ISO Numeric</th>
   1835 				<th>ISO 3-Letter</th>
   1836 			</tr>
   1837 			<tr>
   1838 				<td><code>AA</code></td>
   1839 				<td><code>958</code></td>
   1840 				<td><code>AAA</code></td>
   1841 			</tr>
   1842 			<tr>
   1843 				<td><code>QM..QZ</code></td>
   1844 				<td><code>959..972</code></td>
   1845 				<td><code>QMM..QZZ</code></td>
   1846 			</tr>
   1847 			<tr>
   1848 				<td><code>XA..XZ</code></td>
   1849 				<td><code>973..998</code></td>
   1850 				<td><code>XAA..XZZ</code></td>
   1851 			</tr>
   1852 			<tr>
   1853 				<td><code>ZZ</code></td>
   1854 				<td><code>999</code></td>
   1855 				<td><code>ZZZ</code></td>
   1856 			</tr>
   1857 		</table>
   1858 		<p>For script codes, ISO 15924 supplies a mapping (however, the
   1859 			numeric codes are not in common use):</p>
   1860 		<table border="1" cellspacing="0" cellpadding="4"
   1861 			style="margin-top: 0.5em; margin-bottom: 0.5em" id="table21">
   1862 			<tr>
   1863 				<th>Script</th>
   1864 				<th>Numeric</th>
   1865 			</tr>
   1866 			<tr>
   1867 				<td><code>Qaaa..Qabx</code></td>
   1868 				<td><code>900..949</code></td>
   1869 			</tr>
   1870 		</table>
   1871 		<br>
   1872 		<h4>
   1873 			3.5.3 <a name="Private_Use" href="#Private_Use">Private Use Codes</a>
   1874 		</h4>
   1875 		<p>Private use codes fall into three groups.</p>
   1876 		<ul>
   1877 			<li><strong>defined:</strong> those that are given particular
   1878 				semantics currently in CLDR</li>
   1879 			<li><strong>reserved:</strong> those that may be given
   1880 				particular semantics in future versions of CLDR</li>
   1881 			<li><strong>excluded:</strong> those that will never be given
   1882 				particular CLDR semantics in the future, and thus can normally be
   1883 				used by applications without worrying about collisions. However,
   1884 				CLDR may follow widespread industry practice in the use of some of
   1885 				these codes, such as for XA, XB, and XK.</li>
   1886 		</ul>
   1887 		<table>
   1888 			<caption>
   1889 				<a name="Private_Use_CLDR" href="#Private_Use_CLDR">Private Use
   1890 					Codes in CLDR</a>
   1891 			</caption>
   1892 			<tr>
   1893 				<th>category</th>
   1894 				<th>status</th>
   1895 				<th>codes</th>
   1896 			</tr>
   1897 			<tr>
   1898 				<td rowspan="3">base language</td>
   1899 				<td>defined</td>
   1900 				<td>none</td>
   1901 			</tr>
   1902 			<tr>
   1903 				<td>reserved</td>
   1904 				<td>qaa..qfy</td>
   1905 			</tr>
   1906 			<tr>
   1907 				<td>excluded</td>
   1908 				<td>qfz..qtz</td>
   1909 			</tr>
   1910 			<tr>
   1911 				<td rowspan="3">script</td>
   1912 				<td>defined</td>
   1913 				<td>Qaai (obsolete), Qaag</td>
   1914 			</tr>
   1915 			<tr>
   1916 				<td>reserved</td>
   1917 				<td>Qaaa..Qaaf Qaah Qaaj..Qaap</td>
   1918 			</tr>
   1919 			<tr>
   1920 				<td>excluded</td>
   1921 				<td>Qaaq..Qabx</td>
   1922 			</tr>
   1923 			<tr>
   1924 				<td rowspan="3">region</td>
   1925 				<td>defined</td>
   1926 				<td>QO, QU, UK, XA, XB, XK, ZZ</td>
   1927 			</tr>
   1928 			<tr>
   1929 				<td>reserved</td>
   1930 				<td>AA 			QM..QN QP..QT QV..QZ</td>
   1931 			</tr>
   1932 			<tr>
   1933 				<td>excluded</td>
   1934 				<td>XC..XJ, XL..XZ</td>
   1935 			</tr>
   1936 			<tr>
   1937 				<td rowspan="3">timezone</td>
   1938 				<td>defined</td>
   1939 				<td>IANA: Etc/Unknown<br> 
   1940 					bcp47: as listed in bcp47/timezone.xml
   1941 				</td>
   1942 			</tr>
   1943 			<tr>
   1944 				<td>reserved</td>
   1945 				<td>bcp47: all non-5 letter codes not starting with x</td>
   1946 			</tr>
   1947 			<tr>
   1948 				<td>excluded</td>
   1949 				<td>bcp47: all non-5 letter codes starting with x</td>
   1950 			</tr>
   1951 		</table>
   1952 		<p>
   1953 			See also <em>Section 3.5.1 <a
   1954 				href="#Unknown_or_Invalid_Identifiers">Unknown or Invalid
   1955 					Identifiers</a></em>.
   1956 		</p>
   1957 		<p></p>
   1958 		<h3>
   1959 			<a name="Locale_Extension_Key_and_Type_Data"></a><a
   1960 				name="u_Extension" href="#u_Extension">3.6 Unicode BCP 47 U
   1961 				Extension</a>
   1962 		</h3>
   1963 		<p>
   1964 			[<a href="#BCP47">BCP47</a>] Language Tags provides a mechanism for
   1965 			extending language tags for use in various applications by extension
   1966 			subtags. Each extension subtag is identified by a single alphanumeric
   1967 			character subtag assigned by IANA.
   1968 		</p>
   1969 		<p>
   1970 			The Unicode Consortium has registered and is the maintaining
   1971 			authority for two BCP 47 language tag extensions: the extension 'u'
   1972 			for Unicode locale extension [<a href="#RFC6067">RFC6067</a>] and
   1973 			extension 't' for transformed content [<a href="#RFC6497">RFC6497</a>].
   1974 			The Unicode BCP 47 extension data defines the complete list of valid
   1975 			subtags.
   1976 		</p>
   1977 
   1978 		<p>
   1979 			These subtags are all in lowercase (that is the canonical casing for
   1980 			these subtags), however, subtags are case-insensitive and casing does
   1981 			not carry any specific meaning. All subtags within the Unicode
   1982 			extensions are alphanumeric characters in length of two to eight that
   1983 			meet the rule
   1984 			<code>extension</code>
   1985 			in the [<a href="#BCP47">BCP47</a>]
   1986 		</p>
   1987 		<p>
   1988 			<strong>The -u- Extension.</strong> The syntax of 'u' extension
   1989 			subtags is defined by the rule
   1990 			<code>unicode_locale_extensions</code>
   1991 			in <a href="#Unicode_locale_identifier">Section 3.2 Unicode
   1992 				locale identifier</a>, except the separator of subtags
   1993 			<code>sep</code>
   1994 			must be always hyphen '-' when the extension is used as a part of BCP
   1995 			47 language tag.
   1996 		</p>
   1997 		<p>
   1998 			A 'u' extension may contain multiple
   1999 			<code>attribute</code>
   2000 			s or
   2001 			<code>keyword</code>
   2002 			s as defined in <a href="#Unicode_locale_identifier">Section 3.2
   2003 				Unicode locale identifier</a>. Although the order of
   2004 			<code>attribute</code>
   2005 			s or
   2006 			<code>keyword</code>
   2007 			s does not matter, this specification defines the canonical form as
   2008 			below:
   2009 		</p>
   2010 		<ul>
   2011 			<li>All attributes are sorted in alphabetical order.</li>
   2012 			<li>All keywords are sorted by alphabetical order of keys.</li>
   2013 			<li>All keywords are in lowercase.</li>
   2014 			<li>All keys and types use the canonical form (from the name
   2015 				attribute; see <a href="#Unicode_Locale_Extension_Data_Files">Section
   2016 					3.6.4 U Extension Data Files</a>).
   2017 			</li>
   2018 			<li>Type value "true" is removed.</li>
   2019 		</ul>
   2020 		<p>For example, the canonical form of 'u' extension
   2021 			"u-foo-bar-nu-thai-ca-buddhist-kk-true" is
   2022 			"u-bar-foo-ca-buddhist-kk-nu-thai". The attributes "foo" and "bar" in
   2023 			this example are provided only for illustration; no attribute subtags
   2024 			are defined by the current CLDR specification.</p>
   2025 		<p>
   2026 			<em>See also <a
   2027 				href="http://cldr.unicode.org/index/bcp47-extension"> Unicode
   2028 					Extensions for BCP 47</a> on the CLDR site.
   2029 			</em>
   2030 		</p>
   2031 		<h4>
   2032 			<a href="#Key_And_Type_Definitions_" name="Key_And_Type_Definitions_">3.6.1
   2033 				Key And Type Definitions</a>
   2034 		</h4>
   2035 		<p>The following chart contains a set of U extension key values
   2036 			that are currently available, with a description or sampling of the U
   2037 			extension type values. Each category is associated with an XML file
   2038 			in the bcp47 directory.</p>
   2039 		<p>
   2040 			For the complete list of valid keys and types defined for Unicode
   2041 			locale extensions, see <a href="#Unicode_Locale_Extension_Data_Files">Section
   2042 				3.6.4 U Extension Data Files</a>. For information on the process for
   2043 			adding new <i>key</i>/<i>type</i>, see [<a href="#localeProject">LocaleProject</a>].
   2044 		</p>
   2045 		<p>
   2046 			Most type values are represented by a single subtag in the current
   2047 			version of CLDR. There are exceptions, such as types used for key
   2048 			"ca" (calendar) and "kr" (collation reordering). If the type is not
   2049 			included, then the type value "true" is assumed. Note that the
   2050 			default for key with a possible &quot;true&quot; value is often
   2051 			&quot;false&quot;, but may not always be. Note also that
   2052 			"true"/"True" is not a valid script code, since <a
   2053 				href="http://www.unicode.org/iso15924/codelists.html">the ISO
   2054 				15924 Registration Authority has exceptionally reserved it</a>, which
   2055 			means that it will not be assigned for any purpose.
   2056 		</p>
   2057 		<p>The BCP 47 form for keys and types is the canonical form, and
   2058 			recommended. Other aliases are included for backwards compatibility.
   2059 	  </p>
   2060 		<table>
   2061 			<caption>
   2062 				<a name="Key_Type_Definitions" href="#Key_Type_Definitions">Key/Type
   2063 					Definitions</a>
   2064 			</caption>
   2065 			<tr>
   2066 				<th>key<br> (old key name)
   2067 				</th>
   2068 				<th>key description</th>
   2069 				<th>example type<br> (old type name)
   2070 				</th>
   2071 				<th>type description</th>
   2072 			</tr>
   2073 			<tr>
   2074 				<td colspan="4"><strong>A <a
   2075 						href="#UnicodeCalendarIdentifier" name="UnicodeCalendarIdentifier">Unicode
   2076 							Calendar Identifier</a> defines a type of calendar. The valid values
   2077 						are those <em>name</em> attribute values in the <em>type</em>
   2078 						elements of key name="ca" in bcp47/<a target="_blank"
   2079 						href="http://www.unicode.org/repos/cldr/tags/latest/common/bcp47/calendar.xml">calendar.xml</a></strong>.</td>
   2080 			</tr>
   2081 			<tr>
   2082 				<td rowspan="10">"ca"<br> (calendar)
   2083 				</td>
   2084 				<td rowspan="10">Calendar algorithm<br> <br> <i>(For
   2085 						information on the calendar algorithms associated with the data
   2086 						used with these, see [<a href="#Calendars">Calendars</a>].)
   2087 				</i></td>
   2088 				<td>"buddhist"</td>
   2089 				<td>Thai Buddhist calendar (same as Gregorian except for the
   2090 					year)</td>
   2091 			</tr>
   2092 			<tr>
   2093 				<td>"chinese"</td>
   2094 				<td>Traditional Chinese calendar</td>
   2095 			</tr>
   2096 			<tr>
   2097 				<td colspan="2"></td>
   2098 			</tr>
   2099 			<tr>
   2100 				<td>"gregory"<br> (gregorian)
   2101 				</td>
   2102 				<td>Gregorian calendar</td>
   2103 			</tr>
   2104 			<tr>
   2105 				<td colspan="2"></td>
   2106 			</tr>
   2107 			<tr>
   2108 				<td>"islamic"</td>
   2109 				<td>Islamic calendar</td>
   2110 			</tr>
   2111 			<tr>
   2112 				<td>"islamic-civil"</td>
   2113 				<td>Islamic calendar, tabular (intercalary years
   2114 					[2,5,7,10,13,16,18,21,24,26,29] - civil epoch)</td>
   2115 			</tr>
   2116 			<tr>
   2117 				<td>"islamic-umalqura"</td>
   2118 				<td>Islamic calendar, Umm al-Qura</td>
   2119 			</tr>
   2120 			<tr>
   2121 				<td colspan="2"></td>
   2122 			</tr>
   2123 			<tr>
   2124 				<td colspan="2"><b>Note:</b> <i>Some calendar types are
   2125 						represented by two subtags. In such cases, the first subtag
   2126 						specifies a generic calendar type and the second subtag specifies
   2127 						a calendar algorithm variant. The CLDR uses generic calendar types
   2128 						(single subtag types) for tagging data when calendar algorithm
   2129 						variations within a generic calendar type are irrelevant. For
   2130 						example, type "islamic" is used for specifying Islamic calendar
   2131 						formatting data for all Islamic calendar types, including
   2132 						"islamic-civil" and "islamic-umalqura".</i></td>
   2133 			</tr>
   2134 
   2135 			<tr>
   2136 				<td colspan="4"><strong>A <a
   2137 						href="#UnicodeCurrencyFormatIdentifier"
   2138 						name="UnicodeCurrencyFormatIdentifier">Unicode Currency Format
   2139 							Identifier</a> defines a style for currency formatting. The valid
   2140 						values are those <em>name</em> attribute values in the <em>type</em>
   2141 						elements of key name="cf" in bcp47/<a target="_blank"
   2142 						href="http://www.unicode.org/repos/cldr/tags/latest/common/bcp47/currency.xml">currency.xml</a></strong>.</td>
   2143 			</tr>
   2144 			<tr>
   2145 				<td rowspan="2">"cf"</td>
   2146 				<td rowspan="2">Currency Format style</td>
   2147 				<td>"standard"</td>
   2148 				<td>Negative numbers use the minusSign symbol (the default).</td>
   2149 			</tr>
   2150 			<tr>
   2151 				<td>"account"</td>
   2152 				<td>Negative numbers use parentheses or equivalent.</td>
   2153 			</tr>
   2154 
   2155 			<tr>
   2156 				<td colspan="4"><strong>A <a
   2157 						href="#UnicodeCollationIdentifier"
   2158 						name="UnicodeCollationIdentifier">Unicode Collation Identifier</a>
   2159 						defines a type of collation (sort order). The valid values are
   2160 						those <em>name</em> attribute values in the <em>type</em> elements
   2161 						of bcp47/<a target="_blank"
   2162 						href="http://www.unicode.org/repos/cldr/tags/latest/common/bcp47/collation.xml">collation.xml</a></strong>.</td>
   2163 			</tr>
   2164 			<tr>
   2165 				<td colspan="4"><i>For information on each collation
   2166 						setting parameter, from <strong>ka</strong> to <strong>vt</strong>,
   2167 						see <a href="tr35-collation.html#Setting_Options">Setting
   2168 							Options</a>
   2169 				</i></td>
   2170 			</tr>
   2171 			<tr>
   2172 				<td rowspan="9">"co"<br> (collation)
   2173 				</td>
   2174 				<td rowspan="9">Collation type</td>
   2175 				<td>"standard"</td>
   2176 				<td>The default ordering for each language. For root it is
   2177 					based on the [<a href="#DUCET">DUCET</a>] (Default Unicode
   2178 					Collation Element Table): see <em><a
   2179 						href="tr35-collation.html#Root_Collation">Root Collation</a></em>. Each
   2180 					other locale is based on that, except for appropriate modifications
   2181 					to certain characters for that language.
   2182 				</td>
   2183 			</tr>
   2184 
   2185 			<tr>
   2186 				<td>"search"</td>
   2187 				<td>A special collation type dedicated for string searchit is
   2188 					not used to determine the relative order of two strings, but only
   2189 					to determine whether they should be considered equivalent for the
   2190 					specified strength, using the string search matching rules
   2191 					appropriate for the language. Compared to the normal collator for
   2192 					the language, this may add or remove primary equivalences, may make
   2193 					additional characters ignorable or change secondary equivalences,
   2194 					and may modify contractions to allow matching within them,
   2195 					depending on the desired behavior. For example, in Czech, the
   2196 					distinction between a and  is secondary for normal collation,
   2197 					but primary for search; a search for a should never match  and
   2198 					vice versa. A search collator is normally used with strength set to
   2199 					PRIMARY or SECONDARY (should be SECONDARY if using asymmetric
   2200 					search as described in the [<a
   2201 					href="http://www.unicode.org/reports/tr41/#UTS10">UCA</a>] section
   2202 					Asymmetric Search). The search collator in root supplies matching
   2203 					rules that are appropriate for most languages (and which are
   2204 					different than the root collation behavior); language-specific
   2205 					search collators may be provided to override the matching rules for
   2206 					a given language as necessary.
   2207 				</td>
   2208 			</tr>
   2209 			<tr>
   2210 				<td colspan="2"><p>
   2211 						Other keywords provide additional choices for certain locales; <i>they
   2212 							only have effect in certain locales.</i>
   2213 					</p></td>
   2214 			</tr>
   2215 			<tr>
   2216 				<td colspan="2"></td>
   2217 			</tr>
   2218 			<tr>
   2219 				<td>"phonetic"</td>
   2220 				<td>Requests a phonetic variant if available, where text is
   2221 					sorted based on pronunciation. It may interleave different scripts,
   2222 					if multiple scripts are in common use.</td>
   2223 			</tr>
   2224 			<tr>
   2225 				<td>"pinyin"</td>
   2226 				<td>Pinyin ordering for Latin and for CJK characters; that is,
   2227 					an ordering for CJK characters based on a character-by-character
   2228 					transliteration into a pinyin. (used in Chinese)</td>
   2229 			</tr>
   2230 			<tr>
   2231 				<td>"reformed"</td>
   2232 				<td>Reformed collation (such as in Swedish)</td>
   2233 			</tr>
   2234 			<tr>
   2235 				<td>"searchjl"</td>
   2236 				<td>Special collation type for a modified string search in
   2237 					which a pattern consisting of a sequence of Hangul initial
   2238 					consonants (jamo lead consonants) will match a sequence of Hangul
   2239 					syllable characters whose initial consonants match the pattern. The
   2240 					jamo lead consonants can be represented using conjoining or
   2241 					compatibility jamo. This search collator is best used at SECONDARY
   2242 					strength with an "asymmetric" search as described in the [<a
   2243 					href="http://www.unicode.org/reports/tr41/#UTS10">UCA</a>] section
   2244 					Asymmetric Search and obtained, for example, using ICU4C's usearch
   2245 					facility with attribute USEARCH_ELEMENT_COMPARISON set to value
   2246 					USEARCH_PATTERN_BASE_WEIGHT_IS_WILDCARD; this ensures that a full
   2247 					Hangul syllable in the search pattern will only match the same
   2248 					syllable in the searched text (instead of matching any syllable
   2249 					with the same initial consonant), while a Hangul initial consonant
   2250 					in the search pattern will match any Hangul syllable in the
   2251 					searched text with the same initial consonant.
   2252 				</td>
   2253 			</tr>
   2254 			<tr>
   2255 				<td colspan="2"></td>
   2256 			</tr>
   2257 
   2258 			<tr>
   2259 				<td colspan="4"><strong>A <a
   2260 						href="#UnicodeCurrencyIdentifier" name="UnicodeCurrencyIdentifier">Unicode
   2261 							Currency Identifier</a> defines a type of currency. The valid values
   2262 						are those <em>name</em> attribute values in the <em>type</em>
   2263 						elements of key name="cu" in bcp47/<a target="_blank"
   2264 						href="http://www.unicode.org/repos/cldr/tags/latest/common/bcp47/currency.xml">currency.xml</a>.
   2265 				</strong></td>
   2266 			</tr>
   2267 			<tr>
   2268 				<td>"cu"<br> (currency)
   2269 				</td>
   2270 				<td>Currency type</td>
   2271 				<td><i>ISO 4217 code,</i>
   2272 					<p>
   2273 						<i>plus others in common use</i>
   2274 					</p></td>
   2275 				<td><p>
   2276 						Codes consisting of 3 ASCII letters that are or have been valid in
   2277 						ISO 4217, plus certain additional codes that are or have been in
   2278 						common use. The list of countries and time periods associated with
   2279 						each currency value is available in <a
   2280 							href="tr35-numbers.html#Supplemental_Currency_Data">Supplemental
   2281 							Currency Data</a>, plus the default number of decimals.
   2282 					</p>
   2283 					<p>
   2284 						The XXX code is given a broader interpretation as <em>Unknown
   2285 							or Invalid Currency</em>.
   2286 					</p></td>
   2287 			</tr>
   2288 
   2289 			<tr>
   2290 				<td colspan="4"><strong>A <a
   2291 						href="#UnicodeEmojiPresentationStyleIdentifier" name="UnicodeEmojiPresentationStyleIdentifier">Unicode
   2292 							Emoji Presentation Style Identifier</a> specifies a request for
   2293 						the preferred emoji presentation style. This can be used as part of
   2294 						the value for an HTML lang attribute, for example
   2295 						<code>&lt;html lang=&quot;sr-Latn-u-em-emoji&quot;&gt;</code>.
   2296 						The valid values are those <em>name</em> attribute values
   2297 						in the <em>type</em> elements of key name="em" in bcp47/<a
   2298 						target="_blank"
   2299 						href="http://www.unicode.org/repos/cldr/tags/latest/common/bcp47/variant.xml">variant.xml</a></strong>.</td>
   2300 			</tr>
   2301 			<tr>
   2302 				<td rowspan="3">"em"</td>
   2303 				<td rowspan="3">Emoji presentation style</td>
   2304 				<td>"emoji"</td>
   2305 				<td>Use an emoji presentation for emoji characters if possible.</td>
   2306 			</tr>
   2307 			<tr>
   2308 				<td>"text"</td>
   2309 				<td>Use a text presentation for emoji characters if possible.</td>
   2310 			</tr>
   2311 			<tr>
   2312 				<td>"default"</td>
   2313 				<td>Use the default presentation for emoji characters as specified in UTR #51 Section 4,
   2314 					<a href="http://www.unicode.org/reports/tr51/#Presentation_Style">Presentation Style</a>.</td>
   2315 			</tr>
   2316 
   2317 			<tr>
   2318 				<td colspan="4"><strong>A <a
   2319 						href="#UnicodeFirstDayIdentifier" name="UnicodeFirstDayIdentifier">Unicode
   2320 							First Day Identifier</a> defines the preferred first day of the week
   2321 						for calendar display. Specifying "fw" in a locale identifier
   2322 						overrides the default value specified by supplemental week data
   2323 						(see Part 4 Dates, section 4.3 <a href="tr35-dates.html#Week_Data">Week
   2324 							Data</a>). The valid values are those <em>name</em> attribute values
   2325 						in the <em>type</em> elements of key name="fw" in bcp47/<a
   2326 						target="_blank"
   2327 						href="http://www.unicode.org/repos/cldr/tags/latest/common/bcp47/calendar.xml">calendar.xml</a></strong>.</td>
   2328 			</tr>
   2329 			<tr>
   2330 				<td rowspan="4">"fw"</td>
   2331 				<td rowspan="4">First day of week</td>
   2332 				<td>"sun"</td>
   2333 				<td>Sunday</td>
   2334 			</tr>
   2335 			<tr>
   2336 				<td>"mon"</td>
   2337 				<td>Monday</td>
   2338 			</tr>
   2339 			<tr>
   2340 				<td colspan="2"></td>
   2341 			</tr>
   2342 			<tr>
   2343 				<td>"sat"</td>
   2344 				<td>Saturday</td>
   2345 			</tr>
   2346 
   2347 			<tr>
   2348 				<td colspan="4"><strong>A <a
   2349 						href="#UnicodeHourCycleIdentifier"
   2350 						name="UnicodeHourCycleIdentifier">Unicode Hour Cycle
   2351 							Identifier</a> defines the preferred time cycle. Specifying "hc" in a
   2352 						locale identifier overrides the the default value specified by
   2353 						supplemental time data (see Part 4 Dates, section 4.4 <a
   2354 						href="tr35-dates.html#Time_Data">Time Data</a>). The valid values
   2355 						are those <em>name</em> attribute values in the <em>type</em>
   2356 						elements of key name="hc" in bcp47/<a target="_blank"
   2357 						href="http://www.unicode.org/repos/cldr/tags/latest/common/bcp47/calendar.xml">calendar.xml</a></strong>.</td>
   2358 			</tr>
   2359 			<tr>
   2360 				<td rowspan="4">"hc"</td>
   2361 				<td rowspan="4">Hour cycle</td>
   2362 				<td>"h12"</td>
   2363 				<td>Hour system using 112; corresponds to 'h' in patterns</td>
   2364 			</tr>
   2365 			<tr>
   2366 				<td>"h23"</td>
   2367 				<td>Hour system using 023; corresponds to 'H' in patterns</td>
   2368 			</tr>
   2369 			<tr>
   2370 				<td>"h11"</td>
   2371 				<td>Hour system using 011; corresponds to 'K' in patterns</td>
   2372 			</tr>
   2373 			<tr>
   2374 				<td>"h24"</td>
   2375 				<td>Hour system using 124; corresponds to 'k' in pattern</td>
   2376 			</tr>
   2377 
   2378 			<tr>
   2379 				<td colspan="4"><strong>A <a
   2380 						href="#UnicodeLineBreakStyleIdentifier"
   2381 						name="UnicodeLineBreakStyleIdentifier">Unicode Line Break
   2382 							Style Identifier</a> defines a preferred line break style
   2383 						corresponding to the CSS level 3 <a
   2384 						href="https://drafts.csswg.org/css-text/#line-break-property">line-break
   2385 							option</a>. Specifying "lb" in a locale identifier overrides the
   2386 						locales default style (which may correspond to "normal" or
   2387 						"strict"). The valid values are those <em>name</em> attribute
   2388 						values in the <em>type</em> elements of key name="lb" in bcp47/<a
   2389 						target="_blank"
   2390 						href="http://www.unicode.org/repos/cldr/tags/latest/common/bcp47/segmentation.xml">segmentation.xml</a></strong>.</td>
   2391 			</tr>
   2392 			<tr>
   2393 				<td rowspan="3">"lb"</td>
   2394 				<td rowspan="3">Line break style</td>
   2395 				<td>"strict"</td>
   2396 				<td>CSS level 3 line-break=strict, e.g. treat CJ as NS</td>
   2397 			</tr>
   2398 			<tr>
   2399 				<td>"normal"</td>
   2400 				<td>CSS level 3 line-break=normal, e.g. treat CJ as ID, break
   2401 					before hyphens for ja,zh</td>
   2402 			</tr>
   2403 			<tr>
   2404 				<td>"loose"</td>
   2405 				<td>CSS lev 3 line-break=loose</td>
   2406 			</tr>
   2407 
   2408 			<tr>
   2409 				<td colspan="4"><strong>A <a
   2410 						href="#UnicodeLineBreakWordIdentifier"
   2411 						name="UnicodeLineBreakWordIdentifier">Unicode Line Break Word
   2412 							Identifier</a> defines preferred line break word handling behavior
   2413 						corresponding to the CSS level 3 <a
   2414 						href="https://drafts.csswg.org/css-text/#word-break-property">word-break
   2415 							option</a>. The valid values are those <em>name</em> attribute values
   2416 						in the <em>type</em> elements of key name="lw" in bcp47/<a
   2417 						target="_blank"
   2418 						href="http://www.unicode.org/repos/cldr/tags/latest/common/bcp47/segmentation.xml">segmentation.xml</a></strong>.</td>
   2419 			</tr>
   2420 			<tr>
   2421 				<td rowspan="3">"lw"</td>
   2422 				<td rowspan="3">Line break word handling</td>
   2423 				<td>"normal"</td>
   2424 				<td>CSS level 3 word-break=normal, normal script/language
   2425 					behavior for midword breaks</td>
   2426 			</tr>
   2427 			<tr>
   2428 				<td>"breakall"</td>
   2429 				<td>CSS level 3 word-break=break-all, allow midword breaks
   2430 					unless forbidden by lb setting</td>
   2431 			</tr>
   2432 			<tr>
   2433 				<td>"keepall"</td>
   2434 				<td>CSS level 3 word-break=keep-all, prohibit midword breaks
   2435 					except for dictionary breaks</td>
   2436 			</tr>
   2437 
   2438 			<tr>
   2439 				<td colspan="4"><strong>A <a
   2440 						href="#UnicodeMeasurementSystemIdentifier"
   2441 						name="UnicodeMeasurementSystemIdentifier">Unicode Measurement
   2442 							System Identifier</a> defines a preferred measurement system.
   2443 						Specifying "ms" in a locale identifier overrides the default value
   2444 						specified by supplemental measurement system data (see Part 2
   2445 						General, section 5 <a
   2446 						href="tr35-general.html#Measurement_System_Data">Measurement
   2447 							System Data</a>). The valid values are those <em>name</em> attribute
   2448 						values in the <em>type</em> elements of key name="ms" in bcp47/<a
   2449 						target="_blank"
   2450 						href="http://www.unicode.org/repos/cldr/tags/latest/common/bcp47/measure.xml">measure.xml</a></strong>.</td>
   2451 			</tr>
   2452 			<tr>
   2453 				<td rowspan="3">"ms"</td>
   2454 				<td rowspan="3">Measurement system</td>
   2455 				<td>"metric"</td>
   2456 				<td>Metric System</td>
   2457 			</tr>
   2458 			<tr>
   2459 				<td>"ussystem"</td>
   2460 				<td>US System of measurement: feet, pints, etc.; pints are 16oz</td>
   2461 			</tr>
   2462 			<tr>
   2463 				<td>"uksystem"</td>
   2464 				<td>UK System of measurement: feet, pints, etc.; pints are 20oz</td>
   2465 			</tr>
   2466 
   2467 			<tr>
   2468 				<td colspan="4"><strong>A <a
   2469 						href="#UnicodeNumberSystemIdentifier"
   2470 						name="UnicodeNumberSystemIdentifier">Unicode Number System
   2471 							Identifier</a> defines a type of number system. The valid values are
   2472 						those <em>name</em> attribute values in the <em>type</em> elements
   2473 						of bcp47/<a target="_blank"
   2474 						href="http://www.unicode.org/repos/cldr/tags/latest/common/bcp47/number.xml">number.xml</a>.
   2475 				</strong></td>
   2476 			</tr>
   2477 			<tr>
   2478 				<td rowspan="7">"nu"<br> (numbers)
   2479 				</td>
   2480 				<td rowspan="7">Numbering system</td>
   2481 				<td><i>Unicode script subtag</i></td>
   2482 				<td><p>
   2483 						Four-letter types indicating the primary numbering system for the
   2484 						corresponding script represented in Unicode. Unless otherwise
   2485 						specified, it is a decimal numbering system using digits
   2486 						[:GeneralCategory=Nd:]. For example, &quot;latn&quot; refers to
   2487 						the ASCII / Western digits 0-9, while &quot;taml&quot; is an
   2488 						algorithmic (non-decimal) numbering system. (The code "tamldec" is
   2489 						indicates the "modern Tamil decimal digits".)<br>
   2490 					</p>
   2491 					<p class="note">
   2492 						For more information, see <a
   2493 							href="tr35-numbers.html#Numbering_Systems">Numbering Systems</a>.
   2494 					</p></td>
   2495 			</tr>
   2496 			<tr>
   2497 				<td>"arabext"</td>
   2498 				<td>Extended Arabic-Indic digits ("arab" means the base
   2499 					Arabic-Indic digits)</td>
   2500 			</tr>
   2501 			<tr>
   2502 				<td>"armnlow"</td>
   2503 				<td>Armenian lowercase numerals</td>
   2504 			</tr>
   2505 			<tr>
   2506 				<td colspan="2"></td>
   2507 			</tr>
   2508 			<tr>
   2509 				<td>"roman"</td>
   2510 				<td>Roman numerals</td>
   2511 			</tr>
   2512 			<tr>
   2513 				<td>"romanlow"</td>
   2514 				<td>Roman lowercase numerals</td>
   2515 			</tr>
   2516 			<tr>
   2517 				<td>"tamldec"</td>
   2518 				<td>Modern Tamil decimal digits</td>
   2519 			</tr>
   2520 
   2521 			<tr>
   2522 				<td colspan="4"><strong>A <a href="#RegionOverride"
   2523 						name="RegionOverride">Region Override</a> specifies an alternate
   2524 						region to use for obtaining certain region-specific default values
   2525 						(those specified by the <a href="tr35-info.html#rgScope">&lt;rgScope&gt;</a>
   2526 						element), instead of using the region specified by the <a
   2527 						href="#unicode_region_subtag">unicode_region_subtag</a> in the
   2528 						Unicode Language Identifier (or inferred from the <a
   2529 						href="#unicode_language_subtag">unicode_language_subtag</a>).
   2530 				</strong></td>
   2531 			</tr>
   2532 			<tr>
   2533 				<td rowspan="2">"rg"</td>
   2534 				<td rowspan="2">Region Override</td>
   2535 				<td>&quot;uszzzz&quot;<br> <br></td>
   2536 				<td rowspan="2">The value is a <a href="#unicode_region_subtag">unicode_region_subtag</a>
   2537 					for a regular region (not a macroregion), suffixed by "ZZZZ" (case
   2538 					is not significant). For example, en-GB-u-rg-uszzzz represents a
   2539 					locale for British English but with region-specific defaults set to
   2540 					US for items such as default currency, default calendar and week
   2541 					data, default time cycle, and default measurement system and unit
   2542 					preferences.
   2543 				</td>
   2544 			</tr>
   2545 			<tr>
   2546 				<td></td>
   2547 			</tr>
   2548 
   2549 			<tr>
   2550 				<td colspan="4"><strong>A <a
   2551 						name="unicode_subdivision_subtag_validity"></a><a
   2552 						href="#UnicodeSubdivisionIdentifier"
   2553 						name="UnicodeSubdivisionIdentifier">Unicode Subdivision
   2554 							Identifier</a> defines a regional subdivision used for locales. The
   2555 						valid values are based on the <em>subdivisionContainment</em>
   2556 						element as described in <em>Section <a
   2557 							href="#Unicode_Subdivision_Codes">3.6.5 Subdivision Codes</a></em>.
   2558 				</strong></td>
   2559 			</tr>
   2560 			<tr>
   2561 				<td rowspan="2">"sd"</td>
   2562 				<td rowspan="2">Regional Subdivision</td>
   2563 				<td>&quot;gbsct&quot;<br> <br></td>
   2564 				<td rowspan="2">A <a href="#unicode_subdivision_id">unicode_subdivision_id</a>, which is
   2565 					a <a href="#unicode_region_subtag">unicode_region_subtag</a>concatenated
   2566 					with a unicode_subdivision_suffix.<br> For example, <em>gbsct</em> is gb+sct (where sct
   2567 						represents the subdivision code for Scotland). Thus
   2568 					en-GB-u-sd-gbsct represents the language variant English as used
   2569 					in Scotland. And both en-u-sd-usca and en-US-u-sd-usca
   2570 					represent English as used in California. See
   2571 						<strong><em><a href="#Unicode_Subdivision_Codes">3.6.5
   2572 									Subdivision Codes</a></em></strong>.
   2573 				</td>
   2574 			</tr>
   2575 			<tr>
   2576 				<td></td>
   2577 			</tr>
   2578 
   2579 			<tr>
   2580 				<td colspan="4"><strong>A <a
   2581 						href="#UnicodeSentenceBreakSuppressionsIdentifier"
   2582 						name="UnicodeSentenceBreakSuppressionsIdentifier">Unicode
   2583 							Sentence Break Suppressions Identifier</a> defines a set of data to
   2584 						be used for suppressing certain sentence breaks that would
   2585 						otherwise be found by UAX #14 rules. The valid values are those <em>name</em>
   2586 						attribute values in the <em>type</em> elements of key name="ss" in
   2587 						bcp47/<a target="_blank"
   2588 						href="http://www.unicode.org/repos/cldr/tags/latest/common/bcp47/segmentation.xml">segmentation.xml</a></strong>.</td>
   2589 			</tr>
   2590 			<tr>
   2591 				<td rowspan="2">"ss"</td>
   2592 				<td rowspan="2">Sentence break suppressions</td>
   2593 				<td>"none"</td>
   2594 				<td>Dont use sentence break suppressions data (the default).</td>
   2595 			</tr>
   2596 			<tr>
   2597 				<td>"standard"</td>
   2598 				<td>Use sentence break suppressions data of type "standard"</td>
   2599 			</tr>
   2600 
   2601 			<tr>
   2602 				<td colspan="4"><strong>A <a
   2603 						href="#UnicodeTimezoneIdentifier" name="UnicodeTimezoneIdentifier">Unicode
   2604 							Timezone Identifier</a> defines a timezone. The valid values are
   2605 						those name attribute values in the <em>type</em> elements of
   2606 						bcp47/<a target="_blank"
   2607 						href="http://www.unicode.org/repos/cldr/tags/latest/common/bcp47/timezone.xml">timezone.xml</a>.
   2608 				</strong></td>
   2609 			</tr>
   2610 			<tr>
   2611 				<td>"tz"<br> (timezone)
   2612 				</td>
   2613 				<td>Time zone</td>
   2614 				<td><i>Unicode short time zone IDs</i></td>
   2615 				<td><p>
   2616 						Short identifiers defined in terms of a TZ time zone database [<a
   2617 							href="#Olson">Olson</a>] identifier in the file
   2618 						common/bcp47/timezone.xml file, plus a few extra values.
   2619 					</p>
   2620 					<p>
   2621 						For more information, see <a href="#Time_Zone_Identifiers">Section
   2622 							3.7.1.2 Time Zone Identifiers</a>.
   2623 					</p>
   2624 					<p>CLDR provides data for normalizing timezone codes.</p></td>
   2625 			</tr>
   2626 			<tr>
   2627 				<td colspan="4"><strong>A <a
   2628 						href="#UnicodeVariantIdentifier" name="UnicodeVariantIdentifier">Unicode
   2629 							Variant Identifier</a> defines a special variant used for locales.
   2630 						The valid values are those name attribute values in the <em>type</em>
   2631 						elements of bcp47/<a target="_blank"
   2632 						href="http://www.unicode.org/repos/cldr/tags/latest/common/bcp47/variant.xml">variant.xml</a>.
   2633 				</strong></td>
   2634 			</tr>
   2635 			<tr>
   2636 				<td>"va"</td>
   2637 				<td>Common variant type</td>
   2638 				<td>"posix"</td>
   2639 				<td>POSIX style locale variant. About handling of the "POSIX"
   2640 					variant see <i>Section 3.8.2, <a href="#Legacy_Variants">Legacy
   2641 							Variants</a></i>.
   2642 				</td>
   2643 			</tr>
   2644 		</table>
   2645 		<p>
   2646 			For more information on the allowed keys and types, see the specific
   2647 			elements below, and <a href="#Unicode_Locale_Extension_Data_Files">Section
   2648 				3.6.4 U Extension Data Files</a>.
   2649 		</p>
   2650 		<p>Additional keys or types might be added in future versions.
   2651 			Implementations of LDML should be robust to handle any syntactically
   2652 			valid key or type values.</p>
   2653 		<h4>
   2654 			<a href="#Numbering System Data" name="Numbering System Data">3.6.2
   2655 				Numbering System Data </a>
   2656 		</h4>
   2657 		<p>
   2658 			LDML supports multiple numbering systems. The identifiers for those
   2659 			numbering systems are defined in the file <strong>bcp47/number.xml</strong>.
   2660 			For example, for the 'trunk' version of the data see <a
   2661 				href="http://unicode.org/repos/cldr/tags/latest/common/bcp47/number.xml">bcp47/number.xml</a>.<br>
   2662 		</p>
   2663 		<p>
   2664 			Details about those numbering systems are defined in <strong>supplemental/numberingSystems.xml</strong>.
   2665 			For example, for the 'trunk' version of the data see <a
   2666 				href="http://unicode.org/repos/cldr/tags/latest/common/supplemental/numberingSystems.xml">supplemental/numberingSystems.xml</a>.<br>
   2667 		</p>
   2668 		<p>
   2669 			LDML makes certain stability guarantees on this data:<br>
   2670 		</p>
   2671 		<ol>
   2672 			<li>Like other BCP 47 identifiers, once a numeric identifier is
   2673 				added to <strong>bcp47/number.xml</strong> or <strong>numberingSystems.xml</strong>,
   2674 				it will never be removed from either of those files.
   2675 			</li>
   2676 			<li>If an identifier has type="numeric" in numberingSystems.xml,
   2677 				then
   2678 				<ol>
   2679 					<li>It is a decimal, positional numbering system with an
   2680 						attribute digits=X, where X is a string with the 10 digits in
   2681 						order used by the numbering system.</li>
   2682 					<li>The values of the type and digits will never change.</li>
   2683 				</ol>
   2684 			</li>
   2685 		</ol>
   2686 		<h4>
   2687 			<a href="#Time_Zone_Identifiers" name="Time_Zone_Identifiers">3.6.3
   2688 				Time Zone Identifiers</a>
   2689 		</h4>
   2690 		<p>
   2691 			LDML inherits time zone IDs from the tz database [<a href="#Olson">Olson</a>].
   2692 			Because these IDs from the tz database do not satisfy the BCP 47
   2693 			language subtag syntax requirements, CLDR defines short identifiers
   2694 			for the use in the Unicode locale extension. The short identifiers
   2695 			are defined in the file <strong>common/bcp47/timezone.xml</strong>.
   2696 		</p>
   2697 		<p>
   2698 			The short identifiers use UN/LOCODE [<a href="#LOCODE">LOCODE</a>]
   2699 			(excluding a space character) codes where possible. For example, the
   2700 			short identifier for "America/Los_Angeles" is "uslax" (the LOCODE for
   2701 			Los Angeles, US is "US LAX"). Identifiers of length not equal to 5
   2702 			are used where there is no corresponding UN/LOCODE, such as
   2703 			"usnavajo" for "America/Shiprock", or "utcw01" for "Etc/GMT+1", so
   2704 			that they do not overlap with future UN/LOCODE.
   2705 		</p>
   2706 		<p>Although the first two letters of a short identifier may match
   2707 			an ISO 3166 two-letter country code, a user should not assume that
   2708 			the time zone belongs to the country. The first two letters in an
   2709 			identifier of length not equal to 5 has no meaning. Also, the
   2710 			identifiers are stabilized, meaning that they will not change no
   2711 			matter what changes happen in the base standard. So if Hawaii leaves
   2712 			the US and joins Canada as a new province, the short time zone
   2713 			identifier "ushnl" would not change in CLDR even if the UN/LOCODE
   2714 			changes to "cahnl" or something else.</p>
   2715 		<p>There is a special code "unk" for an Unknown or Invalid time
   2716 			zone. This can be expressed in the tz database style ID
   2717 			"Etc/Unknown", although it is not defined in the tz database.</p>
   2718 		<p>
   2719 			<b>Stability of Time Zone Identifiers</b>
   2720 		</p>
   2721 		<p>
   2722 			Although the short time zone identifiers are guaranteed to be stable,
   2723 			the preferred IDs in the tz database (as those found in <strong>zone.tab</strong>
   2724 			file) might be changed time to time. For example, "Asia/Culcutta" was
   2725 			replaced with "Asia/Kolkata" and moved to <strong>backward</strong>
   2726 			file in the tz database. CLDR contains locale data using a time zone
   2727 			ID from the tz database as the key, stability of the IDs is cirtical.
   2728 		</p>
   2729 		<p>
   2730 			To maintain the stability of "long" IDs (for those inherited from the
   2731 			tz database), a special rule applied to the <i>alias</i> attribute in
   2732 			the &lt;type&gt; element for "tz" - the first "long" ID is the CLDR
   2733 			canonical "long" time zone ID.
   2734 		</p>
   2735 		<p>For example:</p>
   2736 		<blockquote>&lt;type name="inccu" alias="Asia/Calcutta
   2737 			Asia/Kolkata" description="Kolkata, India"/&gt;</blockquote>
   2738 		<p>
   2739 			Above &lt;type&gt; element defines the short time zone ID "inccu"
   2740 			(for the use in the Unicode locale extension), corresponding <em>CLDR
   2741 				canonical "long" ID</em> "Asia/Culcutta", and an alias "Asia/Kolkata".
   2742 		</p>
   2743 		<h4>
   2744 			<a href="#Unicode_Locale_Extension_Data_Files"
   2745 				name="Unicode_Locale_Extension_Data_Files">3.6.4 U Extension
   2746 				Data Files</a>
   2747 		</h4>
   2748 		<p>
   2749 			The 'u' extension data is stored in multiple XML files located under
   2750 			common/bcp47 directory in CLDR. Each file contains the locale
   2751 			extension key/type values and their backward compatibility mappings
   2752 			appropriate for a particular domain. <a
   2753 				href="http://unicode.org/repos/cldr/tags/latest/common/bcp47/collation.xml">common/bcp47/collation.xml</a>
   2754 			contains key/type values for collation, including optional collation
   2755 			parameters and valid type values for each key.
   2756 		</p>
   2757 		<p>
   2758 			The 't' extension data is stored in <a
   2759 				href="http://unicode.org/repos/cldr/tags/latest/common/bcp47/transform.xml">common/bcp47/transform.xml</a>.
   2760 		</p>
   2761 		<p class="dtd">&lt;!ELEMENT keyword ( key* )&gt;</p>
   2762 		<p class="dtd">
   2763 			&lt;!ELEMENT key ( type* )&gt;<br> &lt;!ATTLIST key extension
   2764 			NMTOKEN #IMPLIED&gt;<br> &lt;!ATTLIST key name NMTOKEN
   2765 			#REQUIRED&gt;<br> &lt;!ATTLIST key description CDATA
   2766 			#IMPLIED&gt;<br> &lt;!ATTLIST key deprecated ( true | false )
   2767 			"false"&gt;<br> &lt;!ATTLIST key preferred NMTOKEN #IMPLIED&gt;<br>
   2768 			&lt;!ATTLIST key alias NMTOKEN #IMPLIED&gt;<br> &lt;!ATTLIST key valueType (single | multiple
   2769 				| incremental | any) #IMPLIED &gt;<br> &lt;!ATTLIST key since
   2770 			CDATA #IMPLIED&gt;
   2771 		</p>
   2772 		<p class="dtd">
   2773 			&lt;!ELEMENT type EMPTY&gt;<br> &lt;!ATTLIST type name NMTOKEN
   2774 			#REQUIRED&gt;<br> &lt;!ATTLIST type description CDATA
   2775 			#IMPLIED&gt;<br> &lt;!ATTLIST type deprecated ( true | false )
   2776 			"false"&gt;<br> &lt;!ATTLIST type preferred NMTOKEN #IMPLIED&gt;<br>
   2777 			&lt;!ATTLIST type alias CDATA #IMPLIED&gt;<br> &lt;!ATTLIST type
   2778 			since CDATA #IMPLIED&gt;
   2779 		</p>
   2780 		<p class="dtd">
   2781 			&lt;!ELEMENT attribute EMPTY&gt;<br> &lt;!ATTLIST attribute name
   2782 			NMTOKEN #REQUIRED&gt;<br> &lt;!ATTLIST attribute description
   2783 			CDATA #IMPLIED&gt;<br> &lt;!ATTLIST attribute deprecated ( true
   2784 			| false ) "false"&gt;<br> &lt;!ATTLIST attribute preferred
   2785 			NMTOKEN #IMPLIED&gt;<br> &lt;!ATTLIST attribute since CDATA
   2786 			#IMPLIED&gt;
   2787 		</p>
   2788 		<p>The extension attribute in &lt;key&gt; element specifies the
   2789 			BCP 47 language tag extension type. The default value of the
   2790 			extension attribute is "u" (Unicode locale extension). The
   2791 			&lt;type&gt; element is only applicable to the enclosing &lt;key&gt;.
   2792 		</p>
   2793 		<p>
   2794 			In the Unicode locale extension 'u' and
   2795 				't' data files, the common attributes for the &lt;key&gt;,
   2796 			&lt;type&gt; and &lt;attribute&gt; elements are as follows:
   2797 		</p>
   2798 		<dl>
   2799 			<dt>
   2800 				<b>name</b>
   2801 			</dt>
   2802 			<dd>
   2803 				<p>
   2804 					The key or type name used by Unicode locale extension with <a
   2805 						href="#Unicode_locale_identifier">'u' extension syntax</a> or the 't' extensions syntax. When <i>alias</i>
   2806 					below is absent, this name can be also used with the old style <a
   2807 						href="#Old_Locale_Extension_Syntax"> "@key=type" syntax</a>.
   2808 				</p>
   2809 				<p>
   2810 					Most type names are <strong>literal type names</strong>, which
   2811 					match exactly the same value. All of these have at least one
   2812 					lowercase letter, such as &quot;buddhist&quot;. There are a small
   2813 					number of <strong>indirect type names</strong>, such as
   2814 					&quot;RG_KEY_VALUE&quot;. These have no lowercase letters. The
   2815 					interpretation of each one is listed below.
   2816 				</p>
   2817 				<h5>
   2818 					<a name="CODEPOINTS" href="#CODEPOINTS">CODEPOINTS</a>
   2819 				</h5>
   2820 				<p>
   2821 					The type name <strong>"CODEPOINTS"</strong> is reserved for a
   2822 					variable representing Unicode code point(s). The syntax is:
   2823 				</p>
   2824 				<table border="0">
   2825 					<tr>
   2826 						<th>&nbsp;</th>
   2827 						<th><div align="center">EBNF</div></th>
   2828 						<th><div align="center">ABNF</div></th>
   2829 					</tr>
   2830 					<tr>
   2831 						<td><pre>codepoints</pre></td>
   2832 						<td><pre>= codepoint (sep codepoint)?</pre></td>
   2833 						<td><pre>= codepoint *(sep codepoint)</pre></td>
   2834 					</tr>
   2835 					<tr>
   2836 						<td><pre>codepoint</pre></td>
   2837 						<td><pre>= [0-9 A-F a-f]{4,6}</pre></td>
   2838 						<td><pre>= 4*6HEXDIG</pre></td>
   2839 					</tr>
   2840 				</table>
   2841 				<p>In addition, no codepoint may exceed 10FFFF. For example,
   2842 					"00A0", "300b", "10D40C" and "00C1-00E1" are valid, but "A0",
   2843 					"U060C" and "110000" are not.</p>
   2844 				<p>In the current version of CLDR, the type "CODEPOINTS" is only
   2845 					used for the deprecated locale extension key "vt" (variableTop).
   2846 					The subtags forming the type for "vt" represent an arbitrary string
   2847 					of characters. There is no formal limit in the number of
   2848 					characters, although practically anything above 1 will be rare, and
   2849 					anything longer than 4 might be useless. Repetition is allowed, for
   2850 					example, 0061-0061 ("aa") is a Valid type value for "vt", since the
   2851 					sequence may be a collating element. Order is vital: 0061-0062
   2852 					("ab") is different than 0062-0061 ("ba"). Note that for
   2853 					variableTop any character sequence must be a contraction which
   2854 					yields exactly one primary weight.</p>
   2855 				<p>For example,</p>
   2856 				<blockquote>
   2857 					<p>
   2858 						<strong>en-u-vt-00A4</strong> : this indicates English, with any
   2859 						characters sorting at or below &quot; &quot; (at a primary level)
   2860 						considered Variable.
   2861 					</p>
   2862 				</blockquote>
   2863 				<p>
   2864 					By default in UCA, variable characters are ignored in sorting at a
   2865 					primary, secondary, and tertiary level. But in CLDR, they are not
   2866 					ignorable by default. For more information, see <a
   2867 						href="tr35-collation.html#Setting_Options">Collation: Section
   2868 						3.3 <em>Setting Options</em>
   2869 					</a>.
   2870 				</p>
   2871 
   2872 				<h5>
   2873 					<a name="REORDER_CODE" href="#REORDER_CODE">REORDER_CODE</a>
   2874 				</h5>
   2875 				<p>
   2876 					The type name <strong>"REORDER_CODE"</strong> is reserved for
   2877 					reordering block names (e.g. "latn", "digit" and "others") defined
   2878 					in the <i><a href="tr35-collation.html#Root_Collation">Root
   2879 							Collation</a></i>. The type "REORDER_CODE" is used for locale extension
   2880 					key "kr" (colReorder). The value of type for "kr" is represented by
   2881 					one or more reordering block names such as "latn-digit". For more
   2882 					information, see <a href="tr35-collation.html#Script_Reordering">Collation:
   2883 						Section 3.12 <em>Collation Reordering</em>
   2884 					</a>.
   2885 				</p>
   2886 				<h5>
   2887 					<a name="RG_KEY_VALUE" href="#RG_KEY_VALUE">RG_KEY_VALUE</a>
   2888 				</h5>
   2889 				<p>
   2890 					The type name <strong>"RG_KEY_VALUE"</strong> is reserved for
   2891 					region codes in the format required by the "rg" key; this is a
   2892 					region code from the idValidity data in common/validity/region.xml
   2893 					(with certain exclusions, listed below) followed by "zzzz". The
   2894 					excluded region codes are those with idStatus='unknown' and
   2895 					'macroregion'; region codes with idStatus='deprecated' should not
   2896 					be generated, and those with idStatus='private_use' are only to be
   2897 					used with prior agreement. Thus the value for the "rg" key will
   2898 					normally be a region code with idStatus='regular' followed by
   2899 					"zzzz"; this set of values is the same as the subdivision codes
   2900 					with idStatus='unknown' from the idValidity data in
   2901 					common/validity/subdivision.xml.
   2902 				</p>
   2903 				<h5>
   2904 					<a name="SUBDIVISION_CODE" href="#SUBDIVISION_CODE">SUBDIVISION_CODE</a>
   2905 				</h5>
   2906 				<p>
   2907 					The type name <strong>"SUBDIVISION_CODE"</strong> is reserved for
   2908 					subdivision codes in the format required by the "sd" key; this is a
   2909 					subdivision code from the idValidity data in
   2910 					common/validity/subdivision.xml, excluding those with
   2911 					idStatus='unknown'. Codes with idStatus='deprecated' should not be
   2912 					generated, and those with idStatus='private_use' are only to be
   2913 					used with prior agreement.
   2914 				</p>
   2915 				<h5>
   2916 					<a name="PRIVATE_USE" href="#PRIVATE_USE">PRIVATE_USE</a>
   2917 				</h5>
   2918 				<p>
   2919 					The type name <strong>"PRIVATE_USE"</strong> is reserved for
   2920 					private use types. A valid type value is composed of one or more
   2921 					subtags separated by hyphens and each subtag consists of three to
   2922 					eight ASCII alphanumeric characters. In the current version of
   2923 					CLDR, <strong>"PRIVATE_USE"</strong> is only used for transform
   2924 					extension "x0".
   2925 				</p>
   2926 
   2927 			</dd>
   2928 			<dt>
   2929 				<b>valueType</b>
   2930 			</dt>
   2931 			<dd>
   2932 				<p>The valueType attribute indicates how many
   2933 					subtags are valid for a given key:</p>
   2934 				<table class='simple' width="100%" border="1">
   2935 					<tbody>
   2936 						<tr>
   2937 							<th>single</th>
   2938 							<td>Either exactly one type value, or no type value (but only if the value of &quot;true&quot; would be valid). This is the default
   2939 								if no valueType attribute is present.</td>
   2940 						</tr>
   2941 						<tr>
   2942 							<th>incremental</th>
   2943 							<td>Multiple type values are allowed, but only if a prefix
   2944 								is also present, and the sequence is explicitly listed. Each
   2945 								successive type value indicates a refinement of its prefix. For
   2946 								example:<br> &lt;key name=&quot;ca&quot;
   2947 								description=&quot;Calendar algorithm key&quot;<strong>
   2948 									valueType=&quot;incremental&quot;</strong>&gt; <br>&nbsp;&nbsp;&lt;type
   2949 								name=&quot;islamic&quot; description=&quot;Islamic
   2950 								calendar&quot;/&gt;<br> &nbsp;&nbsp;&lt;type
   2951 								name=&quot;islamic-umalqura&quot; description=&quot;Islamic
   2952 								calendar, Umm al-Qura&quot;/&gt;<br> Thus <em>ca-islamic-umalqura</em>
   2953 								is valid. However, <em>ca-gregory-japanese</em> is not valid,
   2954 								because &quot;gregory-japanese&quot; is not listed as a type.
   2955 							</td>
   2956 						</tr>
   2957 						<tr>
   2958 							<th>multiple</th>
   2959 							<td>Multiple type values are allowed, but each may only
   2960 								occur once. For example:<br>&lt;key name=&quot;kr&quot;
   2961 								description=&quot;Collation reorder codes&quot; <strong>valueType=&quot;multiple&quot;</strong>&gt;<br>
   2962 								&nbsp;&nbsp;&lt;type name=&quot;REORDER_CODE&quot; /&gt;
   2963 							</td>
   2964 						</tr>
   2965 						<tr>
   2966 							<th>any</th>
   2967 							<td>Any number of type values are allowed, with none of the
   2968 								above restrictions. For example:<br> &lt;key
   2969 								extension=&quot;t&quot; name=&quot;x0&quot;<strong> </strong>description=&quot;Private
   2970 								use transform type key.&quot;<strong>
   2971 									valueType=&quot;any&quot;</strong>&gt;<br> &nbsp;&nbsp;&lt;type
   2972 								name=&quot;PRIVATE_USE&quot; /&gt;
   2973 							</td>
   2974 						</tr>
   2975 					</tbody>
   2976 				</table>
   2977 			</dd>
   2978 			<dt>
   2979 				<b>description</b>
   2980 			</dt>
   2981 			<dd>
   2982 				<p>
   2983 					The description of the key, type or attribute element. There is
   2984 					also some informative text about certain keys and types in the
   2985 					Section 3.5 <a href="#Key_And_Type_Definitions_">Key And Type
   2986 						Definitions</a>.
   2987 				</p>
   2988 			</dd>
   2989 			<dt>
   2990 				<b>deprecated</b>
   2991 			</dt>
   2992 			<dd>
   2993 				<p>The deprecation status of the key, type or attribute element.
   2994 					The value "true" indicates the element is deprecated and no longer
   2995 					used in the version of CLDR. The default value is "false".</p>
   2996 			</dd>
   2997 			<dt>
   2998 				<b>preferred</b>
   2999 			</dt>
   3000 			<dd>
   3001 				<p>The preferred value of the deprecated key, type or attribute
   3002 					element. When a key, type or attribute element is deprecated, this
   3003 					attribute is used for specifying a new canonical form if available.</p>
   3004 			</dd>
   3005 			<dt>
   3006 				<b>alias</b> (Not applicable to &lt;attribute&gt;)
   3007 			</dt>
   3008 			<dd>
   3009 				<p>The BCP 47 form is the canonical form, and recommended. Other
   3010 			  aliases are included only for backwards compatibility.</p>
   3011 			</dd>
   3012 			<dd>
   3013 				<em>Example:</em>
   3014 			</dd>
   3015 			<dd>
   3016 				<p>
   3017 					&lt;type name="phonebk" <strong>alias="phonebook"</strong>
   3018 					description="Phonebook style ordering (such as in German)"/&gt;<br>
   3019 				</p>
   3020 				The preferred term, and the only one to be used in BCP 47, is the
   3021 				name: in this example, &quot;phonebk&quot;.<br>
   3022 			</dd>
   3023 			<dd>
   3024 				<p>
   3025 					The alias is a key or type name used by Unicode locale extensions
   3026 					with the old <a href="#Old_Locale_Extension_Syntax">"@key=type"
   3027 						syntax</a>. The attribute value for type may contain multiple names
   3028 					delimited by ASCII space characters. Of those aliases, the first
   3029 					name is the preferred value.
   3030 				</p>
   3031 			</dd>
   3032 			<dt>
   3033 				<b>since</b>
   3034 			</dt>
   3035 			<dd>The version of CLDR in which this key or type was
   3036 				introduced. Absence of this attribute value implies the key or type
   3037 				was available in CLDR 1.7.2.</dd>
   3038 		</dl>
   3039 		<p>
   3040 			<em>Note: There are no values defined for the locale extension
   3041 				attribute in the current CLDR release. </em>
   3042 		</p>
   3043 		<p>For example,</p>
   3044 		<pre>
   3045 &lt;key name="co" alias="collation" description="Collation type key"&gt;
   3046   &lt;type name="pinyin" description="Pinyin ordering for Latin and for CJK characters (used in Chinese)"/&gt;
   3047 &lt;/key&gt;
   3048 
   3049 &lt;key name="ka" alias="colAlternate" description="Collation parameter key for alternate handling"&gt;
   3050   &lt;type name="noignore" alias="non-ignorable" description="Variable collation elements are not reset to ignorable"/&gt;
   3051   &lt;type name="shifted" description="Variable collation elements are reset to zero at levels one through three"/&gt;
   3052 &lt;/key&gt;
   3053 
   3054 &lt;key name="tz" alias="timezone"&gt;
   3055   ...
   3056   &lt;type name="aumel" alias="Australia/Melbourne Australia/Victoria" description="Melbourne, Australia"/&gt;
   3057   &lt;type name="aumqi" alias="Antarctica/Macquarie" description="Macquarie Island Station, Macquarie Island" since="1.8.1"/&gt;
   3058   ...
   3059 &lt;/key&gt;
   3060     </pre>
   3061 		The data above indicates:
   3062 		<ul>
   3063 			<li>type "pinyin" is valid for key "co", thus "u-co-pinyin" is a
   3064 				valid Unicode locale extension.</li>
   3065 			<li>type "pinyin" is not valid for key "ka", thus "u-ka-pinyin"
   3066 				is not a valid Unicode locale extension.</li>
   3067 			<li>type "pinyin" has no <i>alias</i>, so "zh@collation=pinyin"
   3068 				is a valid Unicode locale identifier according to the old syntax.
   3069 			</li>
   3070 			<li>type "noignore" has an alias attribute, so
   3071 				"en@colAlternate=noignore" is not a valid Unicode locale identifier
   3072 				according to the old syntax.</li>
   3073 			<li>type "aumel" is valid for key "tz", supported by CLDR 1.7.2
   3074 				(default value) or later versions.</li>
   3075 			<li>type "aumqi" is valid for key "tz", supported by CLDR 1.8.1
   3076 				or later versions.</li>
   3077 		</ul>
   3078 		<p>It is strongly recommended that all API methods accept all
   3079 			possible aliases for keywords and types, but generate the canonical
   3080 			form. For example, &quot;ar-u-ca-islamicc&quot; would be equivalent
   3081 			to &quot;ar-u-ca-islamic-civil&quot; on input, but the latter should
   3082 			be output. The one exception is where an alias would only be
   3083 			well-formed with the old syntax, such as &quot;gregorian&quot; (for
   3084 			&quot;gregory&quot;).</p>
   3085 		<h4>
   3086 			<a href="#Unicode_Subdivision_Codes" name="Unicode_Subdivision_Codes">3.6.5
   3087 				Subdivision Codes</a>
   3088 		</h4>
   3089 		<p>
   3090 			The subdivision codes designate a
   3091 				subdivision of a country or region. They are called various names,
   3092 				such as a <em>state</em> in the United States, or a <em>province</em>
   3093 				in Canada. The codes in CLDR
   3094 			are based on ISO 3166-2 subdivision codes. The
   3095 				ISO codes have a region code followed by a hyphen, then a suffix
   3096 				consisting of 1..3 ASCII letters or digits.
   3097 		</p>
   3098 		<p>
   3099 			The CLDR codes are designed to work in a
   3100 				<a href='#unicode_locale_id'>unicode_locale_id</a> (BCP47), and are
   3101 				thus all lowercase, with no hyphen.
   3102 			For example, the following are valid, and mean English as used in
   3103 			California, USA.
   3104 		</p>
   3105 		<ul>
   3106 			<li>en-u-sd-<strong>usca</strong></li>
   3107 			<li>en-US-u-sd-<strong>usca</strong></li>
   3108 		</ul>
   3109 		<p>CLDR has additional subdivision codes. These
   3110 			may start with a 3-digit region code or use a suffix of 4 ASCII
   3111 			letters or digits, so they will not collide with the ISO codes.
   3112 			Subdivision codes for unknown values are the region code plus
   3113 			&quot;zzzz&quot;, such as &quot;uszzzz&quot; for an unknown
   3114 			subdivision of the US. Other codes may be added for stability.</p>
   3115 		<p>
   3116 			Like BCP 47, CLDR requires stable codes, which are not guaranteed for
   3117 			ISO 3166-2 (nor have the ISO 3166-2
   3118 				codes been stable in the past). If an ISO 3166-2 code is removed, it
   3119 			remains valid (though marked as deprecated) in CLDR. If an ICU 3166-2
   3120 			code is reused (for the same region), then CLDR will define a new
   3121 			equivalent code using these a 4-character suffixes.
   3122 	  </p>
   3123 		<h5>
   3124 			<a name="Validity" href="#Validity">3.6.5.1 Validity</a>
   3125 		</h5>
   3126 		<p>
   3127 			A <a href="#unicode_subdivision_id">unicode_subdivision_id</a>
   3128 			is only valid when it is present in the
   3129 				subdivision.xml file as described in <em>Section 3.11 <a
   3130 					href="#Validity_Data">Validity Data</a></em>.
   3131 			The data is in a compressed form, and thus needs to be expanded
   3132 			before such a test is made.
   3133 		</p>
   3134 		<p>
   3135 			<em> Examples:<br>
   3136 			</em>
   3137 		</p>
   3138 		<ul>
   3139 			<li><strong>usca</strong> is valid  there is an <strong>id</strong>
   3140 				element<code>&lt;idtype="subdivision"&gt; usca
   3141 					&lt;/id&gt;</code></li>
   3142 			<li><strong>ussct</strong> is invalid  there is no <strong>id</strong>
   3143 				element <code>&lt;idtype="subdivision"&gt; ussct
   3144 					&lt;/id&gt;</code></li>
   3145 		</ul>
   3146 		<p>If a <a href='#unicode_locale_id'>unicode_locale_id</a> contains both a <a
   3147 				href="#unicode_region_subtag">unicode_region_subtag</a> and a <a
   3148 				href="#unicode_subdivision_id">unicode_subdivision_id</a>, it is only valid if the <a
   3149 				href="#unicode_subdivision_id">unicode_subdivision_id</a> starts with the <a
   3150 				href="#unicode_region_subtag">unicode_region_subtag</a> (case-insensitively).<br>
   3151 		</p>
   3152 		<p>It is  recommended that a <a href='#unicode_locale_id'>unicode_locale_id</a> contain a <a
   3153 				href="#unicode_region_subtag">unicode_region_subtag</a> if it contains a <a
   3154 				href="#unicode_subdivision_id">unicode_subdivision_id</a> and the region would not be added by adding likely subtags. That produces better behavior if the <a
   3155 				href="#unicode_subdivision_id">unicode_subdivision_id</a> is ignored by an implementation or if the language tag is truncated.		</p>
   3156 		<p>
   3157 			Examples:<br>
   3158 		</p>
   3159 		<ul>
   3160 			<li>en-<strong>US</strong>-u-sd-<strong>us</strong>ca
   3161 				is valid  the region &quot;US&quot; matches
   3162 			the first part of "usca"</li>
   3163 			<li>en-u-sd-<strong>us</strong>ca is valid  it still works after adding likely subtags.</li>
   3164 			<li>en-<strong>CA</strong>-u-sd-<strong>gb</strong>sct is
   3165 				invalid  the region &quot;CA&quot; does not match the first part of &quot;gbsct&quot;. An implementation should  disregard the subdivision id (or return an error).</li>
   3166 			<li>en-u-sd-<strong>gb</strong>sct is valid but not recommended  an implementation that ignores the <a
   3167 				href="#unicode_subdivision_id">unicode_subdivision_id</a> can get the wrong fallback behavior, or could add likely subtags and get the invalid en<strong>-Latn-US</strong>-u-sd-<strong>gb</strong>sct</li>
   3168 		</ul>
   3169 		<p>
   3170 			In version 28.0, the subdivisions in the
   3171 			validity files used the ISO format, uppercase with a hyphen separating two
   3172 			components, instead of the BCP 47 format.
   3173 	  </p>
   3174 		<h3>
   3175 			<a name="t_Extension"></a><a name="BCP47_T_Extension"
   3176 				href="#BCP47_T_Extension">3.7 Unicode BCP 47 T Extension</a>
   3177 		</h3>
   3178 		<p>
   3179 			The Unicode Consortium has registered and is the maintaining
   3180 			authority for two BCP 47 language tag extensions: the extension 'u'
   3181 			for Unicode locale extension [<a href="#RFC6067">RFC6067</a>] and
   3182 			extension 't' for transformed content [<a href="#RFC6497">RFC6497</a>].
   3183 			The Unicode BCP 47 extension data defines the complete list of valid
   3184 			subtags.
   3185 		While the title of the RFC is &ldquo;Transformed Content&rdquo;, the abstract makes it clear that the scope is broader than the term "transformed" might indicate to a casual reader:including content that has been transliterated, transcribed, or 
   3186         translated, or<em>in some other way influenced by the source. It also provides for additional information used for identification.</em></p>
   3187 		<p>
   3188 			<strong>The -t- Extension.</strong> The syntax of 't' extension
   3189 			subtags is defined by the rule
   3190 			<code>unicode_locale_extensions</code>
   3191 			in <a href="#Unicode_locale_identifier"><em>Section 3.2
   3192 					Unicode locale identifier</em></a>, except the separator of subtags
   3193 			<code>sep</code>
   3194 			must be always hyphen '-' when the extension is used as a part of BCP
   3195 			47 language tag. For information about the registration process,
   3196 			meaning, and usage of the 't' extension, see [<a href="#RFC6497">RFC6497</a>].
   3197 		</p>
   3198 		<p>
   3199 			These subtags are all in lowercase (that is the canonical casing for
   3200 			these subtags), however, subtags are case-insensitive and casing does
   3201 			not carry any specific meaning. All subtags within the Unicode
   3202 			extensions are alphanumeric characters in length of two to eight that
   3203 			meet the rule
   3204 			<code>extension</code>
   3205 			in the [<a href="#BCP47">BCP47</a>].</p>
   3206 	  <p>The following keys are defined for the -t- extension:</p>
   3207 		<table class='simple'>
   3208 		  <tbody>
   3209 		    <tr>
   3210 		      <th>Keys</th>
   3211 		      <th>Description</th>
   3212 		      <th>Values in latest release</th>
   3213 	        </tr>
   3214 		    <tr>
   3215 		      <td>m0</td>
   3216 		      <td><strong>Transform extension mechanism:</strong> to reference an authority or rules for a type of transformation</td>
   3217 		      <td><a href="http://unicode.org/repos/cldr/tags/latest/common/bcp47/transform.xml">transform.xml</a></td>
   3218 	        </tr>
   3219 		    <tr>
   3220 		      <td nowrap>s0, d0 </td>
   3221 		      <td><strong>Transform source/destination:</strong> for non-languages/scripts, such as fullwidth-halfwidth conversion.</td>
   3222 		      <td><a href="http://unicode.org/repos/cldr/tags/latest/common/bcp47/transform-destination.xml">transform-destination.xml</a></td>
   3223 	        </tr>
   3224 		    <tr>
   3225 		      <td>i0</td>
   3226 		      <td><strong>Input Method Engine transform:</strong> Used to indicate an input method transformation, such as one used by 
   3227 a client-side input method. The first subfield in a sequence would 
   3228 typically be a 'platform' or vendor designation.</td>
   3229 		      <td><a href="http://unicode.org/repos/cldr/tags/latest/common/bcp47/transform_ime.xml">transform_ime.xml</a></td>
   3230 	        </tr>
   3231 		    <tr>
   3232 		      <td>k0</td>
   3233 		      <td><strong>Keyboard transform:</strong> Used to indicate a keyboard transformation, such as one used by a client-side virtual keyboard. The first subfield in a sequence would typically be a 'platform' designation, representing the platform that the keyboard is intended for. The keyboard might or might not correspond to a keyboard mapping shipped by the vendor for the platform. One or more subsequent fields may occur, but are only added where needed to distinguish from others.</td>
   3234 		      <td><a href="http://unicode.org/repos/cldr/tags/latest/common/bcp47/transform_keyboard.xml">transform_keyboard.xml</a></td>
   3235 	        </tr>
   3236 		    <tr>
   3237 		      <td>t0</td>
   3238 		      <td><strong>Machine Translation:</strong> Used to indicate content that has been machine translated, or a request for a particular type of machine translation of content. The first subfield in a sequence would typically be a 'platform' or vendor designation.</td>
   3239 		      <td><a href="http://unicode.org/repos/cldr/tags/latest/common/bcp47/transform_mt.xml">transform_mt.xml</a></td>
   3240 	        </tr>
   3241 		    <tr>
   3242 		      <td nowrap>h0</td>
   3243 		      <td><strong>Hybrid Locale Identifiers:</strong> h0 with the value 'hybrid' indicates that the -t- value is a language that is mixed into the main  language tag to form a hybrid.  		For more information, and examples, see <em>Section 3.10.2 <a href="#Hybrid_Locale">Hybrid Locale Identifiers</a>.</em></td>
   3244 		      <td><a href="http://unicode.org/repos/cldr/tags/latest/common/bcp47/transform_hybrid.xml">transform_hybrid.xml</a></td>
   3245 	        </tr>
   3246 			    <tr>
   3247 		      <td>x0</td>
   3248 		      <td><strong>Private use transform</strong></td>
   3249 		      <td><a href="http://unicode.org/repos/cldr/tags/latest/common/bcp47/transform_private_use.xml">transform_private_use.xml</a></td>
   3250 	        </tr>
   3251       </tbody>
   3252 	  </table>
   3253 		<h4>
   3254 			<a href="#Transformed_Content_Data_File"
   3255 				name="Transformed_Content_Data_File">3.7.1 T Extension Data
   3256 				Files</a>
   3257 		</h4>
   3258 		<p>The overall structure of the data files is the similar to the U
   3259 			Extension, with the following exceptions.</p>
   3260 		<p>In the transformed content 't' data file, the name attribute in
   3261 			a &lt;key&gt; element defines a valid field separator subtag. The
   3262 			name attribute in an enclosed &lt;type&gt; element defines a valid
   3263 			field subtag for the field separator subtag. For example:</p>
   3264 		<pre>
   3265 &lt;key extension="t" name="m0"
   3266     description="Transform extension mechanism"&gt;
   3267 	&lt;type name="ungegn"
   3268 		description="United Nations Group of Experts on Geographical Names"
   3269       since="21"/&gt;
   3270 &lt;key&gt;
   3271 </pre>
   3272 		The data above indicates:
   3273 		<ul>
   3274 			<li>"m0" is a valid field separator for the transformed content
   3275 				extension 't'.</li>
   3276 			<li>field subtag "ungegn" is valid for field separator "m0".</li>
   3277 			<li>field subtag "ungegn" was introduced in CLDR 21.</li>
   3278 		</ul>
   3279 		<p>The attributes are:</p>
   3280 		<dl>
   3281 			<dt>
   3282 				<b>name</b>
   3283 			</dt>
   3284 			<dd>
   3285 				The name of the mechanism, limited to 3-8 characters (or sequences
   3286 				of them). Any indirect type names are
   3287 					listed in 3.6.4 <a href="#Unicode_Locale_Extension_Data_Files">U
   3288 						Extension Data Files</a>.
   3289 		  </dd>
   3290 			<dt>
   3291 				<b>description</b>
   3292 			</dt>
   3293 			<dd>A description of the name, with all and only that
   3294 				information necessary to distinguish one name from | American
   3295 				Library others with which it might be confused. Descriptions are not
   3296 				intended to provide general background information.</dd>
   3297 			<dt>
   3298 				<b>since</b>
   3299 			</dt>
   3300 			<dd>Indicates the first version of CLDR where the name appears.
   3301 				(Required for new items.)</dd>
   3302 			<dt>&nbsp;</dt>
   3303 			<dt>
   3304 				<b>alias</b>
   3305 			</dt>
   3306 			<dd>
   3307 				Alternative name, not limited in number of characters. Aliases are
   3308 				intended for compatibility, not to provide all possible alternate
   3309 				names or designations. <em>(Optional)</em>
   3310 			</dd>
   3311 		</dl>
   3312 		<p>
   3313 			For information about the registration process, meaning, and usage of
   3314 			the 't' extension, see [<a href="#RFC6497">RFC6497</a>].
   3315 		</p>
   3316 		<h3>
   3317 			<a name="Compatibility_with_Older_Identifiers"
   3318 				href="#Compatibility_with_Older_Identifiers">3.8 Compatibility
   3319 				with Older Identifiers</a>
   3320 		</h3>
   3321 		<p>LDML version before 1.7.2 used slightly different syntax for
   3322 			variant subtags and locale extensions. Implementations of LDML may
   3323 			provide backward compatible identifier support as described in
   3324 			following sections.</p>
   3325 
   3326 		<h4>
   3327 			<a name="Old_Locale_Extension_Syntax"
   3328 				href="#Old_Locale_Extension_Syntax">3.8.1 Old Locale Extension
   3329 				Syntax </a>
   3330 		</h4>
   3331 		<p>LDML 1.7 or older specification used different syntax for
   3332 			representing unicode locale extensions. The previous definition of
   3333 			Unicode locale extensions had the following structure:</p>
   3334 		<table border="0">
   3335 			<tr>
   3336 				<th>&nbsp;</th>
   3337 				<th><div align="center">EBNF</div></th>
   3338 				<th><div align="center">ABNF</div></th>
   3339 			</tr>
   3340 			<tr>
   3341 				<td>old_unicode_locale_extensions</td>
   3342 				<td><pre>= "@" old_key "=" old_type
   3343  (";" old_key "=" old_type)*</pre></td>
   3344 				<td><pre>= "@" old_key "=" old_type
   3345 *(";" old_key "=" old_type)</pre></td>
   3346 			</tr>
   3347 		</table>
   3348 		<p>The new specification mandates keys to be two alphanumeric
   3349 			characters and types to be three to eight alphanumeric characters. As
   3350 			the result, new codes were assigned to all existing keys and some
   3351 			types. For example, a new key "co" replaced the previous key
   3352 			"collation", a new type "phonebk" replaced the previous type
   3353 			"phonebook". However, the existing collation type "big5han" already
   3354 			satisfied the new requirement, so no new type code was assigned to
   3355 			the type. All new keys and types introduced after LDML 1.7 satisfy
   3356 			the new requirement, so they do not have aliases dedicated for the
   3357 			old syntax, except time zone types. The conversion between old types
   3358 			and new types can be done regardless of key, with one known exception
   3359 			(old type "traditional" is mapped to new type "trad" for collation
   3360 			and "traditio" for numbering system), and this relationship will be
   3361 			maintained in the future versions unless otherwise noted.</p>
   3362 		<p>
   3363 			The new specification introduced a new field
   3364 			<code>attribute</code>
   3365 			in addition to key/type pairs in the Unicode locale extension. When
   3366 			it is necessary to map a new Unicode locale identifier with
   3367 			<code>attribute</code>
   3368 			field to a well-formed old locale identifier, a special key name <i>attribute</i>
   3369 			with the value of entire
   3370 			<code>attribute</code>
   3371 			subtags in the new identifier is used. For example, a new identifier
   3372 			<code>ja-u-xxx-yyy-ca-japanese</code>
   3373 			is mapped to an old identifier
   3374 			<code>ja@attribute=xxx-yyy;calendar=japanese</code>
   3375 			.
   3376 		</p>
   3377 		<p>The chart below shows some example mappings between the new
   3378 			syntax and the old syntax.</p>
   3379 
   3380 		<table>
   3381 			<caption>
   3382 				<a name="Locale_Extension_Mappings"
   3383 					href="#Locale_Extension_Mappings">Locale Extension Mappings</a>
   3384 			</caption>
   3385 			<tr>
   3386 				<th>Old (LDML 1.7 or older)</th>
   3387 				<th>New</th>
   3388 			</tr>
   3389 			<tr>
   3390 				<td>de_DE@collation=phonebook</td>
   3391 				<td>de_DE_u_co_phonebk</td>
   3392 			</tr>
   3393 			<tr>
   3394 				<td>zh_Hant_TW@collation=big5han</td>
   3395 				<td>zh_Hant_TW_u_co_big5han</td>
   3396 			</tr>
   3397 			<tr>
   3398 				<td>th_TH@calendar=gregorian;numbers=thai</td>
   3399 				<td>th_TH_u_ca_gregory_nu_thai</td>
   3400 			</tr>
   3401 			<tr>
   3402 				<td>en_US_POSIX@timezone=America/Los_Angeles</td>
   3403 				<td>en_US_u_tz_uslax_va_posix</td>
   3404 			</tr>
   3405 		</table>
   3406 
   3407 		<p>Where the old API is supplied the bcp47 language code, or vice
   3408 			versa, the recommendation is to:</p>
   3409 		<ol>
   3410 			<li>Have all methods that take the old syntax also take the new
   3411 				syntax, interpreted correctly. For example,
   3412 				&quot;zh-TW-u-co-pinyin&quot; and &quot;zh_TW@collation=pinyin&quot;
   3413 				would both be interpreted as meaning the same.</li>
   3414 			<li>Have all methods (both for old and new syntax) accept all
   3415 				possible aliases for keywords and types. For example,
   3416 				&quot;ar-u-ca-islamicc&quot; would be equivalent to
   3417 				&quot;ar-u-ca-islamic-civil&quot;.
   3418 				<ul>
   3419 					<li>The one exception is where an alias would only be
   3420 						well-formed with the old syntax, such as &quot;gregorian&quot;
   3421 						(for &quot;gregory&quot;).</li>
   3422 				</ul>
   3423 			</li>
   3424 			<li>Where an API cannot successfully accept the alternate
   3425 				syntax, throw an exception (or otherwise indicate an error) so that
   3426 				people can detect that they are using the wrong method (or wrong
   3427 				input).</li>
   3428 			<li>Provide a method that tests a purported locale ID string to
   3429 				determine its status:
   3430 				<ol>
   3431 					<li><strong>well-formed</strong> - syntactically correct</li>
   3432 					<li><strong>valid</strong> - well-formed and only uses
   3433 						registered language subtags, extensions, keywords, types...</li>
   3434 					<li><strong>canonical</strong> - valid and no deprecated codes
   3435 						or structure.</li>
   3436 				</ol>
   3437 			</li>
   3438 		</ol>
   3439 
   3440 		<h4>
   3441 			<a name="Legacy_Variants" href="#Legacy_Variants">3.8.2 Legacy
   3442 				Variants </a>
   3443 		</h4>
   3444 		<p>
   3445 			Old LDML specification allowed codes other than registered [<a
   3446 				href="#BCP47">BCP47</a>] variant subtags used in Unicode language
   3447 			and locale identifiers for representing variations of locale data.
   3448 			Unicode locale identifiers including such variant codes can be
   3449 			converted to the new [<a href="#BCP47">BCP47</a>] compatible
   3450 			identifiers by following the descriptions below:
   3451 		</p>
   3452 		<table>
   3453 			<caption>
   3454 				<a name="Legacy_Variant_Mappings" href="#Legacy_Variant_Mappings">Legacy
   3455 					Variant Mappings</a>
   3456 			</caption>
   3457 			<tr>
   3458 				<th>Variant Code</th>
   3459 				<th>Description</th>
   3460 			</tr>
   3461 
   3462 			<tr>
   3463 				<td>AALAND</td>
   3464 				<td>land, variant of "sv" Swedish used in Finland. Use "sv_AX"
   3465 					to indicate this.</td>
   3466 			</tr>
   3467 
   3468 			<tr>
   3469 				<td>BOKMAL</td>
   3470 				<td>Bokml, variant of "no" Norwegian. Use primary language
   3471 					subtag "nb" to indicate this.</td>
   3472 			</tr>
   3473 
   3474 			<tr>
   3475 				<td>NYNORSK</td>
   3476 				<td>Nynorsk, variant of "no" Norwegian. Use primary language
   3477 					subtag "nn" to indicate this.</td>
   3478 			</tr>
   3479 
   3480 			<tr>
   3481 				<td>POSIX</td>
   3482 				<td>POSIX variation of locale data. Use Unicode locale
   3483 					extension "-u-va-posix" to indicate this.</td>
   3484 			</tr>
   3485 
   3486 			<tr>
   3487 				<td>POLYTONI</td>
   3488 				<td>Polytonic, variant of "el" Greek. Use [<a href="#BCP47">BCP47</a>]
   3489 					variant subtag "polyton" to indicate this.
   3490 				</td>
   3491 			</tr>
   3492 
   3493 			<tr>
   3494 				<td>SAAHO</td>
   3495 				<td>The Saaho variant of Afar. Use primary language subtag
   3496 					"ssy" to indicated this.</td>
   3497 			</tr>
   3498 		</table>
   3499 		<p>
   3500 			When converting to old syntax, the Unicode locale extension
   3501 			"-u-va-posix" should be converted to the "POSIX" variant, <i>not</i>
   3502 			to old extension syntax like "@va=posix". This is an exception: The
   3503 			other mappings above should not be reversed.
   3504 		</p>
   3505 
   3506 		<p>Examples:</p>
   3507 		<ul>
   3508 			<li>en_US_POSIX  en-US-u-va-posix</li>
   3509 			<li>en_US_POSIX@colNumeric=yes  en-US-u-kn-va-posix</li>
   3510 			<li>en-US-POSIX-u-kn-true  en-US-u-kn-va-posix</li>
   3511 			<li>en-US-POSIX-u-kn-va-posix  en-US-u-kn-va-posix</li>
   3512 		</ul>
   3513 
   3514 		<h4>
   3515 			<a name="Relation_to_OpenI18n" href="#Relation_to_OpenI18n">3.8.3
   3516 				Relation to OpenI18n</a>
   3517 		</h4>
   3518 		<p>
   3519 			The locale id format generally follows the description in the <i>OpenI18N
   3520 				Locale Naming Guideline</i> [<a href="#NamingGuideline">NamingGuideline</a>],
   3521 			with some enhancements. The main differences from the those
   3522 			guidelines are that the locale id:
   3523 		</p>
   3524 		<ol type="a">
   3525 			<li style="margin-top: 0.5em; margin-bottom: 0.5em">does not
   3526 				include a charset (since the data in LDML format always provides a
   3527 				representation of all Unicode characters. The repository is stored
   3528 				in UTF-8, although that can be transcoded to other encodings as
   3529 				well.),</li>
   3530 			<li style="margin-top: 0.5em; margin-bottom: 0.5em">adds the
   3531 				ability to have a variant, as in Java</li>
   3532 			<li style="margin-top: 0.5em; margin-bottom: 0.5em">adds the
   3533 				ability to discriminate the written language by script (or script
   3534 				variant).</li>
   3535 			<li style="margin-top: 0.5em; margin-bottom: 0.5em">is a
   3536 				superset of [<a href="#BCP47">BCP47</a>] codes.
   3537 			</li>
   3538 		</ol>
   3539 		<h3>
   3540 			<a name="Transmitting_Locale_Information"
   3541 				href="#Transmitting_Locale_Information">3.9 Transmitting Locale
   3542 				Information</a>
   3543 		</h3>
   3544 		<p>
   3545 			In a world of on-demand software components, with arbitrary
   3546 			connections between those components, it is important to get a sense
   3547 			of where localization should be done, and how to transmit enough
   3548 			information so that it can be done at that appropriate place.
   3549 			End-users need to get messages localized to their languages, messages
   3550 			that not only contain a translation of text, but also contain
   3551 			variables such as date, time, number formats, and currencies
   3552 			formatted according to the users&#39; conventions. The strategy for
   3553 			doing the so-called <i>JIT localization </i>is made up of two parts:
   3554 		</p>
   3555 		<ol>
   3556 			<li>Store and transmit <i>neutral-format</i> data wherever
   3557 				possible.
   3558 				<ul>
   3559 					<li>Neutral-format data is data that is kept in a standard
   3560 						format, no matter what the local user&#39;s environment is.
   3561 						Neutral-format is also (loosely) called <i>binary data</i>, even
   3562 						though it actually could be represented in many different ways,
   3563 						including a textual representation such as in XML.
   3564 					</li>
   3565 					<li>Such data should use accepted standards where possible,
   3566 						such as for currency codes.</li>
   3567 					<li>Textual data should also be in a uniform character set
   3568 						(Unicode/10646) to avoid possible data corruption problems when
   3569 						converting between encodings.</li>
   3570 				</ul>
   3571 			</li>
   3572 			<li>Localize that data as &quot;<i>close</i>&quot; to the
   3573 				end-user as possible.
   3574 			</li>
   3575 		</ol>
   3576 		<p>There are a number of advantages to this strategy. The longer
   3577 			the data is kept in a neutral format, the more flexible the entire
   3578 			system is. On a practical level, if transmitted data is
   3579 			neutral-format, then it is much easier to manipulate the data, debug
   3580 			the processing of the data, and maintain the software connections
   3581 			between components.</p>
   3582 		<p>Once data has been localized into a given language, it can be
   3583 			quite difficult to programmatically convert that data into another
   3584 			format, if required. This is especially true if the data contains a
   3585 			mixture of translated text and formatted variables. Once information
   3586 			has been localized into, say, Romanian, it is much more difficult to
   3587 			localize that data into, say, French. Parsing is more difficult than
   3588 			formatting, and may run up against different ambiguities in
   3589 			interpreting text that has been localized, even if the original
   3590 			translated message text is available (which it may not be).</p>
   3591 		<p>Moreover, the closer we are to end-user, the more we know about
   3592 			that user&#39;s preferred formats. If we format dates, for example,
   3593 			at the user&#39;s machine, then it can easily take into account any
   3594 			customizations that the user has specified. If the formatting is done
   3595 			elsewhere, either we have to transmit whatever user customizations
   3596 			are in play, or we only transmit the user&#39;s locale code, which
   3597 			may only approximate the desired format. Thus the closer the
   3598 			localization is to the end user, the less we need to ship all of the
   3599 			user&#39;s preferences around to all the places that localization
   3600 			could possibly need to be done.</p>
   3601 		<p>Even though localization should be done as close to the
   3602 			end-user as possible, there will be cases where different components
   3603 			need to be aware of whatever settings are appropriate for doing the
   3604 			localization. Thus information such as a locale code or time zone
   3605 			needs to be communicated between different components.</p>
   3606 		<h4>
   3607 			<a name="Message_Formatting_and_Exceptions"
   3608 				href="#Message_Formatting_and_Exceptions">3.9.1 Message
   3609 				Formatting and Exceptions</a>
   3610 		</h4>
   3611 		<p>
   3612 			Windows (<a
   3613 				href="http://msdn.microsoft.com/en-us/library/ms679351.aspx">FormatMessage</a>,
   3614 			<a href="http://msdn.microsoft.com/en-us/library/aa331875.aspx">String.Format</a>),
   3615 			Java (<a
   3616 				href="http://docs.oracle.com/javase/7/docs/api/java/text/MessageFormat.html">MessageFormat</a>)
   3617 			and ICU (<a
   3618 				href="http://www.icu-project.org/apiref/icu4c/classMessageFormat.html">MessageFormat</a>,
   3619 			<a href="http://www.icu-project.org/apiref/icu4c/umsg_8h.html">umsg</a>)
   3620 			all provide methods of formatting variables (dates, times, etc) and
   3621 			inserting them at arbitrary positions in a string. This avoids the
   3622 			manual string concatenation that causes severe problems for
   3623 			localization. The question is, where to do this? It is especially
   3624 			important since the original code site that originates a particular
   3625 			message may be far down in the bowels of a component, and passed up
   3626 			to the top of the component with an exception. So we will take that
   3627 			case as representative of this class of issues.
   3628 		</p>
   3629 		<p>There are circumstances where the message can be communicated
   3630 			with a language-neutral code, such as a numeric error code or
   3631 			mnemonic string key, that is understood outside of the component. If
   3632 			there are arguments that need to accompany that message, such as a
   3633 			number of files or a datetime, those need to accompany the numeric
   3634 			code so that when the localization is finally at some point, the full
   3635 			information can be presented to the end-user. This is the best case
   3636 			for localization.</p>
   3637 		<p>More often, the exact messages that could originate from within
   3638 			the component are not known outside of the component itself; or at
   3639 			least they may not be known by the component that is finally
   3640 			displaying text to the user. In such a case, the information as to
   3641 			the user&#39;s locale needs to be communicated in some way to the
   3642 			component that is doing the localization. That locale information
   3643 			does not necessarily need to be communicated deep within the
   3644 			component; ideally, any exceptions should bundle up some
   3645 			language-neutral message ID, plus the arguments needed to format the
   3646 			message (for example, datetime), but not do the localization at the
   3647 			throw site. This approach has the advantages noted above for JIT
   3648 			localization.</p>
   3649 		<p>In addition, exceptions are often caught at a higher level;
   3650 			they do not end up being displayed to any end-user at all. By
   3651 			avoiding the localization at the throw site, it the cost of doing
   3652 			formatting, when that formatting is not really necessary. In fact, in
   3653 			many running programs most of the exceptions that are thrown at a low
   3654 			level never end up being presented to an end-user, so this can have
   3655 			considerable performance benefits.</p>
   3656 		<h3>
   3657 			<a name="Language_and_Locale_IDs" href="#Language_and_Locale_IDs">3.10
   3658 				Unicode Language and Locale IDs</a>
   3659 		</h3>
   3660 		<p>People have very slippery notions of what distinguishes a
   3661 			language code versus a locale code. The problem is that both are
   3662 			somewhat nebulous concepts.</p>
   3663 		<p>
   3664 			In practice, many people use [<a href="#BCP47">BCP47</a>] codes to
   3665 			mean locale codes instead of strictly language codes. It is easy to
   3666 			see why this came about; because [<a href="#BCP47">BCP47</a>]
   3667 			includes an explicit region (territory) code, for most people it was
   3668 			sufficient for use as a locale code as well. For example, when
   3669 			typical web software receives an [<a href="#BCP47">BCP47</a>] code,
   3670 			it will use it as a locale code. Other typical software will do the
   3671 			same: in practice, language codes and locale codes are treated
   3672 			interchangeably. Some people recommend distinguishing on the basis of
   3673 			&quot;-&quot; versus &quot;_&quot; (for example, <i>zh-TW</i> for
   3674 			language code, <i>zh_TW</i> for locale code), but in practice that
   3675 			does not work because of the free variation out in the world in the
   3676 			use of these separators. Notice that Windows, for example, uses
   3677 			&quot;-&quot; as a separator in its locale codes. So pragmatically
   3678 			one is forced to treat &quot;-&quot; and &quot;_&quot; as equivalent
   3679 			when interpreting either one on input.
   3680 		</p>
   3681 		<p>
   3682 			Another reason for the conflation of these codes is that <i>very</i>
   3683 			little data in most systems is distinguished by region alone;
   3684 			currency codes and measurement systems being some of the few.
   3685 			Sometimes date or number formats are mentioned as regional, but that
   3686 			really does not make much sense. If people see the sentence &quot;You
   3687 			will have to adjust the value to ,. from ,.&quot;
   3688 			(using Indic digits), they would say that sentence is simply not
   3689 			English. Number format is far more closely associated with language
   3690 			than it is with region. The same is true for date formats: people
   3691 			would never expect to see intermixed a date in the format
   3692 			&quot;200341&quot; (using Kanji) in text purporting to be purely
   3693 			English. There are regional differences in date and number format 
   3694 			differences which can be important  but those are different in kind
   3695 			than other language differences between regions.
   3696 		</p>
   3697 		<p>
   3698 			As far as we are concerned  <i>as a completely practical matter</i>
   3699 			 two languages are different if they require substantially different
   3700 			localized resources. Distinctions according to spoken form are
   3701 			important in some contexts, but the written form is by far and away
   3702 			the most important issue for data interchange. Unfortunately, this is
   3703 			not the principle used in [<a href="#ISO639">ISO639</a>], which has
   3704 			the fairly unproductive notion (for data interchange) that only
   3705 			spoken language matters (it is also not completely consistent about
   3706 			this, however).
   3707 		</p>
   3708 		<p>
   3709 			[<a href="#BCP47">BCP47</a>] <i><b>can</b></i> express a difference
   3710 			if the use of written languages happens to correspond to region
   3711 			boundaries expressed as [<a href="#ISO3166">ISO3166</a>] region
   3712 			codes, and has recently added codes that allow it to express some
   3713 			important cases that are not distinguished by [<a href="#ISO3166">ISO3166</a>]
   3714 			codes. These written languages include simplified and traditional
   3715 			Chinese (both used in Hong Kong S.A.R.); Serbian in Latin script;
   3716 			Azerbaijani in Arab script, and so on.
   3717 		</p>
   3718 		<p>
   3719 			Notice also that <i>currency codes</i> are different than <i>currency
   3720 				localizations</i>. The currency localizations should largely be in the
   3721 			language-based resource bundles, not in the territory-based resource
   3722 			bundles. Thus, the resource bundle <i>en</i> contains the localized
   3723 			mappings in English for a range of different currency codes: USD 
   3724 			US$, RUR  Rub, AUD  $A and so on. Of course, some currency symbols
   3725 			are used for more than one currency, and in such cases
   3726 			specializations appear in the territory-based bundles. Continuing the
   3727 			example, <i>en_US</i> would have USD  $, while <i>en_AU</i> would
   3728 			have AUD  $. (In protocols, the currency codes should always
   3729 			accompany any currency amounts; otherwise the data is ambiguous, and
   3730 			software is forced to use the user&#39;s territory to guess at the
   3731 			currency. For some informal discussion of this, see <a
   3732 				href="http://source.icu-project.org/repos/icu/icuhtml/trunk/design/jit_localization.html">JIT
   3733 				Localization</a>.)
   3734 		</p>
   3735 		<h4>
   3736 			<a name="Written_Language" href="#Written_Language">3.10.1
   3737 				Written Language</a>
   3738 		</h4>
   3739 		<p>
   3740 			Criteria for what makes a written language should be purely
   3741 			pragmatic; <i>what would copy-editors say? </i>If one gave them text
   3742 			like the following, they would respond that is far from acceptable
   3743 			English for publication, and ask for it to be redone:
   3744 		</p>
   3745 		<ol>
   3746 			<li type="A">&quot;Theatre Center News: The date of the last
   3747 				version of this document was 2003320. A copy can be obtained for
   3748 				$50,0 or 1.234,57 . We would like to acknowledge contributions by
   3749 				the following authors (in alphabetical order): Alaa Ghoneim, Behdad
   3750 				Esfahbod, Ahmed Talaat, Eric Mader, Asmus Freytag, Avery Bishop, and
   3751 				Doug Felt.&quot;</li>
   3752 		</ol>
   3753 		<p>So one would change it to either B or C below, depending on
   3754 			which orthographic variant of English was the target for the
   3755 			publication:</p>
   3756 		<ol type="A" start="2">
   3757 			<li>&quot;Theater Center News: The date of the last version of
   3758 				this document was 3/20/2003. A copy can be obtained for $50.00 or
   3759 				1,234.57 Ukrainian Hryvni. We would like to acknowledge
   3760 				contributions by the following authors (in alphabetical order): Alaa
   3761 				Ghoneim, Ahmed Talaat, Asmus Freytag, Avery Bishop, Behdad Esfahbod,
   3762 				Doug Felt, Eric Mader.&quot;</li>
   3763 			<li>&quot;Theatre Centre News: The date of the last version of
   3764 				this document was 20/3/2003. A copy can be obtained for $50.00 or
   3765 				1,234.57 Ukrainian Hryvni. We would like to acknowledge
   3766 				contributions by the following authors (in alphabetical order): Alaa
   3767 				Ghoneim, Ahmed Talaat, Asmus Freytag, Avery Bishop, Behdad Esfahbod,
   3768 				Doug Felt, Eric Mader.&quot;</li>
   3769 		</ol>
   3770 		<p>
   3771 			Clearly there are many acceptable variations on this text. For
   3772 			example, copy editors might still quibble with the use of first
   3773 			versus last name sorting in the list, but clearly the first list was
   3774 			<i>not</i> acceptable English alphabetical order. And in quoting a
   3775 			name, like &quot;Theatre Centre News&quot;, one may leave it in the
   3776 			source orthography even if it differs from the publication target
   3777 			orthography. And so on. However, just as clearly, there limits on
   3778 			what is acceptable English, and &quot;2003320&quot;, for example,
   3779 			is <i>not</i>.
   3780 		</p>
   3781 		<p>Note that the language of locale data may differ from the
   3782 			language of localized software or web sites, when those latter are
   3783 			not localized into the user&#39;s preferred language. In such cases,
   3784 			the kind of incongruous juxtapositions described above may well
   3785 			appear, but this situation is usually preferable to forcing
   3786 			unfamiliar date or number formats on the user as well.</p>
   3787 	  <h4>
   3788 			<a name="Hybrid_Locale" href="#Hybrid_Locale">3.10.2
   3789 		Hybrid Locale Identifiers</a>
   3790 		</h4>
   3791         <p>Hybrid locales have intermixed content from 2 (or more) languages, often with one language's grammatical structure applied to words in another. These are commonly referred to with portmanteau words such as<em>Franglais, <a href="https://en.oxforddictionaries.com/definition/spanglish">Spanglish</a> </em>or<em> Denglish</em>. Hybrid locales do not<em>not</em> reference text simply containing two languages: a book of parallel text containing English and French, such as the following, is not Franglais:</p>
   3792       <table style='margin-left:2em; margin-right:2em'>
   3793           <tbody>
   3794             <tr>
   3795               <td width='50%' style='font-family:serif'>On the 24th of May, 1863, my uncle, Professor Liedenbrock, rushed into his little house, No. 19 Knigstrasse, one of the oldest streets in the oldest portion of the city of Hamburg</td>
   3796               <td style='font-family:serif'>Le 24 mai 1863, un dimanche, mon oncle, le professeur Lidenbrock, revint prcipitamment vers sa petite maison situe au numro 19 de Knigstrasse, lune des plus anciennes rues du vieux quartier de Hambourg</td>
   3797             </tr>
   3798           </tbody>
   3799         </table>
   3800         <p>While text in a document can be tagged as partly in one language and partly in another, that is not the same having a hybrid locale. There is a difference between having a Spanglish document, and a Spanish document that has some passages quoted in English. Fine-grained tagging doesn't  handle grammatical combinations like Denglisch <a href="http://www.duden.de/rechtschreibung/downloaden">gedownloadet</a>, which is neither English nor German  similarly the Franglais <a href='http://www.le-dictionnaire.com/definition.php?mot=downloader'>download</a>. More importantly, it doesnt work for the very common use case for a <a href="#unicode_locale_id">unicode_locale_id</a>: <i>locale selection</i>. </p>
   3801       <p>To communicate requests for localized content and internationalization services, locales are used. When people pick a language from a menu, internally they are picking a locale (en-GB, es-419, etc.). To allow an application to support Spanglish or Hinglish locale selection, <a href="#unicode_locale_id">unicode_locale_id</a>s can represent hybrid locales using the  T extension key-value 'h0-hybrid'. (For more information on the T extension, see <em>Section 3.7 <a href="#t_Extension">Unicode BCP 47 T Extension</a>.</em>)
   3802       </p>
   3803       <p>Examples:</p>
   3804       <table class='simple'>
   3805           <tbody>
   3806             <tr>
   3807               <td>hi-t-<u>en-h0-hybrid</u></td>
   3808               <td>Hinglish</td>
   3809               <td>Hindi-English hybrid locale</td>
   3810             </tr>
   3811             <tr>
   3812               <td>ta-t-<u>en-h0-hybrid</u></td>
   3813               <td>Tanglish</td>
   3814               <td>Tamil-English hybrid locale</td>
   3815             </tr>
   3816             <tr>
   3817               <td>ba-t-<u>en-h0-hybrid</u></td>
   3818               <td>Banglish</td>
   3819               <td>Bangla-English hybrid locale</td>
   3820             </tr>
   3821              <tr><td colspan="3"></td></tr>
   3822             <tr>
   3823               <td>en-t-<u>hi-h0-hybrid</u></td>
   3824               <td>Hinglish</td>
   3825               <td>English-Hindi hybrid locale</td>
   3826             </tr>
   3827             <tr>
   3828               <td>en-t-<u>zh-h0-hybrid</u></td>
   3829               <td>Chinglish</td>
   3830               <td>English-Chinese hybrid locale</td>
   3831             </tr>
   3832 			<tr><td colspan="3"></td></tr>
   3833         </tbody>
   3834         </table>
   3835         <blockquote>
   3836           <p><em>Note: The <a href="#unicode_language_id">unicode_language_id</a> should be the language used as the scaffold: for the fallback locale for internationalization services, typically used for more of the core vocabulary/structure in the content. Thus Hinglish should be represented as hi-t-h0-en where Hindi is the scaffold, and as en-t-h0-hi where English is.</em></p>
   3837         </blockquote>
   3838       <p>The value of -t- is a full <em><a href="#unicode_language_id">unicode_language_id</a></em>, and can contain subtags for script or region where it is important to include them, as in the following. It may be useful in order to emphasize the script, even where it is the default script for the language, if it is not the same as the script of the main language tag.</p>
   3839       <table class='simple'>
   3840           <tbody>
   3841             <tr>
   3842               <td>ru-t<u>-en-latn-gb-h0-hybrid</u></td>
   3843               <td>Runglish</td>
   3844               <td>Russian with an admixture of British English in Latin script</td>
   3845             </tr>
   3846             <tr>
   3847               <td>ru-t-<u>en-cyrl-gb-h0-hybrid</u></td>
   3848               <td>Runglish</td>
   3849               <td>Russian with an admixture of British English in Cyrillic script</td>
   3850             </tr>
   3851           </tbody>
   3852         </table>
   3853       <p>Should there ever be strong need for hybrids of more than two languages or for other purposes such as hybrid languages as the source of translated content, additional structure could be added.</p>
   3854 		<h3>
   3855 			<a name="Validity_Data" href="#Validity_Data">3.11 Validity Data</a>
   3856 		</h3>
   3857 		<p class='dtd'>
   3858 			&lt;!ELEMENT idValidity (id*) &gt;<br> &lt;!ELEMENT id ( #PCDATA
   3859 			) &gt;<br> &lt;!ATTLIST id type NMTOKEN #REQUIRED &gt; <br>
   3860 			&lt;!ATTLIST id idStatus NMTOKEN #REQUIRED &gt;
   3861 		</p>
   3862 		<p>
   3863 			The directory <a
   3864 				href='http://unicode.org/repos/cldr/tags/latest/common/validity/'>common/validity</a>
   3865 			contains machine-readable data for validating the language, region,
   3866 			script, and variant subtags, as well as currency, subdivisions and
   3867 			measure units. Each file contains a number of subtags with the
   3868 			following <strong>idStatus</strong> values:
   3869 		</p>
   3870 		<ul>
   3871 			<li><strong>regular</strong>  the standard codes used for the
   3872 				specific type of subtag</li>
   3873 			<li><strong>special</strong>  certain
   3874 				exceptional language codes like 'mul'<em> (languages only)</em></li>
   3875 			<li><strong>unknown</strong>  the code used to indicate the
   3876 				&quot;unknown&quot;, &quot;undetermined&quot; or &quot;invalid&quot;
   3877 				values. For more information, see <em>Section 3.5.1 <a
   3878 					href="#Unknown_or_Invalid_Identifiers">Unknown or Invalid
   3879 						Identifiers</a></em>.</li>
   3880 			<li><strong>macroregion</strong>  the standard codes that are
   3881 				macroregions<em> (for regions only).</em>
   3882 				<ul>
   3883 					<li>Note that some two-letter region codes are macroregions,
   3884 						and (in the future) some three-digit codes may be regular codes.</li>
   3885 					<li>For details as to which regions are contained within which
   3886 						macroregions, see the <strong>&lt;containment&gt;</strong> element
   3887 						of the supplemental data.
   3888 					</li>
   3889 				</ul></li>
   3890 			<li><strong>deprecated</strong>  codes that should not be used.
   3891 				The <strong>&lt;alias&gt;</strong> element in the supplementalMeta
   3892 				file contains more information about these codes, and which codes
   3893 				should be used instead.</li>
   3894 			<li><strong>private_use</strong>  codes that, for CLDR, are
   3895 				considered private use. Note that some BCP 47 private-use codes have
   3896 				defined CLDR semantics, and are considered regular codes. For more
   3897 				information, see <em>Section 3.5.3 <a href="#Private_Use">Private
   3898 						Use Codes</a>.
   3899 			</em></li>
   3900 		</ul>
   3901 		<p>
   3902 			The list of subtags for each idStatus use a compact format as a
   3903 			space-delimited list of StringRanges, as defined in <em>Section
   3904 				<a href="#String_Range">5.3.4 String Range</a>.
   3905 			</em> The separator for each StringRange is a &quot;~&quot;.
   3906 		</p>
   3907 		<p>Each measure unit is a sequence of subtags, such as
   3908 			angle-arc-minute. The first subtag provides a general category of
   3909 			the unit.</p>
   3910 		<p>
   3911 			In version 28.0, the subdivisions in the
   3912 			validity files used the ISO format, uppercase with a hyphen separating two
   3913 			components, instead of the BCP 47 format.
   3914 	  </p>
   3915 		<h2>
   3916 			<a name="Locale_Inheritance" href="#Locale_Inheritance">4 Locale
   3917 				Inheritance and Matching</a>
   3918 		</h2>
   3919 		<p>
   3920 			The XML format relies on an inheritance model, whereby the resources
   3921 			are collected into <i>bundles</i>, and the bundles organized into a
   3922 			tree. Data for the many Spanish locales does not need to be
   3923 			duplicated across all of the countries having Spanish as a national
   3924 			language. Instead, common data is collected in the Spanish language
   3925 			locale, and territory locales only need to supply differences. The
   3926 			parent of all of the language locales is a generic locale known as <i>root</i>.
   3927 			Wherever possible, the resources in the root are language &amp;
   3928 			territory neutral. For example, the collation (sorting) order in the
   3929 			root is based on the [<a href="#DUCET">DUCET</a>] (see<em><a
   3930 				href="tr35-collation.html#Root_Collation">Root Collation</a></em>). Since
   3931 			English language collation has the same ordering as the root locale,
   3932 			the &#39;en&#39; locale data does not need to supply any collation
   3933 			data, nor do the &#39;en_US&#39;, &#39;en_GB&#39; or the any of the
   3934 			various other locales that use English.
   3935 		</p>
   3936 		<p>Given a particular locale id &quot;en_US_someVariant&quot;, the
   3937 			search chain for a particular resource is the following.</p>
   3938 		<blockquote>
   3939 			<pre>en_US_someVariant
   3940 en_US
   3941 en
   3942 root</pre>
   3943 		</blockquote>
   3944 		<p>
   3945 			<em>The inheritance is often not simple truncation, as will be
   3946 				seen later in this section.</em>
   3947 		</p>
   3948 		<p>If a type and key are supplied in the locale id, then logically
   3949 			the chain from that id to the root is searched for a resource tag
   3950 			with a given type, all the way up to root. If no resource is found
   3951 			with that tag and type, then the chain is searched again without the
   3952 			type.</p>
   3953 		<p>
   3954 			Thus the data for any given locale will only contain resources that
   3955 			are different from the parent locale. For example, most territory
   3956 			locales will inherit the bulk of their data from the language locale:
   3957 			&quot;en&quot; will contain the bulk of the data: &quot;en_IE&quot;
   3958 			will only contain a few items like currency. All data that is
   3959 			inherited from a parent is presumed to be valid, just as valid as if
   3960 			it were physically present in the file. This provides for much
   3961 			smaller resource bundles, and much simpler (and less error-prone)
   3962 			maintenance. At the script or region level, the &quot;primary&quot;
   3963 			child locale will be empty, since its parent will contain all of the
   3964 			appropriate resources for it. For more information see <i>CLDR
   3965 				Information : Section 9.3 <a href="tr35-info.html#Default_Content">Default
   3966 					Content</a>.
   3967 			</i>
   3968 		</p>
   3969 
   3970 		<p>
   3971 			Certain data items depend only on the region specified in a locale id
   3972 			(by a <a
   3973 				href="#unicode_region_subtag_validity">unicode_region_subtag</a> or
   3974 				an rg <a href="#RegionOverride">Region Override</a> key)
   3975 			, and are obtained from supplemental data rather than through locale
   3976 			resources. For example:
   3977 		</p>
   3978 		<ul>
   3979 			<li>The currency for the specified region (see <a
   3980 				href="tr35-numbers.html#Supplemental_Currency_Data">Supplemental
   3981 					Currency Data</a>)
   3982 			</li>
   3983 			<li>The measurement system for the specified region (see <a
   3984 				href="tr35-general.html#Measurement_System_Data">Measurement
   3985 					System Data</a>)
   3986 			</li>
   3987 			<li>The week conventions for the specified region (see <a
   3988 				href="tr35-dates.html#Week_Data">Week Data</a>)
   3989 			</li>
   3990 		</ul>
   3991 		<p>
   3992 			(For more information on the specific
   3993 				items handled this way, see <a
   3994 				href="tr35-info.html#Territory_Based_Preferences">Territory-Based
   3995 					Preferences</a>.)
   3996 			These items will be correct for the specified region regardless of
   3997 			whether a locale bundle actually exists with the same combination of
   3998 			language and region as in the locale id. For example, suppose data is
   3999 			requested for the locale id "fr_US" and there is no bundle for that
   4000 			combination. Data obtained via locale inheritance, such as currency
   4001 			patterns and currency symbols, will be obtained from the parent
   4002 			locale "fr". However, currency amounts would be formatted by default
   4003 			using US dollars, just displayed in the manner governed by the locale
   4004 			"fr". When a locale id does not specify a region, the region-specific
   4005 			items such as those above are obtained from the likely region for the
   4006 			locale (obtained via <a href="#Likely_Subtags">Likely Subtags</a>).</p>
   4007 		<p>For the relationship between Inheritance, DefaultContent, LikelySubtags, and LocaleMatching, see Section 4.2.6 <a 
   4008 				href="tr35.html#Inheritance_vs_Related">Inheritance vs Related Information</a>.</p>
   4009 		<h3>
   4010 			<a href="#Lookup" name="Lookup">4.1 Lookup</a>
   4011 		</h3>
   4012 
   4013 		<p>If a language has more than one script in customary modern use,
   4014 			then the CLDR file structure in common/main follows the following
   4015 			model:</p>
   4016 		<blockquote>
   4017 			<p>
   4018 				lang<br> lang_script<br> lang_script_region<br>
   4019 				lang_region<i> (aliases to lang_script_region)</i>
   4020 			</p>
   4021 		</blockquote>
   4022 		<h4>
   4023 			<a href="#Bundle_vs_Item_Lookup" name="Bundle_vs_Item_Lookup">4.1.1
   4024 				Bundle vs Item Lookup</a>
   4025 		</h4>
   4026 		<p>
   4027 			There are actually two different kinds of inheritance fallback: <em>resource&nbsp;bundle&nbsp;lookup</em>
   4028 			and <em>resource&nbsp;item&nbsp;lookup</em>. For the former, a
   4029 			process is looking to find the first, best resource bundle it can;
   4030 			for the later, it is fallback&nbsp;within&nbsp;bundles on individual
   4031 			items, like the translated name for the region &quot;CN&quot; in
   4032 			Breton.
   4033 		</p>
   4034 		<p>
   4035 			These are closely related, but distinct, processes. They are
   4036 			illustrated in the table <a href="#Lookup-Differences">Lookup
   4037 				Differences</a>, where &quot;key&quot; stands for zero or more key/type
   4038 			pairs. Logically speaking, when looking up an item for a given
   4039 			locale, you first do a resource bundle lookup to find the best bundle
   4040 			for the locale, then you do a inherited item lookup starting with
   4041 			that resource bundle.
   4042 		</p>
   4043 		<p>
   4044 			The table <a href="#Lookup-Differences">Lookup Differences</a> uses
   4045 			the nave resource bundle lookup for illustration. More sophisticated
   4046 			systems will get far better results for resource bundle lookup if
   4047 			they use the algorithm described in <em>Section 4.4 <a
   4048 				href="#LanguageMatching">Language Matching</a></em>. That algorithm takes
   4049 			into account both the users desired locale(s) and the applications
   4050 			supported locales, in order to get the best match.
   4051 		</p>
   4052 		<p>
   4053 			If the nave resource bundle lookup is used, the desired locale needs
   4054 			to be canonicalized using 4.3 <a href="#Likely_Subtags">Likely
   4055 				Subtags</a> and the supplemental alias information, so that locales that
   4056 			CLDR considers identical are treated as such. Thus eng-Latn-GB should
   4057 			be mapped to en-GB, and cmn-TW mapped to zh-Hant-TW.
   4058 		</p>
   4059 		<p>For the purposes of CLDR, everything with the &lt;ldml&gt; dtd
   4060 			is treated logically as if it is one resource bundle, even if the
   4061 			implementation separates data into separate physical resource
   4062 			bundles. For example, suppose that there is a main XML file for Nama
   4063 			(naq), but there are no &lt;unit&gt; elements for it because the
   4064 			units are all inherited from root. If the &lt;unit&gt; elements are
   4065 			separated into a separate data tree for modularity in the
   4066 			implementation, the Nama &lt;unit&gt; resource bundle would be empty.
   4067 			However, for purposes of resource-bundle lookup the resource bundle
   4068 			lookup still stops at naq.xml.</p>
   4069 
   4070 		<div id="iqaw2" style="margin-top: 0px; margin-bottom: 0px;">
   4071 			<table class='simple' id="a1bn" border="1" cellpadding="3" cellspacing="0">
   4072 				<caption>
   4073 					<a href="#Lookup-Differences" name="Lookup-Differences">Lookup
   4074 						Differences</a>
   4075 				</caption>
   4076 				<tbody id="iqaw3">
   4077 					<tr id="x40y0">
   4078 						<th id="x40y1" style="vertical-align: top;" nowrap>Lookup
   4079 							Type</th>
   4080 						<th id="x40y3" style="vertical-align: top;" nowrap>Example</th>
   4081 						<th id="x40y5" style="vertical-align: top;">Comments</th>
   4082 					</tr>
   4083 					<tr id="iqaw4">
   4084 						<td id="iqaw5" style="vertical-align: top;" nowrap>
   4085 							<p id="rkc40">
   4086 								<strong>Resource bundle</strong> lookup
   4087 							</p>
   4088 						</td>
   4089 						<td id="iqaw7" style="vertical-align: top;" nowrap>
   4090 							<p>se-FI&nbsp;</p>
   4091 							<p>se&nbsp; </p>
   4092 							<p>
   4093 								<em>default-locale*&nbsp;&nbsp;</em>
   4094 							</p>
   4095 							<p>root</p>
   4096 						</td>
   4097 						<td id="rkc41" style="vertical-align: top;">
   4098 							<p>* The default-locale may have its own inheritance change;
   4099 								for example, it may be &quot;en-GB&nbsp;&nbsp;en&quot; In that
   4100 								case, the chain is expanded by inserting the chain, resulting
   4101 								in:</p>
   4102 							<blockquote>
   4103 								<p>se-FI </p>
   4104 								<p>se </p>
   4105 								<p>fi </p>
   4106 								<p>
   4107 									<em>en-GB </em>
   4108 								</p>
   4109 								<p>
   4110 									<em>en </em>
   4111 								</p>
   4112 								<p>root</p>
   4113 							</blockquote>
   4114 						</td>
   4115 					</tr>
   4116 					<tr id="iqaw9">
   4117 						<td id="iqaw10" style="vertical-align: top;" nowrap>
   4118 							<p>
   4119 								<strong>Inherited item</strong> lookup
   4120 							</p>
   4121 						</td>
   4122 						<td id="iqaw12" style="vertical-align: top;" nowrap>
   4123 							<p>se-FI+key&nbsp;</p>
   4124 							<p>se+key </p>
   4125 							<p>
   4126 								<em>root_alias*+key&nbsp;</em>
   4127 							</p>
   4128 							<p>&nbsp;root+key</p>
   4129 						</td>
   4130 						<td id="rkc43" style="vertical-align: top;">
   4131 							<p>* If there is a root_alias to another key or locale, then
   4132 								insert that entire chain. For example, suppose that months for
   4133 								another calendar system have a root alias to Gregorian months.
   4134 								In that case, the root alias would change the key, and retry
   4135 								from se-FI downward. This can happen multiple times.</p>
   4136 							<blockquote>
   4137 								<p>se-FI+key&nbsp;</p>
   4138 								<p>se+key </p>
   4139 								<p>root_alias*+key </p>
   4140 								<p>
   4141 									<em>se-FI+key2&nbsp;</em>
   4142 								</p>
   4143 								<p>
   4144 									<em>se+key2 </em>
   4145 								</p>
   4146 								<p>root_alias*+key2 </p>
   4147 								<p>root+key2</p>
   4148 							</blockquote>
   4149 						</td>
   4150 					</tr>
   4151 				</tbody>
   4152 			</table>
   4153 		</div>
   4154 		<p>Both the resource bundle inheritance and the inherited item
   4155 			inheritance use the parentLocale data, where available, instead of
   4156 			simple trunctation.</p>
   4157 		<p>The fallback is a bit different for these two cases; internal
   4158 			aliases and keys are are not involved in the bundle lookup, and the
   4159 			default locale is not involved in the item lookup. If the
   4160 			default-locale were used in the resource-item lookup, then strange
   4161 			results will occur. For example, suppose that the default locale is
   4162 			Swedish, and there is a Nama locale but no specific inherited item
   4163 			for collation. If the default-locale were used in resource-item
   4164 			lookup, it would produce odd and unexpected results for Nama sorting.
   4165 		</p>
   4166 		<p>The default locale is not even always used in resource bundle
   4167 			inheritance. For the following services, the fallback is always
   4168 			directly to the root locale rather than through default locale.</p>
   4169 		<ul>
   4170 			<li>collation</li>
   4171 			<li>break iteration</li>
   4172 			<li>case mapping</li>
   4173 			<li>transliteration
   4174 				<ul>
   4175 					<li>The lookup for transliteration is yet more complicated
   4176 						because of the interplay of source and target locales: see <em>Part
   4177 							2 General, Section 10.1<a
   4178 							href="http://www.unicode.org/reports/tr35/tr35-general.html#Inheritance">Inheritance.</a>
   4179 					</em>
   4180 					</li>
   4181 				</ul>
   4182 			</li>
   4183 		</ul>
   4184 		<p>
   4185 			Thus if there is no Akan locale, for example, asking for a collation
   4186 			for Akan should produce the root collation, <em>not the Swedish
   4187 				collation.</em>
   4188 		</p>
   4189 		<p>The inherited item lookup must remain stable, because the
   4190 			resources are built with a certain fallback in mind; changing the
   4191 			core fallback order can render the bundle structure incoherent.</p>
   4192 		<p>
   4193 			Resource bundle lookup, on the other hand, is more flexible; changes
   4194 			in the view of the &quot;best&quot; match between the input request
   4195 			and the output bundle are more tolerant, when represent overall
   4196 			improvements for users. For more information, see <i> <a
   4197 				href="#Fallback_Elements">A.1 Element fallback</a></i>.
   4198 		</p>
   4199 		<p>
   4200 			Where the LDML inheritance relationship does not match a target
   4201 			system, such as POSIX, the data logically should be fully resolved in
   4202 			converting to a format for use by that system, by adding <i>all</i>
   4203 			inherited data to each locale data set.
   4204 		</p>
   4205 		<p>
   4206 			For a more complete description of how inheritance applies to data,
   4207 			and the use of keywords, see <i><a
   4208 				href="#Inheritance_and_Validity">Section 4.2 Inheritance </a></i>.
   4209 		</p>
   4210 		<p>
   4211 			The locale data does not contain general character properties that
   4212 			are derived from the <i>Unicode Character Database</i> [<a
   4213 				href="http://unicode.org/reports/tr41/#UAX44">UAX44</a>]. That data
   4214 			being common across locales, it is not duplicated in the bundles.
   4215 			Constructing a POSIX locale from the CLDR data requires use of UCD
   4216 			data. In addition, POSIX locales may also specify the character
   4217 			encoding, which requires the data to be transformed into that target
   4218 			encoding.
   4219 		</p>
   4220 		<p>
   4221 			<b>Warning: </b>If a locale has a different script than its parent
   4222 			(for example, sr_Latn), then special attention must be paid to make
   4223 			sure that all inheritance is covered. For example, auxiliary exemplar
   4224 			characters may need to be empty (&quot;[]&quot;) to block
   4225 			inheritance.
   4226 		</p>
   4227 		<p>
   4228 			<strong>Empty Override:</strong> There is one special value reserved
   4229 			in LDML to indicate that a child locale is to have no value for a
   4230 			path, even if the parent locale has a value for that path. That value
   4231 			is &quot;&quot;. For example, if there is no phrase for &quot;two
   4232 			days ago&quot; in a language, that can be indicated with:
   4233 		</p>
   4234 		<pre>&lt;field type="day"&gt;
   4235   &lt;relative type="-2"&gt;&lt;/relative&gt;
   4236 </pre>
   4237 		<h4>
   4238 			<a name="Multiple_Inheritance"></a><a name="Lateral_Inheritance"
   4239 				href="#Lateral_Inheritance">4.1.2 Lateral Inheritance </a>
   4240 		</h4>
   4241 		<p>
   4242 			In clearly specified instances, resources may inherit from within the
   4243 			same locale. For example, currency format symbols inherit from the
   4244 			number format symbols; the Buddhist calendar inherits from the
   4245 			Gregorian calendar. This <i>only</i> happens where documented in this
   4246 			specification. In these special cases, the inheritance functions as
   4247 			normal, up to the root. If the data is not found along that path,
   4248 			then a second search is made, logically changing the
   4249 			element/attribute to the alternate values.
   4250 		</p>
   4251 		<p>
   4252 			For example, for the locale &quot;en_US&quot; the month data in
   4253 			&lt;calendar class=&quot;<span style="color: blue">buddhist</span>&quot;&gt;
   4254 			inherits first from &lt;calendar class=&quot;<span
   4255 				style="color: blue">buddhist</span>&quot;&gt; in &quot;en&quot;,
   4256 			then in &quot;root&quot;. If not found there, then it inherits from
   4257 			&lt;calendar type=&quot;<span style="color: blue">gregorian</span>&quot;&gt;
   4258 			in &quot;en_US&quot;, then &quot;en&quot;, then in &quot;root&quot;.
   4259 		</p>
   4260 		<p>There is one special case, for items with a &quot;count&quot;
   4261 			parameter (used to select a plural form). In that case, the
   4262 			inheritance works as follows:</p>
   4263 		<p>If there is no value for a path, and that path has a
   4264 			[@count=&quot;x&quot;] attribute and value, then:</p>
   4265 		<ol>
   4266 			<li>If &quot;x&quot; is anything but &quot;other&quot;, it falls
   4267 				back to [@count=&quot;other&quot;], within that the same locale.</li>
   4268 			<li>In the special case of currencies, if the
   4269 				[@count=&quot;other&quot;] value is missing, it falls back to the
   4270 				path that is completely missing the count item.</li>
   4271 			<li>If there is no value within the same locale, the same
   4272 				process is used in the parent locale, and so on.</li>
   4273 		</ol>
   4274 		<p>
   4275 			<em>Examples:</em>
   4276 		</p>
   4277 		<table class='simple' border="1" cellpadding="3" cellspacing="0" id="a1bn3">
   4278 			<caption>
   4279 				<a name="Count_Fallback_normal" href="#Count_Fallback_normal">Count
   4280 					Fallback: normal</a>
   4281 			</caption>
   4282 			<tbody>
   4283 				<tr>
   4284 					<th nowrap style="vertical-align: top;">Locale</th>
   4285 					<th nowrap style="vertical-align: top;">Path</th>
   4286 				</tr>
   4287 				<tr>
   4288 					<td nowrap style="vertical-align: top;">fr-CA</td>
   4289 					<td nowrap id="iqaw" style="vertical-align: top;"><code>
   4290 							//ldml/units/unitLength[@type="<strong>narrow</strong>"]/unit[@type="mass-gram"]/unitPattern<strong>[@count="x"]</strong>
   4291 						</code></td>
   4292 				</tr>
   4293 				<tr>
   4294 					<td nowrap style="vertical-align: top;">fr-CA</td>
   4295 					<td nowrap id="iqaw16" style="vertical-align: top;"><code>
   4296 							//ldml/units/unitLength[@type="<strong>narrow</strong>"]/unit[@type="mass-gram"]/unitPattern<strong>[@count="other"]</strong>
   4297 						</code></td>
   4298 				</tr>
   4299 				<tr>
   4300 					<td nowrap style="vertical-align: top;">fr</td>
   4301 					<td nowrap id="iqaw19" style="vertical-align: top;"><code>
   4302 							//ldml/units/unitLength[@type="<strong>narrow</strong>"]/unit[@type="mass-gram"]/unitPattern<strong>[@count="x"]</strong>
   4303 						</code></td>
   4304 				</tr>
   4305 				<tr>
   4306 					<td nowrap style="vertical-align: top;">fr</td>
   4307 					<td nowrap id="iqaw18" style="vertical-align: top;"><code>
   4308 							//ldml/units/unitLength[@type="<strong>narrow</strong>"]/unit[@type="mass-gram"]/unitPattern<strong>[@count="other"]</strong>
   4309 						</code></td>
   4310 				</tr>
   4311 				<tr>
   4312 					<td nowrap style="vertical-align: top;">root</td>
   4313 					<td nowrap id="iqaw21" style="vertical-align: top;"><code>
   4314 							//ldml/units/unitLength[@type="<strong>narrow</strong>"]/unit[@type="mass-gram"]/unitPattern<strong>[@count="x"]</strong>
   4315 						</code></td>
   4316 				</tr>
   4317 				<tr>
   4318 					<td nowrap style="vertical-align: top;">root</td>
   4319 					<td nowrap id="iqaw20" style="vertical-align: top;"><code>
   4320 							//ldml/units/unitLength[@type="<strong>narrow</strong>"]/unit[@type="mass-gram"]/unitPattern<strong>[@count="other"]</strong>
   4321 						</code></td>
   4322 				</tr>
   4323 			</tbody>
   4324 		</table>
   4325 		<p>Note that there may be an alias in root that changes the path
   4326 			and starts again from the requested locale, such as:</p>
   4327 		<p>
   4328 			<code>
   4329 				&lt;unitLength type=&quot;<strong>narrow</strong>&quot;&gt;<br>
   4330 				&lt;alias source=&quot;locale&quot;
   4331 				path=&quot;../unitLength[@type='<strong>short</strong>']&quot;/&gt;<br>
   4332 				&lt;/unitLength&gt;
   4333 			</code>
   4334 		</p>
   4335 		<table class='simple' border="1" cellpadding="3" cellspacing="0" id="a1bn2">
   4336 			<caption>
   4337 				<a name="Count_Fallback_currency" href="#Count_Fallback_currency">Count
   4338 					Fallback: currency</a>
   4339 			</caption>
   4340 			<tbody>
   4341 				<tr>
   4342 					<th nowrap style="vertical-align: top;">Locale</th>
   4343 					<th nowrap style="vertical-align: top;">Path</th>
   4344 				</tr>
   4345 				<tr>
   4346 					<td nowrap style="vertical-align: top;">fr-CA</td>
   4347 					<td nowrap id="iqaw11" style="vertical-align: top;"><code>
   4348 							//ldml/numbers/currencies/currency[@type="CAD"]/displayName<strong>[@count="x"]</strong>
   4349 						</code></td>
   4350 				</tr>
   4351 				<tr>
   4352 					<td nowrap style="vertical-align: top;">fr-CA</td>
   4353 					<td nowrap id="iqaw6" style="vertical-align: top;"><code>
   4354 							//ldml/numbers/currencies/currency[@type="CAD"]/displayName<strong>[@count="other"]</strong>
   4355 						</code></td>
   4356 				</tr>
   4357 				<tr>
   4358 					<td nowrap style="vertical-align: top;">fr-CA</td>
   4359 					<td nowrap id="iqaw8" style="vertical-align: top;"><code>//ldml/numbers/currencies/currency[@type="CAD"]/displayName</code></td>
   4360 				</tr>
   4361 				<tr>
   4362 					<td nowrap style="vertical-align: top;">fr</td>
   4363 					<td nowrap id="iqaw15" style="vertical-align: top;"><code>
   4364 							//ldml/numbers/currencies/currency[@type="CAD"]/displayName<strong>[@count="x"]</strong>
   4365 						</code></td>
   4366 				</tr>
   4367 				<tr>
   4368 					<td nowrap style="vertical-align: top;">fr</td>
   4369 					<td nowrap id="iqaw14" style="vertical-align: top;"><code>
   4370 							//ldml/numbers/currencies/currency[@type="CAD"]/displayName<strong>[@count="other"]</strong>
   4371 						</code></td>
   4372 				</tr>
   4373 				<tr>
   4374 					<td nowrap style="vertical-align: top;">fr</td>
   4375 					<td nowrap id="iqaw13" style="vertical-align: top;"><code>//ldml/numbers/currencies/currency[@type="CAD"]/displayName</code></td>
   4376 				</tr>
   4377 				<tr>
   4378 					<td nowrap style="vertical-align: top;">root</td>
   4379 					<td nowrap id="iqaw25" style="vertical-align: top;"><code>
   4380 							//ldml/numbers/currencies/currency[@type="CAD"]/displayName<strong>[@count="x"]</strong>
   4381 						</code></td>
   4382 				</tr>
   4383 				<tr>
   4384 					<td nowrap style="vertical-align: top;">root</td>
   4385 					<td nowrap id="iqaw24" style="vertical-align: top;"><code>
   4386 							//ldml/numbers/currencies/currency[@type="CAD"]/displayName<strong>[@count="other"]</strong>
   4387 						</code></td>
   4388 				</tr>
   4389 				<tr>
   4390 					<td nowrap style="vertical-align: top;">root</td>
   4391 					<td nowrap id="iqaw23" style="vertical-align: top;"><code>//ldml/numbers/currencies/currency[@type="CAD"]/displayName</code></td>
   4392 				</tr>
   4393 			</tbody>
   4394 		</table>
   4395 		<br>
   4396 		<h4>
   4397 			<a name="Parent_Locales" href="#Parent_Locales">4.1.3 Parent
   4398 				Locales</a>
   4399 		</h4>
   4400 		<p class="dtd">
   4401 			&lt;!ELEMENT parentLocales ( parentLocale* ) &gt;<br>
   4402 			&lt;!ELEMENT parentLocale EMPTY &gt;<br> &lt;!ATTLIST
   4403 			parentLocale parent NMTOKEN #REQUIRED
   4404 			&gt;<br> &lt;!ATTLIST parentLocale locales NMTOKENS #REQUIRED &gt;
   4405 		</p>
   4406 		<p>In some cases, the normal truncation inheritance does not
   4407 			function well. This happens when:</p>
   4408 		<ol>
   4409 			<li>The child locale is of a different script. In this case,
   4410 				mixing elements from the parent into the child data results in a
   4411 				mishmash.</li>
   4412 			<li>A large number of child locales behave similarly, and
   4413 				differently from the truncation parent.</li>
   4414 		</ol>
   4415 		<p>
   4416 			The <span class="element">parentLocale</span> element is used to
   4417 			override the normal inheritance when accessing CLDR data.
   4418 		</p>
   4419 		<p>For case 1, the children are script locales, and the parent is
   4420 			&quot;root&quot;. For example:</p>
   4421 		<pre> &lt;parentLocale parent=&quot;root&quot; locales=&quot;az_Cyrl ha_Arab  zh_Hant&quot;/&gt;</pre>
   4422 		<p>For case 2, the children and parent share the same primary
   4423 			language, but the region is changed. For example:</p>
   4424 		<pre> &lt;parentLocale parent=&quot;es_419&quot; locales=&quot;es_AR es_BO  es_UY es_VE&quot;/&gt;</pre>
   4425 		<p>Collation data, however, is an exception. Since collation rules
   4426 			do not truly inherit data from the parent, the parentLocale element
   4427 			is not necessary and not used for collation. Thus, for a locale like
   4428 			zh_Hant in the example above, the parentLocale element would dictate
   4429 			the parent as &quot;root&quot; when referring to main locale data,
   4430 			but for collation data, the parent locale would still be
   4431 			&quot;zh&quot;, even though the parentLocale element is present for
   4432 			that locale.</p>
   4433 		<p>
   4434 			Since parentLocale information is not localizable on a per locale
   4435 			basis, the parentLocale information is contained in CLDRs <a
   4436 				href="tr35-info.html">supplemental data.</a>
   4437 		</p>
   4438 		<p>
   4439 			When a <span class="element">parentLocale</span> element is used to
   4440 			override normal inheritance, the following invariants must always be
   4441 			true:
   4442 		</p>
   4443 		<ol>
   4444 			<li>If X is the parentLocale of Y, then either X is the root
   4445 				locale, or X has the same base language code as Y. For example, the
   4446 				parent of &quot;en&quot; cannot be &quot;fr&quot;, and the parent of
   4447 				&quot;en_YY&quot; cannot be &quot;fr&quot; or &quot;fr_XX&quot;.</li>
   4448 			<li>If X is the parentLocale of Y, Y must not be a base language
   4449 				locale. For example, the parent of &quot;en&quot; cannot be
   4450 				&quot;en_XX&quot;.</li>
   4451 			<li>There can never be cycles, such as: X parent of Y ... parent
   4452 				of X.</li>
   4453 		</ol>
   4454 		<h3>
   4455 			<a name="Inheritance_and_Validity" href="#Inheritance_and_Validity">4.2
   4456 				Inheritance and Validity</a>
   4457 		</h3>
   4458 		<p>The following describes in more detail how to determine the
   4459 			exact inheritance of elements, and the validity of a given element in
   4460 			LDML.</p>
   4461 		<h4>
   4462 			<a name="Definitions" href="#Definitions">4.2.1 Definitions</a>
   4463 		</h4>
   4464 		<p>
   4465 			<i>Blocking</i> elements are those whose subelements do not inherit
   4466 			from parent locales. For example, a &lt;collation&gt; element is a
   4467 			blocking element: everything in a &lt;collation&gt; element is
   4468 			treated as a single lump of data, as far as inheritance is concerned.
   4469 			For more information, see <a href="#Valid_Attribute_Values">Section
   4470 				5.5 Valid Attribute Values</a>.
   4471 		</p>
   4472 		<p>
   4473 			Attributes that serve to distinguish multiple elements at the same
   4474 			level are called <i>distinguishing</i> attributes. For example, the <i>type</i>
   4475 			attribute distinguishes different elements in lists of translations,
   4476 			such as:
   4477 		</p>
   4478 		<pre>&lt;language type=&quot;aa&quot;&gt;Afar&lt;/language&gt;
   4479 &lt;language type=&quot;ab&quot;&gt;Abkhazian&lt;/language&gt;</pre>
   4480 		<p>
   4481 			Distinguishing attributes affect inheritance; two elements with
   4482 			different distinguishing attributes are treated as different for
   4483 			purposes of inheritance. For more information, see <a
   4484 				href="#Valid_Attribute_Values">Section 5.5 Valid Attribute
   4485 				Values</a>. Other attributes are called nondistinguishing (or
   4486 			informational) attributes. These carry separate information, and do
   4487 			not affect inheritance.
   4488 		</p>
   4489 		<p>
   4490 			For any element in an XML file, <i>an element chain</i> is a resolved
   4491 			[<a href="#XPath">XPath</a>] leading from the root to an element,
   4492 			with attributes on each element in alphabetical order. So in, say, <a
   4493 				href="http://unicode.org/cldr/data/common/main/el.xml">http://unicode.org/cldr/data/common/main/el.xml</a>
   4494 			we may have:
   4495 		</p>
   4496 		<pre>&lt;ldml&gt;
   4497   &lt;identity&gt;
   4498     &lt;version number=&quot;1.1&quot; /&gt;
   4499     &lt;language type=&quot;el&quot; /&gt;
   4500   &lt;/identity&gt;
   4501   &lt;localeDisplayNames&gt;
   4502     &lt;languages&gt;
   4503       &lt;language type=&quot;ar&quot;&gt;&lt;/language&gt;
   4504 ...</pre>
   4505 		<p>Which gives the following element chains (among others):</p>
   4506 		<ul>
   4507 			<li>//ldml/identity/version[@number=&quot;1.1&quot;]</li>
   4508 			<li>//ldml/localeDisplayNames/languages/language[@type=&quot;ar&quot;]</li>
   4509 		</ul>
   4510 		<p>
   4511 			An element chain A is an <i>extension</i> of an element chain B if B
   4512 			is equivalent to an initial portion of A. For example, #2 below is an
   4513 			extension of #1. (Equivalent, depending on the tree, may not be
   4514 			&quot;identical to&quot;. See below for an example.)
   4515 		</p>
   4516 		<ol>
   4517 			<li>//ldml/localeDisplayNames</li>
   4518 			<li>//ldml/localeDisplayNames/languages/language[@type=&quot;ar&quot;]</li>
   4519 		</ol>
   4520 		<p>
   4521 			An LDML file can be thought of as an ordered list of <i>element
   4522 				pairs</i>: &lt;element chain, data&gt;, where the element chains are all
   4523 			the chains for the end-nodes. (This works because of restrictions on
   4524 			the structure of LDML, including that it does not allow mixed
   4525 			content.) The ordering is the ordering that the element chains are
   4526 			found in the file, and thus determined by the DTD.
   4527 	  </p>
   4528 		<p>For example, some of those pairs would be the following. Notice
   4529 			that the first has the null string as element contents.</p>
   4530 		<ul>
   4531 			<li><b>&lt;</b>//ldml/identity/version[@number=&quot;1.1&quot;]<b>,
   4532 			</b>&quot;&quot;<b>&gt;</b></li>
   4533 			<li><b>&lt;</b>//ldml/localeDisplayNames/languages/language[@type=&quot;ar&quot;]<b>,
   4534 			</b>&quot;&quot;<b>&gt;</b></li>
   4535 		</ul>
   4536 		<blockquote>
   4537 			<p>
   4538 				<b>Note: </b>There are two exceptions to this:
   4539 			</p>
   4540 			<ol>
   4541 				<li>Blocking nodes and their contents are treated as a single
   4542 					end node.</li>
   4543 				<li>In terms of computing inheritance, the element pair
   4544 					consists of the element chain plus all distinguishing attributes;
   4545 					the value consists of the value (if any) plus any nondistinguishing
   4546 					attributes.</li>
   4547 			</ol>
   4548 			<blockquote>
   4549 				<p>Thus instead of the element pair being (a) below, it is (b):</p>
   4550 				<ol type="a">
   4551 					<li><b>&lt;</b>//ldml/dates/calendars/calendar[@type=&#39;gregorian&#39;]/week/weekendStart[@day=&#39;sun&#39;][@time=&#39;00:00&#39;]<b>,</b><br>
   4552 						<b>&quot;&quot;&gt;</b></li>
   4553 					<li><b>&lt;</b>//ldml/dates/calendars/calendar[@type=&#39;gregorian&#39;]/week/weekendStart<b>,</b><br>
   4554 						[@day=&#39;sun&#39;][@time=&#39;00:00&#39;]<b>&gt;</b></li>
   4555 				</ol>
   4556 			</blockquote>
   4557 		</blockquote>
   4558 		<p>
   4559 			Two LDML element chains are <i>equivalent</i> when they would be
   4560 			identical if all attributes and their values were removed  except
   4561 			for distinguishing attributes. Thus the following are equivalent:
   4562 		</p>
   4563 		<ul>
   4564 			<li><code>//ldml/localeDisplayNames/languages/language[@type=&quot;ar&quot;]</code></li>
   4565 			<li><code>//ldml/localeDisplayNames/languages/language[@type=&quot;ar&quot;][@draft=&quot;unconfirmed&quot;]</code></li>
   4566 		</ul>
   4567 		<p>
   4568 			For any locale ID, an <i>locale chain</i> is an ordered list starting
   4569 			with the root and leading down to the ID. For example:
   4570 		</p>
   4571 		<blockquote>
   4572 			<p>&lt;root, de, de_DE, de_DE_xxx&gt;</p>
   4573 		</blockquote>
   4574 		<h4>
   4575 			<a name="Resolved_Data_File" href="#Resolved_Data_File">4.2.2
   4576 				Resolved Data File</a>
   4577 		</h4>
   4578 		<p>To produce fully resolved locale data file from CLDR for a
   4579 			locale ID L, you start with L, and successively add unique items from
   4580 			the parent locales until you get up to root. More formally, this can
   4581 			be expressed as the following procedure.</p>
   4582 		<ol>
   4583 			<li>Let Result be initially L.</li>
   4584 			<li>For each Li in the locale chain for L, starting at L and
   4585 				going up to root:
   4586 				<ol>
   4587 					<li>Let Temp be a copy of the pairs in the LDML file for Li</li>
   4588 					<li>Replace each alias in Temp by the resolved list of pairs
   4589 						it points to.
   4590 						<ol>
   4591 							<li>The resolved list of pairs is obtained by recursively
   4592 								applying this procedure.</li>
   4593 							<li>That alias now blocks any inheritance from the parent.
   4594 								(See <i><a href="#Common_Elements">Section 5.1 Common
   4595 										Elements</a></i> for an example.)
   4596 							</li>
   4597 						</ol>
   4598 					</li>
   4599 					<li>For each element pair P in Temp:
   4600 						<ol>
   4601 							<li>If P does not contain a blocking element, and Result
   4602 								does not have an element pair Q with an equivalent element
   4603 								chain, add P to Result.</li>
   4604 						</ol>
   4605 					</li>
   4606 				</ol>
   4607 			</li>
   4608 		</ol>
   4609 		<p>
   4610 			<b>Notes:</b>
   4611 		</p>
   4612 		<ul>
   4613 			<li>When adding an element pair to a result, it has to go in the
   4614 				right order for it to be valid according to the DTD.</li>
   4615 			<li>The identity element and its children are unaffected by
   4616 				resolution.</li>
   4617 			<li>The LDML data must be constructed so as to avoid circularity
   4618 				in step 2.2.</li>
   4619 		</ul>
   4620 		<h4>
   4621 			<a name="Valid_Data" href="#Valid_Data">4.2.3 Valid Data</a>
   4622 		</h4>
   4623 		<p>
   4624 			The attribute <i>draft=&quot;x&quot; </i>in LDML means that the data
   4625 			has not been approved by the subcommittee. (For more information, see
   4626 			<a href="http://cldr.unicode.org/index/process">Process</a>).
   4627 			However, some data that is not explicitly marked as <i>draft </i>may
   4628 			be implicitly <i>draft</i>, either because it inherits it from a
   4629 			parent, or from an enclosing element.
   4630 		</p>
   4631 		<p>
   4632 			<b>Example 2. </b>Suppose that new locale data is added for af
   4633 			(Afrikaans). To indicate that all of the data is <i>unconfirmed</i>,
   4634 			the attribute can be added to the top level.
   4635 		</p>
   4636 		<p>
   4637 			<code>
   4638 				&lt;ldml version=&quot;1.1&quot; draft=&quot;unconfirmed&quot;&gt;<br>
   4639 				&nbsp;&lt;identity&gt;<br> &nbsp; &lt;version
   4640 				number=&quot;1.1&quot; /&gt; <br> &nbsp; &lt;language
   4641 				type=&quot;af&quot; /&gt; <br> &nbsp;&lt;/identity&gt;<br>
   4642 				&nbsp;&lt;characters&gt;...&lt;/characters&gt;<br>
   4643 				&nbsp;&lt;localeDisplayNames&gt;...&lt;/localeDisplayNames&gt;<br>
   4644 				&lt;/ldml&gt;
   4645 			</code>
   4646 		</p>
   4647 		<p>
   4648 			Any data can be added to that file, and the status will all be draft=<i>unconfirmed</i>.
   4649 			Once an item is vetted<i>whether it is inherited or explicitly
   4650 				in the file</i>then its status can be changed to <i>approved</i>. This
   4651 			can be done either by leaving draft=&quot;unconfirmed&quot; on the
   4652 			enclosing element and marking the child with
   4653 			draft=&quot;approved&quot;, such as:
   4654 		</p>
   4655 		<p>
   4656 			<code>
   4657 				&lt;ldml version=&quot;1.1&quot; draft=&quot;unconfirmed&quot;&gt;<br>
   4658 				&nbsp;&lt;identity&gt;<br> &nbsp; &lt;version
   4659 				number=&quot;1.1&quot; /&gt; <br> &nbsp; &lt;language
   4660 				type=&quot;af&quot; /&gt; <br> &nbsp;&lt;/identity&gt;<br>
   4661 				&nbsp;&lt;characters
   4662 				draft=&quot;approved&quot;&gt;...&lt;/characters&gt;<br>
   4663 				&nbsp;&lt;localeDisplayNames&gt;...&lt;/localeDisplayNames&gt;<br>
   4664 				&nbsp;&lt;dates/&gt;<br> &nbsp;&lt;numbers/&gt;<br>
   4665 				&nbsp;&lt;collations/&gt;<br> &lt;/ldml&gt;
   4666 			</code>
   4667 		</p>
   4668 		<p>
   4669 			However, normally the draft attributes should be canonicalized, which
   4670 			means they are pushed down to leaf nodes as described in <i><a
   4671 				href="#Canonical_Form">Section 5.6 Canonical Form</a></i>. If an LDML
   4672 			file does has draft attributes that are not on leaf nodes, the file
   4673 			should be interpreted as if it were the canonicalized version of that
   4674 			file.
   4675 		</p>
   4676 		<p>More formally, here is how to determine whether data for an
   4677 			element chain E is implicitly or explicitly draft, given a locale L.
   4678 			Sections 1, 2, and 4 are simply formalizations of what is in LDML
   4679 			already. Item 3 adds the new element.</p>
   4680 		<h4>
   4681 			<a name="Checking_for_Draft_Status" href="#Checking_for_Draft_Status">4.2.4
   4682 				Checking for Draft Status</a>
   4683 		</h4>
   4684 		<ol>
   4685 			<li><b>Parent Locale Inheritance</b>
   4686 				<ol>
   4687 					<li>Walk through the locale chain until you find a locale ID
   4688 						L&#39; with a data file D. (L&#39; may equal L).</li>
   4689 					<li>Produce the fully resolved data file D&#39; for D.</li>
   4690 					<li>In D&#39;, find the first element pair whose element chain
   4691 						E&#39; is either equivalent to or an extension of E.</li>
   4692 					<li>If there is no such E&#39;, return <i>true</i></li>
   4693 					<li>If E&#39; is not equivalent to E, truncate E&#39; to the
   4694 						length of E.</li>
   4695 				</ol></li>
   4696 			<li><b>Enclosing Element Inheritance</b>
   4697 				<ol>
   4698 					<li>Walk through the elements in E&#39;, from back to front.
   4699 						<ol>
   4700 							<li>If you ever encounter draft=<i>x</i>, return <i>x</i></li>
   4701 						</ol>
   4702 					</li>
   4703 					<li>If L&#39; = L, return <i>false</i></li>
   4704 				</ol></li>
   4705 			<li><b>Missing File Inheritance</b>
   4706 				<ol>
   4707 					<li>Otherwise, walk again through the elements in E&#39;, from
   4708 						back to front.
   4709 						<ol>
   4710 							<li>If you encounter a validSubLocales attribute
   4711 								(deprecated):
   4712 								<ol>
   4713 									<li>If L is in the attribute value, return <i>false</i></li>
   4714 									<li>Otherwise return <i>true</i></li>
   4715 								</ol>
   4716 							</li>
   4717 						</ol>
   4718 					</li>
   4719 				</ol></li>
   4720 			<li><b>Otherwise</b>
   4721 				<ol>
   4722 					<li>Return <i>true</i></li>
   4723 				</ol></li>
   4724 		</ol>
   4725 		<p>The validSubLocales in the most specific (farthest from root
   4726 			file) locale file &quot;wins&quot; through the full resolution step
   4727 			(data from more specific files replacing data from less specific
   4728 			ones).</p>
   4729 		<h4>
   4730 			<a name="Keyword_and_Default_Resolution"
   4731 				href="#Keyword_and_Default_Resolution">4.2.5 Keyword and Default
   4732 				Resolution</a>
   4733 		</h4>
   4734 		<p>When accessing data based on keywords, the following process is
   4735 			used. Consider the following example:</p>
   4736 		<ul>
   4737 			<li>The locale &#39;de&#39; has collation types A, B, C, and no
   4738 				&lt;default&gt; element</li>
   4739 			<li>The locale &#39;de_CH&#39; has &lt;default
   4740 				type=&#39;B&#39;&gt;</li>
   4741 		</ul>
   4742 		<p>Here are the searches for various combinations.</p>
   4743 		<table class='simple' border="1" cellpadding="0" cellspacing="0">
   4744 			<tr>
   4745 				<td><strong>User Input</strong></td>
   4746 				<td><strong>Lookup in Locale</strong></td>
   4747 				<td><strong>For</strong></td>
   4748 				<td><strong>Comment</strong></td>
   4749 			</tr>
   4750 			<tr>
   4751 				<td rowspan="3">de_CH<br> <em>no keyword</em></td>
   4752 				<td>de_CH</td>
   4753 				<td>default collation type</td>
   4754 				<td>finds &quot;B&quot;</td>
   4755 			</tr>
   4756 			<tr>
   4757 				<td>de_CH</td>
   4758 				<td>collation type=B</td>
   4759 				<td>not found</td>
   4760 			</tr>
   4761 			<tr>
   4762 				<td>de</td>
   4763 				<td>collation type=B</td>
   4764 				<td><em>found</em></td>
   4765 			</tr>
   4766 			<tr>
   4767 				<td rowspan="4">de<br> <em>no keyword</em></td>
   4768 				<td>de</td>
   4769 				<td>default collation type</td>
   4770 				<td>not found</td>
   4771 			</tr>
   4772 			<tr>
   4773 				<td>root</td>
   4774 				<td>default collation type</td>
   4775 				<td>finds &quot;standard&quot;</td>
   4776 			</tr>
   4777 			<tr>
   4778 				<td>de</td>
   4779 				<td>collation type=standard</td>
   4780 				<td>not found</td>
   4781 			</tr>
   4782 			<tr>
   4783 				<td>root</td>
   4784 				<td>collation type=standard</td>
   4785 				<td><i>found</i></td>
   4786 			</tr>
   4787 			<tr>
   4788 				<td>de_u_co_A</td>
   4789 				<td>de</td>
   4790 				<td>collation type=A</td>
   4791 				<td><i>found</i></td>
   4792 			</tr>
   4793 			<tr>
   4794 				<td rowspan="2">de_u_co_standard</td>
   4795 				<td>de</td>
   4796 				<td>collation type=standard</td>
   4797 				<td>not found</td>
   4798 			</tr>
   4799 			<tr>
   4800 				<td>root</td>
   4801 				<td>collation type=standard</td>
   4802 				<td><i>found</i></td>
   4803 			</tr>
   4804 			<tr>
   4805 				<td rowspan="6">de_u_co_foobar</td>
   4806 				<td>de</td>
   4807 				<td>collation type=foobar</td>
   4808 				<td>not found</td>
   4809 			</tr>
   4810 			<tr>
   4811 				<td>root</td>
   4812 				<td>collation type=foobar</td>
   4813 				<td>not found, starts looking for default</td>
   4814 			</tr>
   4815 			<tr>
   4816 				<td>de</td>
   4817 				<td>default collation type</td>
   4818 				<td>not found</td>
   4819 			</tr>
   4820 			<tr>
   4821 				<td>root</td>
   4822 				<td>default collation type</td>
   4823 				<td>finds &quot;standard&quot;</td>
   4824 			</tr>
   4825 			<tr>
   4826 				<td>de</td>
   4827 				<td>collation type=standard</td>
   4828 				<td>not found</td>
   4829 			</tr>
   4830 			<tr>
   4831 				<td>root</td>
   4832 				<td>collation type=standard</td>
   4833 				<td><i>found</i></td>
   4834 			</tr>
   4835 		</table>
   4836 		<p>Examples of &quot;search&quot; collator lookup; 'de' has a
   4837 			language-specific version, but 'en' does not:</p>
   4838 		<table class='simple' border="1" cellpadding="0" cellspacing="0">
   4839 			<tr>
   4840 				<td><strong>User Input</strong></td>
   4841 				<td><strong>Lookup in Locale</strong></td>
   4842 				<td><strong>For</strong></td>
   4843 				<td><strong>Comment</strong></td>
   4844 			</tr>
   4845 			<tr>
   4846 				<td rowspan="2">de_CH_u_co_search</td>
   4847 				<td>de_CH</td>
   4848 				<td>collation type=search</td>
   4849 				<td>not found</td>
   4850 			</tr>
   4851 			<tr>
   4852 				<td>de</td>
   4853 				<td>collation type=search</td>
   4854 				<td><i>found</i></td>
   4855 			</tr>
   4856 			<tr>
   4857 				<td rowspan="3">en_US_u_co_search</td>
   4858 				<td>en_US</td>
   4859 				<td>collation type=search</td>
   4860 				<td>not found</td>
   4861 			</tr>
   4862 			<tr>
   4863 				<td>en</td>
   4864 				<td>collation type=search</td>
   4865 				<td>not found</td>
   4866 			</tr>
   4867 			<tr>
   4868 				<td>root</td>
   4869 				<td>collation type=search</td>
   4870 				<td><i>found</i></td>
   4871 			</tr>
   4872 		</table>
   4873 		<p>Examples of lookup for Chinese collation types. Note:</p>
   4874 		<ul>
   4875 			<li>All of the Chinese-specific collation types are provided in
   4876 				the 'zh' locale</li>
   4877 			<li>For 'zh' the &lt;default&gt; element specifies
   4878 				&quot;pinyin&quot;; for 'zh_Hant' the &lt;default&gt; element
   4879 				specifies &quot;stroke&quot;. However any of the available Chinese
   4880 				collation types can be explicitly requested for any Chinese locale.</li>
   4881 		</ul>
   4882 		<table class='simple' border="1" cellpadding="0" cellspacing="0">
   4883 			<tr>
   4884 				<td><strong>User Input</strong></td>
   4885 				<td><strong>Lookup in Locale</strong></td>
   4886 				<td><strong>For</strong></td>
   4887 				<td><strong>Comment</strong></td>
   4888 			</tr>
   4889 			<tr>
   4890 				<td rowspan="3">zh_Hant<br> <em>no keyword</em></td>
   4891 				<td>zh_Hant</td>
   4892 				<td>default collation type</td>
   4893 				<td>finds &quot;stroke&quot;</td>
   4894 			</tr>
   4895 			<tr>
   4896 				<td>zh_Hant</td>
   4897 				<td>collation type=stroke</td>
   4898 				<td>not found</td>
   4899 			</tr>
   4900 			<tr>
   4901 				<td>zh</td>
   4902 				<td>collation type=stroke</td>
   4903 				<td><i>found</i></td>
   4904 			</tr>
   4905 			<tr>
   4906 				<td rowspan="3">zh_Hant_HK_u_co_pinyin</td>
   4907 				<td>zh_Hant_HK</td>
   4908 				<td>collation type=pinyin</td>
   4909 				<td>not found</td>
   4910 			</tr>
   4911 			<tr>
   4912 				<td>zh_Hant</td>
   4913 				<td>collation type=pinyin</td>
   4914 				<td>not found</td>
   4915 			</tr>
   4916 			<tr>
   4917 				<td>zh</td>
   4918 				<td>collation type=pinyin</td>
   4919 				<td><i>found</i></td>
   4920 			</tr>
   4921 			<tr>
   4922 				<td rowspan="2">zh<br> <em>no keyword</em></td>
   4923 				<td>zh</td>
   4924 				<td>default collation type</td>
   4925 				<td>finds &quot;pinyin&quot;</td>
   4926 			</tr>
   4927 			<tr>
   4928 				<td>zh</td>
   4929 				<td>collation type=pinyin</td>
   4930 				<td><i>found</i></td>
   4931 			</tr>
   4932 		</table>
   4933 		<blockquote>
   4934 			<p>
   4935 				<b>Note: </b>It is an invariant that the default in root for a given
   4936 				element must<br> always be a value that exists in root. So you
   4937 				can not have the following in root:
   4938 			</p>
   4939 		</blockquote>
   4940 		<p>
   4941 			<code>
   4942 				&lt;someElements&gt;<br> &nbsp; &lt;default
   4943 				type=&#39;a&#39;/&gt;<br> &nbsp; &lt;someElement
   4944 				type=&#39;b&#39;&gt;...&lt;/someElement&gt;<br> &nbsp;
   4945 				&lt;someElement type=&#39;c&#39;&gt;...&lt;/someElement&gt;<br>
   4946 				<b>&nbsp; &lt;!-- no &#39;a&#39; --&gt;</b><br>
   4947 				&lt;/someElements&gt;
   4948 			</code>
   4949 		</p>
   4950 		<p>For identifiers, such as language codes, script codes, region
   4951 			codes, variant codes, types, keywords, currency symbols or currency
   4952 			display names, the default value is the identifier itself whenever if
   4953 			no value is found in the root. Thus if there is no display name for
   4954 			the region code &#39;QA&#39; in root, then the display name is simply
   4955 			&#39;QA&#39;.	  </p>
   4956 
   4957 		<h4>
   4958 		  <a name="Inheritance_vs_Related" href="#Inheritance_vs_Related">4.2.6 Inheritance vs Related Information</a>
   4959 		</h4>
   4960 	    <p>There are related types of data and processing that are easy to confuse:</p>
   4961 		  <table class='simple'>
   4962 		    <tr>
   4963 		      <td rowspan="4"><p><strong>Inheritance</strong></p></td>
   4964 		      <td colspan="2">Part of the internal mechanism used by CLDR to organize and manage locale data.
   4965 		        This is used to share common resources, and ease maintenance, and provide the best fallback behavior in the absence of data. <em>Should not be used for locale matching or likely subtags.</em></td>
   4966 	        </tr>
   4967 		    <tr>
   4968 		      <td><em>Example:</em></td>
   4969 		      <td>parent(en_AU)  en_001<br>
   4970 	          parent(en_001)  en<br>
   4971 	          parent(en)  root</td>
   4972 	        </tr>
   4973 		    <tr>
   4974 		      <td><em>Data: </em></td>
   4975 		      <td>supplementalData.xml &lt;parentLocale&gt;</td>
   4976 	        </tr>
   4977 		    <tr>
   4978 		      <td><em>Spec:</em></td>
   4979 		      <td><strong>Section <a href="#Inheritance_and_Validity">4.2 Inheritance and Validity</a></strong></td>
   4980 	        </tr>
   4981 		    <tr>
   4982 		      <td rowspan="4"><strong>DefaultContent</strong></td>
   4983 		      <td colspan="2">Part of the internal mechanism used by CLDR to manage locale data. A particular sublocale is designated the defaultContent for a parent, so that the parent exhibits consistent behavior.  <em>Should not be used for locale matching or likely subtags.</em></td>
   4984 	        </tr>
   4985 		    <tr>
   4986 		      <td><em>Example:</em></td>
   4987 		      <td>addLikelySubtags(sr-ME)  sr-Latn-ME, minimize(de-Latn-DE)  de</td>
   4988 	        </tr>
   4989 		    <tr>
   4990 		      <td><em>Data: </em></td>
   4991 		      <td>supplementalMetadata.xml &lt;defaultContent&gt;</td>
   4992 	        </tr>
   4993 		    <tr>
   4994 		      <td><em>Spec:</em></td>
   4995 		      <td><strong>Part 6: Section 9.3<a  href="tr35-info.html#Default_Content">Default Content</a>
   4996 		      </strong></td>
   4997 	        </tr>
   4998    		    <tr>
   4999 		      <td rowspan="4"><strong>LikelySubtags</strong></td>
   5000 		      <td colspan="2">Provides most likely full subtag (script and region) in the absence of other information. A core component of LocaleMatching.</td>
   5001 	        </tr>
   5002 		    <tr>
   5003 		      <td><em>Example:</em></td>
   5004 		      <td>addLikelySubtags(zh)  zh-Hans-CN<br>
   5005 		        addLikelySubtags(zh-TW)  zh-Hant-TW <br>
   5006 minimize(zh-Hans, favorRegion)  zh-TW</td>
   5007 	        </tr>
   5008 		    <tr>
   5009 		      <td><em>Data: </em></td>
   5010 		      <td>likelySubtags.xml &lt;likelySubtags&gt;</td>
   5011 	        </tr>
   5012 		    <tr>
   5013 		      <td><em>Spec:</em></td>
   5014 		      <td><strong>Section <a href="#Likely_Subtags">4.3 Likely
   5015 			  Subtags</a></strong></td>
   5016 	        </tr>
   5017 		    <tr>
   5018 		      <td rowspan="4"><strong>LocaleMatching</strong></td>
   5019 		      <td colspan="2">Provides the   best match for the users language(s) among an applications supported languages.  </td>
   5020 	        </tr>
   5021 		    <tr>
   5022 		      <td><em>Example:</em></td>
   5023 		      <td>bestLocale(userLangs=&lt;en, fr&gt;, appLangs=&lt;fr-CA, ru&gt;)  fr-CA</td>
   5024 	        </tr>
   5025 		    <tr>
   5026 		      <td><em>Data: </em></td>
   5027 		      <td>languageInfo.xml &lt;languageMatching&gt;</td>
   5028 	        </tr>
   5029 		    <tr>
   5030 		      <td><em>Spec:</em></td>
   5031 		      <td><strong>Section 
   5032               <a href="#LanguageMatching">4.4 Language Matching</a></strong></td>
   5033 	        </tr>
   5034         </table>
   5035 
   5036 
   5037 		<h3>
   5038 		  <a name="Likely_Subtags" href="#Likely_Subtags">4.3 Likely
   5039 				Subtags</a>
   5040 		</h3>
   5041 		<p class="dtd">
   5042 			&lt;!ELEMENT likelySubtag EMPTY &gt;<br> &lt;!ATTLIST
   5043 			likelySubtag from NMTOKEN #REQUIRED&gt;<br> &lt;!ATTLIST
   5044 			likelySubtag to NMTOKEN #REQUIRED&gt;
   5045 		</p>
   5046 		<p>There are a number of situations where it is useful to be able
   5047 			to find the most likely language, script, or region. For example,
   5048 			given the language &quot;zh&quot; and the region &quot;TW&quot;, what
   5049 			is the most likely script? Given the script &quot;Thai&quot; what is
   5050 			the most likely language or region? Given the region TW, what is the
   5051 			most likely language and script?</p>
   5052 		<p>Conversely, given a locale, it is useful to find out which
   5053 			fields (language, script, or region) may be superfluous, in the sense
   5054 			that they contain the likely tags. For example, &quot;en_Latn&quot;
   5055 			can be simplified down to &quot;en&quot; since &quot;Latn&quot; is
   5056 			the likely script for &quot;en&quot;; &quot;ja_Jpan_JP&quot; can be
   5057 			simplified down to &quot;ja&quot;.</p>
   5058 		<p>
   5059 			The <i>likelySubtag</i> supplemental data provides default
   5060 			information for computing these values. This data is based on the
   5061 			default content data, the population data, and the the
   5062 			suppress-script data in [<a href="#BCP47">BCP47</a>]. It is
   5063 			heuristically derived, and may change over time.
   5064 		</p>
   5065 	  <p>For the relationship between Inheritance, DefaultContent, LikelySubtags, and LocaleMatching, see <strong><em>Section 4.2.6 <a 
   5066 				href="tr35.html#Inheritance_vs_Related">Inheritance vs Related Information</a></em></strong>.</p>
   5067 	  <p>
   5068 			To look up data in the table, see if a locale matches one of the <b>from</b>
   5069 			attribute values. If so, fetch the corresponding <b>to</b> attribute
   5070 			value. For example, the Chinese data looks like the following:
   5071 		</p>
   5072 		<blockquote>
   5073 			<p class="example">
   5074 				&lt;likelySubtag from=&quot;zh&quot; to=&quot;zh_Hans_CN&quot;/&gt;<br>
   5075 				&lt;likelySubtag from=&quot;zh_HK&quot;
   5076 				to=&quot;zh_Hant_HK&quot;/&gt;<br> &lt;likelySubtag
   5077 				from=&quot;zh_Hani&quot; to=&quot;zh_Hani_CN&quot;/&gt;<br>
   5078 				&lt;likelySubtag from=&quot;zh_Hant&quot;
   5079 				to=&quot;zh_Hant_TW&quot;/&gt;<br> &lt;likelySubtag
   5080 				from=&quot;zh_MO&quot; to=&quot;zh_Hant_MO&quot;/&gt;<br>
   5081 				&lt;likelySubtag from=&quot;zh_TW&quot;
   5082 				to=&quot;zh_Hant_TW&quot;/&gt;
   5083 			</p>
   5084 		</blockquote>
   5085 		<p>So looking up &quot;zh_TW&quot; returns &quot;zh_Hant_TW&quot;,
   5086 			while looking up &quot;zh&quot; returns &quot;zh_Hans_CN&quot;.</p>
   5087 		<p>In more detail, the data is designed to be used in the
   5088 			following operations.</p>
   5089 		<p>
   5090 			Note that as of CLDR v24, any field present in the 'from' field, is
   5091 			also present in the 'to' field, so an input field will not change in
   5092 			&quot;Add Likely Subtags&quot; operation. The data and operations can
   5093 			also be used with language tags using [<a href="#BCP47">BCP47</a>]
   5094 			syntax, with the appropriate changes. In addition, certain common
   5095 			'denormalized' language subtags such as 'iw' (for 'he') may occur in
   5096 			both the 'from' and 'to' fields. This allows for implementations that
   5097 			use those denormalized subtags to use the data with only minor
   5098 			changes to the operations.
   5099 		</p>
   5100 		<p>&nbsp;</p>
   5101 		<p>
   5102 			<i><b>Add Likely Subtags: </b></i><em>Given a source locale X,
   5103 				to return a locale Y where the empty subtags have been filled in by
   5104 				the most likely subtags.</em> This is written as X  Y (&quot;X maximizes
   5105 			to Y&quot;).
   5106 		</p>
   5107 		<p>
   5108 			A subtag is called <em>empty</em> if it is a missing script or region
   5109 			subtag, or it is a base language subtag with the value
   5110 			&quot;und&quot;. In the description below, a subscript on a subtag <em>x</em>
   5111 			indicates which tag it is from: <em>x<sub>s</sub></em> is in the
   5112 			source, <em>x<sub>m</sub></em>is in a match, and <em>x<sub>r</sub></em>
   5113 			is in the final result.
   5114 		</p>
   5115 		<p>This operation is performed in the following way.</p>
   5116 		<ol>
   5117 			<li style="margin-top: 0.5em; margin-bottom: 0.5em"><strong>Canonicalize.</strong>
   5118 				<ol>
   5119 					<li>Make sure the input locale is in canonical form: uses the
   5120 						right separator, and has the right casing.</li>
   5121 					<li style="margin-top: 0.5em; margin-bottom: 0.5em">Replace
   5122 						any deprecated subtags with their canonical values using the
   5123 						&lt;alias&gt; data in supplemental metadata. Use the first value
   5124 						in the replacement list, if it exists. Language tag replacements
   5125 						may have multiple parts, such as &quot;sh&quot; 
   5126 						&quot;sr_Latn&quot; or mo&quot;  &quot;ro_MD&quot;. In such a
   5127 						case, the original script and/or region are retained if there is
   5128 						one. Thus &quot;sh_Arab_AQ&quot;  &quot;sr_Arab_AQ&quot;, not
   5129 						&quot;sr_Latn_AQ&quot;.</li>
   5130 					<li>If the tag is grandfathered (see &lt;variable
   5131 						id=&quot;$grandfathered&quot; type=&quot;choice&quot;&gt; in the
   5132 						supplemental data), then return it.</li>
   5133 					<li>Remove the script code &#39;Zzzz&#39; and the region code
   5134 						&#39;ZZ&#39; if they occur.</li>
   5135 					<li>Get the components of the cleaned-up source tag <em>(language<sub>s</sub>,
   5136 							script<sub>s</sub>,
   5137 					</em>and<em> region<sub>s</sub></em>), plus any variants and extensions.
   5138 					</li>
   5139 				</ol></li>
   5140 			<li style="margin-top: 0.5em; margin-bottom: 0.5em"><strong>Lookup.
   5141 			</strong>Lookup each of the following in order, and stop on the first match:
   5142 				<ol>
   5143 					<li style="margin-top: 0.5em; margin-bottom: 0.5em"><em>language<sub>s</sub>_script<sub>s</sub>_region<sub>s</sub></em></li>
   5144 
   5145 					<li style="margin-top: 0.5em; margin-bottom: 0.5em"><em>language<sub>s</sub>_region<sub>s</sub></em></li>
   5146 
   5147 					<li style="margin-top: 0.5em; margin-bottom: 0.5em"><em>language<sub>s</sub>_script<sub>s</sub></em></li>
   5148 					<li style="margin-top: 0.5em; margin-bottom: 0.5em"><em><em>language<sub>s</sub></em></em></li>
   5149 					<li>und<em>_script<sub>s</sub></em></li>
   5150 				</ol></li>
   5151 			<li><strong>Return</strong>
   5152 				<ol>
   5153 					<li>If there is no match,either return
   5154 						<ol>
   5155 							<li>an error value, or</li>
   5156 							<li>the match for &quot;und&quot; (in APIs where a valid
   5157 								language tag is required).</li>
   5158 						</ol>
   5159 					</li>
   5160 					<li>Otherwise there is a match = <span
   5161 						style="margin-top: 0.5em; margin-bottom: 0.5em"><em>language<sub>m</sub>_script<sub>m</sub>_region<sub>m</sub></em></span></li>
   5162 					<li>Let x<sub>r</sub> = x<sub>s</sub> if x<sub>s</sub> is not
   5163 						empty, and x<sub>m</sub> otherwise.
   5164 					</li>
   5165 					<li>R<span style="margin-top: 0.5em; margin-bottom: 0.5em">eturn
   5166 							the language tag composed of <em>language<sub>r</sub> _
   5167 								script<sub>r</sub> _ region<sub>r</sub></em> + variants + extensions
   5168 					</span>.
   5169 					</li>
   5170 				</ol></li>
   5171 		</ol>
   5172 		<p>The lookup can be optimized. For example, if any of the tags in
   5173 			Step 2 are the same as previous ones in that list, they do not need
   5174 			to be tested.</p>
   5175 		<p>
   5176 			<i>Example1:</i>
   5177 		</p>
   5178 		<ul>
   5179 			<li style="margin-top: 0.5em; margin-bottom: 0.5em">
   5180 				<p>Input is ZH-ZZZZ-SG.</p>
   5181 			</li>
   5182 			<li style="margin-top: 0.5em; margin-bottom: 0.5em">
   5183 				<p>Normalize to zh_SG.</p>
   5184 			</li>
   5185 			<li style="margin-top: 0.5em; margin-bottom: 0.5em">
   5186 				<p>Lookup in table. No match.</p>
   5187 			</li>
   5188 			<li style="margin-top: 0.5em; margin-bottom: 0.5em">
   5189 				<p>Lookup zh, and get the match (zh_Hans_CN). Substitute SG, and
   5190 					return zh_Hans_SG.</p>
   5191 			</li>
   5192 		</ul>
   5193 		<p>To find the most likely language for a country, or language for
   5194 			a script, use &quot;und&quot; as the language subtag. For example,
   5195 			looking up &quot;und_TW&quot; returns zh_Hant_TW.</p>
   5196 		<p>A goal of the algorithm is that if X  Y, and X' results from
   5197 			replacing an empty subtag in X by the the corresponding subtag in Y,
   5198 			then X'  Y. For example, if und_AF  fa_Arab_AF, then:</p>
   5199 		<ul>
   5200 			<li>fa_Arab_AF  fa_Arab_AF</li>
   5201 			<li>und_Arab_AF  fa_Arab_AF</li>
   5202 			<li>fa_AF  fa_Arab_AF</li>
   5203 		</ul>
   5204 		<p>There are a small number of exceptions to this goal in the
   5205 			current data, where X  {und_Bopo, und_Brai, und_Cakm, und_Limb,
   5206 			und_Shaw}.</p>
   5207 		<p>
   5208 			<b><i>Remove</i></b><i><b> Likely Subtags: </b>Given a locale,
   5209 				remove any fields that Add Likely Subtags would add.</i>
   5210 		</p>
   5211 		<p>The reverse operation removes fields that would be added by the
   5212 			first operation.</p>
   5213 		<ol>
   5214 			<li style="margin-top: 0.5em; margin-bottom: 0.5em">First get
   5215 				max = AddLikelySubtags(inputLocale). If an error is signaled, return
   5216 				it.</li>
   5217 			<li style="margin-top: 0.5em; margin-bottom: 0.5em">Remove the
   5218 				variants from max.</li>
   5219 			<li style="margin-top: 0.5em; margin-bottom: 0.5em">Then for <i>trial</i>
   5220 				in {language, language _ region, language _ script}
   5221 				<ul>
   5222 					<li style="margin-top: 0.5em; margin-bottom: 0.5em">If
   5223 						AddLikelySubtags(<i>trial</i>) = max, then return <i>trial</i> +
   5224 						variants.
   5225 					</li>
   5226 				</ul>
   5227 			</li>
   5228 			<li style="margin-top: 0.5em; margin-bottom: 0.5em">If you do
   5229 				not get a match, return max + variants.</li>
   5230 		</ol>
   5231 		<p>Example:</p>
   5232 		<ul>
   5233 			<li style="margin-top: 0.5em; margin-bottom: 0.5em">
   5234 				<p>Input is zh_Hant. Maximize to get zh_Hant_TW.</p>
   5235 			</li>
   5236 			<li style="margin-top: 0.5em; margin-bottom: 0.5em">
   5237 				<p>zh =&gt; zh_Hans_CN. No match, so continue.</p>
   5238 			</li>
   5239 			<li style="margin-top: 0.5em; margin-bottom: 0.5em">
   5240 				<p>zh_TW =&gt; zh_Hant_TW. Matches, so return zh_TW.</p>
   5241 			</li>
   5242 		</ul>
   5243 		<p>A variant of this favors the script over the region, thus using
   5244 			{language, language_script, language_region} in the above. If that
   5245 			variant is used, then the result in this example would be zh_Hant
   5246 			instead of zh_TW.		</p>
   5247 		<h3>
   5248 			<a name="LanguageMatching" href="#LanguageMatching">4.4 Language
   5249 				Matching</a>
   5250 		</h3>
   5251 		<p class="dtd">
   5252 			&lt;!ELEMENT languageMatching ( languageMatches* ) &gt;<br>
   5253 			&lt;!ELEMENT languageMatches ( paradigmLocales*, matchVariable*, languageMatch* ) &gt;<br>
   5254 		&lt;!ATTLIST languageMatches type NMTOKEN #REQUIRED &gt;</p>
   5255 		<p class="dtd">&lt;!ELEMENT languageMatch EMPTY &gt;<br> &lt;!ATTLIST
   5256 		  languageMatch desired CDATA #REQUIRED &gt;<br> &lt;!ATTLIST
   5257 		  languageMatch supported CDATA #REQUIRED &gt;<br> &lt;!ATTLIST
   5258 		  languageMatch percent NMTOKEN #REQUIRED &gt;<br>
   5259           &lt;!ATTLIST languageMatch distance NMTOKEN #IMPLIED &gt;<br>
   5260            &lt;!ATTLIST languageMatch oneway ( true | false ) #IMPLIED &gt;</p>
   5261 		<p class="dtd">&lt;!ELEMENT languageMatches ( paradigmLocales*, matchVariable*, languageMatch* ) &gt;<br>
   5262 		  &lt;!ATTLIST languageMatches type NMTOKEN #REQUIRED &gt;</p>
   5263 		<p class="dtd">&lt;!ELEMENT paradigmLocales EMPTY &gt;<br>
   5264 		  &lt;!ATTLIST paradigmLocales locales NMTOKENS #REQUIRED &gt;
   5265 	    </p>
   5266 		<p>
   5267 			Implementers are often faced with the issue of how to match the
   5268 			user's requested languages with their product's supported languages.
   5269 			For example, suppose that a product supports {ja-JP, de, zh-TW}. If
   5270 			the user understands written American English, German, French, Swiss
   5271 			German, and Italian, then <strong>de</strong> would be the best
   5272 			match; if s/he understands only Chinese (zh), then zh-TW would be the
   5273 			best match.
   5274 		</p>
   5275 		<p>The standard truncation-fallback algorithm does not work well
   5276 			when faced with the complexities of natural language. The language
   5277 			matching data is designed to fill that gap. Stated in those terms,
   5278 			language matching can have the effect of a more complex fallback,
   5279 			such as:</p>
   5280 		<p>
   5281 			sr-Cyrl-RS<br> sr-Cyrl<br> sr-Latn-RS<br> sr-Latn<br>
   5282 			sr<br> hr-Latn<br> hr
   5283 		</p>
   5284 		<p>Language matching is used to find the best supported locale ID
   5285 			given a requested list of languages. The requested list could come
   5286 			from different sources, such as such as the user's list of preferred
   5287 			languages in the OS Settings, or from a browser Accept-Language list.
   5288 			For example, if my native tongue is English, I can understand Swiss
   5289 			German and German, my French is rusty but usable, and Italian basic,
   5290 			ideally an implementation would allow me to select {gsw, de, fr} as
   5291 			my preferred list of languages, skipping Italian because my
   5292 			comprehension is not good enough for arbitrary content.</p>
   5293 		<p>Language Matching can also be used to get fallback data elements. In
   5294 		  many cases, there may not be full data for a particular locale. For
   5295 		  example, for a Breton speaker, the best fallback if data is
   5296 		  unavailable might be French. That is, suppose we have found a Breton
   5297 		  bundle, but it does not contain translation for the key &quot;CN&quot;
   5298 		  (for the country China). It is best to return &quot;chine&quot;,
   5299 		  rather than falling back to the value default language such as Russian
   5300 		  and getting &quot;&quot;.&nbsp; The language matching data can be
   5301 		  used to get the closest fallback locales (of those supported) to a
   5302 		  given language.
   5303 </p>
   5304 	  <p>For the relationship between Inheritance, DefaultContent, LikelySubtags, and LocaleMatching, see <strong><em>Section 4.2.6 <a 
   5305 				href="tr35.html#Inheritance_vs_Related">Inheritance vs Related Information</a></em></strong>.</p>		<p>
   5306 			When such fallback is used for inherited item lookup, the normal
   5307 			order of inheritance is used for inherited item lookup, except that
   5308 			before using any data from <strong>root</strong>, the data for the
   5309 			fallback locales would be used if available. Language matching does
   5310 			not interact with the fallback of resources<em>within the
   5311 				locale-parent chain</em>. For example, suppose that we are looking for
   5312 			the value for a particular path <strong>P</strong> in <strong>nb-NO</strong>.
   5313 			In the absence of aliases, normally the following lookup is used.
   5314 		</p>
   5315 		<blockquote>
   5316 			<p>
   5317 				<strong>nb-NO</strong>  <strong>nb</strong>  <strong>root</strong>
   5318 			</p>
   5319 		</blockquote>
   5320 		<p>
   5321 			That is, we first look in <strong>nb-NO</strong>. If there is no
   5322 			value for <strong>P</strong> there, then we look in <strong>nb</strong>.
   5323 			If there is no value for <strong>P</strong> there, we return the
   5324 			value for <strong>P</strong> in root (or a code value, if there is
   5325 			nothing there). Remember that if there is an alias element along this
   5326 			path, then the lookup may restart with a different path in <strong>nb-NO</strong>
   5327 			(or another locale).
   5328 		</p>
   5329 		<p>
   5330 			However, suppose that <strong>nb-NO</strong> has the fallback values
   5331 			<strong>[nn da sv en]</strong>, derived from language matching. In
   5332 			that case, an implementation <em>may</em> progressively lookup each
   5333 			of the listed locales, with the appropriate substitutions, returning
   5334 			the first value that is not found in <strong>root</strong>. This
   5335 			follows roughly the following pseudocode:
   5336 		</p>
   5337 		<ul>
   5338 			<li>value = lookup(P, nb-NO); if (locationFound != root) return
   5339 				value;</li>
   5340 			<li>value = lookup(P, nn-NO); if (locationFound != root) return
   5341 				value;</li>
   5342 			<li>value = lookup(P, da-NO); if (locationFound != root) return
   5343 				value;</li>
   5344 			<li>value = lookup(P, sv-NO); if (locationFound != root) return
   5345 				value;</li>
   5346 			<li>value = lookup(P, en-NO); return value;</li>
   5347 		</ul>
   5348 		<p>
   5349 			The locales in the fallback list are not used recursively. For
   5350 			example, for the lookup of a path in nb-NO, if <strong>fr</strong>
   5351 			were a fallback value for <strong>da</strong>, it would not matter
   5352 			for the above process. Only the original language matters.
   5353 		</p>
   5354 		<p>The language matching data is intended to be used according to
   5355 			the following algorithm. This is a logical description, and can be
   5356 			optimized for production in many ways. In this algorithm, the
   5357 			languageMatching data is interpreted as an ordered list.</p>
   5358 		<p>The language matching algorithm takes a list of a users
   5359 			desired languages, and a list of the applications supported
   5360 			languages.</p>
   5361 		<ul>
   5362 			<li>Set the best weighted distance BWD to </li>
   5363 			<li>Set the best desired language BD to null</li>
   5364 			<li>For each desired language D
   5365 				<ul>
   5366 					<li>Compute a discount factor F, based on the position in the
   5367 						list.
   5368 						<ul>
   5369 							<li>This discount factor is up to the implementation, but is
   5370 								typically a positive value that increases according to how far D
   5371 								is from the start of the desired language list.</li>
   5372 						</ul>
   5373 					</li>
   5374 					<li>For each supported language S
   5375 						<ul>
   5376 							<li>Find the matching distance MD as described below.</li>
   5377 							<li>Compute the weighted distance as F + MD</li>
   5378 							<li>If WD &lt; BD
   5379 								<ul>
   5380 									<li>BWD = WD</li>
   5381 									<li>BD = D</li>
   5382 								</ul>
   5383 							</li>
   5384 						</ul>
   5385 					</li>
   5386 				</ul>
   5387 			</li>
   5388 			<li>If the BWD is less than a threshold, return BD.
   5389 				<ul>
   5390 					<li>The threshold is implementation-defined, typically set to
   5391 						greater than a default region difference, and less than a default
   5392 						script difference.</li>
   5393 				</ul>
   5394 			</li>
   5395 			<li>Otherwise return a default supported language (like
   5396 				English).</li>
   5397 		</ul>
   5398 		<p>To find the matching distance MD between any two languages,
   5399 			perform the following steps.</p>
   5400 		<ol>
   5401 			<li>Maximize each language using Section 4.3 <a
   5402 				href="#Likely_Subtags">Likely Subtags</a>.
   5403 				<ul>
   5404 					<li>und is a special case: see below.</li>
   5405 				</ul>
   5406 			</li>
   5407 			<li>Set the match-distance MD to 0</li>
   5408 			<li>For each subtag in the list, starting from the end: region,
   5409 				script, base-language
   5410 				<ol>
   5411 					<li>If respective subtags in each language tag are identical,
   5412 						remove the subtag from each (logically) and continue.</li>
   5413 					<li>Traverse the languageMatching data until a match is found.
   5414 						<ul>
   5415 							<li>* matches any field.</li>
   5416 							<li>If the oneway flag is false, then the match is
   5417 								symmetric.</li>
   5418 						</ul>
   5419 					</li>
   5420 					<li>Add 100 minus the <strong>percent</strong> attribute value
   5421 						to MD.
   5422 					</li>
   5423 					<li>Remove the subtag from each (logically)</li>
   5424 				</ol>
   5425 			</li>
   5426 			<li>Return MD</li>
   5427 		</ol>
   5428 		<p>
   5429 			It is typically useful to set the discount factor between successive
   5430 			elements of the desired languages list to be slightly greater than
   5431 			the default region difference. That avoids the following problem:<br>
   5432 		</p>
   5433 		<p>
   5434 			<em>Supported languages:</em> "de, fr, ja"<br>
   5435 		</p>
   5436 		<p>
   5437 			<em>User's desired languages:</em> "de-AT, fr"
   5438 		</p>
   5439 		<p>This user would expect to get "de", not "fr". In practice, when
   5440 			a user selects a list of preferred languages, they don't include all
   5441 			the regional variants ahead of their second base language. Yet while
   5442 			the user's desired languages really doesn't tell us the priority
   5443 			ranking among their languages, normally the fall-off between the
   5444 			user's languages is substantially greater than regional variants. But
   5445 			unless F is greater than the distance between de-AT and de-DE, then
   5446 			the users second-choice language would be returned.</p>
   5447 		<p>The base language subtag &quot;und&quot; is a special case.
   5448 			Suppose we have the following situation:</p>
   5449 		<ul>
   5450 			<li>desired languages: {und, it}</li>
   5451 			<li>supported languages: {en, it}</li>
   5452 			<li>resulting language: en<br>
   5453 			</li>
   5454 		</ul>
   5455 		<p>Part of this is because 'und' has a special function in BCP 47;
   5456 			it stands in for 'no supplied base language'. To prevent this from
   5457 			happening, if the desired base language is und, the language matcher
   5458 	  should not apply likely subtags to it.</p>
   5459 		<p>Examples:</p>
   5460 		<p>For example, suppose that nn-DE and nb-FR are being compared.
   5461 			They are first maximized to nn-Latn-DE and nb-Latn-FR, respectively.
   5462 			The list is searched. The first match is with &quot;*-*-*&quot;, for
   5463 			a match of 96%. The languages are truncated to nn-Latn and nb-Latn,
   5464 			then to nn and nb. The first match is also for a value of 96%, so the
   5465 			result is 92%.</p>
   5466 		<p>Note that language matching is orthogonal to the how closely
   5467 			two languages are related linguistically. For example, Breton is more
   5468 			closely related to Welsh than to French, but French is the better
   5469 			match (because it is more likely that a Breton reader will understand
   5470 			French than Welsh). This also illustrates that the matches are often
   5471 			asymmetric: it is not likely that a French reader will understand
   5472 			Breton.</p>
   5473 		<p>The &quot;*&quot; acts as a wild card, as shown in the
   5474 			following example:</p>
   5475 		<p class="example">
   5476 			&lt;languageMatch desired=&quot;es-*-ES&quot;
   5477 			supported=&quot;es-*-ES&quot; percent=&quot;100&quot;/&gt;<br>
   5478 			&lt;!-- Latin American Spanishes are closer to each other.
   5479 			Approximate by having es-ES be further from everything else.--&gt;
   5480 		</p>
   5481 		<p>&nbsp;</p>
   5482 		<p class="example">&lt;languageMatch desired=&quot;es-*-ES&quot;
   5483 			supported=&quot;es-*-*&quot; percent=&quot;93&quot;/&gt;</p>
   5484 		<p class="example">
   5485 			<br> &lt;languageMatch desired=&quot;*&quot;
   5486 			supported=&quot;*&quot; percent=&quot;1&quot;/&gt;<br> &lt;!--
   5487 			[Default value - must be at end!] Normally there is no comprehension
   5488 			of different languages.--&gt;
   5489 		</p>
   5490 		<p class="example">
   5491 			<br> &lt;languageMatch desired=&quot;*-*&quot;
   5492 			supported=&quot;*-*&quot; percent=&quot;20&quot;/&gt;<br>
   5493 			&lt;!-- [Default value - must be at end!] Normally there is little
   5494 			comprehension of different scripts.--&gt;
   5495 		</p>
   5496 		<p class="example">
   5497 			<br> &lt;languageMatch desired=&quot;*-*-*&quot;
   5498 			supported=&quot;*-*-*&quot; percent=&quot;96&quot;/&gt;<br>
   5499 			&lt;!-- [Default value - must be at end!] Normally there are small
   5500 			differences across regions.--&gt;
   5501 		</p>
   5502 		<p>When the language+region is not matched, and there is otherwise
   5503 			no reason to pick among the supported regions for that language, then
   5504 			some measure of geographic &quot;closeness&quot; can be used. The
   5505 			results may be more understandable by users. Looking for en-SK, for
   5506 			example, should fall back to something within Europe (eg en-GB) in
   5507 			preference to something far away and unrelated (eg en-SG). Such a
   5508 			closeness metric does not need to be exact; a small amount of data
   5509 			can be used to give an approximate distance between any two regions.
   5510 			However, any such data must be used carefully; although Hong Kong is
   5511 			closer to India than to the UK, it is unlikely that en-IN would be a
   5512 			better match to en-HK than en-GB would.</p>
   5513 
   5514 		<h4><a name="EnhancedLanguageMatching" href="#EnhancedLanguageMatching">4.4.1 Enhanced Language Matching</a></h4>
   5515 		<p>The enhanced format for language matching adds  structure to enable better matching of languages. It is distinguished by having a suffix &quot;_new&quot; on the type, as in the example below. The extended structure allows matching to  take into account broad similarities that would give better results. For example, for English the regions that are or inherit from US (AS|GU|MH|MP|PR|UM|VI|US) form a &ldquo;cluster&rdquo;. Each region in that cluster should be closer to each other than to any other region. And a region outside the cluster should be closer to another region outside that cluster than to one inside. We get this issue with the &ldquo;world languages&rdquo; like English, Spanish, Portuguese, Arabic, etc.</p>
   5516 		<p><em>Example:</em></p>
   5517 		<pre> &lt;languageMatches type=&quot;written_new&quot;&gt;<br>	&lt;paradigmLocales locales=&quot;en en-GB es es-419 pt-BR pt-PT&quot;/&gt;<br>	&lt;matchVariable id=&quot;$enUS&quot; value=&quot;AS+GU+MH+MP+PR+UM+US+VI&quot;/&gt;<br>	&lt;matchVariable id=&quot;$cnsar&quot; value=&quot;HK+MO&quot;/&gt;<br>	&lt;matchVariable id=&quot;$americas&quot; value=&quot;019&quot;/&gt;<br>	&lt;matchVariable id=&quot;$maghreb&quot; value=&quot;MA+DZ+TN+LY+MR+EH&quot;/&gt;<br>	&lt;languageMatch desired=&quot;no&quot; supported=&quot;nb&quot; distance=&quot;1&quot;/&gt;&lt;!-- no  nb --&gt;<br>
   5518 	&lt;languageMatch desired=&quot;ar_*_$maghreb&quot; supported=&quot;ar_*_$maghreb&quot; distance=&quot;4&quot;/&gt;
   5519 		&lt;!-- ar; *; $maghreb  ar; *; $maghreb --&gt;			
   5520 	&lt;languageMatch desired=&quot;ar_*_$!maghreb&quot;	supported=&quot;ar_*_$!maghreb&quot;	distance=&quot;4&quot;/&gt;
   5521 		&lt;!-- ar; *; $!maghreb  ar; *; $!maghreb --&gt;<br></pre>
   5522 <p>The <strong>matchVariable</strong> allows for  a rule to matche to multiple regions, as illustrated by <strong>$maghreb</strong>. The syntax is simple: it allows for + for <em>union</em> and - for <em>set difference</em>, but no precedence. So A+B-A+D is interpreted as (((A+B)-A)+D), not as (A+B)-(A+D). The variable <strong>id</strong> has a value of the form [$][a-zA-Z0-9]+. If $X is defined, then $!X automatically means all those regions that are not in $X. </p>
   5523 <p dir="ltr">When the set is interpreted, then macrolanguages are (logically) transformed into a list of their contents, so &ldquo;053+GB&rdquo;  &ldquo;AU+GB+NF+NZ&rdquo;. This is done recursively, so 009  &ldquo;053+054+057+061+QO&rdquo;  &ldquo;AU+NF+NZ+FJ+NC+PG+SB +VU...&rdquo;. Note that we use 019 for all of the Americas in the variables above, because en-US should be in the same cluster as es-419 and its contents. </p>
   5524 <p>In the rules, the percent value (100..0) is replaced by a <strong>distance</strong> value, which is the inverse (0..100).</p>
   5525 <p dir="ltr">These new variables and rules divide up the world into clusters, where items in the same clusters (for specific languages) get the normal regional difference, and items in different clusters get different weights.</p>
   5526 <br>
   5527 <p dir="ltr">Each cluster can have one or more associated <strong>paradigmLocales</strong>. These are locales that are preferred within a cluster. So when matching desired=[en-SA] against [en-GU en en-IN en-GB], the value en-GB is returned. Both of {en-GU en} are in a different cluster. While {en-IN en-GB} are in the same cluster, and the same distance from en-SA, the preference is given to en-GB because it is in the paradigm locales. It would be possible to express this in rules, but using this mechanism handles these very common cases without bulking up the tables.<br>
   5528 </p>
   5529 <p dir="ltr">The <strong>paradigmLocales</strong>  also allow matching to macroregions. For example, desired=[es-419] should match to {es-MX} more closely than to {es}, and vice versa: {es-MX} should match more closely to {es-419} than to {es}. But es-MX should match more closely to es-419 than to any of the other es-419 sublocales. In general, in the absence of other distance data, there is a &lsquo;paradigm&rsquo; in each cluster that the others should match more closely to: en(-US), en-GB, es(-ES), es-419, ru(-RU)... </p>
   5530 
   5531 		<h2>
   5532 			<a name="XML_Format" href="#XML_Format">5 XML Format</a>
   5533 		</h2>
   5534 		<p>There are two kinds of data that can be expressed in LDML:
   5535 			language-dependent data and supplementary data. In either case, data
   5536 			can be split across multiple files, which can be in multiple
   5537 			directory trees.</p>
   5538 		<p>For example, the language-dependent data for Japanese in CLDR
   5539 			is present in the following files:</p>
   5540 		<ul>
   5541 			<li>common/collation/ja.xml</li>
   5542 			<li>common/main/ja.xml</li>
   5543 			<li>common/rbnf/ja.xml</li>
   5544 			<li>common/segmentations/ja.xml</li>
   5545 		</ul>
   5546 		<p>Data for cased languages such as French are in files like:</p>
   5547 		<ul>
   5548 			<li>common/casing/fr.xml</li>
   5549 		</ul>
   5550 		<p>The status of the data is the same, whether or not data is
   5551 			split. That is, for the purpose of validation and lookup, all of the
   5552 			data for the above ja.xml files is treated as if it was in a single
   5553 			file. These files have the &lt;ldml&gt; root element and use
   5554 			ldml.dtd. The file name must match the identity element. For example,
   5555 			the &lt;ldml&gt; file pa_Arab_PK.xml must contain the following
   5556 			elements:</p>
   5557 		<pre>
   5558 			<strong>&lt;ldml&gt;</strong><br> 	&lt;identity&gt;<br> 		<br> 		<strong>&lt;language type=&quot;pa&quot;/&gt;<br> 		&lt;script type=&quot;Arab&quot;/&gt;<br> 		&lt;territory type=&quot;PK&quot;/&gt;</strong><br> 	&lt;/identity&gt;
   5559 </pre>
   5560 		<p>Supplemental data can have different root elements, currently:
   5561 			ldmlBCP47, supplementalData, keyboard, and platform. Keyboard and
   5562 			platform files are considered distinct. The ldmlBCP47 files and
   5563 			supplementalData files that have the same root are all logically part
   5564 			of the same file; they are simply split into separate files for
   5565 			convenience. Implementations may split the files in different ways,
   5566 			also for their convenience. The files in /properties are also
   5567 			supplemental data files, but are structured like UCD properties.</p>
   5568 
   5569 		<p>For example, supplemental data relating to Japan or the
   5570 			Japanese writing are in:</p>
   5571 		<ul>
   5572 			<li>common/supplemental/ (in many files, such as
   5573 				supplementalData.xml)</li>
   5574 			<li>common/transforms/Hiragana-Katakana.xml</li>
   5575 			<li>common/transforms/Hiragana-Latin.xml</li>
   5576 			<li>common/properties/scriptMetadata.txt</li>
   5577 			<li>common/bcp47/calendar.xml</li>
   5578 			<li>uca/allkeys_CLDR.txt (sorting)</li>
   5579 			<li>/keyboards/chromeos/ja-t-k0-chromeos.xml</li>
   5580 			<li>...</li>
   5581 		</ul>
   5582 		<p>Like the &lt;ldml&gt; files, the keyboard file names must match
   5583 			internal data: in particular, the locale attribute on the keyboard
   5584 			element must have a value that corresponds to the file name, such as
   5585 			&lt;keyboard locale=&quot;af-t-k0-android&quot;&gt; for the file
   5586 			af-t-k0-android.xml.</p>
   5587 		<p>
   5588 			The following sections describe the structure of the XML format for
   5589 			language-dependent data. The more precise syntax is in the ldml.dtd
   5590 			file<i>; however, the DTD does not describe all the constraints
   5591 				on the structure.</i>
   5592 		</p>
   5593 		<p>To start with, the root element is &lt;ldml&gt;, with the
   5594 			following DTD entry:</p>
   5595 		<p class='dtd'>
   5596 			&lt;!ELEMENT ldml
   5597 			(identity,(alias|(fallback*,localeDisplayNames?,layout?,contextTransforms?,characters?,<br>
   5598 			delimiters?,measurement?,dates?,numbers?,units?,listPatterns?,collations?,posix?,<br>
   5599 			segmentations?,rbnf?,annotations?,metadata?,references?,special*)))&gt;
   5600 		</p>
   5601 
   5602 		<p>The XML structure is stable over releases. Elements and
   5603 			attributes may be deprecated: they are retained in the DTD but their
   5604 			usage is strongly discouraged. In most cases, an alternate structure
   5605 			is provided for expressing the information. There is only one
   5606 			exception: newer DTDs cannot be used with version 1.1 files, without
   5607 			some modification.</p>
   5608 		<p>In general, all translatable text in this format is in element
   5609 			contents, while attributes are reserved for types and non-translated
   5610 			information (such as numbers or dates). The reason that attributes
   5611 			are not used for translatable text is that spaces are not preserved,
   5612 			and we cannot predict where spaces may be significant in translated
   5613 			material.</p>
   5614 		<p>
   5615 			There are two kinds of elements in LDML: <i>rule</i> elements and <i>structure</i>
   5616 			elements. For structure elements, there are restrictions to allow for
   5617 			effective inheritance and processing:
   5618 		</p>
   5619 		<ol>
   5620 			<li>There is no &quot;mixed&quot; content: if an element has
   5621 				textual content, then it cannot contain any elements.</li>
   5622 			<li>The [<a href="#XPath">XPath</a>] leading to the content is
   5623 				unique; no two different pieces of textual content have the same [<a
   5624 				href="#XPath">XPath</a>].
   5625 			</li>
   5626 		</ol>
   5627 		<p>
   5628 			Rule elements do not have this restriction, but also do not inherit,
   5629 			except as an entire block. The rule elements are listed in
   5630 			serialElements in the supplemental metadata. See also <i><a
   5631 				href="#Inheritance_and_Validity">Section 4.2 Inheritance and
   5632 					Validity</a></i>. For more technical details, see <a
   5633 				href="http://cldr.unicode.org/development/updating-dtds">Updating-DTDs</a>.
   5634 		</p>
   5635 		<p>
   5636 			Note that the data in examples given below is purely illustrative,
   5637 			and does not match any particular language. For a more detailed
   5638 			example of this format, see [<a href="#LDML">Example</a>]. There is
   5639 			also a DTD for this format, but <i>remember that the DTD alone is
   5640 				not sufficient to understand the semantics, the constraints,
   5641 				nor&nbsp; the interrelationships between the different elements and
   5642 				attributes</i>. You may wish to have copies of each of these to hand as
   5643 			you proceed through the rest of this document.
   5644 		</p>
   5645 		<p>In particular, all elements allow for draft versions to coexist
   5646 			in the file at the same time. Thus most elements are marked in the
   5647 			DTD as allowing multiple instances. However, unless an element is
   5648 			listed as a serialElement, or has a distinguishing attribute, it can
   5649 			only occur once as a subelement of a given element. Thus, for
   5650 			example, the following is illegal even though allowed by the DTD:</p>
   5651 		<p>
   5652 			&lt;languages&gt;<br> &nbsp; &lt;language
   5653 			type=&quot;aa&quot;&gt;...&lt;/language&gt;<br> &nbsp;
   5654 			&lt;language type=&quot;aa&quot;&gt;..&lt;/language&gt;
   5655 		</p>
   5656 		<p>There must be only one instance of these per parent, unless
   5657 			there are other distinguishing attributes (such as an alt element).</p>
   5658 		<p>In general, LDML data should be in NFC format. However, certain
   5659 			elements may need to contain characters that are not in NFC,
   5660 			including exemplars, transforms, segmentations, and
   5661 			p/s/t/i/pc/sc/tc/ic rules in collation. These elements must not be
   5662 			normalized (either to NFC or NFD), or their meaning may be changed.
   5663 			Thus LDML documents must not be normalized as a whole. To prevent
   5664 			problems with normalization, no element value can start with a
   5665 			combining slash (U+0338 COMBINING LONG SOLIDUS OVERLAY).</p>
   5666 		<p>
   5667 			Lists, such as <span class="attribute">singleCountries</span> are
   5668 			space-delimited. That means that they are separated by one or more
   5669 			XML whitespace characters,
   5670 		</p>
   5671 		<ul>
   5672 			<li>singleCountries</li>
   5673 			<li>preferenceOrdering</li>
   5674 			<li>references</li>
   5675 		</ul>
   5676 		<h3>
   5677 			<a name="Common_Elements" href="#Common_Elements">5.1 Common
   5678 				Elements</a>
   5679 		</h3>
   5680 		<p>At any level in any element, two special elements are allowed.</p>
   5681 		<h4>
   5682 			<a name="special" href="#special">5.1.1 Element special</a>
   5683 		</h4>
   5684 		<p>
   5685 			This element is designed to allow for arbitrary additional annotation
   5686 			and data that is product-specific. It has one required attribute <span
   5687 				class="attribute">xmlns</span>, which specifies the XML <a
   5688 				href="http://www.w3.org/TR/REC-xml-names/">namespace</a> of the
   5689 			special data. For example, the following used the version 1.0 POSIX
   5690 			special element.
   5691 		</p>
   5692 		<pre>&lt;!DOCTYPE ldml SYSTEM &quot;<span style="color: blue">http://unicode.org/cldr/dtd/1.0/ldml.dtd</span>&quot; [
   5693     &lt;!ENTITY % posix SYSTEM &quot;<span style="color: blue">http://unicode.org/cldr/dtd/1.0/ldmlPOSIX.dtd</span>&quot;&gt;
   5694 <span style="color: blue">%posix;</span>
   5695 ]&gt;
   5696 &lt;ldml&gt;
   5697 ...
   5698 &lt;special xmlns:posix=&quot;<span style="color: blue">http://www.opengroup.org/regproducts/xu.htm</span>&quot;&gt;
   5699         <span style="color: green">&lt;!-- old abbreviations for pre-GUI days --&gt;</span>
   5700         &lt;posix:messages&gt;
   5701             &lt;posix:yesstr&gt;<span style="color: blue">Yes</span>&lt;/posix:yesstr&gt;
   5702             &lt;posix:nostr&gt;<span style="color: blue">No</span>&lt;/posix:nostr&gt;
   5703             &lt;posix:yesexpr&gt;<span style="color: blue">^[Yy].*</span>&lt;/posix:yesexpr&gt;
   5704             &lt;posix:noexpr&gt;<span style="color: blue">^[Nn].*</span>&lt;/posix:noexpr&gt;
   5705         &lt;/posix:messages&gt;
   5706     &lt;/special&gt;
   5707 &lt;/ldml&gt;
   5708 </pre>
   5709 		<h5>
   5710 			<a name="Sample_Special_Elements" href="#Sample_Special_Elements">5.1.1.1
   5711 				Sample Special Elements</a>
   5712 		</h5>
   5713 		<p>
   5714 			The elements in this section are <i><b>not</b></i> part of the Locale
   5715 			Data Markup Language 1.0 specification. Instead, they are special
   5716 			elements used for application-specific data to be stored in the
   5717 			Common Locale Repository. They may change or be removed future
   5718 			versions of this document, and are present her more as examples of
   5719 			how to extend the format. (Some of these items may move into a future
   5720 			version of the Locale Data Markup Language specification.)
   5721 		</p>
   5722 		<ul>
   5723 			<li><a href="http://unicode.org/cldr/dtd/1.1/ldmlICU.dtd">http://unicode.org/cldr/dtd/1.1/ldmlICU.dtd</a></li>
   5724 			<li><a href="http://unicode.org/cldr/dtd/1.1/ldmlOpenOffice.dtd">http://unicode.org/cldr/dtd/1.1/ldmlOpenOffice.dtd</a></li>
   5725 		</ul>
   5726 		<p>The above examples are old versions: consult the documentation
   5727 			for the specific application to see which should be used.</p>
   5728 		<p>These DTDs use namespaces and the special element. To include
   5729 			one or more, use the following pattern to import the special DTDs
   5730 			that are used in the file:</p>
   5731 		<pre>&lt;?xml version=&quot;<span style="color: blue">1.0</span>&quot; encoding=&quot;<span
   5732 				style="color: blue">UTF-8</span>&quot; ?&gt;
   5733 &lt;!DOCTYPE ldml SYSTEM &quot;<span style="color: blue">http://unicode.org/cldr/dtd/1.1/ldml.dtd</span>&quot; [
   5734     &lt;!ENTITY % <span style="color: blue">icu</span> SYSTEM &quot;<span
   5735 				style="color: blue">http://unicode.org/cldr/dtd/1.1/ldmlICU.dtd</span>&quot;&gt;
   5736     &lt;!ENTITY % <span style="color: blue">openOffice</span> SYSTEM &quot;<span
   5737 				style="color: blue">http://unicode.org/cldr/dtd/1.1/ldmlOpenOffice.dtd</span>&quot;&gt;
   5738 <span style="color: blue">%icu;
   5739 %openOffice;
   5740 </span>]&gt;</pre>
   5741 		<p>Thus to include just the ICU DTD, one uses:</p>
   5742 		<pre>&lt;?xml version=&quot;<span style="color: blue">1.0</span>&quot; encoding=&quot;<span
   5743 				style="color: blue">UTF-8</span>&quot; ?&gt;
   5744 &lt;!DOCTYPE ldml SYSTEM &quot;<span style="color: blue">http://unicode.org/cldr/dtd/1.1/ldml.dtd</span>&quot; [
   5745     &lt;!ENTITY % icu SYSTEM &quot;<span style="color: blue">http://unicode.org/cldr/dtd/1.1/ldmlICU.dtd</span>&quot;&gt;
   5746 <span style="color: blue">%icu;
   5747 </span>]&gt;</pre>
   5748 		<blockquote>
   5749 			<p>
   5750 				<b>Note: </b>A previous version of this document contained a special
   5751 				element for <a
   5752 					href="http://www.open-std.org/jtc1/sc22/wg20/docs/n897-14652w25.pdf">ISO
   5753 					TR 14652</a> compatibility data. That element has been withdrawn,
   5754 				pending further investigation, since<b><i> </i></b>14652 is a Type 1
   5755 				TR: &quot;when the required support cannot be obtained for the
   5756 				publication of an International Standard, despite repeated
   5757 				effort&quot;. See the ballot comments on <a
   5758 					href="http://www.open-std.org/jtc1/sc22/wg20/docs/n948-J1N6769-14652.pdf">14652
   5759 					Comments</a> for details on the 14652 defects. For example, most of
   5760 				these patterns make little provision for substantial changes in
   5761 				format when elements are empty, so are not particularly useful in
   5762 				practice. Compare, for example, the mail-merge capabilities of
   5763 				production software such as Microsoft Word or OpenOffice.
   5764 			</p>
   5765 			<p>
   5766 				<b>Note: </b>While the CLDR specification guarantees backwards
   5767 				compatibility, the definition of specials is up to other
   5768 				organizations. Any assurance of backwards compatibility is up to
   5769 				those organizations.
   5770 			</p>
   5771 		</blockquote>
   5772 		<p>
   5773 			A number of the elements above can have extra information for <a
   5774 				name="OpenOffice" href="#OpenOffice">openoffice.org</a>, such as the
   5775 			following example:
   5776 		</p>
   5777 		<pre>    &lt;special xmlns:openOffice=&quot;<span
   5778 				style="color: blue">http://www.openoffice.org</span>&quot;&gt;
   5779         &lt;openOffice:search&gt;
   5780             &lt;openOffice:searchOptions&gt;
   5781                 &lt;openOffice:transliterationModules&gt;<span
   5782 				style="color: blue">IGNORE_CASE</span>&lt;/openOffice:transliterationModules&gt;
   5783             &lt;/openOffice:searchOptions&gt;
   5784         &lt;/openOffice:search&gt;
   5785     &lt;/special&gt;
   5786 </pre>
   5787 		<h4>
   5788 			<a name="Alias_Elements" href="#Alias_Elements">5.1.2 Element
   5789 				alias</a>
   5790 		</h4>
   5791 		<p class="dtd">
   5792 			&lt;!ELEMENT alias (special*) &gt;<br> &lt;!ATTLIST alias source
   5793 			NMTOKEN #REQUIRED &gt;<br> &lt;!ATTLIST alias path CDATA
   5794 			#IMPLIED&gt;
   5795 		</p>
   5796 		<p>The contents of any element in root can be replaced by an
   5797 			alias, which points to the path where the data can be found.</p>
   5798 		<p>Aliases will only ever appear in root with the form
   5799 			//ldml/.../alias[@source=&quot;locale&quot;][@path=&quot;...&quot;].</p>
   5800 		<p>Consider the following example in root:</p>
   5801 		<pre>
   5802       &lt;calendar type=&quot;gregorian&quot;&gt;<br> &lt;months&gt;<br>      &lt;default choice=&quot;format&quot;/&gt;<br>      &lt;monthContext type=&quot;format&quot;&gt;<br>            &lt;default choice=&quot;wide&quot;/&gt;<br>            &lt;monthWidth type=&quot;abbreviated&quot;&gt;<br>             <strong>&lt;alias source=&quot;locale&quot; path=&quot;../monthWidth[@type='wide']&quot;/&gt;</strong><br>                      &lt;/monthWidth&gt;</pre>
   5803 		<p>
   5804 			If the locale &quot;de_DE&quot; is being accessed for a month name
   5805 			for format/abbreviated, then a resource bundle at &quot;de_DE&quot;
   5806 			will be searched for a resource element at the that path. If not
   5807 			found there, then the resource bundle at &quot;de&quot; will be
   5808 			searched, and so on. When the alias is found in root, then the search
   5809 			is restarted, but searching for format/<strong>wide</strong> element
   5810 			instead of format/abbreviated.
   5811 		</p>
   5812 		<p>
   5813 			If the <b>path</b> attribute is present, then its value is an [<a
   5814 				href="#XPath">XPath</a>] that points to a different node in the
   5815 			tree. For example:
   5816 		</p>
   5817 		<pre>&lt;alias source=&quot;locale&quot; path=&quot;../monthWidth[@type=&#39;wide&#39;]&quot;/&gt;</pre>
   5818 		<p>
   5819 			The default value if the path is not present is the same position in
   5820 			the tree. All of the attributes in the [<a href="#XPath">XPath</a>]
   5821 			must be <i>distinguishing</i> elements. For more details, see <a
   5822 				href="#Inheritance_and_Validity">Section 4.2 Inheritance and
   5823 				Validity</a>.
   5824 		</p>
   5825 		<p>
   5826 			There is a special value for the source attribute, the constant <b>source=&quot;locale&quot;</b>.
   5827 			This special value is equivalent to the locale being resolved. For
   5828 			example, consider the following example, where locale data for
   5829 			&#39;de&#39; is being resolved:
   5830 		</p>
   5831 		<div align="center">
   5832 			<center>
   5833 				<table border="1" cellpadding="0" cellspacing="1">
   5834 					<caption>
   5835 						<a name="Inheritance_with_source_locale_"
   5836 							href="#Inheritance_with_source_locale_">Inheritance with
   5837 							source=&quot;locale&quot;</a>
   5838 					</caption>
   5839 					<tr>
   5840 						<th>Root</th>
   5841 						<th>de</th>
   5842 						<th bgcolor="#C0C0C0">Resolved</th>
   5843 					</tr>
   5844 					<tr>
   5845 						<td><code>
   5846 								&lt;x&gt;<br> &nbsp; &lt;a&gt;1&lt;/a&gt;<br> &nbsp;
   5847 								&lt;b&gt;2&lt;/b&gt;<br> &nbsp; &lt;c&gt;3&lt;/c&gt;<br>
   5848 								<br> &lt;/x&gt;
   5849 							</code></td>
   5850 						<td><code>
   5851 								&lt;x&gt;<br> &nbsp;&lt;a&gt;11&lt;/a&gt;<br>
   5852 								&nbsp;&lt;b&gt;12&lt;/b&gt;<br> <br>
   5853 								&nbsp;&lt;d&gt;14&lt;/d&gt;<br> &lt;/x&gt;
   5854 							</code></td>
   5855 						<td bgcolor="#C0C0C0"><code>
   5856 								&lt;x&gt;<br> &nbsp;&lt;a&gt;11&lt;/a&gt;<br>
   5857 								&nbsp;&lt;b&gt;12&lt;/b&gt;<br> &nbsp;<span
   5858 									style="background-color: #FFFF00"><span
   5859 									class="inherited"><span style="font-weight: 400;">&lt;c&gt;3&lt;/c&gt;</span></span></span><br>
   5860 								&nbsp;&lt;d&gt;14&lt;/d&gt;<br> &lt;/x&gt;
   5861 							</code></td>
   5862 					</tr>
   5863 					<tr>
   5864 						<td><code>
   5865 								&lt;y&gt;<br> &nbsp;&lt;alias source=&quot;locale&quot;
   5866 								path=&quot;../x&quot;&gt;<br> &lt;/y&gt;
   5867 							</code></td>
   5868 						<td><code>
   5869 								&lt;y&gt;<br> <br> &nbsp;&lt;b&gt;22&lt;/b&gt;<br>
   5870 								<br> <br> &nbsp;&lt;e&gt;25&lt;/e&gt;<br>
   5871 								&lt;/y&gt;
   5872 							</code></td>
   5873 						<td bgcolor="#C0C0C0"><code>
   5874 								&lt;y&gt;<br> &nbsp;<span style="background-color: #FFFF00"><span
   5875 									class="inherited"><span style="font-weight: 400;">&lt;a&gt;11&lt;/a&gt;</span></span></span><br>
   5876 								&nbsp;&lt;b&gt;22&lt;/b&gt;<br> &nbsp;<span
   5877 									style="background-color: #FFFF00"><span
   5878 									class="inherited"><span style="font-weight: 400;">&lt;c&gt;3&lt;/c&gt;</span></span></span><br>
   5879 								&nbsp;<span style="background-color: #FFFF00"><span
   5880 									class="inherited"><span style="font-weight: 400;">&lt;d&gt;14&lt;/d&gt;</span></span></span><br>
   5881 								&nbsp;&lt;e&gt;25&lt;/e&gt;<br> &lt;/y&gt;
   5882 							</code></td>
   5883 					</tr>
   5884 				</table>
   5885 			</center>
   5886 		</div>
   5887 		<p>The first row shows the inheritance within the &lt;x&gt;
   5888 			element, whereby &lt;c&gt; is inherited from root. The second shows
   5889 			the inheritance within the &lt;y&gt; element, whereby &lt;a&gt;,
   5890 			&lt;c&gt;, and &lt;d&gt; are inherited also from root, but from an
   5891 			alias there. The alias in root is logically replaced not by the
   5892 			elements in root itself, but by elements in the &#39;target&#39;
   5893 			locale.</p>
   5894 		<p>
   5895 			For more details on data resolution, see <a
   5896 				href="#Inheritance_and_Validity">Section 4.2 Inheritance and
   5897 				Validity</a>.
   5898 		</p>
   5899 		<p>
   5900 			Aliases must be resolved recursively. An alias may point to another
   5901 			path that results in another alias being found, and so on. For
   5902 			example, looking up Thai buddhist abbreviated months for the locale <strong>xx-YY</strong>
   5903 			may result in the following chain of aliases being followed:
   5904 		</p>
   5905 		<blockquote>
   5906 			<p>../../calendar[@type=&quot;buddhist&quot;]/months/monthContext[@type=&quot;format&quot;]/monthWidth[@type=&quot;abbreviated&quot;]
   5907 			</p>
   5908 			<p>xx-YY  xx  root // finds alias that changes path to:</p>
   5909 			<p>../../calendar[@type=&quot;gregorian&quot;]/months/monthContext[@type=&quot;format&quot;]/monthWidth[@type=&quot;abbreviated&quot;]
   5910 			</p>
   5911 			<p>xx-YY  xx  root // finds alias that changes path to:</p>
   5912 			<p>../../calendar[@type=&quot;gregorian&quot;]/months/monthContext[@type=&quot;format&quot;]/monthWidth[@type=&quot;wide&quot;]
   5913 			</p>
   5914 			<p>xx-YY  xx // finds value here</p>
   5915 		</blockquote>
   5916 		<p>It is an error to have a circular chain of aliases. That is, a
   5917 			collection of LDML XML documents must not have situations where a
   5918 			sequence of alias lookups (including inheritance and lateral
   5919 			inheritance) can be followed indefinitely without terminating.</p>
   5920 		<h4>
   5921 			<a name="Element_displayName" href="#Element_displayName">5.1.3
   5922 				Element displayName</a>
   5923 		</h4>
   5924 		<p>Many elements can have a display name. This is a translated
   5925 			name that can be presented to users when discussing the particular
   5926 			service. For example, a number format, used to format numbers using
   5927 			the conventions of that locale, can have translated name for
   5928 			presentation in GUIs.</p>
   5929 		<pre>  &lt;numberFormat&gt;
   5930     &lt;displayName&gt;<span style="color: blue">Prozentformat</span>&lt;/displayName&gt;
   5931 ...
   5932   &lt;numberFormat&gt;</pre>
   5933 		<p>
   5934 			Where present, the display names must be unique; that is, two
   5935 			distinct code would not get the same display name.&nbsp; (There is
   5936 			one exception to this: in time zones, where parsing results would
   5937 			give the same GMT offset, the standard and daylight display names can
   5938 			be the same across different time zone IDs.) Any translations should
   5939 			follow customary practice for the locale in question. For more
   5940 			information, see [<a href="#DataFormats">Data Formats</a>].
   5941 		</p>
   5942 		<h4>
   5943 			<a name="Escaping_Characters" href="#Escaping_Characters">5.1.4
   5944 				Escaping Characters</a>
   5945 		</h4>
   5946 		<p>Unfortunately, XML does not have the capability to contain all
   5947 			Unicode code points. Due to this, in certain instances extra syntax
   5948 			is required to represent those code points that cannot be otherwise
   5949 			represented in element content. The escaping syntax is only defined
   5950 			on a few types of elements, such as in collation or exemplar sets,
   5951 			and uses the appropriate syntax for that type.</p>
   5952 		<p>The element &lt;cp&gt;, which was formerly used for this
   5953 			purpose, has been deprecated.</p>
   5954 
   5955 		<h3>
   5956 			<a name="Common_Attributes" href="#Common_Attributes">5.2 Common
   5957 				Attributes</a>
   5958 		</h3>
   5959 		<h4>
   5960 			<a name="Attribute_type" href="#Attribute_type">5.2.1 Attribute
   5961 				type</a>
   5962 		</h4>
   5963 		<p>
   5964 			The attribute <i>type</i> is also used to indicate an alternate
   5965 			resource that can be selected with a matching type=option in the
   5966 			locale id modifiers, or be referenced by a default element. For
   5967 			example:
   5968 		</p>
   5969 		<pre>&lt;ldml&gt;
   5970   ...
   5971   &lt;currencies&gt;
   5972     &lt;currency&gt;<span style="color: blue">...</span>&lt;/currency&gt;
   5973     &lt;currency type=&quot;<span style="color: blue">preEuro</span>&quot;&gt;<span
   5974 				style="color: blue">...</span>&lt;/currency&gt;
   5975   &lt;/currencies&gt;
   5976 &lt;/ldml&gt;</pre>
   5977 		<h4>
   5978 			<a name="Attribute_draft" href="#Attribute_draft">5.2.2 Attribute
   5979 				draft</a>
   5980 		</h4>
   5981 		<p>
   5982 			If this attribute is present, it indicates the status of all the data
   5983 			in this element and any subelements (unless they have a contrary <i>draft</i>
   5984 			value), as per the following:
   5985 		</p>
   5986 		<ul>
   5987 			<li style="margin-top: 0.5em; margin-bottom: 0.5em"><i>approved:</i>
   5988 				fully approved by the technical committee (equals the CLDR 1.3 value
   5989 				of <i>false</i>, or an absent <i>draft</i> attribute). This does not
   5990 				mean that the data is guaranteed to be error-freethis is the best
   5991 				judgment of the committee.</li>
   5992 			<li style="margin-top: 0.5em; margin-bottom: 0.5em"><i>contributed</i>:
   5993 				partially approved by the technical committee.</li>
   5994 			<li style="margin-top: 0.5em; margin-bottom: 0.5em"><i>provisional</i>:
   5995 				partially confirmed. Implementations may choose to accept the
   5996 				provisional data, especially if there is no translated alternative.</li>
   5997 			<li style="margin-top: 0.5em; margin-bottom: 0.5em"><i>unconfirmed</i>:
   5998 				no confirmation available.</li>
   5999 		</ul>
   6000 		<p>
   6001 			For more information on precisely how these values are computed for
   6002 			any given release, see <a
   6003 				href="http://cldr.unicode.org/index/process#TOC-Data-Submission-and-Vetting-Process">Data
   6004 				Submission and Vetting Process</a> on the CLDR website.
   6005 		</p>
   6006 		<p>
   6007 			The draft attribute should only occur on &quot;leaf&quot; elements, and is deprecated elsewhere. For a more
   6008 			formal description of how elements are inherited, and what their
   6009 			draft status is, see <i><a href="#Inheritance_and_Validity">Section
   6010 					4.2 Inheritance and Validity</a></i>.
   6011 		</p>
   6012 		<h4>
   6013 			<a name="alt_attribute" href="#alt_attribute">5.2.3 Attribute alt</a>
   6014 		</h4>
   6015 		<p>
   6016 			This attribute labels an alternative value for an element. The value
   6017 			is a <i>descriptor</i> indicates what kind of alternative it is, and
   6018 			takes one of the following
   6019 		</p>
   6020 		<ul>
   6021 			<li><i>variantname</i> meaning that the value is a variant of
   6022 				the normal value, and may be used in its place in certain
   6023 				circumstances. If a variant value is absent for a particular locale,
   6024 				the normal value is used. The variant mechanism should only be used
   6025 				when such a fallback is acceptable.</li>
   6026 			<li><span style="color: blue">proposed</span>, optionally
   6027 				followed by a number, indicating that the value is a proposed
   6028 				replacement for an existing value.</li>
   6029 			<li><i>variantname</i><span style="color: blue">-proposed</span>,
   6030 				optionally followed by a number, indicating that the value is a
   6031 				proposed replacement variant value.</li>
   6032 		</ul>
   6033 		<p>
   6034 			&quot;<span style="color: blue">proposed</span>&quot; should only be
   6035 			present if the draft status is not &quot;approved&quot;. It indicates
   6036 			that the data is proposed replacement data that has been added
   6037 			provisionally until the differences between it and the other data can
   6038 			be vetted. For example, suppose that the translation for September
   6039 			for some language is &quot;Settembru&quot;, and a bug report is filed
   6040 			that that should be &quot;Settembro&quot;. The new data can be
   6041 			entered in, but marked as <i>alt=&quot;proposed&quot;</i> until it is
   6042 			vetted.
   6043 		</p>
   6044 		<pre>...
   6045 &lt;month type=&quot;9&quot;&gt;Settembru&lt;/month&gt;
   6046 &lt;month type=&quot;9&quot; draft=&quot;unconfirmed&quot; alt=&quot;proposed&quot;&gt;Settembro&lt;/month&gt;
   6047 &lt;month type=&quot;10&quot;&gt;...</pre>
   6048 		<p>Now assume another bug report comes in, saying that the correct
   6049 			form is actually &quot;Settembre&quot;. Another alternative can be
   6050 			added:</p>
   6051 		<pre>...
   6052 &lt;month type=&quot;9&quot; draft=&quot;unconfirmed&quot; alt=&quot;proposed2&quot;&gt;Settembre&lt;/month&gt;
   6053 ...</pre>
   6054 		<p>
   6055 			The values for <i>variantname</i> at this time include &quot;<span
   6056 				style="color: blue">variant</span>&quot;, &quot;<span
   6057 				style="color: blue">list</span>&quot;, &quot;<span
   6058 				style="color: blue">email</span>&quot;, &quot;<span
   6059 				style="color: blue">www</span>&quot;, &quot;<span
   6060 				class="attributeValue">short</span>&quot;, and &quot;<span
   6061 				style="color: blue">secondary</span>&quot;.
   6062 		</p>
   6063 		<p>
   6064 			For a more complete description of how draft applies to data, see <i><a
   6065 				href="#Inheritance_and_Validity">Section 4.2 Inheritance and
   6066 					Validity</a></i>.
   6067 		</p>
   6068 		<p class="element2">
   6069 			Attribute <a name="references_attribute" href="#references_attribute">references</a>
   6070 		</p>
   6071 		<p>The value of this attribute is a token representing a reference
   6072 			for the information in the element, including standards that it may
   6073 			conform to. &lt;references&gt;. (In older versions of CLDR, the value
   6074 			of the attribute was freeform text. That format is deprecated.)</p>
   6075 		<p>
   6076 			<i>Example:</i>
   6077 		</p>
   6078 		<p class="example">&lt;territory type=&quot;UM&quot;
   6079 			references=&quot;R222&quot;&gt;USAs yttre ar&lt;/territory&gt;</p>
   6080 		<p>The reference element may be inherited. Thus, for example, R222
   6081 			may be used in sv_SE.xml even though it is not defined there, if it
   6082 			is defined in sv.xml.</p>
   6083 		<p>&lt;... allow=&quot;verbatim&quot; ...&gt; (deprecated)</p>
   6084 		<p>This attribute was originally intended for use in marking
   6085 			display names whose capitalization differed from what was indicated
   6086 			by the now-deprecated &lt;inText&gt; element (perhaps, for example,
   6087 			because the names included a proper noun). It was never supported in
   6088 			the dtd and is not needed for use with the new
   6089 			&lt;contextTransforms&gt; element.</p>
   6090 		<h3>
   6091 			<a name="Common_Structures" href="#Common_Structures">5.3 Common
   6092 				Structures</a>
   6093 		</h3>
   6094 		<h4>
   6095 			<a name="Date_Ranges" href="#Date_Ranges">5.3.1 Date and Date
   6096 				Ranges</a>
   6097 		</h4>
   6098 		<p>
   6099 			When attribute specify date ranges, it is usually done with
   6100 			attributes <i>from</i> and <i>to</i>. The <i>from</i> attribute
   6101 			specifies the starting point, and the <i>to</i> attribute specifies
   6102 			the end point. The deprecated <i>time</i> attribute was formerly used
   6103 			to specify time with the deprecated weekEndStart and weekEndEnd
   6104 			elements, which were themselves inherently <i>from</i> or <i>to</i>.
   6105 		</p>
   6106 		<p>
   6107 			The data format is a restricted ISO 8601 format, restricted to the
   6108 			fields <i>year, month, day, hour, minute, </i>and<i> second</i> in
   6109 			that order, with &quot;-&quot; used as a separator between date
   6110 			fields, a space used as the separator between the date and the time
   6111 			fields, and &quot;:&quot; used as a separator between the time
   6112 			fields. If the minute or minute and second are absent, they are
   6113 			interpreted as zero. If the hour is also missing, then it is
   6114 			interpreted based on whether the attribute is <i>from</i> or <i>to</i>.
   6115 		</p>
   6116 		<ul>
   6117 			<li>
   6118 				<p class="note">
   6119 					<i>from</i> defaults to &quot;00:00:00&quot; (midnight at the start
   6120 					of the day).
   6121 				</p>
   6122 			</li>
   6123 			<li>
   6124 				<p class="note">
   6125 					<i>to </i>defaults to &quot;24:00:00&quot; (midnight at the end of
   6126 					the day).
   6127 				</p>
   6128 			</li>
   6129 		</ul>
   6130 		<p class="note">
   6131 			That is, Friday at 24:00:00 is the same time as Saturday at 00:00:00.
   6132 			Thus when the hour is missing, the <i>from and to</i> are interpreted
   6133 			inclusively: the range includes all of the day mentioned.
   6134 		</p>
   6135 		<p class="note">For example, the following are equivalent:</p>
   6136 		<table style="margin-top: 0.5em; margin-bottom: 0.5em" id="table25">
   6137 			<tr>
   6138 				<td>&lt;usesMetazone from=&quot;1991-10-27&quot;
   6139 					to=&quot;2006-04-02&quot; .../&gt;</td>
   6140 			</tr>
   6141 			<tr>
   6142 				<td>&lt;usesMetazone from=&quot;1991-10-27 00:00:00&quot;
   6143 					to=&quot;2006-04-02 24:00:00&quot; .../&gt;</td>
   6144 			</tr>
   6145 			<tr>
   6146 				<td>&lt;usesMetazone from=&quot;1991-10-<font color="#FF0000"><b>26
   6147 							24</b></font>:00:00&quot; to=&quot;2006-04-<font color="#FF0000"><b>03
   6148 							00</b></font>:00:00&quot; .../&gt;
   6149 				</td>
   6150 			</tr>
   6151 		</table>
   6152 
   6153 		<p>
   6154 			If the <i>from</i> element is missing, it is assumed to be as far
   6155 			backwards in time as there is data for; if the <i>to</i> element is
   6156 			missing, then it is from this point onwards, with no known end point.
   6157 		</p>
   6158 		<p>The dates and times are specified in local time, unless
   6159 			otherwise noted. (In particular, the metazone values are in UTC (also
   6160 			known as GMT).</p>
   6161 		<h4>
   6162 			<a name="Text_Directionality" href="#Text_Directionality">5.3.2
   6163 				Text Directionality</a>
   6164 		</h4>
   6165 		<p>The content of certain elements, such as date or number
   6166 			formats, may consist of several sub-elements with an inherent order
   6167 			(for example, the year, month, and day for dates). In some cases, the
   6168 			order of these sub-elements may be changed depending on the
   6169 			bidirectional context in which the element is embedded.</p>
   6170 		<p>For example, short date formats in languages such as Arabic may
   6171 			contain neutral or weak characters at the beginning or end of the
   6172 			element content. In such a case, the overall order of the
   6173 			sub-elements may change depending on the surrounding text.</p>
   6174 		<p>Element content whose display may be affected in this way
   6175 			should include an explicit direction mark, such as U+200E
   6176 			LEFT-TO-RIGHT MARK or U+200F RIGHT-TO-LEFT MARK, at the beginning or
   6177 			end of the element content, or both.</p>
   6178 		<h4>
   6179 			<a name="Unicode_Sets" href="#Unicode_Sets">5.3.3 Unicode Sets</a>
   6180 		</h4>
   6181 		<p>
   6182 			Some attribute values or element contents use <em>UnicodeSet</em>
   6183 			notation. A UnicodeSet represents a finite set of Unicode code points
   6184 			and strings, and is defined by lists of code points and strings,
   6185 			Unicode property sets, and set operators, all bounded by square
   6186 			brackets. In this context, a code point means a string consisting of
   6187 			exactly one code point.
   6188 		</p>
   6189 		<p>
   6190 			A UnicodeSet  implements the  semantics in <i>UTS
   6191 			#18: Unicode Regular Expressions</i> [<a
   6192 				href="http://www.unicode.org/reports/tr41/#UTS18">UTS18</a>] Levels 1 &amp; 2 that are relevant to determining sets of characters. Note however that it may deviate from the syntax provided in [<a
   6193 				href="http://www.unicode.org/reports/tr41/#UTS18">UTS18</a>], which is illustrative rather than a requirement. There is one exception to the supported semantics, Section <a href="http://unicode.org/reports/tr18/#RL2.6">RL2.6</a> <em>Wildcards in Property Values</em>. That feature can be supported in clients such as ICU by implementing a hook as is done in the <a href="https://unicode.org/cldr/utility/list-unicodeset.jsp?a=\p{name=/APPLE/}">online UnicodeSet utilities</a>.</p>
   6194 		<p>A UnicodeSet may be cited in specifications
   6195 				outside of the domain of LDML. In such a case, the specification may
   6196 				specify a subset of the syntax provided here.</p>
   6197 	  <p>The following provides EBNF syntax for a UnicodeSet:</p>
   6198 	  <div align='center'>
   6199 		<table class='simple'>
   6200 <tr>
   6201   <th>Symbol</th>
   6202   <th>Expression</th>
   6203   <th>Examples</th>
   6204 </tr>
   6205 <tr><th>root</th>
   6206 	<td><code>= prop <br>| '[-]' <br>| '[' [\-\^]? s seq+ ']'</code></td>
   6207 	<td>\p{x=y},<br>
   6208 	  [abc]</td>
   6209 </tr>
   6210 <tr><th>seq</th>
   6211 	<td><code>= root (s [\&amp;\-] s root)* s <br>| range s</code></td>
   6212 	<td>[abc]-[cde], a	  <br></td>
   6213 </tr>
   6214 <tr><th>range</th>
   6215 	<td><code>= char ('-' char)? <br>| '{' (s char)+ s '}'</code></td>
   6216 	<td>a, a-c, {abc}</td>
   6217 </tr>
   6218 <tr><th>prop</th>
   6219 	<td><code>= '\\' [pP] '{' propName ([=] s value1+)? '}' <br>| '[:' '^'? propName ([=] s value2+)? ':]'</code></td>
   6220 	<td>\p{x=y}, [:x=y:]<br></td>
   6221 </tr>
   6222 <tr><th>propName</th>
   6223 	<td><code>= s [A-Za-z0-9] [A-Za-z0-9_\x20]* s</code></td>
   6224 	<td>General_Category,<br>
   6225 	  General Category</td>
   6226 </tr>
   6227 <tr><th>value1</th>
   6228 	<td><code>= [^\}] <br>
   6229 	  | '\\' quoted </code></td>
   6230 	<td>Lm,<br>
   6231 	  \n,<br>
   6232 	  \}</td>
   6233 </tr>
   6234 <tr><th>value2</th>
   6235 	<td><code>= [^:] <br>
   6236 	  | '\\' quoted</code></td>
   6237 	<td>Lm,<br>
   6238       \n,<br>
   6239       \:</td>
   6240 </tr>
   6241 <tr><th>char</th>
   6242 	<td><code>= [^\&amp; \- \[ \[ \] \\ \} \{ [:Pat_WS:]] <br>
   6243 	  | '\\' quoted</code></td>
   6244 	<td>a, b, c, \n</td>
   6245 </tr>
   6246 <tr><th>quoted</th>
   6247 <td><code>= 'u' (hex{4} | bracketedHex) <br>  
   6248 	| 'x' (hex{2} | bracketedHex) <br>  | 'U00' ('0' hex{5} | '10' hex{4}) <br>| 'N{' propName '}' <br>| [\u0000-\U00010FFFF]</code></td>
   6249 <td>&nbsp;</td>
   6250 </tr>
   6251 <tr><th>bracketedHex</th>
   6252 	<td><code>= '{' s hexCodePoint (s hexCodePoint)* s '}'</code></td>
   6253 	<td>{61 2019 62}</td>
   6254 </tr>
   6255 <tr><th>hexCodePoint</th>
   6256 	<td><code>= hex{1,5} | '10' hex{4}</code></td>
   6257 	<td>&nbsp;</td>
   6258 </tr>
   6259 <tr><th>hex</th>
   6260 	<td><code>= [0-9A-Fa-f]</code></td>
   6261 	<td>&nbsp;</td>
   6262 </tr>
   6263 <tr><th>s</th>
   6264 	<td><code>= [:Pattern_White_Space:]*</code></td>
   6265 	<td>optional whitespace</td>
   6266 </tr>
   6267 	</table>
   6268 </div>
   6269 		<p>Some constraints on UnicodeSet syntax are not captured by this EBNF. Notably, property names and values are restricted to those supported by the implementation.</p>
   6270 		<p>The syntax characters are listed in the table below:</p>
   6271 		<table>
   6272 		  <tbody>
   6273 		    <tr>
   6274 		      <th>Char</th>
   6275 		      <th>Hex</th>
   6276 		      <th>Name</th>
   6277 		      <th>Usage</th>
   6278 	        </tr>
   6279 		    <tr>
   6280 		      <td>$</td>
   6281 		      <td>U+0024</td>
   6282 		      <td>DOLLAR SIGN</td>
   6283 		      <td>Equivalent of \uFFFF (This is for implementations that return \uFFFF when accessing before the first or after the last character)</td>
   6284 	        </tr>
   6285 		    <tr>
   6286 		      <td>&amp;</td>
   6287 		      <td>U+0026</td>
   6288 		      <td>AMPERSAND</td>
   6289 		      <td>Intersecting UnicodeSets</td>
   6290 	        </tr>
   6291 			    <tr>
   6292 		      <td>-</td>
   6293 		      <td>U+002D</td>
   6294 		      <td>HYPHEN-MINUS</td>
   6295 			      <td>Ranges of characters; also set difference.</td>
   6296         </tr>
   6297 	    <tr>
   6298 		      <td>:</td>
   6299 		      <td>U+003A</td>
   6300 		      <td>COLON</td>
   6301 		      <td>POSIX-style property syntax</td>
   6302 	        </tr>
   6303 		    <tr>
   6304 		      <td>[</td>
   6305 		      <td>U+005B</td>
   6306 		      <td>LEFT SQUARE BRACKET</td>
   6307 		      <td>Grouping; POSIX property syntax</td>
   6308 	        </tr>
   6309 		    <tr>
   6310 		      <td>]</td>
   6311 		      <td>U+005D</td>
   6312 		      <td>RIGHT SQUARE BRACKET</td>
   6313 		      <td>Grouping; POSIX property syntax</td>
   6314 	        </tr>
   6315 		    <tr>
   6316 		      <td>\</td>
   6317 		      <td>U+005C</td>
   6318 		      <td>REVERSE SOLIDUS</td>
   6319 		      <td>Escaping</td>
   6320 	        </tr>
   6321 		    <tr>
   6322 		      <td>^</td>
   6323 		      <td>U+005E</td>
   6324 		      <td>CIRCUMFLEX ACCENT</td>
   6325 		      <td>Posix negation syntax</td>
   6326 	        </tr>
   6327 		    <tr>
   6328 		      <td>{</td>
   6329 		      <td>U+007B</td>
   6330 		      <td>LEFT CURLY BRACKET</td>
   6331 		      <td>Strings in set; Perl property syntax</td>
   6332 	        </tr>
   6333 		    <tr>
   6334 		      <td>}</td>
   6335 		      <td>U+007D</td>
   6336 		      <td>RIGHT CURLY BRACKET</td>
   6337 			      <td>Strings in set; Perl property syntax</td>
   6338         </tr>
   6339 		    <tr>
   6340 		      <td>&nbsp;</td>
   6341 		      <td>U+0020 U+0009..U+000D U+0085<br>
   6342 	          U+200E U+200F<br>
   6343 	          U+2028 U+2029</td>
   6344 		      <td>ASCII whitespace,<br>
   6345 	          LRM, RLM,<br>
   6346 	          LINE/PARAGRAPH SEPARATOR</td>
   6347 		      <td>Ignored except when escaped</td>
   6348 	        </tr>
   6349 	      </tbody>
   6350 	  </table>
   6351 	  <br>
   6352 		<h5>
   6353 			<a href="#Lists_of_Code_Points" name="Lists_of_Code_Points">5.3.3.1
   6354 				Lists of Code Points</a>
   6355 		</h5>
   6356 		<p>
   6357 			Lists are a sequence of strings that may include ranges, which are
   6358 			indicated by a &#39;-&#39; between two code points, as in
   6359 			&quot;a-z&quot;. The sequence<em> start-end</em> specifies the range
   6360 			of all code points from the start to end, inclusive, in Unicode
   6361 			order. For example, <b>[a c d-f m]</b> is equivalent to <b>[a c d
   6362 				e f m]</b>. Whitespace can be freely used for clarity, as <b>[a c
   6363 				d-f m]</b> means the same as <b>[acd-fm]</b>.
   6364 		</p>
   6365 		<p>
   6366 			A string with multiple code points is represented in a list by being
   6367 			surrounded by curly braces, such as in <strong>[a-z {ch}]</strong>.
   6368 			It can be used with the range notation, as described in <em>Section
   6369 				<a href="#String_Range">5.3.4 String Range</a>
   6370 			</em>. There is an additional restriction on string ranges in a
   6371 			UnicodeSet: the number of codepoints in the first string of the range
   6372 			must be identical to the number in the second. Thus [{ab}-{c}] and
   6373 			[{ab}-c] are invalid.
   6374 		</p>
   6375 		<p>In UnicodeSets, there are two ways to quote syntax code points:
   6376 		</p>
   6377 		<p>
   6378 			<a name="Backslash_Escapes"></a>Outside of single quotes, certain
   6379 			backslashed code point sequences can be used to quote code points:
   6380 		</p>
   6381 	  <table class='simple'>
   6382 			<tr>
   6383 			  <td>\x{h...h}<br>
   6384 	          \u{h...h}</td>
   6385 			  <td>list of 1-6 hex digits ([0-9A-Fa-f]), separated by spaces</td>
   6386 	    </tr>
   6387 			<tr>
   6388 			  <td>\xhh</td>
   6389 			  <td>1-2 hex digits</td>
   6390 	    </tr>
   6391 			<tr>
   6392 				<td>\uhhhh</td>
   6393 				<td>Exactly 4 hex digits</td>
   6394 			</tr>
   6395 			<tr>
   6396 				<td>\Uhhhhhhhh</td>
   6397 				<td>Exactly 8 hex digits</td>
   6398 			</tr>
   6399 			<tr>
   6400 				<td>\a</td>
   6401 				<td>U+0007 (BEL / ALERT)</td>
   6402 			</tr>
   6403 			<tr>
   6404 				<td>\b</td>
   6405 				<td>U+0008 (BACKSPACE)</td>
   6406 			</tr>
   6407 			<tr>
   6408 				<td>\t</td>
   6409 				<td>U+0009 (TAB / CHARACTER TABULATION)</td>
   6410 			</tr>
   6411 			<tr>
   6412 				<td>\n</td>
   6413 				<td>U+000A (LINE FEED)</td>
   6414 			</tr>
   6415 			<tr>
   6416 				<td>\v</td>
   6417 				<td>U+000B (LINE TABULATION)</td>
   6418 			</tr>
   6419 			<tr>
   6420 				<td>\f</td>
   6421 				<td>U+000C (FORM FEED)</td>
   6422 			</tr>
   6423 			<tr>
   6424 				<td>\r</td>
   6425 				<td>U+000D (CARRIAGE RETURN)</td>
   6426 			</tr>
   6427 			<tr>
   6428 				<td>\\</td>
   6429 				<td>U+005C (BACKSLASH / REVERSE SOLIDUS)</td>
   6430 			</tr>
   6431 			<tr>
   6432 				<td>\N{name}</td>
   6433 				<td>The Unicode code point named &quot;name&quot;.</td>
   6434 			</tr>
   6435 			<tr>
   6436 				<td>\p{},\P{}</td>
   6437 				<td>Unicode property (see below)</td>
   6438 			</tr>
   6439 		</table><br>
   6440 	  <p>Anything else following a backslash is mapped to itself, except
   6441 			the property syntax described below, or in an environment where it is
   6442 			defined to have some special meaning.		</p>
   6443 	  <p>
   6444 		  Any code point formed as the result of a backslash escape loses any
   6445 			special meaning and is treated as a literal. In particular, note that
   6446 			\x, \u and \U escapes create literal code points. (In contrast, Java
   6447 			treats Unicode escapes as just a way to represent arbitrary code
   6448 			points in an ASCII source file, and any resulting code points are <i><b>not</b></i>
   6449 		  tagged as literals.)
   6450 		</p>
   6451 		<p>
   6452 			Unicode property sets are defined as described as described in <i>UTS
   6453 				#18: Unicode Regular Expressions</i> [<a
   6454 				href="http://www.unicode.org/reports/tr41/#UTS18">UTS18</a>], Level
   6455 			1 and RL2.5, including the syntax where given. For an example of a
   6456 			concrete implementation of this, see [<a href="#ICUUnicodeSet">ICUUnicodeSet</a>].
   6457 		</p>
   6458 		<h5>
   6459 			<a href="#Unicode_Properties" name="Unicode_Properties">5.3.3.2
   6460 				Unicode Properties</a>
   6461 		</h5>
   6462 
   6463 		<p>
   6464 			Briefly, Unicode property sets are specified by any Unicode property
   6465 			and a value of that property, such as <b>[:General_Category=Letter:]</b>.
   6466 			for Unicode letters or <b>\p{uppercase}</b> is the set of upper case
   6467 			letters in Unicode. The property names are defined by the
   6468 			PropertyAliases.txt file and the property values by the
   6469 			PropertyValueAliases.txt file. For more information, see [<a
   6470 				href="http://unicode.org/reports/tr41/#UAX44">UAX44</a>]. The syntax
   6471 			for specifying the property sets is an extension of either POSIX or
   6472 			Perl syntax, by the addition of &quot;=&lt;value&gt;&quot;. For
   6473 			example, you can match letters by using the POSIX-style syntax:
   6474 		</p>
   6475 		<p>
   6476 			<b>[:General_Category=Letter:]</b>
   6477 		</p>
   6478 		<p>or by using the Perl-style syntax</p>
   6479 		<p>
   6480 			<b>\p{General_Category=Letter}</b>.
   6481 		</p>
   6482 		<p>
   6483 			Property names and values are case-insensitive, and whitespace,
   6484 			&quot;-&quot;, and &quot;_&quot; are ignored. The property name can
   6485 			be omitted for the <strong>General_Category</strong> and <strong>Script</strong>
   6486 			properties, but is required for other properties. If the property
   6487 			value is omitted, it is assumed to represent a boolean property with
   6488 			the value &quot;true&quot;. Thus <b>[:Letter:]</b> is equivalent to <b>[:General_Category=Letter:]</b>,
   6489 			and <b>[:Wh-ite-s pa_ce:]</b> is equivalent to <b>[:Whitespace=true:]</b>.
   6490 		</p>
   6491 		<p>
   6492 			The table below shows the two kinds of syntax: POSIX and Perl style.
   6493 			Also, the table shows the &quot;Negative&quot; version, which is a
   6494 			property that excludes all code points of a given kind. For example,
   6495 			<b>[:^Letter:]</b> matches all code points that are not <b>[:Letter:]</b>.
   6496 		</p>
   6497 		<table>
   6498 			<tr>
   6499 				<th>&nbsp;</th>
   6500 				<th>Positive</th>
   6501 				<th>Negative</th>
   6502 			</tr>
   6503 			<tr>
   6504 				<td>POSIX-style Syntax</td>
   6505 				<td>[:type=value:]</td>
   6506 				<td>[:^type=value:]</td>
   6507 			</tr>
   6508 			<tr>
   6509 				<td>Perl-style Syntax</td>
   6510 				<td>\p{type=value}</td>
   6511 				<td>\P{type=value}</td>
   6512 			</tr>
   6513 		</table>
   6514 		<h5>
   6515 			<a href="#Boolean_Operations" name="Boolean_Operations">5.3.3.3
   6516 				Boolean Operations</a>
   6517 		</h5>
   6518 
   6519 		<p>The low-level lists or properties then can be freely combined
   6520 			with the normal set operations (union, inverse, difference, and
   6521 			intersection):</p>
   6522 		<ul>
   6523 			<li>To union two sets, simply concatenate them. For example, <b>[[:letter:]
   6524 					[:number:]]</b></li>
   6525 			<li>To intersect two sets, use the &#39;&amp;&#39; operator. For
   6526 				example, <b>[[:letter:] &amp; [a-z]] </b>
   6527 			</li>
   6528 			<li>To take the set-difference of two sets, use the &#39;-&#39;
   6529 				operator. For example, <b>[[:letter:] - [a-z]]</b>
   6530 			</li>
   6531 			<li>To invert a set, place a &#39;^&#39; immediately after the
   6532 				opening &#39;[&#39;. For example, <b>[^a-z]</b>. In any other
   6533 				location, the &#39;^&#39; does not have a special meaning. The
   6534 				inversion [^X] is equivalent to [[\x{0}-\x{10FFFF}]-[X]]. Thus
   6535 				multi-code point strings are discarded.
   6536 			</li>
   6537 			<li>Symmetric difference (~) is not supported.</li>
   6538 		</ul>
   6539 		<p>
   6540 			The binary operators &#39;&amp;&#39;, &#39;-&#39;, and the implicit
   6541 			union have equal precedence and bind left-to-right. Thus <b>[[:letter:]-[a-z]-[\u0100-\u01FF]]</b>
   6542 			is equal to <b>[[[:letter:]-[a-z]]-[\u0100-\u01FF]]</b>. Another
   6543 			example is the set <b>[[ace][bdf] - [abc][def]]</b>, which is not the
   6544 			empty set, but instead equal to <b>[[[[ace] [bdf]] - [abc]]
   6545 				[def]]</b>, which equals <b>[[[abcdef] - [abc]] [def]]</b>, which equals
   6546 			<b>[[def] [def]]</b>, which equals <b>[def]</b>.
   6547 		</p>
   6548 		<p>
   6549 			<strong>One caution:</strong> the &#39;&amp;&#39; and &#39;-&#39;
   6550 			operators operate between sets. That is, they must be immediately
   6551 			preceded and immediately followed by a set. For example, the pattern
   6552 			<b>[[:Lu:]-A]</b> is illegal, since it is interpreted as the set <b>[:Lu:]</b>
   6553 			followed by the incomplete range <b>-A</b>. To specify the set of
   6554 			upper case letters except for &#39;A&#39;, enclose the &#39;A&#39; in
   6555 			brackets: <b>[[:Lu:]-[A]]</b>.
   6556 		</p>
   6557 		<h5>
   6558 			<a href="#UnicodeSet_Examples" name="UnicodeSet_Examples">5.3.3.4
   6559 				UnicodeSet Examples</a>
   6560 		</h5>
   6561 		<p>The following table summarizes the syntax that can be used.</p>
   6562 		<table style="margin-top: 0.5em; margin-bottom: 0.5em" id="table18">
   6563 			<tr>
   6564 				<th>Example</th>
   6565 				<th>Description</th>
   6566 			</tr>
   6567 			<tr>
   6568 				<td nowrap>[a]</td>
   6569 				<td>The set containing &#39;a&#39; alone</td>
   6570 			</tr>
   6571 			<tr>
   6572 				<td nowrap>[a-z]</td>
   6573 				<td>The set containing &#39;a&#39; through &#39;z&#39; and all
   6574 					letters in between, in Unicode order.<br> Thus it is the same
   6575 					as [\u0061-\u007A].
   6576 				</td>
   6577 			</tr>
   6578 			<tr>
   6579 				<td nowrap>[^a-z]</td>
   6580 				<td>The set containing all code points but &#39;a&#39; through
   6581 					&#39;z&#39;.<br> Thus it is the same as [\u0000-\u0060
   6582 					\u007B-\x{10FFFF}].
   6583 				</td>
   6584 			</tr>
   6585 			<tr>
   6586 				<td nowrap>[[pat1][pat2]]</td>
   6587 				<td>The union of sets specified by pat1 and pat2</td>
   6588 			</tr>
   6589 			<tr>
   6590 				<td nowrap>[[pat1]&amp;[pat2]]</td>
   6591 				<td>The intersection of sets specified by pat1 and pat2</td>
   6592 			</tr>
   6593 			<tr>
   6594 				<td nowrap>[[pat1]-[pat2]]</td>
   6595 				<td>The asymmetric difference of sets specified by pat1 and
   6596 					pat2</td>
   6597 			</tr>
   6598 			<tr>
   6599 				<td nowrap>[a {ab} {ac}]</td>
   6600 				<td>The code point &#39;a&#39; and the multi-code point strings
   6601 					&quot;ab&quot; and &quot;ac&quot;</td>
   6602 			</tr>
   6603 			<tr>
   6604 			  <td nowrap>[x\u{61 2019 62}y]</td>
   6605 			  <td>Equivalent to [x\u0061\u201\u0062y] (= [xaby])</td>
   6606 		  </tr>
   6607 			<tr>
   6608 				<td nowrap>[{ax}-{bz}]</td>
   6609 				<td>The set containing [{ax} {ay} {az} {bx} {by} {bz}], using
   6610 					the range syntax to get all the strings from {ax} to {bz} as
   6611 					described in <em>Section <a href="#String_Range">5.3.4
   6612 							String Range</a></em>.
   6613 				</td>
   6614 			</tr>
   6615 			<tr>
   6616 				<td nowrap>[:Lu:]</td>
   6617 				<td>The set of code points with a given property value, as
   6618 					defined by PropertyValueAliases.txt. In this case, these are the
   6619 					Unicode upper case letters. The long form for this is <b>[:General_Category=Uppercase_Letter:]</b>.
   6620 				</td>
   6621 			</tr>
   6622 			<tr>
   6623 				<td nowrap>[:L:]</td>
   6624 				<td>The set of code points belonging to all Unicode categories
   6625 					starting with &#39;L&#39;, that is, <b>[[:Lu:][:Ll:][:Lt:][:Lm:][:Lo:]]</b>.
   6626 					The long form for this is <b>[:General_Category=Letter:]</b>.
   6627 				</td>
   6628 			</tr>
   6629 		</table>
   6630 		<br>
   6631 		<h4>
   6632 			<a name="String_Range" href="#String_Range">5.3.4 String Range</a>
   6633 		</h4>
   6634 		<p>A String Range is a compact format for specifying a list of
   6635 			strings.</p>
   6636 		<p>
   6637 			<strong>Syntax:<br>
   6638 			</strong>
   6639 		</p>
   6640 		<blockquote>
   6641 			<p>
   6642 				X <em>sep</em> Y<br>
   6643 			</p>
   6644 		</blockquote>
   6645 		<p>The separator and the format of strings X, Y may vary depending
   6646 			on the domain. For example,</p>
   6647 		<ul>
   6648 			<li>for the validity files the separator is ~,</li>
   6649 			<li>for UnicodeSet the separator is
   6650 				-, and any multi-codepoint string is
   6651 				enclosed in {}.
   6652 			</li>
   6653 		</ul>
   6654 		<p>
   6655 			<strong>Validity:<br>
   6656 			</strong>
   6657 		</p>
   6658 		<blockquote>
   6659 			<p>
   6660 				A string range X <em>sep</em> Y is valid iff len(X)  len(Y) &gt; 0,
   6661 				where len(X) is the length of X in code points.
   6662 			</p>
   6663 			<p>
   6664 				<em>There may be additional, domain-specific requirements for
   6665 					validity of the expansion of the string range.</em>
   6666 			</p>
   6667 		</blockquote>
   6668 		<p>
   6669 			<strong>Interpretation:<br>
   6670 			</strong>
   6671 		</p>
   6672 		<ol>
   6673 			<li>Break X into P and S, where len(S) = len(Y)
   6674 				<ul>
   6675 					<li>Note that P will be an empty string if the lengths of X
   6676 						and Y are equal.</li>
   6677 				</ul>
   6678 			</li>
   6679 			<li>Form the combinations of all P+(s..y)+(s..y)+...(s..y)
   6680 				<ul>
   6681 					<li>s is the first code point in S, etc.</li>
   6682 				</ul>
   6683 			</li>
   6684 		</ol>
   6685 		<p>
   6686 			<strong>Examples:</strong>
   6687 		</p>
   6688 		<table>
   6689 			<tbody>
   6690 				<tr>
   6691 					<td>ab-ad</td>
   6692 					<td></td>
   6693 					<td>ab ac ad</td>
   6694 				</tr>
   6695 				<tr>
   6696 					<td>ab-d</td>
   6697 					<td></td>
   6698 					<td>ab ac ad</td>
   6699 				</tr>
   6700 				<tr>
   6701 					<td>ab-cd</td>
   6702 					<td></td>
   6703 					<td>ab ac ad bb bc bd cb cc cd</td>
   6704 				</tr>
   6705 				<tr>
   6706 					<td>-</td>
   6707 					<td></td>
   6708 					<td>    </td>
   6709 				</tr>
   6710 				<tr>
   6711 					<td>-</td>
   6712 					<td></td>
   6713 					<td>    </td>
   6714 				</tr>
   6715 			</tbody>
   6716 		</table>
   6717 		<br>
   6718 		<h3>
   6719 			<a name="Identity_Elements" href="#Identity_Elements">5.4
   6720 				Identity Elements</a>
   6721 		</h3>
   6722 		<p class="dtd">&lt;!ELEMENT identity (alias | (version,
   6723 			generation?, language, script?, territory?, variant?, special*) )
   6724 			&gt;</p>
   6725 		<p>The identity element contains information identifying the
   6726 			target locale for this data, and general information about the
   6727 			version of this data.</p>
   6728 		<p class="element2">
   6729 			&lt;version number=&quot;<u>$</u>Revision: 1.227 <u>$</u>&quot;&gt;
   6730 		</p>
   6731 		<p>The version element provides, in an attribute, the version of
   6732 			this file.&nbsp; The contents of the element can contain textual
   6733 			notes about the changes between this version and the last. For
   6734 			example:</p>
   6735 		<blockquote>
   6736 			<pre>&lt;version number=&quot;<span style="color: blue">1.1</span>&quot;&gt;<span
   6737 					style="color: blue">Various notes and changes in version 1.1</span>&lt;/version&gt;</pre>
   6738 			<p>This is not to be confused with the version attribute on the
   6739 				ldml element, which tracks the dtd version.</p>
   6740 		</blockquote>
   6741 		<p class="element2">
   6742 			&lt;generation date=&quot;<u>$</u>Date: 2007/07/17 23:41:16 <u>$</u>&quot;
   6743 			/&gt;
   6744 		</p>
   6745 		<p>The generation element is now deprecated. It was used to
   6746 			contain the last modified date for the data. This could be in two
   6747 			formats: ISO 8601 format, or CVS format (illustrated by the example
   6748 			above).</p>
   6749 		<p class="element2">
   6750 			&lt;language type=&quot;<span style="color: blue">en</span>&quot;/&gt;
   6751 		</p>
   6752 		<p>The language code is the primary part of the specification of
   6753 			the locale id, with values as described above.</p>
   6754 		<p class="element2">
   6755 			&lt;script type=&quot;<span style="color: blue">Latn</span>&quot;
   6756 			/&gt;
   6757 		</p>
   6758 		<p>The script code may be used in the identification of written
   6759 			languages, with values described above.</p>
   6760 		<p class="element2">
   6761 			&lt;territory type=&quot;<span style="color: blue">US</span>&quot;/&gt;
   6762 		</p>
   6763 		<p>The territory code is a common part of the specification of the
   6764 			locale id, with values as described above.</p>
   6765 		<p class="element2">
   6766 			&lt;variant type=&quot;<span class="attributeValue">NYNORSK</span>&quot;/&gt;
   6767 		</p>
   6768 		<p>The variant code is the tertiary part of the specification of
   6769 			the locale id, with values as described above.</p>
   6770 
   6771 		<p>
   6772 			When combined according to the rules described in <i> <a
   6773 				href="#Unicode_Language_and_Locale_Identifiers">Section 3,
   6774 					Unicode Language and Locale Identifiers</a></i>, the language element,
   6775 			along with any of the optional script, territory, and variant
   6776 			elements, must identify a known, stable locale identifier. Otherwise,
   6777 			it is an error.
   6778 		</p>
   6779 		<h3>
   6780 			<a name="Valid_Attribute_Values" href="#Valid_Attribute_Values">5.5
   6781 				Valid Attribute Values</a>
   6782 		</h3>
   6783 		<p>The valid attribute values, as well as other validity
   6784 			information is contained in the supplementalMetadata.xml file. (Some,
   6785 			but not all, of this information could have been represented in XML
   6786 			Schema or a DTD.) Most of this is primarily for internal tool use.</p>
   6787 
   6788 		<p>The &lt;elementOrder&gt; and &lt;attributeOrder&gt; elements
   6789 			are now deprecated, since the information regarding element and
   6790 			attribute ordering is now contained in the DTD.</p>
   6791 		<p>
   6792 			<i>The suppress elements are those that are suppressed in
   6793 				canonicalization.</i>
   6794 		</p>
   6795 		<p>
   6796 			<i>The serialElements are those that do not inherit, and may have
   6797 				ordering</i>
   6798 		</p>
   6799 		<blockquote>
   6800 			<pre>&lt;serialElements&gt;attributeValues base comment extend first_non_ignorable first_primary_ignorable
   6801 first_secondary_ignorable first_tertiary_ignorable first_trailing first_variable i ic languagePopulation
   6802 last_non_ignorable last_primary_ignorable last_secondary_ignorable last_tertiary_ignorable last_trailing
   6803 last_variable optimize p pc reset rules s sc settings suppress_contractions t tRule tc variable x
   6804 &lt;/serialElements&gt;</pre>
   6805 		</blockquote>
   6806 		<p>
   6807 			<i>The validity elements give the possible attribute values. They
   6808 				are in the format of a series of variables, followed by
   6809 				attributeValues. </i>
   6810 		</p>
   6811 		<blockquote>
   6812 			<pre>&lt;variable id=&quot;$calendar&quot; type=&quot;choice&quot;&gt;
   6813 buddhist coptic ethiopic ethiopic-amete-alem chinese gregorian hebrew indian islamic islamic-civil
   6814 japanese arabic civil-arabic thai-buddhist persian roc&lt;/variable&gt;</pre>
   6815 		</blockquote>
   6816 		<p>The types indicate the style of match:</p>
   6817 		<ul>
   6818 			<li>choice: for a list of possible values</li>
   6819 			<li>regex: for a regular expression match</li>
   6820 			<li>notDoneYet: for items without matching criteria</li>
   6821 			<li>locale: for locale IDs</li>
   6822 			<li>list: for a space-delimited list of values</li>
   6823 			<li>path: for a valid [<a href="#XPath">XPath</a>]
   6824 			</li>
   6825 		</ul>
   6826 		<p>If the attribute order=&quot;given&quot; is supplied, it
   6827 			indicates the order of elements when canonicalizing (see below).</p>
   6828 		<p>The variable values are intended for internal testing, and the
   6829 			definition and usage may change between releases. They do not
   6830 			necessarily include all valid elements. For example, for primary
   6831 			language codes, they include the subset that occur in CLDR locale
   6832 			data. They are intended for a particular version of CLDR, and may
   6833 			omit codes that were present in earlier versions, such as deprecated
   6834 			codes.</p>
   6835 		<p>The &lt;deprecated&gt; element lists elements, attributes, and
   6836 			attribute values that are deprecated. If any deprecatedItems element
   6837 			contains more than one attribute, then only the listed combinations
   6838 			are deprecated. Thus the following means not that the draft attribute
   6839 			is deprecated, but that the true and false values for that attribute
   6840 			are:</p>
   6841 		<blockquote>
   6842 			<pre>&lt;deprecatedItems attributes=&quot;draft&quot; values=&quot;true false&quot;/&gt; </pre>
   6843 		</blockquote>
   6844 		<p>
   6845 			Similarly, the following means that the <i>type</i> attribute is
   6846 			deprecated, but only for the listed elements:
   6847 		</p>
   6848 		<blockquote>
   6849 			<pre>&lt;deprecatedItems elements=&quot;abbreviationFallback default ... preferenceOrdering&quot; attributes=&quot;type&quot;/&gt; </pre>
   6850 		</blockquote>
   6851 		<p class="dtd">
   6852 			&lt;!ELEMENT blockingItems EMPTY &gt;<br> &lt;!ATTLIST
   6853 			blockingItems elements NMTOKENS #IMPLIED &gt;
   6854 		</p>
   6855 		<p>
   6856 			The blockingItems were used to indicate which elements (and their child elements)
   6857 			do not inherit. For example, because supplementalData is a blocking
   6858 			item, all paths containing the element <span class="element">supplementalData</span>
   6859 			do not inherit. However, <strong>the &lt;blockingItems&gt; element is now deprecated,</strong>
   6860 			having been replaced by the annotations in the DTD and the DTDData classes in CLDR tooling.
   6861 		</p>
   6862 		<pre class="dtd">&lt;!ELEMENT distinguishingItems EMPTY &gt;
   6863 &lt;!ATTLIST distinguishingItems exclude ( true | false ) #IMPLIED &gt;
   6864 &lt;!ATTLIST distinguishingItems elements NMTOKENS #IMPLIED &gt;
   6865 &lt;!ATTLIST distinguishingItems attributes NMTOKENS #IMPLIED &gt;</pre>
   6866 		<p>
   6867 			The distinguishing items were used to indicate which combinations of elements and
   6868 			attributes (in unblocked environments) are <i>distinguishing</i> in
   6869 			performing inheritance. For example, the attribute type is
   6870 			distinguishing <i>except</i> in combination with certain elements,
   6871 			such as in the following. However, <strong>the &lt;distinguishingItems&gt; element is now deprecated,</strong>
   6872 			having been replaced by the annotations in the DTD and the DTDData classes in CLDR tooling.
   6873 		</p>
   6874 		<pre>&lt;distinguishingItems
   6875   exclude=&quot;true&quot;
   6876   elements=&quot;default measurementSystem mapping abbreviationFallback preferenceOrdering&quot;
   6877   attributes=&quot;type&quot;/&gt;
   6878 </pre>
   6879 		<h3>
   6880 			<a name="Canonical_Form" href="#Canonical_Form">5.6 Canonical
   6881 				Form</a>
   6882 		</h3>
   6883 		<p>The following are restrictions on the format of LDML files to
   6884 			allow for easier parsing and comparison of files.</p>
   6885 		<p>Peer elements have consistent order. That is, if the DTD or
   6886 			this specification requires the following order in an element foo:</p>
   6887 		<pre>&lt;foo&gt;
   6888   &lt;pattern&gt;
   6889   &lt;somethingElse&gt;
   6890 &lt;/foo&gt;</pre>
   6891 		<p>It can never require the reverse order in a different element
   6892 			bar.</p>
   6893 		<pre>&lt;foo&gt;
   6894   &lt;somethingElse&gt;
   6895   &lt;pattern&gt;
   6896 &lt;/foo&gt;</pre>
   6897 		<p>Note that there was one case that had to be corrected in order
   6898 			to make this true. For that reason, pattern occurs twice under
   6899 			currency:</p>
   6900 		<pre class="dtd">&lt;!ELEMENT currency (alias | (pattern*, displayName?, symbol?, pattern*,
   6901 decimal?, group?, special*)) &gt;</pre>
   6902 		<p>
   6903 			<a href="http://www.w3.org/TR/REC-xml/">XML</a> files can have a wide
   6904 			variation in textual form, while representing precisely the same
   6905 			data. By putting the LDML files in the repository into a canonical
   6906 			form, this allows us to use the simple diff tools used widely (and in
   6907 			CVS) to detect differences when vetting changes, without those tools
   6908 			being confused. This is not a requirement on other uses of LDML; just
   6909 			simply a way to manage repository data more easily.
   6910 		</p>
   6911 		<h4>
   6912 			<a name="Content" href="#Content">5.6.1 Content</a>
   6913 		</h4>
   6914 		<ol>
   6915 			<li>All start elements are on their own line, indented by <i>depth</i>
   6916 				tabs.
   6917 			</li>
   6918 			<li>All end elements (except for leaf nodes) are on their own
   6919 				line, indented by <i>depth</i> tabs.
   6920 			</li>
   6921 			<li>Any leaf node with empty content is in the form
   6922 				&lt;foo/&gt;.</li>
   6923 			<li>There are no blank lines except within comments or content.</li>
   6924 			<li>Spaces are used within a start element. There are no extra
   6925 				spaces within elements.
   6926 				<ul>
   6927 					<li><code>&lt;version number=&quot;1.2&quot;/&gt;</code>, not
   6928 						<code>&lt;version&nbsp; number = &quot;1.2&quot; /&gt;</code></li>
   6929 					<li><code>&lt;/identity&gt;</code>, not <code>&lt;/identity
   6930 							&gt;</code></li>
   6931 				</ul>
   6932 			</li>
   6933 			<li>All attribute values use double quote (&quot;), not single
   6934 				(&#39;).</li>
   6935 			<li>There are no CDATA sections, and no escapes except those
   6936 				absolutely required.
   6937 				<ul>
   6938 					<li>no &amp;apos; since it is not necessary</li>
   6939 					<li>no &#39;&amp;#x61;&#39;, it would be just &#39;a&#39;</li>
   6940 				</ul>
   6941 			</li>
   6942 			<li>All attributes with defaulted values are suppressed.</li>
   6943 			<li>The draft and alt=&quot;proposed.*&quot; attributes are only
   6944 				on leaf elements.</li>
   6945 			<li>The tzid are canonicalized in the following way:
   6946 				<ol>
   6947 					<li type="a">All tzids as of as CLDR 1.1 (2004.06.08) in
   6948 						zone.tab are canonical.</li>
   6949 					<li>After that point, the first time a tzid is introduced,
   6950 						that is the canonical form.</li>
   6951 				</ol>
   6952 				<p>
   6953 					That is, new IDs are added, but existing ones keep the original
   6954 					form. The <i>TZ</i> timezone database keeps a set of equivalences
   6955 					in the &quot;backward&quot; file. These are used to map other tzids
   6956 					to the canonical form. For example, when
   6957 					<code>America/Argentina/Catamarca</code>
   6958 					was introduced as the new name for the previous
   6959 					<code>America/Catamarca</code>
   6960 					, a link was added in the backward file.
   6961 				</p>
   6962 				<p>
   6963 					<code>Link America/Argentina/Catamarca America/Catamarca</code>
   6964 				</p>
   6965 			</li>
   6966 		</ol>
   6967 		<p>
   6968 			<i>Example:</i>
   6969 		</p>
   6970 		<pre>&lt;ldml draft=&quot;unconfirmed&quot; &gt;
   6971 	&lt;identity&gt;
   6972 		&lt;version number=&quot;1.2&quot;/&gt;
   6973 		&lt;language type=&quot;en&quot;/&gt;
   6974 		&lt;territory type=&quot;AS&quot;/&gt;
   6975 	&lt;/identity&gt;
   6976 	&lt;numbers&gt;
   6977 		&lt;currencyFormats&gt;
   6978 			&lt;currencyFormatLength&gt;
   6979 				&lt;currencyFormat&gt;
   6980 					&lt;pattern&gt;#,##0.00;(#,##0.00)&lt;/pattern&gt;
   6981 				&lt;/currencyFormat&gt;
   6982 			&lt;/currencyFormatLength&gt;
   6983 		&lt;/currencyFormats&gt;
   6984 	&lt;/numbers&gt;
   6985 &lt;/ldml&gt;</pre>
   6986 		<h4>
   6987 			<a name="Ordering" href="#Ordering">5.6.2 Ordering</a>
   6988 		</h4>
   6989 		<p>An element is ordered first by the element name, and then if
   6990 			the element names are identical, by the sorted set of attribute-value
   6991 			pairs. For the latter, compare the first pair in each (in sorted
   6992 			order by attribute pair). If not identical, go to the second pair,
   6993 			and so on.</p>
   6994 		<p>Elements and attributes are ordered according to their order in
   6995 			the respective DTDs. Attribute value comparison is a bit more
   6996 			complicated, and may depend on the attribute and type. This is
   6997 			currently done with specific ordering tables.</p>
   6998 		<p>
   6999 			Any future additions to the DTD must be structured so as to allow
   7000 			compatibility with this ordering. See also <a
   7001 				href="#Valid_Attribute_Values">Section 5.5 Valid Attribute
   7002 				Values.</a>
   7003 		</p>
   7004 
   7005 		<h4>
   7006 			<a name="Comments" href="#Comments">5.6.3 Comments</a>
   7007 		</h4>
   7008 		<ol>
   7009 			<li>Comments are of the form &lt;!-- <i>stuff</i> --&gt;.
   7010 			</li>
   7011 			<li>They are logically attached to a node. There are 4 kinds:
   7012 				<ol>
   7013 					<li>Inline always appear after a leaf node, on the same line
   7014 						at the end. These are a single line.</li>
   7015 					<li>Preblock comments always precede the attachment node, and
   7016 						are indented on the same level.</li>
   7017 					<li>Postblock comments always follow the attachment node, and
   7018 						are indented on the same level.</li>
   7019 					<li>Final comment, after &lt;/ldml&gt;</li>
   7020 				</ol>
   7021 			</li>
   7022 			<li>Multiline comments (except the final comment) have each line
   7023 				after the first indented to one deeper level.</li>
   7024 		</ol>
   7025 		<p>
   7026 			<b>Examples:</b>
   7027 		</p>
   7028 		<pre>&lt;eraAbbr&gt;
   7029 	&lt;era type=&quot;0&quot;&gt;BC&lt;/era&gt; &lt;!-- might add alternate BDE in the future --&gt;
   7030 ...
   7031 &lt;timeZoneNames&gt;
   7032 	&lt;!-- Note: zones that do not use daylight time need further work --&gt;
   7033 	&lt;zone type=&quot;America/Los_Angeles&quot;&gt;
   7034 	...
   7035 	&lt;!-- Note: the following is known to be sparse,
   7036 		and needs to be improved in the future --&gt;
   7037 	&lt;zone type=&quot;Asia/Jerusalem&quot;&gt;</pre>
   7038 
   7039     <h3>
   7040 			<a name="DTD_Annotations" href="#DTD_Annotations">5.7 DTD Annotations</a>
   7041 	  </h3>
   7042 				<p>The information in a standard DTD is insufficient for use in CLDR. To make up for that, DTD annotations are added. These are of the form<br>
   7043 				&lt;!-- (a] ...--&gt;<br>
   7044 				and are included below the !ELEMENT or !ATTLIST line that they apply to. The current annotations are:</p>
   7045 				<table>
   7046                 <tr><th>Type</th><th>Description</th></tr>
   7047                 <tr>
   7048                   <td>&lt;!--@VALUE--&gt;</td>
   7049                   <td>The attribute is not distinguishing, and is treated like an element value</td></tr>
   7050                 <tr>
   7051                   <td>&lt;!--@METADATA--&gt;</td>
   7052                   <td>The attribute is a comment on the data, like the draft status. It is not typically used in implementations.</td>
   7053                 </tr>
   7054                 <tr>
   7055                   <td>&lt;!--@ORDERED--&gt;</td>
   7056                   <td>The element's children are ordered, and do not inherit.</td>
   7057                 </tr>
   7058                 <tr>
   7059                   <td>&lt;!--@DEPRECATED--&gt;</td>
   7060                   <td>The element or attribute is deprecated, and should not be used.</td>
   7061                 </tr>
   7062                 <tr>
   7063                   <td>&lt;!--@DEPRECATED: attribute-value1, attribute-value2--&gt;</td>
   7064                   <td>The attribute values are deprecated, and should not be used. Spaces
   7065                   	between tokens are not significant.</td>
   7066                 </tr>
   7067                 </table>
   7068 
   7069 				<p> There is additional information in the attributeValueValidity.xml
   7070 					file that is used internally for testing. For example, the following
   7071 					line indicates that the 'currency' element in the ldml dtd must have
   7072 					values from the bcp47 'cu' type.</p>
   7073 				<p class='example'> &lt;attributeValues dtds='ldml' elements='currency'
   7074 					attributes='type'&gt;$_bcp47_cu&lt;/attributeValues&gt;</p>
   7075 				<p>The element values may be literals, regular expressions, or variables
   7076 					(some of which are set programmatically according to other CLDR data,
   7077 					such as the above. However, the information as this point does not
   7078 					cover all attribute values, is used only for testing, and should not
   7079 					be used in implementations since the structure may change without
   7080 					notice.</p>
   7081 
   7082 		<h2>
   7083 			<a name="Property_Data" href="#Property_Data">6 Property Data</a>
   7084 		</h2>
   7085 		<p>Some data in CLDR does not use an XML format, but rather a
   7086 			semicolon-delimited format derived from that of the Unicode Character
   7087 			Database. That is because the data is more likely to be parsed by
   7088 			implementations that already parse UCD data. Those files are present
   7089 			in the common/properties directory.</p>
   7090 		<p>Each file has a header that explains the format and usage of
   7091 			the data.</p>
   7092 		<h3><a name="Script_Metadata" href="#Script_Metadata">6.1 Script Metadata</a></h3>
   7093 		<p><code>scriptMetadata.txt</code>: </p>
   7094 		<p>This file provides general information about scripts that may be useful to implementations processing text. The information is the best currently available, and may change between versions of CLDR. The format is similar to Unicode Character Database property file, and is documented in the header of the data file.</p>
   7095 		<h3><a name="Extended_Pictographic" href="#Extended_Pictographic">6.2 Extended Pictographic</a>        </h3>
   7096 		<p><code>ExtendedPictographic.txt</code></p>
   7097 	  <p>This file was used to define the ExtendedPictographic data used for future-proofing emoji behavior, especially in segmentation. As of Emoji version 11.0, the set of Extended_Pictographic is incorporated into the emoji data files found at <a href="https://unicode.org/Public/emoji/">unicode.org/Public/emoji/</a>.</p>
   7098 
   7099 
   7100 
   7101 
   7102 
   7103 
   7104 
   7105 
   7106 
   7107 
   7108 
   7109 
   7110 
   7111 
   7112 
   7113 
   7114 		
   7115 	  <h3><a name="Labels.txt" href="#Labels.txt">6.3 Labels.txt</a>        </h3>
   7116 		<p><code>labels.txt</code>: </p>
   7117 		  <p>This file provides general information about associations of labels to characters that may be useful to implementations of character-picking applications. The information is the best currently available, and may change between versions of CLDR. The format is similar to Unicode Character Database property file, and is documented in the header of the data file.</p>
   7118 		  <p>Initially, the contents are focused on emoji, but may be expanded in the future to other types of characters. Note that a character may have multiple labels.</p>
   7119 
   7120         <h2>
   7121 			<a name="Format_Parse_Issues" href="#Format_Parse_Issues">7
   7122 				Issues in Formatting and Parsing</a>
   7123 		</h2>
   7124 		<h3>
   7125 			<a name="Lenient_Parsing" href="#Lenient_Parsing">7.1 Lenient Parsing</a>
   7126 		</h3>
   7127 		<h4>
   7128 			<a name="Motivation" href="#Motivation">7.1.1 Motivation</a>
   7129 		</h4>
   7130 		<p>User input is frequently messy. Attempting to parse it by
   7131 			matching it exactly against a pattern is likely to be unsuccessful,
   7132 			even when the meaning of the input is clear to a human being. For
   7133 			example, for a date pattern of &quot;MM/dd/yy&quot;, the input
   7134 			&quot;June 1, 2006&quot; will fail.</p>
   7135 		<p>The goal of lenient parsing is to accept user input whenever it
   7136 			is possible to decipher what the user intended. Doing so requires
   7137 			using patterns as data to guide the parsing process, rather than an
   7138 			exact template that must be matched. This informative section
   7139 			suggests some heuristics that may be useful for lenient parsing of
   7140 			dates, times, and numbers.</p>
   7141 		<h4>
   7142 			<a name="Loose_Matching" href="#Loose_Matching">7.1.2 Loose Matching</a>
   7143 		</h4>
   7144 		<p>Loose matching ignores attributes of the strings being compared
   7145 			that are not important to matching. It involves the following steps:</p>
   7146 		<ul>
   7147 			<li>Remove &quot;.&quot; from currency symbols and other fields
   7148 				used for matching, and also from the input string unless:
   7149 				<ul>
   7150 					<li>&quot;.&quot; is in the decimal set, and</li>
   7151 					<li>its position in the input string is immediately before a
   7152 						decimal digit</li>
   7153 				</ul>
   7154 			</li>
   7155 			<li>Ignore all format characters: in particular, ignore any
   7156 				RLM, LRM or ALM used to control BIDI formatting.</li>
   7157 			<li>Ignore all characters in [:Zs:] unless they occur between
   7158 				letters. (In the heuristics below, even those between letters are
   7159 				ignored except to delimit fields)</li>
   7160 			<li>Map all characters in [:Dash:] to U+002D HYPHEN-MINUS</li>
   7161 			<li>Use the data in the &lt;character-fallback&gt; element to
   7162 				map equivalent characters (for example, curly to straight
   7163 				apostrophes). Other apostrophe-like characters should also be
   7164 				treated as equivalent, especially if the character actually used in
   7165 				a format may be unavailable on some keyboards. For example:
   7166 				<ul>
   7167 					<li>U+02BB MODIFIER LETTER TURNED COMMA () might be typed
   7168 						instead as U+2018 LEFT SINGLE QUOTATION MARK ().</li>
   7169 					<li>U+02BC MODIFIER LETTER APOSTROPHE () might be typed
   7170 						instead as U+2019 RIGHT SINGLE QUOTATION MARK (), U+0027
   7171 						APOSTROPHE, etc.</li>
   7172 					<li>U+05F3 HEBREW PUNCTUATION GERESH () might be typed
   7173 						instead as U+0027 APOSTROPHE.</li>
   7174 				</ul>
   7175 			</li>
   7176 			<li>Apply mappings particular to the domain (i.e., for dates or
   7177 				for numbers, discussed in more detail below)</li>
   7178 			<li>Apply case folding (possibly including language-specific
   7179 				mappings such as Turkish i)</li>
   7180 			<li>Normalize to NFKC; thus <i>no-break space</i> will map to <i>
   7181 					space</i>; half-width <i>katakana</i> will map to full-width.
   7182 			</li>
   7183 		</ul>
   7184 		<p>Loose matching involves (logically) applying the above
   7185 			transform to both the input text and to each of the field elements
   7186 			used in matching, before applying the specific heuristics below. For
   7187 			example, if the input number text is &quot; - NA f. 1,000.00&quot;,
   7188 			then it is mapped to &quot;-naf1,000.00&quot; before processing. The
   7189 			currency signs are also transformed, so &quot;NA f.&quot; is
   7190 			converted to &quot;naf&quot; for purposes of matching. As with other
   7191 			Unicode algorithms, this is a logical statement of the process;
   7192 			actual implementations can optimize, such as by applying the
   7193 			transform incrementally during matching.</p>
   7194 		<h3>
   7195 			<a name="Invalid_Patterns" href="#Invalid_Patterns">7.2 Handling
   7196 				Invalid Patterns</a>
   7197 		</h3>
   7198 		<p>Processes sometimes encounter invalid number or
   7199 			date patterns, such as a number pattern with  (valid pattern
   7200 			character but invalid length in current CLDR), a date pattern with
   7201 			nn (invalid pattern character in current CLDR), or a date pattern
   7202 			with MMMMMM (invalid length in current CLDR). The recommended
   7203 			behavior for handling such an invalid pattern field is:</p>
   7204 		<ul>
   7205 			<li>For a field using a currently-invalid length for a valid
   7206 				pattern character:
   7207 				<ul>
   7208 					<li>In <strong>formatting, </strong>emit U+FFFD REPLACEMENT
   7209 						CHARACTER for the invalid field.
   7210 					</li>
   7211 					<li>In <strong>parsing, </strong>the field may be parsed as if
   7212 						it had a valid length.
   7213 					</li>
   7214 				</ul>
   7215 			</li>
   7216 			<li>For a pattern that contains a currently-invalid pattern
   7217 				character (applies only to date patterns, for which A-Za-z are
   7218 				reserved as pattern characters but not all defined as valid):
   7219 				<ul>
   7220 					<li>Produce an error (set an error code or throw an exception)
   7221 						when an attempt is made to create a formatter with such a pattern
   7222 						or to apply such a pattern to an existing formatter.</li>
   7223 				</ul>
   7224 			</li>
   7225 		</ul>
   7226 		<h2>
   7227 			<a name="Deprecated_Structure" href="#Deprecated_Structure">Annex A
   7228 				Deprecated Structure</a>
   7229 		</h2>
   7230 		<p>The deprecated elements, attributes, and values are listed in
   7231 			the supplementalMetadata.xml file, under &lt;deprecatedItems&gt;.
   7232 			While valid LDML, it is strongly discouraged, and no longer used in
   7233 			CLDR.</p>
   7234 		<p>The remainder of this section describes selected cases of
   7235 			deprecated structure that were present in previous versions of CLDR.
   7236 		</p>
   7237 		<h3>
   7238 			<a name="Fallback_Elements" href="#Fallback_Elements">A.1 Element
   7239 				fallback</a>
   7240 		</h3>
   7241 		<p class="dtd">&lt;!ELEMENT fallback (#PCDATA) &gt;</p>
   7242 		<p>
   7243 			The fallback element is deprecated. Implementations should use
   7244 			instead the information in <em><a href="#LanguageMatching">Section
   7245 					4.4 Language Matching</a></em> for doing language fallback.
   7246 		</p>
   7247 		<h3>
   7248 			<a name="BCP47_Keyword_Mapping" href="#BCP47_Keyword_Mapping">A.2
   7249 				BCP 47 Keyword Mapping</a>
   7250 		</h3>
   7251 
   7252 		<p>
   7253 			<b>Note:</b> <i>This structure is deprecated and replaced with <a
   7254 				href="#Unicode_Locale_Extension_Data_Files">Section 3.6.4 U
   7255 					Extension Data Files</a>.
   7256 			</i>
   7257 		</p>
   7258 
   7259 		<p class="dtd">
   7260 			&lt;!ELEMENT bcp47KeywordMappings ( mapKeys?, mapTypes* ) &gt;<br>
   7261 			&lt;!ELEMENT mapKeys ( keyMap* ) &gt;<br> &lt;!ELEMENT keyMap
   7262 			EMPTY &gt;<br> &lt;!ATTLIST keyMap type NMTOKEN #REQUIRED &gt;<br>
   7263 			&lt;!ATTLIST keyMap bcp47 NMTOKEN #REQUIRED &gt;<br>
   7264 			&lt;!ELEMENT mapTypes ( typeMap* ) &gt;<br> &lt;!ATTLIST
   7265 			mapTypes type NMTOKEN #REQUIRED &gt;<br> &lt;!ELEMENT typeMap
   7266 			EMPTY &gt;<br> &lt;!ATTLIST typeMap type CDATA #REQUIRED &gt;<br>
   7267 			&lt;!ATTLIST typeMap bcp47 NMTOKEN #REQUIRED &gt;<br>
   7268 		</p>
   7269 		<p>
   7270 			This section defines mappings between old Unicode locale identifier
   7271 			key/type values and their BCP 47 'u' extension subtag
   7272 			representations. The 'u' extension syntax described in <a
   7273 				href="#u_Extension">Section 3.6 Unicode BCP 47 U Extension</a>
   7274 			restricts a key to two ASCII alphanumerics and a type to three to
   7275 			eight ASCII alphanumerics. A key or a type which does not meet that
   7276 			syntax requirement is converted according to the mapping data defined
   7277 			by the mapKeys or mapTypes elements. For example, a keyword
   7278 			"collation=phonebook" is converted to BCP 47 'u' extension subtags
   7279 			"co-phonebk" by the mapping data below:
   7280 		</p>
   7281 		<pre>    &lt;mapKeys&gt;
   7282         ...
   7283         &lt;keyMap type="collation" bcp47="co"/&gt;
   7284         ...
   7285     &lt;/mapKeys&gt;
   7286     &lt;mapTypes type="collation"&gt;
   7287         ...
   7288         &lt;typeMap type="phonebook" bcp47="phonebk"/&gt;
   7289         ...
   7290     &lt;/mapTypes&gt;
   7291 	</pre>
   7292 		<h3>
   7293 			<a name="Choice_Patterns" href="#Choice_Patterns">A.3 Choice
   7294 				Patterns</a>
   7295 		</h3>
   7296 		<p>
   7297 			<b>Note:</b> <i>This structure is deprecated and replaced with
   7298 				count attributes.</i>
   7299 		</p>
   7300 		<p>A choice pattern is a string that chooses among a number of
   7301 			strings, based on numeric value. It has the following form:</p>
   7302 		<p>
   7303 			&lt;choice_pattern&gt; = &lt;choice&gt; ( &#39;|&#39; &lt;choice&gt;
   7304 			)*<br> &lt;choice&gt; =
   7305 			&lt;number&gt;&lt;relation&gt;&lt;string&gt;<br> &lt;number&gt;
   7306 			= (&#39;+&#39; | &#39;-&#39;)? (<font size="3">&#39;&#39; |
   7307 				[0-9]+ (&#39;.&#39; [0-9]+)?)<br> &lt;relation&gt; =
   7308 				&#39;&lt;&#39; | &#39;
   7309 			</font><span style="color: blue">&#39;</span>
   7310 		</p>
   7311 		<p>The interpretation of a choice pattern is that given a number
   7312 			N, the pattern is scanned from right to left, for each choice
   7313 			evaluating &lt;number&gt; &lt;relation&gt; N. The first choice that
   7314 			matches results in the corresponding string. If no match is found,
   7315 			then the first string is used. For example:</p>
   7316 		<table border="1" cellpadding="0" cellspacing="0">
   7317 			<tr>
   7318 				<td width="33%">Pattern</td>
   7319 				<td width="33%">N</td>
   7320 				<td width="34%">Result</td>
   7321 			</tr>
   7322 			<tr>
   7323 				<td width="33%" rowspan="4">0Rf|1Ru|1&lt;Re</td>
   7324 				<td width="33%">-<font size="3">, </font>-3, -1, -0.000001
   7325 				</td>
   7326 				<td width="34%">Rf (defaulted to first string)</td>
   7327 			</tr>
   7328 			<tr>
   7329 				<td width="33%">0, 0.01, 0.9999</td>
   7330 				<td width="34%">Rf</td>
   7331 			</tr>
   7332 			<tr>
   7333 				<td width="33%">1</td>
   7334 				<td width="34%">Ru</td>
   7335 			</tr>
   7336 			<tr>
   7337 				<td width="33%">1.00001, 5, 99, <font size="3"></font></td>
   7338 				<td width="34%">Re</td>
   7339 			</tr>
   7340 		</table>
   7341 		<p>Quoting is done using &#39; characters, as in date or number
   7342 			formats.</p>
   7343 		<h3>
   7344 			<a name="Element_default" href="#Element_default">A.4 Element
   7345 				default</a>
   7346 		</h3>
   7347 		<p>
   7348 			<b>Note:</b> <i>This structure is deprecated. </i> Use replacement
   7349 			structure instead, for example:
   7350 		</p>
   7351 		<ul>
   7352 			<li>For &lt;collations&gt;, now use the &lt;defaultCollation&gt;
   7353 				element.</li>
   7354 			<li>For &lt;calendars&gt;, the default calendar type for a
   7355 				locale is now specified by <i><a
   7356 					href="tr35-dates.html#Calendar_Preference_Data">Calendar
   7357 						Preference Data</a></i>.
   7358 			</li>
   7359 		</ul>
   7360 		<p>In some cases, a number of elements are present. The default
   7361 			element can be used to indicate which of them is the default, in the
   7362 			absence of other information. The value of the choice attribute is to
   7363 			match the value of the type attribute for the selected item.</p>
   7364 		<pre>&lt;timeFormats&gt;
   7365   &lt;default choice=&quot;<span style="color: red">medium</span>&quot; /&gt;
   7366   &lt;timeFormatLength type=&quot;<span style="color: blue">full</span>&quot;&gt;
   7367     &lt;timeFormat type=&quot;<span style="color: blue">standard</span>&quot;&gt;
   7368       &lt;pattern type=&quot;<span style="color: blue">standard</span>&quot;&gt;<span
   7369 				style="color: blue">h:mm:ss a z</span>&lt;/pattern&gt;
   7370     &lt;/timeFormat&gt;
   7371   &lt;/timeFormatLength&gt;
   7372   &lt;timeFormatLength type=&quot;<span style="color: blue">long</span>&quot;&gt;
   7373     &lt;timeFormat type=&quot;<span style="color: blue">standard</span>&quot;&gt;
   7374       &lt;pattern type=&quot;<span style="color: blue">standard</span>&quot;&gt;<span
   7375 				style="color: blue">h:mm:ss a z</span>&lt;/pattern&gt;
   7376     &lt;/timeFormat&gt;
   7377   &lt;/timeFormatLength&gt;
   7378   &lt;timeFormatLength type=&quot;<span style="color: red">medium</span>&quot;&gt;
   7379     &lt;timeFormat type=&quot;<span style="color: blue">standard</span>&quot;&gt;
   7380       &lt;pattern type=&quot;<span style="color: blue">standard</span>&quot;&gt;<span
   7381 				style="color: blue">h:mm:ss a</span>&lt;/pattern&gt;
   7382     &lt;/timeFormat&gt;
   7383   &lt;/timeFormatLength&gt;
   7384 ...</pre>
   7385 		<p>Like all other elements, the &lt;default&gt; element is
   7386 			inherited. Thus, it can also refer to inherited resources. For
   7387 			example, suppose that the above resources are present in fr, and that
   7388 			in fr_BE we have the following:</p>
   7389 		<pre>&lt;timeFormats&gt;
   7390   &lt;default choice=&quot;<span style="color: red">long</span>&quot;/&gt;
   7391 &lt;/timeFormats&gt;</pre>
   7392 		<p>In that case, the default time format for fr_BE would be the
   7393 			inherited &quot;long&quot; resource from fr. Now suppose that we had
   7394 			in fr_CA:</p>
   7395 		<pre>  &lt;timeFormatLength type=&quot;<span style="color: red">medium</span>&quot;&gt;
   7396     &lt;timeFormat type=&quot;<span style="color: blue">standard</span>&quot;&gt;
   7397       &lt;pattern type=&quot;<span style="color: blue">standard</span>&quot;&gt;<span
   7398 				style="color: blue">...</span>&lt;/pattern&gt;
   7399     &lt;/timeFormat&gt;
   7400   &lt;/timeFormatLength&gt;
   7401     </pre>
   7402 		<p>In this case, the &lt;default&gt; is inherited from fr, and has
   7403 			the value &quot;medium&quot;. It thus refers to this new
   7404 			&quot;medium&quot; pattern in this resource bundle.</p>
   7405 		<h3>
   7406 			<a name="Deprecated_Common_Attributes"
   7407 				href="#Deprecated_Common_Attributes">A.5 Deprecated Common
   7408 				Attributes</a>
   7409 		</h3>
   7410 		<h4>
   7411 			<a name="Attribute_standard" href="#Attribute_standard">A.5.1 Attribute standard</a>
   7412 		</h4>
   7413 		<p class="element2">
   7414 			<b>Note: </b>This attribute is deprecated. Instead, use a reference
   7415 			element with the attribute standard=&quot;true&quot;.
   7416 		</p>
   7417 		<p>The value of this attribute is a list of strings representing
   7418 			standards: international, national, organization, or vendor
   7419 			standards. The presence of this attribute indicates that the data in
   7420 			this element is compliant with the indicated standards. Where
   7421 			possible, for uniqueness, the string should be a URL that represents
   7422 			that standard. The strings are separated by commas; leading or
   7423 			trailing spaces on each string are not significant. Examples:</p>
   7424 		<p>
   7425 			<code>
   7426 				&lt;collation standard=&quot;<span style="color: blue">MSA
   7427 					200:2002</span>&quot;&gt;<br> ...<br> &lt;dateFormatStyle
   7428 				standard=http://www.iso.ch/iso/en/CatalogueDetailPage.CatalogueDetail?CSNUMBER=26780&amp;ICS1=1&amp;ICS2=140&amp;ICS3=30&gt;
   7429 			</code>
   7430 		</p>
   7431 
   7432 		<h4>
   7433 			<a name="Attribute_draft_nonLeaf" href="#Attribute_draft_nonLeaf">A.5.2
   7434 				Attribute draft in non-leaf elements</a>
   7435 		</h4>
   7436 		<p>The draft attribute is deprecated except in
   7437 			leaf elements (elements that do not have any subelements)</p>
   7438 
   7439 		<h3>
   7440 			<a name="Element_base" href="#Element_base">A.6 Element base</a>
   7441 		</h3>
   7442 		<p>
   7443 			<b>Note:</b> <i>This element is deprecated.</i> Use the collation
   7444 			&lt;import&gt; element instead.
   7445 		</p>
   7446 		<p>
   7447 			The optional base element
   7448 			<code>
   7449 				&lt;base&gt;<span style="color: blue">...</span>&lt;/base&gt;
   7450 			</code>
   7451 			, contains an alias element that points to another data source that
   7452 			defines a <i>base </i>collation. If present, it indicates that the
   7453 			settings and rules in the collation are modifications applied on <i>top
   7454 				of the</i> respective elements in the base collation. That is, any
   7455 			successive settings, where present, override what is in the base as
   7456 			described in <a href="tr35-collation.html#Setting_Options">Setting
   7457 				Options</a>. Any successive rules are concatenated to the end of the
   7458 			rules in the base. The results of multiple rules applying to the same
   7459 			characters is covered in <a href="tr35-collation.html#Orderings">Orderings</a>.
   7460 		</p>
   7461 
   7462 		<h3>
   7463 			<a name="Element_rules" href="#Element_rules">A.7 Element rules</a>
   7464 		</h3>
   7465 		<p>
   7466 			<b>Note:</b> <i>The XML collation syntax is deprecated; this
   7467 				includes the &lt;rules&gt; element and its subelements, except that
   7468 				the &lt;import&gt; element has been moved up to be a subelement of
   7469 				&lt;collation&gt;.</i> Use the basic collation syntax with the <a
   7470 				href="tr35-collation.html#Rules">&lt;cr&gt; element</a> instead.
   7471 		</p>
   7472 		<p class="dtd">&lt;!ELEMENT rules (alias | ( ( reset | import ), (
   7473 			reset | import | p | pc | s | sc | t | tc | i | ic | x)* )) &gt;</p>
   7474 
   7475 		<h3>
   7476 			<a name="Deprecated_subelements_of_dates"
   7477 				href="#Deprecated_subelements_of_dates">A.8 Deprecated
   7478 				subelements of &lt;dates&gt;</a>
   7479 		</h3>
   7480 		<ul>
   7481 			<li>&lt;localizedPatternChars&gt;</li>
   7482 			<li>&lt;dateRangePattern&gt;, replaced by
   7483 				&lt;intervalFormats&gt;.</li>
   7484 		</ul>
   7485 
   7486 		<h3>
   7487 			<a name="Deprecated_subelements_of_calendars"
   7488 				href="#Deprecated_subelements_of_calendars">A.9 Deprecated
   7489 				subelements of &lt;calendars&gt;</a>
   7490 		</h3>
   7491 		<ul>
   7492 			<li>&lt;monthNames&gt; and &lt;monthAbbr&gt;; month name forms
   7493 				are specified in the &lt;months&gt; element. The older monthNames,
   7494 				monthAbbr are equivalent to: using the months element with the
   7495 				context type=&quot;<span style="color: blue">format</span>&quot; and
   7496 				the width type=&quot;<span style="color: blue">wide</span>&quot;
   7497 				(for ...Names) and type=&quot;<span style="color: blue">narrow</span>&quot;
   7498 				(for ...Abbr), respectively.
   7499 			</li>
   7500 			<li>&lt;dayNames&gt; and &lt;dayAbbr&gt;; weekday name forms are
   7501 				specified in the &lt;days&gt; element. The older dayNames, dayAbbr
   7502 				are equivalent to: using the days element with the context
   7503 				type=&quot;<span style="color: blue">format</span>&quot; and the
   7504 				width type=&quot;<span style="color: blue">wide</span>&quot; (for
   7505 				...Names) and type=&quot;<span style="color: blue">narrow</span>&quot;
   7506 				(for ...Abbr), respectively.
   7507 			</li>
   7508 			<li><a name="week" href="#week">&lt;week&gt;</a> is deprecated
   7509 				in the main LDML files, because the data is more appropriately
   7510 				organized as connected to territories, not to linguistic data. Use
   7511 				the supplemental &lt;weekData&gt; element instead.</li>
   7512 			<li>&lt;am&gt; and &lt;pm&gt;; these are now included as part of
   7513 				the &lt;dayPeriods&gt; element</li>
   7514 			<li>&lt;fields&gt; is deprecated as a subelement of
   7515 				&lt;calendars&gt; instead, a &lt;fields&gt; element should be
   7516 				located just under a &lt;dates&gt; element. See <a
   7517 				href="tr35-dates.html#Calendar_Fields">Calendar Fields</a>.
   7518 			</li>
   7519 		</ul>
   7520 
   7521 		<h3>
   7522 			<a name="Deprecated_subelements_of_timeZoneNames"
   7523 				href="#Deprecated_subelements_of_timeZoneNames">A.10 Deprecated
   7524 				subelements of &lt;timeZoneNames&gt;</a>
   7525 		</h3>
   7526 		<ul>
   7527 			<li>&lt;hoursFormat&gt; e.g. &quot;{0}/{1}&quot; for
   7528 				&quot;-0800/-0700&quot;</li>
   7529 			<li><a name="fallbackRegionFormat" href="#fallbackRegionFormat">&lt;fallbackRegionFormat&gt;</a>
   7530 				(deprecated), e.g. &quot;{0}Time ({1})&quot; for &quot;United
   7531 				States Time (New York)&quot;</li>
   7532 			<li>&lt;abbreviationFallback&gt;</li>
   7533 			<li>&lt;preferenceOrdering&gt;, a preference ordering among
   7534 				modern zones; use metazones instead.</li>
   7535 			<li>&lt;singleCountries&gt;, use <a
   7536 				href="tr35-dates.html#Primary_Zones">Primary Zones</a></li>
   7537 		</ul>
   7538 
   7539 		<h3>
   7540 			<a name="Deprecated_subelements_of_zone_metazone"
   7541 				href="#Deprecated_subelements_of_zone_metazone">A.11 Deprecated
   7542 				subelements of &lt;zone&gt; and &lt;metazone&gt;</a>
   7543 		</h3>
   7544 		<ul>
   7545 			<li>&lt;commonlyUsed&gt;, formerly used to indicate whether a
   7546 				zone was commonly used in the locale.</li>
   7547 		</ul>
   7548 
   7549 		<h3>
   7550 			<a name="Renamed_attribute_values_for_contextTransformUsage"
   7551 				href="#Renamed_attribute_values_for_contextTransformUsage">A.12
   7552 				Renamed attribute values for &lt;contextTransformUsage&gt; element</a>
   7553 		</h3>
   7554 		<p>
   7555 			The &lt;contextTransformUsage&gt; element was introduced in CLDR 21.
   7556 			The values for its <em>type</em> attribute are documented in <a
   7557 				href="tr35-general.html#contextTransformUsage_type_attribute_values">
   7558 				&lt;contextTransformUsage&gt; type attribute values</a>. In CLDR 25,
   7559 			some of these values were renamed from their previous values for
   7560 			improved clarity:
   7561 		</p>
   7562 		<ul>
   7563 			<li>"type" was renamed to "keyValue"</li>
   7564 			<li>"displayName" was renamed to "currencyName"</li>
   7565 			<li>"displayName-count" was renamed to "currencyName-count"</li>
   7566 			<li>"tense" was renamed to "relative"</li>
   7567 		</ul>
   7568 
   7569 		<h3>
   7570 			<a name="Deprecated_subelements_of_segmentations"
   7571 				href="#Deprecated_subelements_of_segmentations">A.13 Deprecated
   7572 				subelements of &lt;segmentations&gt;</a>
   7573 		</h3>
   7574 		<ul>
   7575 			<li>&lt;exceptions&gt; and &lt;exceptions&gt; were deprecated
   7576 				and replaced with &lt;suppressions&gt; and &lt;suppression&gt;.</li>
   7577 		</ul>
   7578 		<h3>
   7579 			<a name="Element_cp" href="#Element_cp">A.14 Element cp</a>
   7580 		</h3>
   7581 		<p>The cp element was used to escape characters that cannot be
   7582 			represented in XML, even with NCRs. These escapes were only allowed
   7583 			in certain elements, according to the DTD.</p>
   7584 		<p>However, this mechanism is very clumsy, and was replaced by
   7585 			specialized syntax.</p>
   7586 		<table>
   7587 			<tr>
   7588 				<th>Code Point</th>
   7589 				<th>XML Example</th>
   7590 			</tr>
   7591 			<tr>
   7592 				<td><code>U+0000</code></td>
   7593 				<td><code>&lt;cp hex=&quot;0&quot;&gt;</code></td>
   7594 			</tr>
   7595 		</table>
   7596 		<p>&nbsp;</p>
   7597 		<h3>
   7598 			<a name="validSubLocales" href="#validSubLocales">A.15 Attribute
   7599 				validSubLocales</a>
   7600 		</h3>
   7601 		<p>
   7602 			The attribute <i>validSubLocales</i> allowed sublocales in a given
   7603 			tree to be treated as though a file for them were present when there
   7604 			was not one. It only had an effect for locales that inherit from the
   7605 			current file where a file is missing.
   7606 		</p>
   7607 		<p>
   7608 			<b>Example 1. </b>Suppose that in a particular LDML tree, there are
   7609 			no region locales for German, for example, there is a de.xml file,
   7610 			but no files for de_AT.xml, de_CH.xml, or de_DE.xml. Then no elements
   7611 			are valid for any of those region locales. If we want to mark one of
   7612 			those files as having valid elements, then we introduce an empty
   7613 			file, such as the following.
   7614 		</p>
   7615 		<p>
   7616 			<code>
   7617 				&lt;ldml version=&quot;1.1&quot;&gt;<br> &nbsp;&lt;identity&gt;<br>
   7618 				&nbsp; &lt;version number=&quot;1.1&quot; /&gt; <br> &nbsp; &lt;language type=&quot;de&quot; /&gt; <br> &nbsp;
   7619 				&lt;territory type=&quot;AT&quot; /&gt; <br>
   7620 				&nbsp;&lt;/identity&gt;<br> &lt;/ldml&gt;
   7621 			</code>
   7622 		</p>
   7623 		<p>
   7624 			With the <i>validSubLocales</i> attribute, instead of adding the
   7625 			empty files for de_AT.xml, de_CH.xml, and de_DE.xml, in the de file
   7626 			we could add to the parent locale a list of the child locales that
   7627 			should behave as if files were present.
   7628 		</p>
   7629 		<p>
   7630 			<code>
   7631 				&lt;ldml version=&quot;1.1&quot; validSubLocales=&quot;de_AT de_CH
   7632 				de_DE&quot;&gt;<br> &nbsp;&lt;identity&gt;<br> &nbsp;
   7633 				&lt;version number=&quot;1.1&quot; /&gt; <br> &nbsp;
   7634 				&lt;language type=&quot;de&quot; /&gt; <br>
   7635 				&nbsp;&lt;/identity&gt;<br> ...<br> &lt;/ldml&gt;
   7636 			</code>
   7637 		</p>
   7638 		<p>
   7639 			Now that the <i>validSubLocales</i> attribute has been deprecated, it
   7640 			is recommended to simply add empty files to specify which sublocales
   7641 			are valid. This convention is used throughout the CLDR.
   7642 		</p>
   7643 		<h3>
   7644 			<a name="postCodeElements" href="#postCodeElements">A.16 Elements
   7645 				postalCodeData, postCodeRegex</a>
   7646 		</h3>
   7647 		<p>The postal code validation data has been deprecated. Please see
   7648 			other services that are kept up to date, such as:</p>
   7649 		<ul>
   7650 			<li><a href="http://i18napis.appspot.com/address/data/US">http://i18napis.appspot.com/address/data/US</a></li>
   7651 			<li><a href="http://i18napis.appspot.com/address/data/CH">http://i18napis.appspot.com/address/data/CH</a></li>
   7652 			<li>...</li>
   7653 		</ul>
   7654 		<p>
   7655 			See <a href="tr35-info.html#Postal_Code_Validation">Postal Code
   7656 				Validation</a>
   7657 		</p>
   7658 
   7659 		<h3>
   7660 			<a name="telephoneCodeData" href="#telephoneCodeData">A.17 Element
   7661 				telephoneCodeData</a>
   7662 		</h3>
   7663 		<p>The element &lt;telephoneCodeData&gt; and its subelements have
   7664 			been deprecated and the data removed.</p>
   7665 
   7666 		<hr>
   7667 		<h2>
   7668 			<a name="Links_to_Other_Parts" href="#Links_to_Other_Parts">Annex B
   7669 				Links to Other Parts</a>
   7670 		</h2>
   7671 		<p>
   7672 			The LDML specification is split into several <a href="#Parts">parts</a>
   7673 			by topic, with one HTML document per part. The following tables
   7674 			provide redirects for links to specific topics. Please update your
   7675 			links and bookmarks.
   7676 		</p>
   7677 
   7678 		<p>Part 1 Links: Core (this document): No redirects needed.</p>
   7679 
   7680 		<table cellspacing="0" cellpadding="2" border="1" width="100%">
   7681 			<caption>
   7682 				<a href="#Part_2_Links" name="Part_2_Links">Part 2 Links</a>: <a
   7683 					href="tr35-general.html">General</a> (display names &amp;
   7684 				transforms, etc.)
   7685 			</caption>
   7686 			<tr>
   7687 				<th>Old section</th>
   7688 				<th>Section in new part</th>
   7689 			</tr>
   7690 			<tr>
   7691 				<td>5.4 <a name="Display_Name_Elements"
   7692 					href="#Display_Name_Elements">Display Name Elements</a></td>
   7693 				<td>1 <a href="tr35-general.html#Display_Name_Elements">Display
   7694 						Name Elements</a></td>
   7695 			</tr>
   7696 			<tr>
   7697 				<td>5.5 <a name="Layout_Elements" href="#Layout_Elements">Layout
   7698 						Elements</a></td>
   7699 				<td>2 <a href="tr35-general.html#Layout_Elements">Layout
   7700 						Elements</a></td>
   7701 			</tr>
   7702 			<tr>
   7703 				<td>5.6 <a name="Character_Elements" href="#Character_Elements">Character
   7704 						Elements</a></td>
   7705 				<td>3 <a href="tr35-general.html#Character_Elements">Character
   7706 						Elements</a></td>
   7707 			</tr>
   7708 			<tr>
   7709 				<td>5.6.1 <a name="ExemplarSyntax" href="#ExemplarSyntax">Exemplar
   7710 						Syntax</a></td>
   7711 				<td>3.1 <a href="tr35-general.html#ExemplarSyntax">Exemplar
   7712 						Syntax</a></td>
   7713 			</tr>
   7714 			<tr>
   7715 				<td>5.6.2 Restrictions</td>
   7716 				<td>3.1 <a href="tr35-general.html#ExemplarSyntax">Exemplar
   7717 						Syntax</a></td>
   7718 			</tr>
   7719 			<tr>
   7720 				<td>5.6.3 Mapping</td>
   7721 				<td>3.2 <a href="tr35-general.html#Character_Mapping">Mapping</a></td>
   7722 			</tr>
   7723 			<tr>
   7724 				<td>5.6.4 <a name="IndexLabels" href="#IndexLabels">Index
   7725 						Labels</a></td>
   7726 				<td>3.3 <a href="tr35-general.html#IndexLabels">Index
   7727 						Labels</a></td>
   7728 			</tr>
   7729 			<tr>
   7730 				<td>5.6.5 Ellipsis</td>
   7731 				<td>3.4 <a href="tr35-general.html#Ellipsis">Ellipsis</a></td>
   7732 			</tr>
   7733 			<tr>
   7734 				<td>5.6.6 More Information</td>
   7735 				<td>3.5 <a href="tr35-general.html#Character_More_Info">More
   7736 						Information</a></td>
   7737 			</tr>
   7738 			<tr>
   7739 				<td>5.7 <a name="Delimiter_Elements" href="#Delimiter_Elements">Delimiter
   7740 						Elements</a></td>
   7741 				<td>4 <a href="tr35-general.html#Delimiter_Elements">Delimiter
   7742 						Elements</a></td>
   7743 			</tr>
   7744 			<tr>
   7745 				<td>C.6 <a name="Measurement_System_Data"
   7746 					href="#Measurement_System_Data">Measurement System Data</a></td>
   7747 				<td>5 <a href="tr35-general.html#Measurement_System_Data">Measurement
   7748 						System Data</a></td>
   7749 			</tr>
   7750 			<tr>
   7751 				<td>5.8 <a name="Measurement_Elements"
   7752 					href="#Measurement_Elements">Measurement Elements (deprecated)</a></td>
   7753 				<td>5.1 <a href="tr35-general.html#Measurement_Elements">Measurement
   7754 						Elements (deprecated)</a></td>
   7755 			</tr>
   7756 			<tr>
   7757 				<td>5.11 <a name="Unit_Elements" href="#Unit_Elements">Unit
   7758 						Elements</a></td>
   7759 				<td>6 <a href="tr35-general.html#Unit_Elements">Unit
   7760 						Elements</a></td>
   7761 			</tr>
   7762 			<tr>
   7763 				<td>5.12 <a name="POSIX_Elements" href="#POSIX_Elements">POSIX
   7764 						Elements</a></td>
   7765 				<td>7 <a href="tr35-general.html#POSIX_Elements">POSIX
   7766 						Elements</a></td>
   7767 			</tr>
   7768 			<tr>
   7769 				<td>5.13 <a name="Reference_Elements"
   7770 					href="#Reference_Elements">Reference Element</a></td>
   7771 				<td>8 <a href="tr35-general.html#Reference_Elements">Reference
   7772 						Element</a></td>
   7773 			</tr>
   7774 			<tr>
   7775 				<td>5.15 <a name="Segmentations" href="#Segmentations">Segmentations</a></td>
   7776 				<td>9 <a href="tr35-general.html#Segmentations">Segmentations</a></td>
   7777 			</tr>
   7778 			<tr>
   7779 				<td>5.15.1 <a name="Segmentation_Inheritance"
   7780 					href="#Segmentation_Inheritance">Segmentation Inheritance</a></td>
   7781 				<td>9.1 <a href="tr35-general.html#Segmentation_Inheritance">Segmentation
   7782 						Inheritance</a></td>
   7783 			</tr>
   7784 			<tr>
   7785 				<td>5.16 <a name="Transforms" href="#Transforms">Transforms</a></td>
   7786 				<td>10 <a href="tr35-general.html#Transforms">Transforms</a></td>
   7787 			</tr>
   7788 			<tr>
   7789 				<td>N <a name="Transform_Rules" href="#Transform_Rules">Transform
   7790 						Rules</a></td>
   7791 				<td>10.3 <a href="tr35-general.html#Transform_Rules_Syntax">Transform
   7792 						Rules Syntax</a></td>
   7793 			</tr>
   7794 			<tr>
   7795 				<td>5.18 <a name="ListPatterns" href="#ListPatterns">List
   7796 						Patterns</a></td>
   7797 				<td>11 <a href="tr35-general.html#ListPatterns">List
   7798 						Patterns</a></td>
   7799 			</tr>
   7800 			<tr>
   7801 				<td>C.20 <a name="List_Gender" href="#List_Gender">Gender
   7802 						of Lists</a></td>
   7803 				<td>11.1 <a href="tr35-general.html#List_Gender">Gender of
   7804 						Lists</a></td>
   7805 			</tr>
   7806 			<tr>
   7807 				<td>5.19 <a name="Context_Transform_Elements"
   7808 					href="#Context_Transform_Elements">ContextTransform Elements</a></td>
   7809 				<td>12 <a href="tr35-general.html#Context_Transform_Elements">ContextTransform
   7810 						Elements</a></td>
   7811 			</tr>
   7812 			<tr>
   7813 				<td></td>
   7814 				<td><a href="tr35-general.html#"></a></td>
   7815 			</tr>
   7816 		</table>
   7817 
   7818 
   7819 		<table cellspacing="0" cellpadding="2" border="1" width="100%">
   7820 			<caption>
   7821 				<a href="#Part_3_Links" name="Part_3_Links">Part 3 Links</a>: <a
   7822 					href="tr35-numbers.html">Numbers</a> (number &amp; currency
   7823 				formatting)
   7824 			</caption>
   7825 			<tr>
   7826 				<th>Old section</th>
   7827 				<th>Section in new part</th>
   7828 			</tr>
   7829 			<tr>
   7830 				<td>C.13 <a name="Numbering_Systems" href="#Numbering_Systems">Numbering
   7831 						Systems</a></td>
   7832 				<td>1 <a href="tr35-numbers.html#Numbering_Systems">Numbering
   7833 						Systems</a></td>
   7834 			</tr>
   7835 			<tr>
   7836 				<td>5.10 <a name="Number_Elements" href="#Number_Elements">Number
   7837 						Elements</a></td>
   7838 				<td>2 <a href="tr35-numbers.html#Number_Elements">Number
   7839 						Elements</a></td>
   7840 			</tr>
   7841 			<tr>
   7842 				<td>5.10.1 <a name="Number_Symbols" href="#Number_Symbols">Number
   7843 						Symbols</a></td>
   7844 				<td>2.3 <a href="tr35-numbers.html#Number_Symbols">Number
   7845 						Symbols</a></td>
   7846 			</tr>
   7847 			<tr>
   7848 				<td>G <a name="Number_Format_Patterns"
   7849 					href="#Number_Format_Patterns">Number Format Patterns</a></td>
   7850 				<td>3 <a href="tr35-numbers.html#Number_Format_Patterns">Number
   7851 						Format Patterns</a></td>
   7852 			</tr>
   7853 			<tr>
   7854 				<td>5.10.2 <a name="Currencies" href="#Currencies">Currencies</a></td>
   7855 				<td>4 <a href="tr35-numbers.html#Currencies">Currencies</a></td>
   7856 			</tr>
   7857 			<tr>
   7858 				<td>C.1 <a name="Supplemental_Currency_Data"
   7859 					href="#Supplemental_Currency_Data">Supplemental Currency Data</a></td>
   7860 				<td>4.1 <a href="tr35-numbers.html#Supplemental_Currency_Data">Supplemental
   7861 						Currency Data</a></td>
   7862 			</tr>
   7863 			<tr>
   7864 				<td>C.11 <a name="Language_Plural_Rules"
   7865 					href="#Language_Plural_Rules">Language Plural Rules</a></td>
   7866 				<td>5 <a href="tr35-numbers.html#Language_Plural_Rules">Language
   7867 						Plural Rules</a></td>
   7868 			</tr>
   7869 			<tr>
   7870 				<td>5.17 <a name="Rule-Based_Number_Formatting"
   7871 					href="#Rule-Based_Number_Formatting">Rule-Based Number
   7872 						Formatting</a></td>
   7873 				<td>6 <a href="tr35-numbers.html#Rule-Based_Number_Formatting">Rule-Based
   7874 						Number Formatting</a></td>
   7875 			</tr>
   7876 		</table>
   7877 
   7878 
   7879 		<table cellspacing="0" cellpadding="2" border="1" width="100%">
   7880 			<caption>
   7881 				<a href="#Part_4_Links" name="Part_4_Links">Part 4 Links</a>: <a
   7882 					href="tr35-dates.html">Dates</a> (date, time, time zone formatting)
   7883 			</caption>
   7884 			<tr>
   7885 				<th>Old section</th>
   7886 				<th>Section in new part</th>
   7887 			</tr>
   7888 			<tr>
   7889 				<td><a name="Date_Elements" href="#Date_Elements">5.9 Date
   7890 						Elements</a></td>
   7891 				<td>1 <a
   7892 					href="tr35-dates.html#Overview_Dates_Element_Supplemental">Overview:
   7893 						Dates Element, Supplemental Date and Calendar Information</a></td>
   7894 			</tr>
   7895 			<tr>
   7896 				<td><a name="Calendar_Elements" href="#Calendar_Elements">5.9.1
   7897 						Calendar Elements</a></td>
   7898 				<td>2 <a href="tr35-dates.html#Calendar_Elements">Calendar
   7899 						Elements</a></td>
   7900 			</tr>
   7901 			<tr>
   7902 				<td><a name="months_days_quarters_eras"
   7903 					href="#months_days_quarters_eras">Elements months, days,
   7904 						quarters, eras</a></td>
   7905 				<td>2.1 <a href="tr35-dates.html#months_days_quarters_eras">Elements
   7906 						months, days, quarters, eras</a></td>
   7907 			</tr>
   7908 			<tr>
   7909 				<td><a name="monthPatterns_cyclicNameSets"
   7910 					href="#monthPatterns_cyclicNameSets">Elements monthPatterns,
   7911 						cyclicNameSets</a></td>
   7912 				<td>2.2 <a href="tr35-dates.html#monthPatterns_cyclicNameSets">Elements
   7913 						monthPatterns, cyclicNameSets</a></td>
   7914 			</tr>
   7915 			<tr>
   7916 				<td><a name="dayPeriods" href="#dayPeriods">Element
   7917 						dayPeriods</a></td>
   7918 				<td>2.3 <a href="tr35-dates.html#dayPeriods">Element
   7919 						dayPeriods</a></td>
   7920 			</tr>
   7921 			<tr>
   7922 				<td><a name="dateFormats" href="#dateFormats">Element
   7923 						dateFormats</a></td>
   7924 				<td>2.4 <a href="tr35-dates.html#dateFormats">Element
   7925 						dateFormats</a></td>
   7926 			</tr>
   7927 			<tr>
   7928 				<td><a name="timeFormats" href="#timeFormats">Element
   7929 						timeFormats</a></td>
   7930 				<td>2.5 <a href="tr35-dates.html#timeFormats">Element
   7931 						timeFormats</a></td>
   7932 			</tr>
   7933 			<tr>
   7934 				<td><a name="dateTimeFormats" href="#dateTimeFormats">Element
   7935 						dateTimeFormats</a></td>
   7936 				<td>2.6 <a href="tr35-dates.html#dateTimeFormats">Element
   7937 						dateTimeFormats</a></td>
   7938 			</tr>
   7939 			<tr>
   7940 				<td><a name="Calendar_Fields" href="#Calendar_Fields">5.9.2
   7941 						Calendar Fields</a></td>
   7942 				<td>3 <a href="tr35-dates.html#Calendar_Fields">Calendar
   7943 						Fields</a></td>
   7944 			</tr>
   7945 			<tr>
   7946 				<td>5.9.3 <a name="Timezone_Names" href="#Timezone_Names">Time
   7947 						Zone Names</a></td>
   7948 				<td>5 <a href="tr35-dates.html#Time_Zone_Names">Time Zone
   7949 						Names</a></td>
   7950 			</tr>
   7951 			<tr>
   7952 				<td><a name="Supplemental_Calendar_Data"
   7953 					href="#Supplemental_Calendar_Data">C.5 Supplemental Calendar
   7954 						Data</a></td>
   7955 				<td>4 <a href="tr35-dates.html#Supplemental_Calendar_Data">Supplemental
   7956 						Calendar Data</a></td>
   7957 			</tr>
   7958 			<tr>
   7959 				<td><a name="Supplemental_Timezone_Data"
   7960 					href="#Supplemental_Timezone_Data">C.7 Supplemental Time Zone
   7961 						Data</a></td>
   7962 				<td>6 <a href="tr35-dates.html#Supplemental_Time_Zone_Data">Supplemental
   7963 						Time Zone Data</a></td>
   7964 			</tr>
   7965 			<tr>
   7966 				<td><a name="Calendar_Preference_Data"
   7967 					href="#Calendar_Preference_Data">C.15 Calendar Preference Data</a></td>
   7968 				<td>4.2 <a href="tr35-dates.html#Calendar_Preference_Data">Calendar
   7969 						Preference Data</a></td>
   7970 			</tr>
   7971 			<tr>
   7972 				<td><a name="DayPeriodRules" href="#DayPeriodRules">C.17
   7973 						DayPeriod Rules</a></td>
   7974 				<td>4.5 <a href="tr35-dates.html#Day_Period_Rules">Day
   7975 						Period Rules</a></td>
   7976 			</tr>
   7977 			<tr>
   7978 				<td><a name="Date_Format_Patterns" href="#Date_Format_Patterns">Appendix
   7979 						F: Date Format Patterns</a></td>
   7980 				<td>8 <a href="tr35-dates.html#Date_Format_Patterns">Date
   7981 						Format Patterns</a></td>
   7982 			</tr>
   7983 			<tr>
   7984 				<td><a name="Date_Field_Symbol_Table"
   7985 					href="#Date_Field_Symbol_Table">Date Field Symbol Table</a></td>
   7986 				<td><a href="tr35-dates.html#Date_Field_Symbol_Table">Date
   7987 						Field Symbol Table</a></td>
   7988 			</tr>
   7989 			<tr>
   7990 				<td><a name="Localized_Pattern_Characters"
   7991 					href="#Localized_Pattern_Characters">F.1 Localized Pattern
   7992 						Characters (deprecated)</a></td>
   7993 				<td>8.1 <a href="tr35-dates.html#Localized_Pattern_Characters">Localized
   7994 						Pattern Characters (deprecated)</a></td>
   7995 			</tr>
   7996 			<tr>
   7997 				<td><a name="Time_Zone_Fallback" href="#Time_Zone_Fallback">Appendix
   7998 						J: Time Zone Display Names</a></td>
   7999 				<td>7 <a href="tr35-dates.html#Using_Time_Zone_Names">Using
   8000 						Time Zone Names</a></td>
   8001 			</tr>
   8002 			<tr>
   8003 				<td><a name="fallbackFormat" href="#fallbackFormat"><b>fallbackFormat</b>:</a></td>
   8004 				<td><a href="tr35-dates.html#fallbackFormat"><b>fallbackFormat</b>:</a></td>
   8005 			</tr>
   8006 			<tr>
   8007 				<td>O.4 Parsing Dates and Times</td>
   8008 				<td>9 <a href="tr35-dates.html#Parsing_Dates_Times">Parsing
   8009 						Dates and Times</a></td>
   8010 			</tr>
   8011 		</table>
   8012 
   8013 
   8014 		<table cellspacing="0" cellpadding="2" border="1" width="100%">
   8015 			<caption>
   8016 				<a href="#Part_5_Links" name="Part_5_Links">Part 5 Links</a>: <a
   8017 					href="tr35-collation.html">Collation</a> (sorting, searching,
   8018 				grouping)
   8019 			</caption>
   8020 			<tr>
   8021 				<th>Old section</th>
   8022 				<th>Section in new part</th>
   8023 			</tr>
   8024 			<tr>
   8025 				<td>5.14 <a name="Collation_Elements"
   8026 					href="#Collation_Elements">Collation Elements</a></td>
   8027 				<td>3 <a href="tr35-collation.html#Collation_Tailorings">Collation
   8028 						Tailorings</a></td>
   8029 			</tr>
   8030 			<tr>
   8031 				<td>5.14.1 <a name="Collation_Version"
   8032 					href="#Collation_Version">Version</a></td>
   8033 				<td>3.1 <a href="tr35-collation.html#Collation_Version">Version</a></td>
   8034 			</tr>
   8035 			<tr>
   8036 				<td>5.14.2 <a name="Collation_Element"
   8037 					href="#Collation_Element">Collation Element</a></td>
   8038 				<td>3.2 <a href="tr35-collation.html#Collation_Element">Collation
   8039 						Element</a></td>
   8040 			</tr>
   8041 			<tr>
   8042 				<td>5.14.3 <a name="Setting_Options" href="#Setting_Options">Setting
   8043 						Options</a></td>
   8044 				<td>3.3 <a href="tr35-collation.html#Setting_Options">Setting
   8045 						Options</a></td>
   8046 			</tr>
   8047 			<tr>
   8048 				<td>Table <a name="Collation_Settings"
   8049 					href="#Collation_Settings">Collation Settings</a></td>
   8050 				<td>Table <a href="tr35-collation.html#Collation_Settings">Collation
   8051 						Settings</a></td>
   8052 			</tr>
   8053 			<tr>
   8054 				<td>5.14.4 <a name="Rules" href="#Rules">Collation Rule
   8055 						Syntax</a></td>
   8056 				<td>3.4 <a href="tr35-collation.html#Rules">Collation Rule
   8057 						Syntax</a></td>
   8058 			</tr>
   8059 			<tr>
   8060 				<td>5.14.5 <a name="Orderings" href="#Orderings">Orderings</a></td>
   8061 				<td>3.5 <a href="tr35-collation.html#Orderings">Orderings</a></td>
   8062 			</tr>
   8063 			<tr>
   8064 				<td>5.14.6 <a name="Contractions" href="#Contractions">Contractions</a></td>
   8065 				<td>3.6 <a href="tr35-collation.html#Contractions">Contractions</a></td>
   8066 			</tr>
   8067 			<tr>
   8068 				<td>5.14.7 <a name="Expansions" href="#Expansions">Expansions</a></td>
   8069 				<td>3.7 <a href="tr35-collation.html#Expansions">Expansions</a></td>
   8070 			</tr>
   8071 			<tr>
   8072 				<td>5.14.8 <a name="Context_Before" href="#Context_Before">Context
   8073 						Before</a></td>
   8074 				<td>3.8 <a href="tr35-collation.html#Context_Before">Context
   8075 						Before</a></td>
   8076 			</tr>
   8077 			<tr>
   8078 				<td>5.14.9 <a name="Placing_Characters_Before_Others"
   8079 					href="#Placing_Characters_Before_Others">Placing Characters
   8080 						Before Others</a></td>
   8081 				<td>3.9 <a
   8082 					href="tr35-collation.html#Placing_Characters_Before_Others">Placing
   8083 						Characters Before Others</a></td>
   8084 			</tr>
   8085 			<tr>
   8086 				<td>5.14.10 <a name="Logical_Reset_Positions"
   8087 					href="#Logical_Reset_Positions">Logical Reset Positions</a></td>
   8088 				<td>3.10 <a href="tr35-collation.html#Logical_Reset_Positions">Logical
   8089 						Reset Positions</a></td>
   8090 			</tr>
   8091 			<tr>
   8092 				<td>5.14.11 <a name="Special_Purpose_Commands"
   8093 					href="#Special_Purpose_Commands">Special-Purpose Commands</a></td>
   8094 				<td>3.11 <a href="tr35-collation.html#Special_Purpose_Commands">Special-Purpose
   8095 						Commands</a></td>
   8096 			</tr>
   8097 			<tr>
   8098 				<td>5.14.12 <a name="Script_Reordering"
   8099 					href="#Script_Reordering">Collation Reordering</a></td>
   8100 				<td>3.12 <a href="tr35-collation.html#Script_Reordering">Collation
   8101 						Reordering</a></td>
   8102 			</tr>
   8103 			<tr>
   8104 				<td>5.14.13 <a name="Case_Parameters" href="#Case_Parameters">Case
   8105 						Parameters</a></td>
   8106 				<td>3.13 <a href="tr35-collation.html#Case_Parameters">Case
   8107 						Parameters</a></td>
   8108 			</tr>
   8109 			<tr>
   8110 				<td>Definition: <a name="UncasedExceptions"
   8111 					href="#UncasedExceptions">UncasedExceptions</a></td>
   8112 				<td>removed: see 3.13 <a
   8113 					href="tr35-collation.html#Case_Parameters">Case Parameters</a></td>
   8114 			</tr>
   8115 			<tr>
   8116 				<td>Definition: <a name="LowerExceptions"
   8117 					href="#LowerExceptions">LowerExceptions</a></td>
   8118 				<td>removed: see 3.13 <a
   8119 					href="tr35-collation.html#Case_Parameters">Case Parameters</a></td>
   8120 			</tr>
   8121 			<tr>
   8122 				<td>Definition: <a name="UpperExceptions"
   8123 					href="#UpperExceptions">UpperExceptions</a></td>
   8124 				<td>removed: see 3.13 <a
   8125 					href="tr35-collation.html#Case_Parameters">Case Parameters</a></td>
   8126 			</tr>
   8127 			<tr>
   8128 				<td>5.14.14 <a name="Visibility" href="#Visibility">Visibility</a></td>
   8129 				<td>3.14 <a href="tr35-collation.html#Visibility">Visibility</a></td>
   8130 			</tr>
   8131 		</table>
   8132 
   8133 		<table cellspacing="0" cellpadding="2" border="1" width="100%">
   8134 			<caption>
   8135 				<a href="#Part_6_Links" name="Part_6_Links">Part 6 Links</a>: <a
   8136 					href="tr35-info.html">Supplemental</a> (supplemental data)
   8137 			</caption>
   8138 			<tr>
   8139 				<th>Old section</th>
   8140 				<th>Section in new part</th>
   8141 			</tr>
   8142 
   8143 			<tr>
   8144 				<td>C <a name="Supplemental_Data" href="#Supplemental_Data">Supplemental
   8145 						Data</a></td>
   8146 				<td>Introduction <a href="tr35-info.html#Supplemental_Data">Supplemental
   8147 						Data</a></td>
   8148 			</tr>
   8149 
   8150 			<tr>
   8151 				<td>C.2 <a name="Supplemental_Territory_Containment"
   8152 					href="#Supplemental_Territory_Containment">Supplemental
   8153 						Territory Containment</a></td>
   8154 				<td>1.1 <a
   8155 					href="tr35-info.html#Supplemental_Territory_Containment">Supplemental
   8156 						Territory Containment</a></td>
   8157 			</tr>
   8158 			<tr>
   8159 				<td>C.4 <a name="Supplemental_Territory_Information"
   8160 					href="#Supplemental_Territory_Information">Supplemental
   8161 						Territory Information</a></td>
   8162 				<td>1.2 <a
   8163 					href="tr35-info.html#Supplemental_Territory_Information">Supplemental
   8164 						Territory Information</a></td>
   8165 			</tr>
   8166 			<tr>
   8167 				<td>C.3 <a name="Supplemental_Language_Data"
   8168 					href="#Supplemental_Language_Data">Supplemental Language Data</a></td>
   8169 				<td>2 <a href="tr35-info.html#Supplemental_Language_Data">Supplemental
   8170 						Language Data</a></td>
   8171 			</tr>
   8172 			<tr>
   8173 				<td>C.9 <a name="Supplemental_Code_Mapping"
   8174 					href="#Supplemental_Code_Mapping">Supplemental Code Mapping</a></td>
   8175 				<td>4 <a href="tr35-info.html#Supplemental_Code_Mapping">Supplemental
   8176 						Code Mapping</a></td>
   8177 			</tr>
   8178 			<tr>
   8179 				<td>C.12 <a name="Telephone_Code_Data"
   8180 					href="#Telephone_Code_Data">Telephone Code Data</a></td>
   8181 				<td>5 <a href="tr35-info.html#Telephone_Code_Data">Telephone
   8182 						Code Data</a></td>
   8183 			</tr>
   8184 			<tr>
   8185 				<td>C.14 <a name="Postal_Code_Validation"
   8186 					href="#Postal_Code_Validation">Postal Code Validation</a></td>
   8187 				<td>6 <a href="tr35-info.html#Postal_Code_Validation">Postal
   8188 						Code Validation</a></td>
   8189 			</tr>
   8190 			<tr>
   8191 				<td>C.8 <a name="Supplemental_Character_Fallback_Data"
   8192 					href="#Supplemental_Character_Fallback_Data">Supplemental
   8193 						Character Fallback Data</a></td>
   8194 				<td>7 <a
   8195 					href="tr35-info.html#Supplemental_Character_Fallback_Data">Supplemental
   8196 						Character Fallback Data</a></td>
   8197 			</tr>
   8198 			<tr>
   8199 				<td>M <a name="Coverage_Levels" href="#Coverage_Levels">Coverage
   8200 						Levels</a></td>
   8201 				<td>8 <a href="tr35-info.html#Coverage_Levels">Coverage
   8202 						Levels</a></td>
   8203 			</tr>
   8204 			<tr>
   8205 				<td>5.20 <a name="Metadata_Elements"
   8206 					href="tr35-info.html#Metadata_Elements">Metadata Elements</a></td>
   8207 				<td>10 <a href="tr35-info.html#Metadata_Elements">Locale
   8208 						Metadata Element</a></td>
   8209 			</tr>
   8210 			<tr>
   8211 				<td>P <a name="Appendix_Supplemental_Metadata"
   8212 					href="tr35-info.html#Appendix_Supplemental_Metadata">Supplemental
   8213 						Metadata</a><br> P.1 <a name="Supplemental_Alias_Information"
   8214 					href="tr35-info.html#Supplemental_Alias_Information">Supplemental
   8215 						Alias Information</a><br> P.2 <a
   8216 					name="Supplemental_Deprecated_Information"
   8217 					href="tr35-info.html#Supplemental_Deprecated_Information">Supplemental
   8218 						Deprecated Information</a><br> P.3 <a name="Default_Content"
   8219 					href="tr35-info.html#Default_Content">Default Content</a>
   8220 				</td>
   8221 				<td>9 <a href="tr35-info.html#Appendix_Supplemental_Metadata">Supplemental
   8222 						Metadata</a> <br> 9.1 <a
   8223 					href="tr35-info.html#Supplemental_Alias_Information">Supplemental
   8224 						Alias Information</a><br> 9.2 <a
   8225 					href="tr35-info.html#Supplemental_Deprecated_Information">Supplemental
   8226 						Deprecated Information</a><br> 9.3 <a
   8227 					href="tr35-info.html#Default_Content">Default Content</a>
   8228 				</td>
   8229 			</tr>
   8230 		</table>
   8231 
   8232 		<table cellspacing="0" cellpadding="2" border="1" width="100%">
   8233 			<caption>
   8234 				<a href="#Part_7_Links" name="Part_7_Links">Part 7 Links</a>: <a
   8235 					href="tr35-keyboards.html">Keyboards</a> (keyboard mappings)
   8236 			</caption>
   8237 			<tr>
   8238 				<th>Old section</th>
   8239 				<th>Section in new part</th>
   8240 			</tr>
   8241 
   8242 			<tr>
   8243 				<td>S <a name="Keyboards" href="#Keyboards">Keyboards</a></td>
   8244 				<td>1 <a href="tr35-keyboards.html#Keyboards">Keyboards</a></td>
   8245 			</tr>
   8246 
   8247 			<tr>
   8248 				<td>S <a name="Goals_and_Nongoals" href="#Goals_and_Nongoals">Goals
   8249 						and Nongoals</a></td>
   8250 				<td><a href="tr35-keyboards.html#Goals_and_Nongoals">Goals
   8251 						and Nongoals</a></td>
   8252 			</tr>
   8253 
   8254 			<tr>
   8255 				<td>S <a name="File_and_Dir_Structure"
   8256 					href="#File_and_Dir_Structure">File and Directory Structure</a></td>
   8257 				<td><a href="tr35-keyboards.html#File_and_Dir_Structure">File
   8258 						and Directory Structure</a></td>
   8259 			</tr>
   8260 
   8261 			<tr>
   8262 				<td>S <a name="Element_Heirarchy_Layout_File"
   8263 					href="#Element_Heirarchy_Layout_File">Element Hierarchy -
   8264 						Layout File</a></td>
   8265 				<td><a href="tr35-keyboards.html#Element_Heirarchy_Layout_File">Element
   8266 						Hierarchy - Layout File</a></td>
   8267 			</tr>
   8268 
   8269 			<tr>
   8270 				<td>S <a name="Element_Heirarchy_Platform_File"
   8271 					href="#Element_Heirarchy_Platform_File">Element Hierarchy -
   8272 						Platform File</a></td>
   8273 				<td><a
   8274 					href="tr35-keyboards.html#Element_Heirarchy_Platform_File">Element
   8275 						Hierarchy - Platform File</a></td>
   8276 			</tr>
   8277 
   8278 			<tr>
   8279 				<td>S <a name="Invariants" href="#Invariants">Invariants</a></td>
   8280 				<td><a href="tr35-keyboards.html#Invariants">Invariants</a></td>
   8281 			</tr>
   8282 
   8283 			<tr>
   8284 				<td>S <a name="Data_Sources" href="#Data_Sources">Data
   8285 						Sources</a></td>
   8286 				<td><a href="tr35-keyboards.html#Data_Sources">Data Sources</a></td>
   8287 			</tr>
   8288 
   8289 			<tr>
   8290 				<td>S <a name="Keyboard_IDs" href="#Keyboard_IDs">Keyboard
   8291 						IDs</a></td>
   8292 				<td><a href="tr35-keyboards.html#Keyboard_IDs">Keyboard IDs</a></td>
   8293 			</tr>
   8294 
   8295 			<tr>
   8296 				<td>S <a name="Platform_Behaviors_in_Edge_Cases"
   8297 					href="#Platform_Behaviors_in_Edge_Cases">Platform Behaviors in
   8298 						Edge Cases</a></td>
   8299 				<td><a
   8300 					href="tr35-keyboards.html#Platform_Behaviors_in_Edge_Cases">Platform
   8301 						Behaviors in Edge Cases</a></td>
   8302 			</tr>
   8303 
   8304 			<tr>
   8305 				<td>S <a name="Element_Keyboard" href="#Element_Keyboard">Element:
   8306 						keyboard</a></td>
   8307 				<td><a href="tr35-keyboards.html#Element_Keyboard">Element:
   8308 						keyboard</a></td>
   8309 			</tr>
   8310 
   8311 			<tr>
   8312 				<td>S <a name="Element_version" href="#Element_version">Element:
   8313 						version</a></td>
   8314 				<td><a href="tr35-keyboards.html#Element_version">Element:
   8315 						version</a></td>
   8316 			</tr>
   8317 
   8318 			<tr>
   8319 				<td>S <a name="Element_generation" href="#Element_generation">Element:
   8320 						generation</a></td>
   8321 				<td><a href="tr35-keyboards.html#Element_generation">Element:
   8322 						generation</a></td>
   8323 			</tr>
   8324 
   8325 			<tr>
   8326 				<td>S <a name="Element_names" href="#Element_names">Element:
   8327 						names</a></td>
   8328 				<td><a href="tr35-keyboards.html#Element_names">Element:
   8329 						names</a></td>
   8330 			</tr>
   8331 
   8332 			<tr>
   8333 				<td>S <a name="Element_name" href="#Element_name">Element:
   8334 						name</a></td>
   8335 				<td><a href="tr35-keyboards.html#Element_name">Element:
   8336 						name</a></td>
   8337 			</tr>
   8338 
   8339 			<tr>
   8340 				<td>S <a name="Element_settings" href="#Element_settings">Element:
   8341 						settings</a></td>
   8342 				<td><a href="tr35-keyboards.html#Element_settings">Element:
   8343 						settings</a></td>
   8344 			</tr>
   8345 
   8346 			<tr>
   8347 				<td>S <a name="Element_keyMap" href="#Element_keyMap">Element:
   8348 						keyMap</a></td>
   8349 				<td><a href="tr35-keyboards.html#Element_keyMap">Element:
   8350 						keyMap</a></td>
   8351 			</tr>
   8352 
   8353 			<tr>
   8354 				<td>S <a name="Element_map" href="#Element_map">Element:
   8355 						map</a></td>
   8356 				<td><a href="tr35-keyboards.html#Element_map">Element: map</a></td>
   8357 			</tr>
   8358 
   8359 			<tr>
   8360 				<td>S <a name="Element_transforms" href="#Element_transforms">Element:
   8361 						transforms</a></td>
   8362 				<td><a href="tr35-keyboards.html#Element_transforms">Element:
   8363 						transforms</a></td>
   8364 			</tr>
   8365 
   8366 			<tr>
   8367 				<td>S <a name="Element_transform" href="#Element_transform">Element:
   8368 						transform</a></td>
   8369 				<td><a href="tr35-keyboards.html#Element_transform">Element:
   8370 						transform</a></td>
   8371 			</tr>
   8372 
   8373 			<tr>
   8374 				<td>S <a name="Element_platform" href="#Element_platform">Element:
   8375 						platform</a></td>
   8376 				<td><a href="tr35-keyboards.html#Element_platform">Element:
   8377 						platform</a></td>
   8378 			</tr>
   8379 
   8380 			<tr>
   8381 				<td>S <a name="Element_hardwareMap" href="#Element_hardwareMap">Element:
   8382 						hardwareMap</a></td>
   8383 				<td><a href="tr35-keyboards.html#Element_hardwareMap">Element:
   8384 						hardwareMap</a></td>
   8385 			</tr>
   8386 
   8387 			<tr>
   8388 				<td>S <a name="Principles_for_Keyboard_Ids"
   8389 					href="#Principles_for_Keyboard_Ids">Principles for Keyboard Ids</a></td>
   8390 				<td><a href="tr35-keyboards.html#Principles_for_Keyboard_Ids">Principles
   8391 						for Keyboard Ids</a></td>
   8392 			</tr>
   8393 
   8394 		</table>
   8395 		<hr>
   8396 		<h2>
   8397 			<a name="References" href="#References">References</a>
   8398 		</h2>
   8399 		<table cellpadding="4" cellspacing="0" class="noborder" border="0">
   8400 			<tr>
   8401 				<th class="noborder" width="148">Ancillary Information</th>
   8402 				<td class="noborder" width="730"><i>To properly localize,
   8403 						parse, and format data requires ancillary information, which is
   8404 						not expressed in Locale Data Markup Language. Some of the formats
   8405 						for values used in Locale Data Markup Language are constructed
   8406 						according to external specifications. The sources for this data
   8407 						and/or formats include the following:<br> &nbsp;
   8408 				</i></td>
   8409 			</tr>
   8410 			<tr>
   8411 				<td class="noborder" width="148">[<a name="Bugs" href="#Bugs">Bugs</a>]
   8412 				</td>
   8413 				<td class="noborder" width="730">CLDR Bug Reporting form<br>
   8414 					<a href="http://cldr.unicode.org/index/bug-reports">
   8415 						http://cldr.unicode.org/index/bug-reports</a></td>
   8416 			</tr>
   8417 			<tr>
   8418 				<td class="noborder" width="148">[<a name="Charts"
   8419 					href="#Charts">Charts</a>]
   8420 				</td>
   8421 				<td class="noborder" width="730">The online code charts can be
   8422 					found at <a href="http://unicode.org/charts/">http://unicode.org/charts/</a>
   8423 					An index to character names with links to the corresponding chart
   8424 					is found at <a href="http://unicode.org/charts/charindex.html">http://unicode.org/charts/charindex.html</a>
   8425 				</td>
   8426 			</tr>
   8427 			<tr>
   8428 				<td class="noborder" width="148">[<a name="DUCET" href="#DUCET">DUCET</a>]
   8429 				</td>
   8430 				<td class="noborder" width="730">The Default Unicode Collation
   8431 					Element Table (DUCET)<br> For the base-level collation, of
   8432 					which all the collation tables in this document are tailorings.<br>
   8433 					<a
   8434 					href="http://unicode.org/reports/tr10/#Default_Unicode_Collation_Element_Table">http://unicode.org/reports/tr10/#Default_Unicode_Collation_Element_Table</a>
   8435 				</td>
   8436 			</tr>
   8437 			<tr>
   8438 				<td class="noborder" width="148">[<a name="FAQ" href="#FAQ">FAQ</a>]
   8439 				</td>
   8440 				<td class="noborder" valign="top" width="730">Unicode
   8441 					Frequently Asked Questions<br> <a
   8442 					href="http://unicode.org/faq/">http://unicode.org/faq/<br>
   8443 				</a><i>For answers to common questions on technical issues.</i>
   8444 				</td>
   8445 			</tr>
   8446 			<tr>
   8447 				<td class="noborder" width="148">[<a name="FCD" href="#FCD">FCD</a>]
   8448 				</td>
   8449 				<td class="noborder" width="730">As defined in UTN #5 Canonical
   8450 					Equivalences in Applications<br> <a
   8451 					href="http://unicode.org/notes/tn5/">http://unicode.org/notes/tn5/</a>
   8452 				</td>
   8453 			</tr>
   8454 			<tr>
   8455 				<td class="noborder" width="148">[<a name="Glossary"
   8456 					href="#Glossary">Glossary</a>]
   8457 				</td>
   8458 				<td class="noborder" width="730">Unicode Glossary<a
   8459 					href="http://unicode.org/glossary/"><br>
   8460 						http://unicode.org/glossary/<br> </a><i>For explanations of
   8461 						terminology used in this and other documents.</i></td>
   8462 			</tr>
   8463 			<tr>
   8464 				<td class="noborder" width="148">[<a name="JavaChoice"
   8465 					href="#JavaChoice">JavaChoice</a>]
   8466 				</td>
   8467 				<td class="noborder" width="730">Java ChoiceFormat<br> <a
   8468 					href="http://docs.oracle.com/javase/7/docs/api/java/text/ChoiceFormat.html">
   8469 						http://docs.oracle.com/javase/7/docs/api/java/text/ChoiceFormat.html</a></td>
   8470 			</tr>
   8471 			<tr>
   8472 				<td class="noborder" width="148">[<a name="Olson" href="#Olson">Olson</a>]
   8473 				</td>
   8474 				<td class="noborder" width="730">The <i>TZ</i>ID Database (aka
   8475 					Olson timezone database)<br> Time zone and daylight savings
   8476 					information.<br> <a href="http://www.iana.org/time-zones">http://www.iana.org/time-zones</a><br>
   8477 					For archived data, see<br> <a
   8478 					href="ftp://ftp.iana.org/tz/releases/">ftp://ftp.iana.org/tz/releases/</a></td>
   8479 			</tr>
   8480 			<tr>
   8481 				<td class="noborder" width="148">[<a name="Reports"
   8482 					href="#Reports">Reports</a>]
   8483 				</td>
   8484 				<td class="noborder" width="730">Unicode Technical Reports<br>
   8485 					<a href="http://unicode.org/reports/">http://unicode.org/reports/<br>
   8486 				</a><i>For information on the status and development process for
   8487 						technical reports, and for a list of technical reports.</i></td>
   8488 			</tr>
   8489 			<tr>
   8490 				<td class="noborder" width="148">[<a name="Unicode"
   8491 					href="#Unicode">Unicode</a>]
   8492 				</td>
   8493 				<td class="noborder" width="730">The Unicode Consortium. <em>The
   8494 						Unicode Standard, Version 7.0.0</em>,&nbsp;(Mountain View, CA: The
   8495 					Unicode Consortium, 2014. ISBN 978-1-936213-09-2)<br> <a
   8496 					href="http://www.unicode.org/versions/Unicode7.0.0/">
   8497 						http://www.unicode.org/versions/Unicode7.0.0/</a>
   8498 				</td>
   8499 			</tr>
   8500 			<tr>
   8501 				<td class="noborder" width="148">[<a name="Versions"
   8502 					href="#Versions">Versions</a>]
   8503 				</td>
   8504 				<td class="noborder" width="730">Versions of the Unicode
   8505 					Standard<br> <a href="http://www.unicode.org/versions/">
   8506 						http://www.unicode.org/versions/</a><br> <i>For information
   8507 						on version numbering, and citing and referencing the Unicode
   8508 						Standard, the Unicode Character Database, and Unicode Technical
   8509 						Reports.</i>
   8510 				</td>
   8511 			</tr>
   8512 			<tr>
   8513 				<td class="noborder" width="148">[<a name="XPath" href="#XPath">XPath</a>]
   8514 				</td>
   8515 				<td class="noborder" width="730"><a
   8516 					href="http://www.w3.org/TR/xpath/"> http://www.w3.org/TR/xpath/</a></td>
   8517 			</tr>
   8518 			<tr>
   8519 				<th class="noborder" width="148">Other Standards</th>
   8520 				<td class="noborder" width="730"><i>Various standards
   8521 						define codes that are used as keys or values in Locale Data Markup
   8522 						Language. These include:</i></td>
   8523 			</tr>
   8524 			<tr>
   8525 				<td class="noborder">[<a name="BCP47" href="#BCP47">BCP47</a>]
   8526 				</td>
   8527 				<td class="noborder"><a
   8528 					href="http://www.rfc-editor.org/rfc/bcp/bcp47.txt">
   8529 						http://www.rfc-editor.org/rfc/bcp/bcp47.txt</a>
   8530 					<p>
   8531 						The Registry<br> <a
   8532 							href="http://www.iana.org/assignments/language-subtag-registry">http://www.iana.org/assignments/language-subtag-registry</a>
   8533 					</p></td>
   8534 			</tr>
   8535 			<tr>
   8536 				<td class="noborder" width="148">[<a name="ISO639"
   8537 					href="#ISO639">ISO639</a>]
   8538 				</td>
   8539 				<td class="noborder" width="730">ISO Language Codes<br> <a
   8540 					href="http://www.loc.gov/standards/iso639-2/">http://www.loc.gov/standards/iso639-2/</a><br>
   8541 					Actual List<br> <a
   8542 					href="http://www.loc.gov/standards/iso639-2/langcodes.html">http://www.loc.gov/standards/iso639-2/langcodes.html</a></td>
   8543 			</tr>
   8544 			<tr>
   8545 				<td class="noborder" width="148">[<a name="ISO1000"
   8546 					href="#ISO1000">ISO1000</a>]
   8547 				</td>
   8548 				<td class="noborder" width="730">ISO 1000: SI units and
   8549 					recommendations for the use of their multiples and of certain other
   8550 					units, International Organization for Standardization, 1992.<br>
   8551 					<a href="http://www.iso.org/iso/catalogue_detail?csnumber=5448">http://www.iso.org/iso/catalogue_detail?csnumber=5448</a>
   8552 				</td>
   8553 			</tr>
   8554 			<tr>
   8555 				<td class="noborder" width="148">[<a name="ISO3166"
   8556 					href="#ISO3166">ISO3166</a>]
   8557 				</td>
   8558 				<td class="noborder" width="730">ISO Region Codes<br> <a
   8559 					href="http://www.iso.org/iso/country_codes">http://www.iso.org/iso/country_codes</a><br>
   8560 					Actual List<br> <a
   8561 					href="http://www.iso.org/iso/country_names_and_code_elements">http://www.iso.org/iso/country_names_and_code_elements</a></td>
   8562 			</tr>
   8563 			<tr>
   8564 				<td class="noborder" width="148">[<a name="ISO4217"
   8565 					href="#ISO4217">ISO4217</a>]
   8566 				</td>
   8567 				<td class="noborder" width="730">ISO Currency Codes<br> <a
   8568 					href="http://www.iso.org/iso/home/standards/currency_codes.htm">http://www.iso.org/iso/home/standards/currency_codes.htm</a>
   8569 					<p>
   8570 						<i>(Note that as of this point, there are significant problems
   8571 							with this list. The supplemental data file contains the best
   8572 							compendium of currency information available.)</i>
   8573 					</p>
   8574 				</td>
   8575 			</tr>
   8576 			<tr>
   8577 				<td class="noborder" width="148">[<a name="ISO8601"
   8578 					href="#ISO8601">ISO8601</a>]
   8579 				</td>
   8580 				<td class="noborder" width="730">ISO Date and Time Format<br>
   8581 					<a href="http://www.iso.org/iso/iso8601">http://www.iso.org/iso/iso8601</a>
   8582 				</td>
   8583 			</tr>
   8584 			<tr>
   8585 				<td class="noborder" width="148">[<a name="ISO15924"
   8586 					href="#ISO15924">ISO15924</a>]
   8587 				</td>
   8588 				<td class="noborder" width="730">ISO Script Codes<br> <a
   8589 					href="http://www.unicode.org/iso15924/standard/index.html">http://www.unicode.org/iso15924/standard/index.html</a><br>
   8590 					Actual List<br> <a
   8591 					href="http://www.unicode.org/iso15924/codelists.html">http://www.unicode.org/iso15924/codelists.html</a></td>
   8592 			</tr>
   8593 			<tr>
   8594 				<td class="noborder" width="148">[<a name="LOCODE"
   8595 					href="#LOCODE">LOCODE</a>]
   8596 				</td>
   8597 				<td class="noborder" width="730">United Nations Code for Trade
   8598 					and Transport Locations, commonly known as "UN/LOCODE"<br> <a
   8599 					href="http://www.unece.org/cefact/locode/welcome.html">
   8600 						http://www.unece.org/cefact/locode/welcome.html</a><br> Download
   8601 					at:<a
   8602 					href="http://www.unece.org/cefact/codesfortrade/codes_index.htm">http://www.unece.org/cefact/codesfortrade/codes_index.htm</a>
   8603 				</td>
   8604 			</tr>
   8605 			<tr>
   8606 				<td class="noborder" width="148">[<a name="RFC6067"
   8607 					href="#RFC6067">RFC6067</a>]
   8608 				</td>
   8609 				<td class="noborder" width="730">BCP 47 Extension U<br> <a
   8610 					href="http://www.ietf.org/rfc/rfc6067.txt">http://www.ietf.org/rfc/rfc6067.txt</a></td>
   8611 			</tr>
   8612 			<tr>
   8613 				<td class="noborder" width="148">[<a name="RFC6497"
   8614 					href="#RFC6497">RFC6497</a>]
   8615 				</td>
   8616 				<td class="noborder" width="730">BCP 47 Extension T -
   8617 					Transformed Content<br> <a
   8618 					href="http://www.ietf.org/rfc/rfc6497.txt">http://www.ietf.org/rfc/rfc6497.txt</a>
   8619 				</td>
   8620 			</tr>
   8621 			<tr>
   8622 				<td class="noborder" width="148">[<a name="UNM49" href="#UNM49">UNM49</a>]
   8623 				</td>
   8624 				<td class="noborder" width="730">UN M.49: UN Statistics
   8625 					Division
   8626 					<p>
   8627 						Country or area &amp; region codes<br> <a
   8628 							href="http://unstats.un.org/unsd/methods/m49/m49.htm">http://unstats.un.org/unsd/methods/m49/m49.htm</a>
   8629 					</p>
   8630 					<p>
   8631 						Composition of macro geographical (continental) regions,
   8632 						geographical sub-regions, and selected economic and other
   8633 						groupings<br> <a
   8634 							href="http://unstats.un.org/unsd/methods/m49/m49regin.htm">http://unstats.un.org/unsd/methods/m49/m49regin.htm</a>
   8635 					</p>
   8636 				</td>
   8637 			</tr>
   8638 			<tr>
   8639 				<td class="noborder" width="148">[<a name="XMLSchema"
   8640 					href="#XMLSchema">XML Schema</a>]
   8641 				</td>
   8642 				<td class="noborder" width="730">W3C XML Schema<br> <a
   8643 					href="http://www.w3.org/XML/Schema">http://www.w3.org/XML/Schema</a></td>
   8644 			</tr>
   8645 			<tr>
   8646 				<th class="noborder" width="148">General</th>
   8647 				<td class="noborder" width="730"><i>The following are
   8648 						general references from the text:</i></td>
   8649 			</tr>
   8650 			<tr>
   8651 				<td class="noborder" width="148">[<a name="ByType"
   8652 					href="#ByType">ByType</a>]
   8653 				</td>
   8654 				<td class="noborder" width="730">CLDR Comparison Charts<br>
   8655 					<a href="http://www.unicode.org/cldr/comparison_charts.html">http://www.unicode.org/cldr/comparison_charts.html</a></td>
   8656 			</tr>
   8657 			<tr>
   8658 				<td class="noborder" width="148">[<a name="Calendars"
   8659 					href="#Calendars">Calendars</a>]
   8660 				</td>
   8661 				<td class="noborder" width="730">Calendrical Calculations: The
   8662 					Millennium Edition by Edward M. Reingold, Nachum Dershowitz;
   8663 					Cambridge University Press; Book and CD-ROM edition (July 1, 2001);
   8664 					ISBN: 0521777526. Note that the algorithms given in this book are
   8665 					copyrighted.</td>
   8666 			</tr>
   8667 			<tr>
   8668 				<td class="noborder" width="148">[<a name="Comparisons"
   8669 					href="#Comparisons">Comparisons</a>]
   8670 				</td>
   8671 				<td class="noborder" width="730">Comparisons between locale
   8672 					data from different sources<br> <a
   8673 					href="http://unicode.org/cldr/data/diff/">http://unicode.org/cldr/data/diff/</a>
   8674 				</td>
   8675 			</tr>
   8676 			<tr>
   8677 				<td class="noborder" width="148">[<a name="CurrencyInfo"
   8678 					href="#CurrencyInfo">CurrencyInfo</a>]
   8679 				</td>
   8680 				<td class="noborder" width="730">UNECE Currency Data<br> <a
   8681 					href="http://www.currency-iso.org/en/home/tables.html">http://www.currency-iso.org/en/home/tables.html</a></td>
   8682 			</tr>
   8683 			<tr>
   8684 				<td class="noborder" width="148">[<a name="DataFormats"
   8685 					href="#DataFormats">DataFormats</a>]
   8686 				</td>
   8687 				<td class="noborder" width="730">CLDR Translation Guidelines<br>
   8688 					<a href="http://cldr.unicode.org/translation">http://cldr.unicode.org/translation</a></td>
   8689 			</tr>
   8690 			<tr>
   8691 				<td class="noborder" width="148">[<a name="LDML" href="#LDML">Example</a>]
   8692 				</td>
   8693 				<td class="noborder" width="730">A sample in Locale Data Markup
   8694 					Language<br> <a
   8695 					href="http://unicode.org/cldr/dtd/1.1/ldml-example.xml">http://unicode.org/cldr/dtd/1.1/ldml-example.xml</a>
   8696 				</td>
   8697 			</tr>
   8698 			<tr>
   8699 				<td class="noborder" width="148">[<a name="ICUCollation"
   8700 					href="#ICUCollation">ICUCollation</a>]
   8701 				</td>
   8702 				<td class="noborder" width="730">ICU rule syntax<br> <a
   8703 					href="http://www.icu-project.org/userguide/Collate_Customization.html">http://www.icu-project.org/userguide/Collate_Customization.html</a></td>
   8704 			</tr>
   8705 			<tr>
   8706 				<td class="noborder" width="148">[<a name="ICUTransforms"
   8707 					href="#ICUTransforms">ICUTransforms</a>]
   8708 				</td>
   8709 				<td class="noborder" width="730">Transforms<br> <a
   8710 					href="http://www.icu-project.org/userguide/Transformations.html">http://www.icu-project.org/userguide/Transformations.html</a><br>
   8711 					Transforms Demo<br> <a
   8712 					href="http://demo.icu-project.org/icu-bin/translit/">http://demo.icu-project.org/icu-bin/translit/</a></td>
   8713 			</tr>
   8714 			<tr>
   8715 				<td class="noborder" width="148">[<a name="ICUUnicodeSet"
   8716 					href="#ICUUnicodeSet">ICUUnicodeSet</a>]
   8717 				</td>
   8718 				<td class="noborder" width="730">ICU UnicodeSet<br> <a
   8719 					href="http://www.icu-project.org/userguide/unicodeSet.html">http://www.icu-project.org/userguide/unicodeSet.html<br>
   8720 				</a>API<br> <a
   8721 					href="http://www.icu-project.org/apiref/icu4j/com/ibm/icu/text/UnicodeSet.html">http://www.icu-project.org/apiref/icu4j/com/ibm/icu/text/UnicodeSet.html</a></td>
   8722 			</tr>
   8723 			<tr>
   8724 				<td class="noborder" width="148">[<a name="ITUE164"
   8725 					href="#ITUE164">ITUE164</a>]
   8726 				</td>
   8727 				<td class="noborder" width="730">International
   8728 					Telecommunication Union: List Of ITU Recommendation E.164 Assigned
   8729 					Country Codes<br> available at <a
   8730 					href="http://www.itu.int/opb/publications.aspx?parent=T-SP&view=T-SP2">http://www.itu.int/opb/publications.aspx?parent=T-SP&view=T-SP2</a>
   8731 				</td>
   8732 			</tr>
   8733 			<tr>
   8734 				<td class="noborder" width="148">[<a name="LocaleExplorer"
   8735 					href="#LocaleExplorer">LocaleExplorer</a>]
   8736 				</td>
   8737 				<td class="noborder" width="730">ICU Locale Explorer<br> <a
   8738 					href="http://demo.icu-project.org/icu-bin/locexp">http://demo.icu-project.org/icu-bin/locexp</a></td>
   8739 			</tr>
   8740 			<tr>
   8741 				<td class="noborder" width="148">[<a name="localeProject"
   8742 					href="#localeProject">LocaleProject</a>]
   8743 				</td>
   8744 				<td class="noborder" width="730">Common Locale Data Repository
   8745 					Project<br> <a href="http://unicode.org/cldr/">http://unicode.org/cldr/</a>
   8746 				</td>
   8747 			</tr>
   8748 			<tr>
   8749 				<td class="noborder" width="148">[<a name="NamingGuideline"
   8750 					href="#NamingGuideline">NamingGuideline</a>]
   8751 				</td>
   8752 				<td class="noborder" width="730">OpenI18N Locale Naming
   8753 					Guideline<br> formerly at
   8754 					http://www.openi18n.org/docs/text/LocNameGuide-V10.txt
   8755 				</td>
   8756 			</tr>
   8757 			<tr>
   8758 				<td class="noborder" width="148">[<a name="RBNF" href="#RBNF">RBNF</a>]
   8759 				</td>
   8760 				<td class="noborder" width="730">Rule-Based Number Format<br>
   8761 					<a
   8762 					href="http://www.icu-project.org/apiref/icu4c/classRuleBasedNumberFormat.html">http://www.icu-project.org/apiref/icu4c/classRuleBasedNumberFormat.html#_details</a></td>
   8763 			</tr>
   8764 			<tr>
   8765 				<td class="noborder" width="148">[<a name="RBBI" href="#RBBI">RBBI</a>]
   8766 				</td>
   8767 				<td class="noborder" width="730">Rule-Based Break Iterator<br>
   8768 					<a
   8769 					href="http://www.icu-project.org/userguide/boundaryAnalysis.html">http://www.icu-project.org/userguide/boundaryAnalysis.html</a></td>
   8770 			</tr>
   8771 			<tr>
   8772 				<td class="noborder" width="148">[<a name="RFC5234"
   8773 					href="#RFC5234">RFC5234</a>]
   8774 				</td>
   8775 				<td class="noborder" width="730">RFC5234 Augmented BNF for
   8776 					Syntax Specifications: ABNF<br> <a
   8777 					href="http://www.ietf.org/rfc/rfc5234.txt">http://www.ietf.org/rfc/rfc5234.txt</a>
   8778 				</td>
   8779 			</tr>
   8780 			<tr>
   8781 				<td class="noborder" width="148">[<a name="UCAChart"
   8782 					href="#UCAChart">UCAChart</a>]
   8783 				</td>
   8784 				<td class="noborder" width="730">Collation Chart<a
   8785 					href="http://unicode.org/charts/collation/"><br>
   8786 						http://unicode.org/charts/collation/</a></td>
   8787 			</tr>
   8788 			<tr>
   8789 				<td class="noborder" width="148">[<a name="UTCInfo"
   8790 					href="#UTCInfo">UTCInfo</a>]
   8791 				</td>
   8792 				<td class="noborder" width="730">NIST Time and Frequency
   8793 					Division Home Page<br> <a href="http://tf.nist.gov/">http://tf.nist.gov/<br>
   8794 				</a>U.S. Naval Observatory: What is Universal Time?<br> <a
   8795 					href="http://aa.usno.navy.mil/faq/docs/UT.php">http://aa.usno.navy.mil/faq/docs/UT.php</a>
   8796 				</td>
   8797 			</tr>
   8798 			<tr>
   8799 				<td class="noborder" width="148">[<a name="WindowsCulture"
   8800 					href="#WindowsCulture">WindowsCulture</a>]
   8801 				</td>
   8802 				<td class="noborder" width="730">Windows Culture Info
   8803 					(with&nbsp; mappings from [<a href="#BCP47">BCP47</a>]-style codes
   8804 					to LCIDs)<br> <a
   8805 					href="http://msdn.microsoft.com/en-us/library/system.globalization.cultureinfo(vs.71).aspx">http://msdn2.microsoft.com/en-us/library/system.globalization.cultureinfo(vs.71).aspx</a>
   8806 				</td>
   8807 			</tr>
   8808 		</table>
   8809 		<h2>
   8810 			<a name="Acknowledgments" href="#Acknowledgments">Acknowledgments</a>
   8811 		</h2>
   8812 		<p>Special thanks to the following people for their continuing
   8813 			overall contributions to the CLDR project, and for their specific
   8814 			contributions in the following areas. These descriptions only touch
   8815 			on the many contributions that they have made.</p>
   8816 		<ul>
   8817 			<li><a
   8818 				href="https://plus.google.com/114199149796022210033?rel=author">Mark
   8819 					Davis</a> for creating the initial version of LDML, and adding to and
   8820 				maintaining this specification, and for his work on the LDML code
   8821 				and tests, much of the supplemental data and overall structure, and
   8822 				transforms and keyboards.</li>
   8823 			<li>John Emmons for the POSIX conversion tool and metazones.</li>
   8824 			<li>Deborah Goldsmith for her contributions to LDML architecture
   8825 				and this specification.</li>
   8826 			<li>Chris Hansten for coordinating and managing data submissions
   8827 				and vetting.</li>
   8828 			<li>Erkki Kolehmainen and his team for their work on Finnish.</li>
   8829 			<li>Steven R. Loomis for development of the survey tool and
   8830 				database management.</li>
   8831 			<li>Peter Nugent for his contributions to the POSIX tool and
   8832 				from Open Office, and for coordinating and managing data submissions
   8833 				and vetting.</li>
   8834 			<li>George Rhoten for his work on currencies.</li>
   8835 			<li>Roozbeh Pournader ( ) for his work on South
   8836 				Asian countries.</li>
   8837 			<li>Ram Viswanadha ( ) for all of his work on
   8838 				LDML code and data integration, and for coordinating and managing
   8839 				data submissions and vetting.</li>
   8840 			<li>Vladimir Weinstein ( ) for his work on
   8841 				collation.</li>
   8842 			<li>Yoshito Umaoka ( ) for his work on the timezone
   8843 				architecture.</li>
   8844 			<li>Rick McGowan for his work gathering language, script and
   8845 				region data.</li>
   8846 			<li>Xiaomei Ji () for her work on time intervals and plural
   8847 				formatting.</li>
   8848 			<li>David Bertoni for his contributions to the conversion tools.</li>
   8849 			<li>Mike Tardif for reviewing this specification and for
   8850 				coordinating and vetting data submissions.</li>
   8851 			<li>Peter Edberg for work on this specification, telephone code
   8852 				data, monthPatterns, cyclicNameSets and contextTransforms.</li>
   8853 			<li>Raymond Wainman and Cibu Johny for their work on keyboards.</li>
   8854 			<li>Jennifer Chye for her contributions to the conversion tools.</li>
   8855 			<li><a
   8856 				href="https://plus.google.com/117587389715494866571?rel=author">Markus
   8857 					Scherer</a> for a major rewrite of Part 5, Collation.</li>
   8858 		</ul>
   8859 		<p>
   8860 			Other contributors to CLDR are listed on the <a
   8861 				href="http://www.unicode.org/cldr/">CLDR Project Page</a>.
   8862 		</p>
   8863 
   8864 		<h2>
   8865 			<a name="Modifications" href="#Modifications">Modifications</a>
   8866 		</h2>
   8867 
   8868 <p><b>Revision 53</b></p>
   8869 <p><strong>Part 1: <a href="tr35.html#Contents">Core</a> (languages,
   8870 				locales, basic structure)
   8871 	</strong></p>
   8872 <ul>
   8873   <li><strong>Section 3.2 <a 
   8874 				href="#Unicode_locale_identifier">Unicode Locale Identifier</a></strong>
   8875 [<a href="http://unicode.org/cldr/trac/ticket/11435">#11435</a>]
   8876 [<a href="http://unicode.org/cldr/trac/ticket/11434">#11434</a>]
   8877 <ul>
   8878   <li>Fixed cases of "-" in the syntax that should have been <em>sep</em>, and note that &quot;-&quot; is the canonical (preferred) form.</li>
   8879   <li>Fixed &quot;u&quot; and &quot;t&quot; in the syntax to [uU] and [tT], resp., to reflect that case is ignored when parsing.</li>
   8880   <li>Included specific syntax rather than just noting &quot;Although not shown in the syntax above, Unicode locale identifiers may also have [BCP47] extensions (other than &quot;u&quot; and &quot;t&quot;) and private use subtags.&quot;</li>
   8881   <li>Reformated and fleshed out the canonical form description; listed where CLDR uses non-canonical forms.</li>
   8882   <li>Added missing details about how Unicode Locale Identifiers differ from BCP 47, and how to convert between them.</li>
   8883   </ul>
   8884   </li>
   8885   <li><strong>Section 3.3 <a href="#BCP_47_Conformance">BCP
   8886     47 Conformance</a> </strong>
   8887 <ul>
   8888   <li>Reorganized for clarity, introduced new terms <em>Unicode BCP 47 locale identifier</em> and <em>Unicode CLDR locale identifier</em>. [<a href="http://unicode.org/cldr/trac/ticket/11451">#11451</a>]</li>
   8889   </ul>
   8890   </li>
   8891   <li><strong>Section 3.3.1 <a  href="http://unicode.org/repos/cldr/trunk/specs/ldml/tr35.html#BCP_47_Language_Tag_Conversion">BCP 47 Language Tag Conversion</a>
   8892     [<a href="http://unicode.org/cldr/trac/ticket/11451">#11451</a>]</strong>
   8893     <ul>
   8894       <li>Now handles private-use extensions and grandfathered tags.</li>
   8895       <li>Added more examples.</li>
   8896       <li>Separated into three conversions.
   8897         <ul>
   8898           <li> <a  href="http://unicode.org/repos/cldr/trunk/specs/ldml/tr35.html#Language_Tag_to_Locale_Identifier">BCP 47 Language Tag to Unicode BCP 47 Locale Identifier</a>          </li>
   8899           <li> <a  href="http://unicode.org/repos/cldr/trunk/specs/ldml/tr35.html#Unicode_Locale_Identifier_CLDR_to_BCP_47">Unicode Locale Identifier: CLDR to BCP 47</a>          </li>
   8900           <li> <a  href="http://unicode.org/repos/cldr/trunk/specs/ldml/tr35.html#Unicode_Locale_Identifier_BCP_47_to_CLDR">Unicode Locale Identifier: BCP 47 to CLDR</a>          </li>
   8901         </ul>
   8902       </li>
   8903       </ul>
   8904   </li>
   8905   <li><strong>Section 3.4
   8906     <a href="#Field_Definitions">Language Identifier Field Definitions </a> 
   8907     </strong>
   8908     <ul>
   8909       <li>Added another macrolanguage example ku (used for kmr), and link to Aliases chart
   8910       	[<a href="http://unicode.org/cldr/trac/ticket/11470">#11470</a>]</li>
   8911       <li>Documented special language subtags mis, mul, zxx [<a href="http://unicode.org/cldr/trac/ticket/11451">#11451</a>]</li>
   8912       <li>Added special script code Qaag [<a href="http://unicode.org/cldr/trac/ticket/11408">#11408</a>]</li>
   8913       <li>Documented special region subtags XA and XB [<a href="http://unicode.org/cldr/trac/ticket/11451">#11451</a>]</li>
   8914       </ul>
   8915   </li>
   8916   <li><strong>Section 3.5.3 <a href="#Private_Use">Private Use Codes</a></strong>
   8917     <ul>
   8918       <li>Adjusted table to move Qaag, XA, and XB into <em>defined</em>. The XA and XB were correct in the identity file (a change in a previous release), but had not been added to that table. [<a href="http://unicode.org/cldr/trac/ticket/11408">#11408</a>]</li>
   8919       </ul>
   8920   </li>
   8921   <li><strong>Section 3.6.4 <a href="#Unicode_Locale_Extension_Data_Files" >U Extension Data Files</a>
   8922     </strong>
   8923     <ul>
   8924       <li>Qualified valueType, since a key's value may be empty (if &quot;true&quot;). [<a href="http://unicode.org/cldr/trac/ticket/11408">#11408</a>]</li>
   8925   </ul>
   8926   </li>
   8927   <li><strong>Section 3.6.5.1 <a  href="#Validity">Validity</a></strong> 
   8928     <ul>
   8929       <li>Softened the requirement that there be region code matching the first 2 letters of the subdivision code. That was needlessly strict, and introduces a dependency on <em>likely subtags</em> that should not be there. [<a href="http://unicode.org/cldr/trac/ticket/11397">#11397</a>]</li>
   8930       </ul>
   8931   </li>
   8932   <li><strong>Section 4.2.6 <a 
   8933 				href="#Inheritance_vs_Related">Inheritance vs Related Information</a>
   8934   </strong>
   8935     <ul>
   8936       <li>Added table to explain the relationship between Inheritance, DefaultContent, LikelySubtags, and LocaleMatching.</li>
   8937   </ul>
   8938   </li>
   8939   <li><strong>Section 5.3.3
   8940     <a href="#Unicode_Sets">Unicode Sets</a> 
   8941     </strong>
   8942     <ul>
   8943       <li>Clarified the relation between UnicodeSet and <a
   8944 				href="http://www.unicode.org/reports/tr41/#UTS18">UTS #18</a> [<a href="http://unicode.org/cldr/trac/ticket/11232">#11232</a>]</li>
   8945       </ul>
   8946   </li>
   8947   </ul>
   8948 <p><strong>Part 2: <a href="tr35-general.html#Contents">General</a>
   8949 		(display names &amp; transforms, etc.)
   8950 	</strong></p>
   8951 <ul>
   8952   <li><strong>Section 6 <a href="tr35-general.html#Unit_Elements">Unit Elements</a> </strong>
   8953     <ul>
   8954       <li>Added &lt;displayName&gt; element for &lt;coordinateUnit&gt;.
   8955         [<a href="http://unicode.org/cldr/trac/ticket/9986">#9986</a>]</li>
   8956       <li>Noted that unitPatterns can use explicit count values 0 and 1.
   8957       	[<a href="http://unicode.org/cldr/trac/ticket/10922">#10922</a>]</li>
   8958       <li>Defined the syntax  of unit identifiers [<a href="http://unicode.org/cldr/trac/ticket/11271">#11271</a>]</li>
   8959       <li>Added several new units: percent and permille, petabyte, and atmosphere.
   8960         [<a href="http://unicode.org/cldr/trac/ticket/10632">#10632</a>]
   8961         [<a href="http://unicode.org/cldr/trac/ticket/10410">#10410</a>]
   8962         [<a href="http://unicode.org/cldr/trac/ticket/10600">#10600</a>]</li>
   8963       </ul>
   8964   </li>
   8965   <li><strong>Section 10.1.1 <a href="tr35-general.html#Pivots">Pivots</a></strong>
   8966     <ul>
   8967       <li>Described the use of private use characters in Interindic. [<a href="http://unicode.org/cldr/trac/ticket/10962">#10962</a>]</li>
   8968     </ul>
   8969   </li>
   8970   </ul>
   8971 <p><strong>Part 3: <a href="tr35-numbers.html#Contents">Numbers</a>
   8972 		(number &amp; currency formatting)
   8973 	</strong></p>
   8974 <ul>
   8975   <li><strong>Section 2.5 <a href="tr35-numbers.html#Miscellaneous_Patterns">Miscellaneous Patterns</a></strong>
   8976     <ul>
   8977       <li>Documented <strong>approximately</strong> and <strong>atMost</strong>. [<a href="http://unicode.org/cldr/trac/ticket/11354">#11354</a>]</li>
   8978       </ul>
   8979   </li>
   8980   <li><strong>Section 3.2 <a 
   8981 				href="tr35-numbers.html##Special_Pattern_Characters">Special Pattern Characters</a></strong><a 
   8982 				href="tr35-numbers.html##Special_Pattern_Characters"></a>
   8983     <ul>
   8984       <li>Documented edge cases for negative subpatterns (and whitespace)  [<a href="http://unicode.org/cldr/trac/ticket/10703">#10703</a>]</li>
   8985       </ul>
   8986   </li>
   8987   <li><strong>Section 3.4 <a href="tr35-numbers.html#sci">Scientific Notation</a> </strong>
   8988     <ul>
   8989       <li>Specify the special formats used for the integer parts.  [<a href="http://unicode.org/cldr/trac/ticket/10103">#10103</a>]</li>
   8990     </ul>
   8991   </li>
   8992   <li><strong>Section 5 <a href="tr35-numbers.html#Language_Plural_Rules">Language Plural Rules</a></strong>
   8993     <ul>
   8994       <li>Added a new section <a href="tr35-numbers.html#Explicit_0_1_rules">Explicit 0 and
   8995         1 rules</a> covering the language-independent explicit plural cases 0 and 1.
   8996         [<a href="http://unicode.org/cldr/trac/ticket/10922">#10922</a>]</li>
   8997       </ul>
   8998   </li>
   8999   </ul>
   9000 
   9001 <p><strong>Part 4: <a href="tr35-dates.html#Contents">Dates</a> (date,
   9002 				time, time zone formatting)
   9003 	</strong></p>
   9004 <ul>
   9005   <li><strong>Section 2.6.3 <a  href="tr35-dates.html#intervalFormats">Element intervalFormats</a></strong>
   9006     <ul>
   9007       <li>Described how to synthesize intervalFormatItems for skeletons that combine date and time fields.
   9008       	[<a href="http://unicode.org/cldr/trac/ticket/10133">#10133</a>]  </li>
   9009     </ul>
   9010   </li>
   9011   <li><strong>Section 4.4 <a  href="tr35-dates.html#Time_Data">Time Data</a></strong>
   9012     <ul>
   9013       <li>Documented the relation between @allowed and @preferred. [<a href="http://unicode.org/cldr/trac/ticket/9930">#9930</a>]</li>
   9014     </ul>
   9015   </li>
   9016 </ul>
   9017 <p><strong>Part 5: <a href="tr35-collation.html#Contents">Collation</a>
   9018 		(sorting, searching, grouping)
   9019 	</strong></p>
   9020 <ul>
   9021   <li><em>no changes</em></li>
   9022 </ul>
   9023 <p><strong>Part 6: <a href="tr35-info.html#Contents">Supplemental</a>
   9024 		(supplemental data)
   9025 	</strong></p>
   9026 <ul>
   9027   <li> <strong>Section 4 <a href="tr35-info.html#Supplemental_Code_Mapping">Supplemental
   9028   		Code Mapping</a></strong>
   9029     <ul>
   9030       <li>For the element &lt;territoryCodes&gt;, deprecated the internet attribute.
   9031       	[<a href="http://unicode.org/cldr/trac/ticket/11072">#11072</a>]</li>
   9032     </ul>
   9033   </li>
   9034 
   9035   <li> <strong>Section 5 <a href="tr35-info.html#Telephone_Code_Data">Telephone
   9036 				Code Data</a></strong>
   9037     <ul>
   9038       <li>Now deprecated, and data removed. [<a href="http://unicode.org/cldr/trac/ticket/10383">#10383</a>]</li>
   9039     </ul>
   9040   </li>
   9041 
   9042   <li> <strong>Section 9.3 <a href="tr35-info.html#Default_Content">Default
   9043 				Content</a></strong>
   9044     <ul>
   9045       <li>Added pointer to <strong>Section 4.2.6 <a 
   9046 				href="#Inheritance_vs_Related">Inheritance vs Related Information</a> </strong></li>
   9047   </ul>
   9048   </li>
   9049 </ul>
   9050 <p><strong>Part 7: <a href="tr35-keyboards.html#Contents">Keyboards</a>
   9051 		(keyboard mappings)
   9052 	</strong>	  </p>
   9053 	<ul>
   9054   <li><em>no changes</em></li>
   9055 </ul>
   9056 
   9057 
   9058 <p>&nbsp;</p>
   9059 
   9060 
   9061 	  <p>Modifications in previous versions are listed in those respective versions. Click on <strong>Previous Version</strong> in the header until you get to the desired version.</p>
   9062 		
   9063 		<hr>
   9064 		<p class="copyright">
   9065 			Copyright  20012018 Unicode, Inc. All
   9066 			Rights Reserved. The Unicode Consortium makes no expressed or implied
   9067 			warranty of any kind, and assumes no liability for errors or
   9068 			omissions. No liability is assumed for incidental and consequential
   9069 			damages in connection with or arising out of the use of the
   9070 			information or programs contained or accompanying this technical
   9071 			report. The Unicode <a href="http://unicode.org/copyright.html">Terms
   9072 				of Use</a> apply.
   9073 		</p>
   9074 		<p class="copyright">Unicode and the Unicode logo are trademarks
   9075 			of Unicode, Inc., and are registered in some jurisdictions.</p>
   9076 	</div>
   9077 
   9078 </body>
   9079 
   9080 </html>
   9081