1 <style type="text/css"> 2 /* default css */ 3 table { 4 font-size: 1em; 5 line-height: inherit; 6 } 7 tr { 8 text-align: left; 9 } 10 div, address, ol, ul, li, option, select { 11 margin-top: 0px; 12 margin-bottom: 0px; 13 } 14 p { 15 margin: 0px; 16 } 17 body { 18 margin: 6px; 19 padding: 0px; 20 font-family: Verdana, sans-serif; 21 font-size: 10pt; 22 background-color: #ffffff; 23 } 24 img { 25 -moz-force-broken-image-icon: 1; 26 } 27 @media screen { 28 html.pageview { 29 background-color: #f3f3f3 !important; 30 } 31 body { 32 min-height: 1100px; 33 counter-reset: __goog_page__; 34 } 35 * html body { 36 height: 1100px; 37 } 38 .pageview body { 39 border-top: 1px solid #ccc; 40 border-left: 1px solid #ccc; 41 border-right: 2px solid #bbb; 42 border-bottom: 2px solid #bbb; 43 width: 648px !important; 44 margin: 15px auto 25px; 45 padding: 40px 50px; 46 } 47 /* IE6 */ 48 * html { 49 overflow-y: scroll; 50 } 51 * html.pageview body { 52 overflow-x: auto; 53 } 54 /* Prevent repaint errors when scrolling in Safari. This "Star-7" css hack 55 targets Safari 3.1, but not WebKit nightlies and presumably Safari 4. 56 That's OK because this bug is fixed in WebKit nightlies/Safari 4 :-). */ 57 html*#wys_frame::before { 58 content: '\A0'; 59 position: fixed; 60 overflow: hidden; 61 width: 0; 62 height: 0; 63 top: 0; 64 left: 0; 65 } 66 .writely-callout-data { 67 display: none; 68 *display: inline-block; 69 *width: 0; 70 *height: 0; 71 *overflow: hidden; 72 } 73 .writely-footnote-marker { 74 background-image: url('images/footnote_doc_icon.gif'); 75 background-color: transparent; 76 background-repeat: no-repeat; 77 width: 7px; 78 overflow: hidden; 79 height: 16px; 80 vertical-align: top; 81 -moz-user-select: none; 82 } 83 .editor .writely-footnote-marker { 84 cursor: move; 85 } 86 .writely-footnote-marker-highlight { 87 background-position: -15px 0; 88 -moz-user-select: text; 89 } 90 .writely-footnote-hide-selection ::-moz-selection, .writely-footnote-hide-selection::-moz-selection { 91 background: transparent; 92 } 93 .writely-footnote-hide-selection ::selection, .writely-footnote-hide-selection::selection { 94 background: transparent; 95 } 96 .writely-footnote-hide-selection { 97 cursor: move; 98 } 99 .editor .writely-comment-yellow { 100 background-color: #FF9; 101 background-position: -240px 0; 102 } 103 .editor .writely-comment-yellow-hover { 104 background-color: #FF0; 105 background-position: -224px 0; 106 } 107 .editor .writely-comment-blue { 108 background-color: #C0D3FF; 109 background-position: -16px 0; 110 } 111 .editor .writely-comment-blue-hover { 112 background-color: #6292FE; 113 background-position: 0 0; 114 } 115 .editor .writely-comment-orange { 116 background-color: #FFDEAD; 117 background-position: -80px 0; 118 } 119 .editor .writely-comment-orange-hover { 120 background-color: #F90; 121 background-position: -64px 0; 122 } 123 .editor .writely-comment-green { 124 background-color: #99FBB3; 125 background-position: -48px 0; 126 } 127 .editor .writely-comment-green-hover { 128 background-color: #00F442; 129 background-position: -32px 0; 130 } 131 .editor .writely-comment-cyan { 132 background-color: #CFF; 133 background-position: -208px 0; 134 } 135 .editor .writely-comment-cyan-hover { 136 background-color: #0FF; 137 background-position: -192px 0; 138 } 139 .editor .writely-comment-purple { 140 background-color: #EBCCFF; 141 background-position: -144px 0; 142 } 143 .editor .writely-comment-purple-hover { 144 background-color: #90F; 145 background-position: -128px 0; 146 } 147 .editor .writely-comment-magenta { 148 background-color: #FCF; 149 background-position: -112px 0; 150 } 151 .editor .writely-comment-magenta-hover { 152 background-color: #F0F; 153 background-position: -96px 0; 154 } 155 .editor .writely-comment-red { 156 background-color: #FFCACA; 157 background-position: -176px 0; 158 } 159 .editor .writely-comment-red-hover { 160 background-color: #FF7A7A; 161 background-position: -160px 0; 162 } 163 .editor .writely-comment-marker { 164 background-image: url('images/markericons_horiz.gif'); 165 background-color: transparent; 166 padding-right: 11px; 167 background-repeat: no-repeat; 168 width: 16px; 169 height: 16px; 170 -moz-user-select: none; 171 } 172 .editor .writely-comment-hidden { 173 padding: 0; 174 background: none; 175 } 176 .editor .writely-comment-marker-hidden { 177 background: none; 178 padding: 0; 179 width: 0; 180 } 181 .editor .writely-comment-none { 182 opacity: .2; 183 filter:progid:DXImageTransform.Microsoft.Alpha(opacity=20); 184 -moz-opacity: .2; 185 } 186 .editor .writely-comment-none-hover { 187 opacity: .2; 188 filter:progid:DXImageTransform.Microsoft.Alpha(opacity=20); 189 -moz-opacity: .2; 190 } 191 .br_fix br:not(:-moz-last-node):not(:-moz-first-node) { 192 position:relative; 193 left: -1ex 194 } 195 .br_fix br+br { 196 position: static !important 197 } 198 } 199 h6 { font-size: 8pt } 200 h5 { font-size: 8pt } 201 h4 { font-size: 10pt } 202 h3 { font-size: 12pt } 203 h2 { font-size: 14pt } 204 h1 { font-size: 18pt } 205 blockquote {padding: 10px; border: 1px #DDD dashed } 206 a img {border: 0} 207 .pb { 208 border-width: 0; 209 page-break-after: always; 210 /* We don't want this to be resizeable, so enforce a width and height 211 using !important */ 212 height: 1px !important; 213 width: 100% !important; 214 } 215 .editor .pb { 216 border-top: 1px dashed #C0C0C0; 217 border-bottom: 1px dashed #C0C0C0; 218 } 219 div.google_header, div.google_footer { 220 position: relative; 221 margin-top: 1em; 222 margin-bottom: 1em; 223 } 224 /* Table of contents */ 225 .editor div.writely-toc { 226 background-color: #f3f3f3; 227 border: 1px solid #ccc; 228 } 229 .writely-toc > ol { 230 padding-left: 3em; 231 font-weight: bold; 232 } 233 ol.writely-toc-subheading { 234 padding-left: 1em; 235 font-weight: normal; 236 } 237 /* IE6 only */ 238 * html writely-toc ol { 239 list-style-position: inside; 240 } 241 .writely-toc-none { 242 list-style-type: none; 243 } 244 .writely-toc-decimal { 245 list-style-type: decimal; 246 } 247 .writely-toc-upper-alpha { 248 list-style-type: upper-alpha; 249 } 250 .writely-toc-lower-alpha { 251 list-style-type: lower-alpha; 252 } 253 .writely-toc-upper-roman { 254 list-style-type: upper-roman; 255 } 256 .writely-toc-lower-roman { 257 list-style-type: lower-roman; 258 } 259 .writely-toc-disc { 260 list-style-type: disc; 261 } 262 /* Ordered lists converted to numbered lists can preserve ordered types, and 263 vice versa. This is confusing, so disallow it */ 264 ul[type="i"], ul[type="I"], ul[type="1"], ul[type="a"], ul[type="A"] { 265 list-style-type: disc; 266 } 267 ol[type="disc"], ol[type="circle"], ol[type="square"] { 268 list-style-type: decimal; 269 } 270 /* end default css */ 271 /* custom css */ 272 /* end custom css */ 273 /* ui edited css */ 274 body { 275 font-family: Verdana; 276 font-size: 10.0pt; 277 line-height: normal; 278 background-color: #ffffff; 279 } 280 /* end ui edited css */ 281 /* editor CSS */ 282 .editor a:visited {color: #551A8B} 283 .editor table.zeroBorder {border: 1px dotted gray} 284 .editor table.zeroBorder td {border: 1px dotted gray} 285 .editor table.zeroBorder th {border: 1px dotted gray} 286 .editor div.google_header, .editor div.google_footer { 287 border: 2px #DDDDDD dashed; 288 position: static; 289 width: 100%; 290 min-height: 2em; 291 } 292 .editor .misspell {background-color: yellow} 293 .editor .writely-comment { 294 font-size: 9pt; 295 line-height: 1.4; 296 padding: 1px; 297 border: 1px dashed #C0C0C0 298 } 299 /* end editor CSS */ 300 </style> 301 <style> 302 body { 303 margin: 0px; 304 } 305 #doc-contents { 306 margin: 6px; 307 } 308 #google-view-footer { 309 clear: both; 310 border-top: thin solid; 311 padding-top: 0.3em; 312 padding-bottom: 0.3em; 313 } 314 a.google-small-link:link, a.google-small-link:visited { 315 color:#112ABB; 316 font-family:Arial,Sans-serif; 317 font-size:11px !important; 318 } 319 body, p, div, td { 320 direction: inherit; 321 } 322 @media print { 323 #google-view-footer { 324 display: none; 325 } 326 } 327 </style> 328 <script> 329 function viewOnLoad() { 330 if (document.location.href.indexOf('spi=1') != -1) { 331 if (navigator.userAgent.toLowerCase().indexOf('msie') != -1) { 332 window.print(); 333 } else { 334 window.setTimeout(window.print, 10); 335 } 336 } 337 if (document.location.href.indexOf('hgd=1') != -1) { 338 var footer = document.getElementById("google-view-footer"); 339 if (footer) { 340 footer.style.display = 'none'; 341 } 342 } 343 } 344 </script> 345 </head> 346 <body> 347 <div id="doc-contents"> 348 <div> 349 350 <h1 style="text-align: center;"> 351 Google XML Document Format Style Guide</h1><div style="text-align: center;">Version 1.0<br>Copyright Google 2008<br><br></div><h2>Introduction</h2>This document provides a set of guidelines for general use when designing new XML document formats (and to some extent XML documents as well; see Section 11). Document formats usually include both formal parts (DTDs, schemas) and parts expressed in normative English prose.<br><br>These guidelines apply to new designs, and are not intended to force retroactive changes in existing designs. When participating in the creation of public or private document format designs, the guidelines may be helpful but should not control the group consensus.<br><br>This guide is meant for the design of XML that is to be generated and consumed by machines rather than human beings. Its rules are <i>not applicable</i> to formats such as XHTML (which should be formatted as much like HTML as possible) or ODF which are meant to express rich text. A document that includes embedded content in XHTML or some other rich-text format, but also contains purely machine-interpretable portions, SHOULD follow this style guide for the machine-interpretable portions. It also does not affect XML document formats that are created by translations from proto buffers or through some other type of format.<br><br>Brief rationales have been added to most of the guidelines. They are maintained in the same document in hopes that they won't get out of date, but they are not considered normative.<br><br>The terms MUST, MUST NOT, SHOULD, SHOULD NOT, and MAY are used in this document in the sense of <a title="RFC 2119" href="https://www.ietf.org/rfc/rfc2119.txt" id="iecm">RFC 2119.</a><br> <br><h2>1. To design or not to design, that is the question<br></h2><ol><li>Attempt to reuse existing XML formats whenever possible, especially those which allow extensions. Creating an entirely new format should be done only with care and consideration; read <a title="Tim Bray's warnings" href="https://www.tbray.org/ongoing/When/200x/2006/01/08/No-New-XML-Languages" id="d3cy">Tim Bray's warnings</a> first. Try to get wide review of your format, from outside your organization as well, if possible. [<i>Rationale:</i> New document formats have a cost: they must be reviewed, documented, and learned by users.]<br><br></li><li>If you are reusing or extending an existing format, make <i>sensible</i> 352 353 use of the prescribed elements and attributes, especially any that are 354 required. Don't completely repurpose them, but do try to see how they 355 might be used in creative ways if the vanilla semantics aren't 356 suitable. As a last resort when an element or attribute is required by the format but is not appropriate for your use case, use some 357 fixed string as its value. [<i>Rationale:</i> Markup reuse is good, markup abuse is bad.]<br><br></li><li>When extending formats, use the implicit style of the existing format, even if it contradicts this guide. [<i>Rationale: </i>Consistency.]<br></li></ol><br><h2>2. Schemas</h2><ol><li>Document formats SHOULD be expressed using a schema language. [<i>Rationale: </i>Clarity and machine-checkability.]<br><br></li><li>The schema language SHOULD be <a title="RELAX NG" href="http://www.relaxng.org/" id="p1s7">RELAX NG</a> <a title="compact syntax" href="http://www.relaxng.org/compact-tutorial-20030326.html" id="ulci">compact syntax</a>. Embedded <a title="Schematron" href="http://www.schematron.com/" id="ymh-">Schematron</a> rules MAY be added to the schema for additional fine control. [<i>Rationale:</i> 358 359 RELAX NG is the most flexible schema language, with very few arbitrary 360 restrictions on designs. The compact syntax is quite easy to read and 361 learn, and can be converted one-to-one to and from the XML syntax when 362 necessary. Schematron handles arbitrary cross-element and 363 cross-attribute constraints nicely.]<br><br></li><li>Schemas SHOULD use the <a title=""Salami Slice" style" href="http://www.xfront.com/GlobalVersusLocal.html#SecondDesign" id="r:fj">"Salami Slice" style</a> (one rule per element). Schemas MAY use the <a title=""Russian Doll" style" href="http://www.xfront.com/GlobalVersusLocal.html#FirstDesign" id="h14y">"Russian Doll" style</a> (schema resembles document) if they are short and simple. The <a title=""Venetian Blind" style" href="http://www.xfront.com/GlobalVersusLocal.html#ThirdDesign" id="dr_g">"Venetian Blind" style</a> (one rule per element type) is unsuited to RELAX NG and SHOULD NOT be used.<br><br></li><li>Regular expressions SHOULD be provided to assist in validating complex values.<br><br></li><li>DTDs and/or W3C XML Schemas MAY be provided for compatibility with existing products, tools, or users. [<i>Rationale:</i> We can't change the world all at once.]<br></li></ol></div><div><br><h2>3. Namespaces</h2><ol><li>Element names MUST be in a namespace, except 364 when extending pre-existing document types that do not use namespaces. 365 366 A default namespace SHOULD be used. [<i>Rationale:</i> Namespace-free 367 documents are obsolete; every set of names should be in some 368 namespace. Using a default namespace improves readability.]<br><br></li><li>Attribute 369 names SHOULD NOT be in a namespace unless they are drawn from a foreign 370 document type or are meant to be used in foreign document types. [<i>Rationale:</i> Attribute names in a namespace must always have a prefix, which is annoying to type and hard to read.]<br><br> 371 </li><li>Namespace names are HTTP URIs. Namespace names SHOULD take the form <span style="font-family: Courier New;">https://example.com/</span><i style="font-family: Courier New;">whatever</i><span style="font-family: Courier New;">/</span><i><span style="font-family: Courier New;">year</span>, </i>where <i>whatever</i> is a unique value based on the name of the document type, and <i>year</i> 372 373 is the year the namespace was created. There may be additional URI-path parts 374 before the <i>year.</i> [<i>Rationale:</i> Existing convention. Providing the year allows for the possible recycling of code names.]<br><br></li><li>Namespaces MUST NOT be changed unless the semantics of particular elements or attributes has changed in drastically incompatible ways. [<i>Rationale:</i> Changing the namespace requires changing all client code.]<br><br></li><li>Namespace prefixes SHOULD be short (but not so short that they are likely to be conflict with another project). Single-letter prefixes MUST NOT be used. Prefixes SHOULD contain only lower-case ASCII letters. [<i>Rationale:</i> Ease of typing and absence of encoding compatibility problems.]</li></ol><br> 375 376 <h2>4. Names and enumerated values</h2><b>Note: </b>"Names" refers to the names of elements, attributes, and enumerated values.<br><br><ol><li>All names MUST use lowerCamelCase. That is, they start with an initial lower-case letter, then each new word within the name starts with an initial capital letter. [<i>Rationale:</i> Adopting a single style provides consistency, which helps when referring to names since the capitalization is known and so does not have to be remembered. It matches Java style, and other languages can be dealt with using automated name conversion.]<br><br></li><li>Names MUST contain only ASCII letters and digits. 377 [<i>Rationale:</i> Ease of typing and absence of encoding compatibility problems.]<br> <br></li><li>Names SHOULD NOT exceed 25 characters. Longer names SHOULD be 378 avoided by devising concise and informative names. If a name can only remain within this limit by becoming obscure, the limit SHOULD be ignored. [<i>Rationale: </i>Longer names are awkward to use and require additional bandwidth.]<br><br></li><li>Published standard abbreviations, if sufficiently well-known, MAY be employed in constructing names. Ad hoc abbreviations MUST NOT be used. Acronyms MUST be treated as words for camel-casing purposes: informationUri, not informationURI. [<i>Rationale:</i> An abbreviation that is well known 379 to one community is often incomprehensible to others who need to use 380 the same document format (and who do understand the full name); treating an acronym as a word makes it easier to see where the word boundaries are.] <br></li></ol><p><br></p><p> 381 382 </p><h2> 383 5. Elements</h2><ol><li>All elements MUST contain either nothing, character content, or child elements. Mixed content MUST NOT be used. [<i>Rationale:</i> Many XML data models don't handle mixed content properly, and its use makes the element order-dependent. As always, textual formats are not covered by this rule.]<br><br></li><li>XML elements that merely wrap repeating child elements SHOULD NOT be used. [<i>Rationale:</i> They are not used in Atom and add nothing.]</li></ol> 384 385 <p><br></p><h2>6. Attributes</h2><ol><li>Document formats MUST NOT depend on the order of attributes in a start-tag. [<i>Rationale:</i> Few XML parsers report the order, and it is not part of the XML Infoset.]<br><br></li><li>Elements SHOULD NOT be overloaded with too many attributes (no more 386 than 10 as a rule of thumb). Instead, use child elements to 387 encapsulate closely related attributes. [<i>Rationale:</i> This 388 approach maintains the built-in extensibility that XML provides with 389 elements, and is useful for providing forward compatibility as a 390 specification evolves.]<br><br></li><li>Attributes MUST NOT be used to hold values in which line breaks are significant. [<i>Rationale:</i> Such line breaks are converted to spaces by conformant XML parsers.]<br><br></li><li>Document formats MUST allow either single or double quotation marks around attribute values. [<i>Rationale:</i> XML parsers don't report the difference.]<br></li></ol> 391 392 <p><br></p> 393 <p> 394 </p><h2> 395 7. Values</h2><ol><li>Numeric values SHOULD be 32-bit signed integers, 64-bit signed integers, or 64-bit IEEE doubles, all expressed in base 10. These correspond to the XML Schema types <span style="font-family: Courier New;">xsd:int</span>, <span style="font-family: Courier New;">xsd:long</span>, and <span style="font-family: Courier New;">xsd:double</span> respectively. If required in particular cases, <span style="font-family: Courier New;">xsd:integer</span> (unlimited-precision integer) values MAY also be used. [<i>Rationale:</i> There are far too many numeric types in XML Schema: these provide a reasonable subset.] <br><br></li><li> 396 397 Boolean values SHOULD NOT be used (use enumerations instead). If they must be used, they MUST be expressed as <span style="font-family: Courier New;">true</span> or <span style="font-family: Courier New;">false</span>, corresponding to a subset of the XML Schema type <span style="font-family: Courier New;">xsd:boolean</span>. The alternative <span style="font-family: Courier New;">xsd:boolean</span> values <span style="font-family: Courier New;">1</span> and <span style="font-family: Courier New;">0</span> MUST NOT be used. [<i>Rationale:</i> Boolean arguments are not extensible. The additional flexibility of allowing numeric values is not abstracted away by any parser.]<br><br></li><li>Dates should be represented using <a title="RFC 3339" href="https://www.ietf.org/rfc/rfc3339.txt" id="sk98">RFC 3339</a> format, a subset of both 398 ISO 8601 format and XML Schema <span style="font-family: Courier New;">xsd:dateTime</span> format. UTC times SHOULD be used rather than local times. 399 400 [<i>Rationale:</i> There are far too many date formats and time zones, although it is recognized that sometimes local time preserves important information.]<br><br></li><li>Embedded syntax in character content and attribute values SHOULD NOT be 401 used. Syntax in values means XML tools are largely useless. Syntaxes such as dates, URIs, and 402 XPath expressions are exceptions. [<i>Rationale:</i> 403 Users should be able to process XML documents using only an XML parser 404 without requiring additional special-purpose parsers, which are easy to 405 get wrong.]<br><br></li><li>Be careful with whitespace in values. XML parsers don't strip whitespace in elements, but do convert newlines to spaces in attributes. However, application frameworks may do more aggressive whitespace stripping. Your document format SHOULD give rules for whitespace stripping.<br></li></ol> 406 407 <p><br> 408 </p> 409 <p> 410 </p><h2>8. Key-value pairs<br></h2><ol><li> 411 Simple key-value pairs SHOULD be represented with an empty element whose name represents the key, with the <span style="font-family: Courier New;">value</span> attribute containing the value. Elements that have a <span style="font-family: Courier New;">value</span> attribute MAY also have a <span style="font-family: Courier New;">unit</span> attribute to specify the unit of a measured value. For physical measurements, the <a title="SI system" href="https://en.wikipedia.org/wiki/International_System_of_Units" id="rhxg">SI system</a> SHOULD be used. [<i>Rationale:</i> 412 413 Simplicity and design consistency. Keeping the value in an attribute 414 hides it from the user, since displaying just the value without the key is not useful.]<br><br></li><li>If the number of possible keys is very large or unbounded, key-value pairs MAY be represented by a single generic element with <span style="font-family: Courier New;">key</span>, <span style="font-family: Courier New;">value</span>, and optional <span style="font-family: Courier New;">unit</span> and <span style="font-family: Courier New;">scheme</span> 415 attributes (which serve to discriminate keys from different domains). 416 In that case, also provide (not necessarily in the same document) a 417 list of keys with human-readable explanations.</li></ol><br><h2>9. Binary data</h2><p><b>Note: </b>There are no hard and fast rules about whether binary data should be included as part of an XML document or not. If it's too large, it's probably better to link to it.</p><p><br></p><ol><li>Binary data MUST NOT be included directly as-is in XML documents, but MUST be encoded using Base64 encoding. [<i>Rationale:</i> XML does not allow arbitrary binary bytes.]<br><br></li><li> 418 419 The line breaks required by Base64 MAY be omitted. [<i>Rationale:</i> The line breaks are meant to keep plain text lines short, but XML is not really plain text.]<br><br></li><li>An attribute named <span style="font-family: Courier New;">xsi:type</span> with value <span style="font-family: Courier New;">xs:base64Binary</span> MAY be attached to this element to signal that the Base64 format is in use. [Rationale: Opaque blobs should have decoding instructions attached.]<br><br></li></ol> 420 <h2>10. Processing instructions</h2><ol><li>New processing instructions MUST NOT be created except in order to specify purely local processing conventions, and SHOULD be avoided altogether. Existing standardized processing instructions MAY be used. [<i>Rationale:</i> Processing instructions fit awkwardly into XML data models and can always be replaced by elements; they exist primarily to avoid breaking backward compatibility.]</li></ol><p> </p> 421 422 <p> 423 </p><p> </p><h2>11. Representation of XML document instances<br></h2><p><b>Note:</b> These points are only guidelines, as the format of program-created instances will often be outside the programmer's control (for example, when an XML serialization library is being used). <i>In no case</i> should XML parsers rely on these guidelines being followed. Use standard XML parsers, not hand-rolled hacks.<br></p><p><br></p><ol><li>The character encoding used SHOULD be UTF-8. Exceptions should require extremely compelling circumstances. [<i>Rationale:</i> UTF-8 is universal and in common use.]<br><br></li><li>Namespaces SHOULD be declared in the root element of a document wherever possible. [<i>Rationale: </i>Clarity and consistency.]<br><br></li><li>The mapping of namespace URIs to prefixes SHOULD remain constant throughout the document, and SHOULD also be used in documentation of the design. [<i>Rationale: </i>Clarity and consistency.]<br><br></li><li>Well-known prefixes such as html: (for XHTML), dc: (for Dublin Core metadata), and xs: (for XML Schema) should be used for standard namespaces. [<i>Rationale:</i> Human readability.]<br><br></li><li>Redundant whitespace in a tag SHOULD NOT be 424 used. Use one space before each attribute in a start-tag; if the start 425 tag is too long, the space MAY be replaced by a newline. [<i>Rationale:</i> Consistency and conciseness.]<br><br></li><li>Empty elements MAY be expressed as empty tags or a start-tag 426 immediately followed by an end-tag. No distinction should be made 427 between these two formats by any application. [<i>Rationale:</i> They are not distinguished by XML parsers.]<br><br></li><li>Documents MAY be pretty-printed using 2-space indentation for child 428 elements. Elements that contain character content SHOULD NOT be 429 wrapped. Long start-tags MAY be broken using newlines (possibly with extra indentation) after any attribute value except the last. [<i>Rationale:</i> General compatibility with our style. Wrapping character content affects its value.]<br><br></li><li>Attribute values MAY be surrounded with either quotation marks or apostrophes. 430 Specifications MUST NOT require or forbid the use of either form. <span style="font-family: Courier New;">&apos;</span> and <span style="font-family: Courier New;">&quot;</span> may be freely used to escape each type of quote. [<i>Rationale:</i> No XML parsers report the distinction.]<br><br> 431 432 </li><li>Comments MUST NOT be used to carry real data. Comments MAY be used to contain TODOs in hand-written XML. Comments SHOULD NOT be used at all in publicly transmitted documents. [<i>Rationale: </i>Comments are often discarded by parsers.]<br><br></li><li>If comments are nevertheless used, they SHOULD appear only in the document prolog or in elements that 433 contain child elements. If pretty-printing is required, pretty-print 434 comments like elements, but with line wrapping. Comments SHOULD NOT 435 appear in elements that contain character content. [<i>Rationale: </i>Whitespace in and around comments improves readability, but embedding a 436 comment in character content can lead to confusion about what 437 whitespace is or is not in the content.]<br><br></li><li>Comments SHOULD have whitespace following <span style="font-family: Courier New;"><!--</span> and preceding <span style="font-family: Courier New;">--></span>. [<i>Rationale:</i> Readability.]<br><br></li><li>CDATA sections MAY be used; they are equivalent to the use of <span style="font-family: Courier New;">&amp;</span> and <span style="font-family: Courier New;">&lt;</span>. Specifications MUST NOT require or forbid the use of CDATA sections. [<i>Rationale:</i> Few XML parsers report the distinction, and combinations of CDATA and text are often reported as single objects anyway.]<br><br></li><li>Entity references other than the XML standard entity references <span style="font-family: Courier New;">&amp;</span>, <span style="font-family: Courier New;">&lt;</span>, <span style="font-family: Courier New;">&gt;</span>, <span style="font-family: Courier New;">&quot;</span>, and <span style="font-family: Courier New;">&apos;</span> MUST NOT be used. Character references MAY be used, but actual characters are preferred, unless the character encoding is not UTF-8. As usual, textual formats are exempt from this rule.<br></li></ol> 438 439 <br><p> </p><p> 440 </p> 441 <p> 442 </p><br><br><h2> 443 12. Elements vs. Attributes 444 </h2> 445 <p> 446 <b>Note:</b> There are no hard and fast rules for deciding when to use attributes and when to use elements. Here are some of the considerations that designers should take into account; no rationales are given. 447 </p> 448 <h3> 449 12.1. General points:<br> 450 </h3> 451 452 <ol> 453 <li> 454 <p> 455 Attributes are more restrictive than elements, and all designs have some elements, so an all-element design is simplest -- which is not the same as best. 456 </p> 457 <p> 458 <br> 459 </p> 460 </li> 461 <li> 462 <p> 463 In a tree-style data model, elements are typically represented internally as nodes, which use more memory than the strings used to represent attributes. Sometimes the nodes are of different application-specific classes, which in many languages also takes up memory to represent the classes. 464 </p> 465 <p> 466 <br> 467 468 </p> 469 </li> 470 <li> 471 <p> 472 When streaming, elements are processed one at a time (possibly even piece by piece, depending on the XML parser you are using), whereas all the attributes of an element and their values are reported at once, which costs memory, particularly if some attribute values are very long. 473 </p> 474 <p> 475 <br> 476 </p> 477 </li> 478 <li> 479 <p> 480 Both element content and attribute values need to be escaped appropriately, so escaping should not be a consideration in the design. 481 </p> 482 <p> 483 <br> 484 </p> 485 486 </li> 487 <li> 488 <p> 489 In some programming languages and libraries, processing elements is easier; in others, processing attributes is easier. Beware of using ease of processing as a criterion. In particular, XSLT can handle either with equal facility. 490 </p> 491 <p> 492 <br> 493 </p> 494 </li> 495 <li> 496 <p> 497 If a piece of data should usually be shown to the user, consider using an element; if not, consider using an attribute. (This rule is often violated for one reason or another.) 498 499 </p> 500 <p> 501 <br> 502 </p> 503 </li> 504 <li> 505 <p> 506 If you are extending an existing schema, do things by analogy to how things are done in that schema. 507 </p> 508 <p> 509 <br> 510 </p> 511 </li> 512 <li> 513 <p> 514 Sensible schema languages, meaning RELAX NG and Schematron, treat elements and attributes symmetrically. Older and cruder<a href="https://www.w3.org/TR/2004/REC-xmlschema-0-20041028/" id="h2c3" title="XML Schema"> </a>schema languages such as DTDs and XML Schema, tend to have better support for elements. 515 516 </p> 517 </li> 518 </ol> 519 <p> 520 </p> 521 <h3> 522 12.2 Using elements<br> 523 </h3> 524 <ol> 525 <li> 526 <p> 527 If something might appear more than once in a data model, use an element rather than introducing attributes with names like <span style="font-family: Courier New;">foo1, foo2, foo3</span> .... 528 </p> 529 530 <p> 531 <br> 532 </p> 533 </li> 534 <li> 535 <p> 536 Use elements to represent a piece of information that can be considered an independent object and when the information is related via a parent/child relationship to another piece of information. 537 </p> 538 <p> 539 <br> 540 </p> 541 </li> 542 <li> 543 <p> 544 Use elements when data incorporates strict typing or relationship rules. 545 </p> 546 <p> 547 548 <br> 549 </p> 550 </li> 551 <li> 552 <p> 553 If order matters between two pieces of data, use elements for them: attributes are inherently unordered. 554 </p> 555 <p> 556 <br> 557 </p> 558 </li> 559 <li> 560 <p> 561 If a piece of data has, or might have, its own substructure, use it in an element: getting substructure into an attribute is always messy. Similarly, if the data is a constituent part of some larger piece of data, put it in an element. 562 </p> 563 564 <p> 565 <br> 566 </p> 567 </li> 568 <li> 569 <p> 570 An exception to the previous rule: multiple whitespace-separated tokens can safely be put in an attribute. In principle, the separator can be anything, but schema-language validators are currently only able to handle whitespace, so it's best to stick with that. 571 </p> 572 <p> 573 <br> 574 </p> 575 </li> 576 <li> 577 <p> 578 If a piece of data extends across multiple lines, use an element: XML parsers will change newlines in attribute values into spaces. 579 580 <br><br></p></li><li>If a piece of data is very large, use an element so that its content can be streamed.<br><br></li> 581 <li> 582 <p> 583 If a piece of data is in a natural language, put it in an element so you can use the <span style="font-family: Courier New;">xml:lang</span> attribute to label the language being used. Some kinds of natural-language text, like Japanese, often make use <a href="https://www.w3.org/TR/2001/REC-ruby-20010531" id="pa2f" title="annotations">annotations</a> that are conventionally represented using child elements; right-to-left languages like Hebrew and Arabic may similarly require child elements to manage <a href="https://www.w3.org/TR/2001/REC-ruby-20010531" id="ehyv" title="bidirectionality">bidirectionality</a> properly. 584 </p> 585 586 <p> 587 </p> 588 </li> 589 </ol> 590 <h3> 591 12.3 Using attributes<br> 592 </h3> 593 <ol> 594 <li> 595 <p> 596 If the data is a code from an enumeration, code list, or controlled vocabulary, put it in an attribute if possible. For example, language tags, currency codes, medical diagnostic codes, etc. are best handled as attributes. 597 </p> 598 <p> 599 <br> 600 601 </p> 602 </li> 603 <li> 604 <p> 605 If a piece of data is really metadata on some other piece of data (for example, representing a class or role that the main data serves, or specifying a method of processing it), put it in an attribute if possible. 606 </p> 607 <p> 608 <br> 609 </p> 610 </li> 611 <li> 612 <p> 613 In particular, if a piece of data is an ID for some other piece of data, or a reference to such an ID, put the identifying piece in an attribute. When it's an ID, use the name <span style="font-family: Courier New;">xml:id</span> for the attribute. 614 615 </p> 616 <p> 617 <br> 618 </p> 619 </li> 620 <li> 621 <p> 622 Hypertext references are conventionally put in <span style="font-family: Courier New;">href</span> attributes. 623 </p> 624 <p> 625 <br> 626 </p> 627 </li> 628 <li> 629 630 <p> 631 If a piece of data is applicable to an element and any descendant elements unless it is overridden in some of them, it is conventional to put it in an attribute. Well-known examples are <span style="font-family: Courier New;">xml:lang</span>, <span style="font-family: Courier New;">xml:space</span>, <span style="font-family: Courier New;">xml:base</span>, and namespace declarations. 632 </p> 633 <p> 634 <br> 635 </p> 636 </li> 637 <li> 638 <p> 639 640 If terseness is really the <i>most</i> important thing, use attributes, but consider <span style="font-family: Courier New;">gzip</span> compression instead -- it works very well on documents with highly repetitive structures.</p></li> 641 </ol></div><br><div><div><div><div><div> 642 <br><h2>13. Parting words 643 </h2> 644 <p> 645 </p><p> 646 Use common sense and <i>BE CONSISTENT</i>. Design for extensibility. You <i>are</i> gonna need it. [<i>Rationale:</i> Long and painful experience.]<br></p><p><br> </p> 647 648 <p> 649 When designing XML formats, take a few minutes to look at other formats and determine their style. The point of having style guidelines is so that people can concentrate on what you are 650 saying, rather than on how you are saying it. <br></p><p> 651 <br> 652 Break <i>ANY OR ALL</i> of these rules (yes, even the ones that say MUST) rather than create a crude, arbitrary, disgusting mess of a design if that's what following them slavishly would give you. In particular, random mixtures of attributes and child elements are hard to follow and hard to use, though it often makes good sense to use both when the data clearly fall into two different groups such as simple/complex or metadata/data. 653 </p> 654 <div><p> 655 <br> 656 Newbies always ask: 657 </p> 658 659 <p> 660 "Elements or attributes? 661 </p> 662 <p> 663 Which will serve me best?" 664 </p> 665 <p> 666 Those who know roar like lions; 667 </p> 668 <p> 669 Wise hackers smile like tigers. 670 </p> 671 <p> 672 --a <a href="https://en.wikipedia.org/wiki/Waka_%28poetry%29#Forms_of_waka" id="s3k3" title="tanka">tanka</a>, or extended haiku 673 674 </p> 675 </div> 676 <p> 677 <br> 678 </p> 679 <br>[TODO: if a registry of schemas is set up, add a link to it]<br><br></div><br></div><br></div></div></div><br> 680 <br clear="all"/> 681 </div> 682