1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" 2 "http://www.w3.org/TR/html4/strict.dtd"> 3 <html> 4 <head> 5 <title>Precompiled Headers (PCH)</title> 6 <link type="text/css" rel="stylesheet" href="../menu.css"> 7 <link type="text/css" rel="stylesheet" href="../content.css"> 8 <style type="text/css"> 9 td { 10 vertical-align: top; 11 } 12 </style> 13 </head> 14 15 <body> 16 17 <!--#include virtual="../menu.html.incl"--> 18 19 <div id="content"> 20 21 <h1>Precompiled Headers</h1> 22 23 <p>This document describes the design and implementation of Clang's 24 precompiled headers (PCH). If you are interested in the end-user 25 view, please see the <a 26 href="UsersManual.html#precompiledheaders">User's Manual</a>.</p> 27 28 <p><b>Table of Contents</b></p> 29 <ul> 30 <li><a href="#usage">Using Precompiled Headers with 31 <tt>clang</tt></a></li> 32 <li><a href="#philosophy">Design Philosophy</a></li> 33 <li><a href="#contents">Precompiled Header Contents</a> 34 <ul> 35 <li><a href="#metadata">Metadata Block</a></li> 36 <li><a href="#sourcemgr">Source Manager Block</a></li> 37 <li><a href="#preprocessor">Preprocessor Block</a></li> 38 <li><a href="#types">Types Block</a></li> 39 <li><a href="#decls">Declarations Block</a></li> 40 <li><a href="#stmt">Statements and Expressions</a></li> 41 <li><a href="#idtable">Identifier Table Block</a></li> 42 <li><a href="#method-pool">Method Pool Block</a></li> 43 </ul> 44 </li> 45 <li><a href="#tendrils">Precompiled Header Integration 46 Points</a></li> 47 </ul> 48 49 <h2 id="usage">Using Precompiled Headers with <tt>clang</tt></h2> 50 51 <p>The Clang compiler frontend, <tt>clang -cc1</tt>, supports two command line 52 options for generating and using PCH files.<p> 53 54 <p>To generate PCH files using <tt>clang -cc1</tt>, use the option 55 <b><tt>-emit-pch</tt></b>: 56 57 <pre> $ clang -cc1 test.h -emit-pch -o test.h.pch </pre> 58 59 <p>This option is transparently used by <tt>clang</tt> when generating 60 PCH files. The resulting PCH file contains the serialized form of the 61 compiler's internal representation after it has completed parsing and 62 semantic analysis. The PCH file can then be used as a prefix header 63 with the <b><tt>-include-pch</tt></b> option:</p> 64 65 <pre> 66 $ clang -cc1 -include-pch test.h.pch test.c -o test.s 67 </pre> 68 69 <h2 id="philosophy">Design Philosophy</h2> 70 71 <p>Precompiled headers are meant to improve overall compile times for 72 projects, so the design of precompiled headers is entirely driven by 73 performance concerns. The use case for precompiled headers is 74 relatively simple: when there is a common set of headers that is 75 included in nearly every source file in the project, we 76 <i>precompile</i> that bundle of headers into a single precompiled 77 header (PCH file). Then, when compiling the source files in the 78 project, we load the PCH file first (as a prefix header), which acts 79 as a stand-in for that bundle of headers.</p> 80 81 <p>A precompiled header implementation improves performance when:</p> 82 <ul> 83 <li>Loading the PCH file is significantly faster than re-parsing the 84 bundle of headers stored within the PCH file. Thus, a precompiled 85 header design attempts to minimize the cost of reading the PCH 86 file. Ideally, this cost should not vary with the size of the 87 precompiled header file.</li> 88 89 <li>The cost of generating the PCH file initially is not so large 90 that it counters the per-source-file performance improvement due to 91 eliminating the need to parse the bundled headers in the first 92 place. This is particularly important on multi-core systems, because 93 PCH file generation serializes the build when all compilations 94 require the PCH file to be up-to-date.</li> 95 </ul> 96 97 <p>Clang's precompiled headers are designed with a compact on-disk 98 representation, which minimizes both PCH creation time and the time 99 required to initially load the PCH file. The PCH file itself contains 100 a serialized representation of Clang's abstract syntax trees and 101 supporting data structures, stored using the same compressed bitstream 102 as <a href="http://llvm.org/docs/BitCodeFormat.html">LLVM's bitcode 103 file format</a>.</p> 104 105 <p>Clang's precompiled headers are loaded "lazily" from disk. When a 106 PCH file is initially loaded, Clang reads only a small amount of data 107 from the PCH file to establish where certain important data structures 108 are stored. The amount of data read in this initial load is 109 independent of the size of the PCH file, such that a larger PCH file 110 does not lead to longer PCH load times. The actual header data in the 111 PCH file--macros, functions, variables, types, etc.--is loaded only 112 when it is referenced from the user's code, at which point only that 113 entity (and those entities it depends on) are deserialized from the 114 PCH file. With this approach, the cost of using a precompiled header 115 for a translation unit is proportional to the amount of code actually 116 used from the header, rather than being proportional to the size of 117 the header itself.</p> 118 119 <p>When given the <code>-print-stats</code> option, Clang produces 120 statistics describing how much of the precompiled header was actually 121 loaded from disk. For a simple "Hello, World!" program that includes 122 the Apple <code>Cocoa.h</code> header (which is built as a precompiled 123 header), this option illustrates how little of the actual precompiled 124 header is required:</p> 125 126 <pre> 127 *** PCH Statistics: 128 933 stat cache hits 129 4 stat cache misses 130 895/39981 source location entries read (2.238563%) 131 19/15315 types read (0.124061%) 132 20/82685 declarations read (0.024188%) 133 154/58070 identifiers read (0.265197%) 134 0/7260 selectors read (0.000000%) 135 0/30842 statements read (0.000000%) 136 4/8400 macros read (0.047619%) 137 1/4995 lexical declcontexts read (0.020020%) 138 0/4413 visible declcontexts read (0.000000%) 139 0/7230 method pool entries read (0.000000%) 140 0 method pool misses 141 </pre> 142 143 <p>For this small program, only a tiny fraction of the source 144 locations, types, declarations, identifiers, and macros were actually 145 deserialized from the precompiled header. These statistics can be 146 useful to determine whether the precompiled header implementation can 147 be improved by making more of the implementation lazy.</p> 148 149 <p>Precompiled headers can be chained. When you create a PCH while 150 including an existing PCH, Clang can create the new PCH by referencing 151 the original file and only writing the new data to the new file. For 152 example, you could create a PCH out of all the headers that are very 153 commonly used throughout your project, and then create a PCH for every 154 single source file in the project that includes the code that is 155 specific to that file, so that recompiling the file itself is very fast, 156 without duplicating the data from the common headers for every file.</p> 157 158 <h2 id="contents">Precompiled Header Contents</h2> 159 160 <img src="PCHLayout.png" style="float:right" alt="Precompiled header layout"> 161 162 <p>Clang's precompiled headers are organized into several different 163 blocks, each of which contains the serialized representation of a part 164 of Clang's internal representation. Each of the blocks corresponds to 165 either a block or a record within <a 166 href="http://llvm.org/docs/BitCodeFormat.html">LLVM's bitstream 167 format</a>. The contents of each of these logical blocks are described 168 below.</p> 169 170 <p>For a given precompiled header, the <a 171 href="http://llvm.org/cmds/llvm-bcanalyzer.html"><code>llvm-bcanalyzer</code></a> 172 utility can be used to examine the actual structure of the bitstream 173 for the precompiled header. This information can be used both to help 174 understand the structure of the precompiled header and to isolate 175 areas where precompiled headers can still be optimized, e.g., through 176 the introduction of abbreviations.</p> 177 178 <h3 id="metadata">Metadata Block</h3> 179 180 <p>The metadata block contains several records that provide 181 information about how the precompiled header was built. This metadata 182 is primarily used to validate the use of a precompiled header. For 183 example, a precompiled header built for a 32-bit x86 target cannot be used 184 when compiling for a 64-bit x86 target. The metadata block contains 185 information about:</p> 186 187 <dl> 188 <dt>Language options</dt> 189 <dd>Describes the particular language dialect used to compile the 190 PCH file, including major options (e.g., Objective-C support) and more 191 minor options (e.g., support for "//" comments). The contents of this 192 record correspond to the <code>LangOptions</code> class.</dd> 193 194 <dt>Target architecture</dt> 195 <dd>The target triple that describes the architecture, platform, and 196 ABI for which the PCH file was generated, e.g., 197 <code>i386-apple-darwin9</code>.</dd> 198 199 <dt>PCH version</dt> 200 <dd>The major and minor version numbers of the precompiled header 201 format. Changes in the minor version number should not affect backward 202 compatibility, while changes in the major version number imply that a 203 newer compiler cannot read an older precompiled header (and 204 vice-versa).</dd> 205 206 <dt>Original file name</dt> 207 <dd>The full path of the header that was used to generate the 208 precompiled header.</dd> 209 210 <dt>Predefines buffer</dt> 211 <dd>Although not explicitly stored as part of the metadata, the 212 predefines buffer is used in the validation of the precompiled header. 213 The predefines buffer itself contains code generated by the compiler 214 to initialize the preprocessor state according to the current target, 215 platform, and command-line options. For example, the predefines buffer 216 will contain "<code>#define __STDC__ 1</code>" when we are compiling C 217 without Microsoft extensions. The predefines buffer itself is stored 218 within the <a href="#sourcemgr">source manager block</a>, but its 219 contents are verified along with the rest of the metadata.</dd> 220 221 </dl> 222 223 <p>A chained PCH file (that is, one that references another PCH) has 224 a slightly different metadata block, which contains the following 225 information:</p> 226 227 <dl> 228 <dt>Referenced file</dt> 229 <dd>The name of the referenced PCH file. It is looked up like a file 230 specified using -include-pch.</dd> 231 232 <dt>PCH version</dt> 233 <dd>This is the same as in normal PCH files.</dd> 234 235 <dt>Original file name</dt> 236 <dd>The full path of the header that was used to generate this 237 precompiled header.</dd> 238 239 </dl> 240 241 <p>The language options, target architecture and predefines buffer data 242 is taken from the end of the chain, since they have to match anyway.</p> 243 244 <h3 id="sourcemgr">Source Manager Block</h3> 245 246 <p>The source manager block contains the serialized representation of 247 Clang's <a 248 href="InternalsManual.html#SourceLocation">SourceManager</a> class, 249 which handles the mapping from source locations (as represented in 250 Clang's abstract syntax tree) into actual column/line positions within 251 a source file or macro instantiation. The precompiled header's 252 representation of the source manager also includes information about 253 all of the headers that were (transitively) included when building the 254 precompiled header.</p> 255 256 <p>The bulk of the source manager block is dedicated to information 257 about the various files, buffers, and macro instantiations into which 258 a source location can refer. Each of these is referenced by a numeric 259 "file ID", which is a unique number (allocated starting at 1) stored 260 in the source location. Clang serializes the information for each kind 261 of file ID, along with an index that maps file IDs to the position 262 within the PCH file where the information about that file ID is 263 stored. The data associated with a file ID is loaded only when 264 required by the front end, e.g., to emit a diagnostic that includes a 265 macro instantiation history inside the header itself.</p> 266 267 <p>The source manager block also contains information about all of the 268 headers that were included when building the precompiled header. This 269 includes information about the controlling macro for the header (e.g., 270 when the preprocessor identified that the contents of the header 271 dependent on a macro like <code>LLVM_CLANG_SOURCEMANAGER_H</code>) 272 along with a cached version of the results of the <code>stat()</code> 273 system calls performed when building the precompiled header. The 274 latter is particularly useful in reducing system time when searching 275 for include files.</p> 276 277 <h3 id="preprocessor">Preprocessor Block</h3> 278 279 <p>The preprocessor block contains the serialized representation of 280 the preprocessor. Specifically, it contains all of the macros that 281 have been defined by the end of the header used to build the 282 precompiled header, along with the token sequences that comprise each 283 macro. The macro definitions are only read from the PCH file when the 284 name of the macro first occurs in the program. This lazy loading of 285 macro definitions is triggered by lookups into the <a 286 href="#idtable">identifier table</a>.</p> 287 288 <h3 id="types">Types Block</h3> 289 290 <p>The types block contains the serialized representation of all of 291 the types referenced in the translation unit. Each Clang type node 292 (<code>PointerType</code>, <code>FunctionProtoType</code>, etc.) has a 293 corresponding record type in the PCH file. When types are deserialized 294 from the precompiled header, the data within the record is used to 295 reconstruct the appropriate type node using the AST context.</p> 296 297 <p>Each type has a unique type ID, which is an integer that uniquely 298 identifies that type. Type ID 0 represents the NULL type, type IDs 299 less than <code>NUM_PREDEF_TYPE_IDS</code> represent predefined types 300 (<code>void</code>, <code>float</code>, etc.), while other 301 "user-defined" type IDs are assigned consecutively from 302 <code>NUM_PREDEF_TYPE_IDS</code> upward as the types are encountered. 303 The PCH file has an associated mapping from the user-defined types 304 block to the location within the types block where the serialized 305 representation of that type resides, enabling lazy deserialization of 306 types. When a type is referenced from within the PCH file, that 307 reference is encoded using the type ID shifted left by 3 bits. The 308 lower three bits are used to represent the <code>const</code>, 309 <code>volatile</code>, and <code>restrict</code> qualifiers, as in 310 Clang's <a 311 href="http://clang.llvm.org/docs/InternalsManual.html#Type">QualType</a> 312 class.</p> 313 314 <h3 id="decls">Declarations Block</h3> 315 316 <p>The declarations block contains the serialized representation of 317 all of the declarations referenced in the translation unit. Each Clang 318 declaration node (<code>VarDecl</code>, <code>FunctionDecl</code>, 319 etc.) has a corresponding record type in the PCH file. When 320 declarations are deserialized from the precompiled header, the data 321 within the record is used to build and populate a new instance of the 322 corresponding <code>Decl</code> node. As with types, each declaration 323 node has a numeric ID that is used to refer to that declaration within 324 the PCH file. In addition, a lookup table provides a mapping from that 325 numeric ID to the offset within the precompiled header where that 326 declaration is described.</p> 327 328 <p>Declarations in Clang's abstract syntax trees are stored 329 hierarchically. At the top of the hierarchy is the translation unit 330 (<code>TranslationUnitDecl</code>), which contains all of the 331 declarations in the translation unit. These declarations (such as 332 functions or struct types) may also contain other declarations inside 333 them, and so on. Within Clang, each declaration is stored within a <a 334 href="http://clang.llvm.org/docs/InternalsManual.html#DeclContext">declaration 335 context</a>, as represented by the <code>DeclContext</code> class. 336 Declaration contexts provide the mechanism to perform name lookup 337 within a given declaration (e.g., find the member named <code>x</code> 338 in a structure) and iterate over the declarations stored within a 339 context (e.g., iterate over all of the fields of a structure for 340 structure layout).</p> 341 342 <p>In Clang's precompiled header format, deserializing a declaration 343 that is a <code>DeclContext</code> is a separate operation from 344 deserializing all of the declarations stored within that declaration 345 context. Therefore, Clang will deserialize the translation unit 346 declaration without deserializing the declarations within that 347 translation unit. When required, the declarations stored within a 348 declaration context will be deserialized. There are two representations 349 of the declarations within a declaration context, which correspond to 350 the name-lookup and iteration behavior described above:</p> 351 352 <ul> 353 <li>When the front end performs name lookup to find a name 354 <code>x</code> within a given declaration context (for example, 355 during semantic analysis of the expression <code>p->x</code>, 356 where <code>p</code>'s type is defined in the precompiled header), 357 Clang deserializes a hash table mapping from the names within that 358 declaration context to the declaration IDs that represent each 359 visible declaration with that name. The entire hash table is 360 deserialized at this point (into the <code>llvm::DenseMap</code> 361 stored within each <code>DeclContext</code> object), but the actual 362 declarations are not yet deserialized. In a second step, those 363 declarations with the name <code>x</code> will be deserialized and 364 will be used as the result of name lookup.</li> 365 366 <li>When the front end performs iteration over all of the 367 declarations within a declaration context, all of those declarations 368 are immediately de-serialized. For large declaration contexts (e.g., 369 the translation unit), this operation is expensive; however, large 370 declaration contexts are not traversed in normal compilation, since 371 such a traversal is unnecessary. However, it is common for the code 372 generator and semantic analysis to traverse declaration contexts for 373 structs, classes, unions, and enumerations, although those contexts 374 contain relatively few declarations in the common case.</li> 375 </ul> 376 377 <h3 id="stmt">Statements and Expressions</h3> 378 379 <p>Statements and expressions are stored in the precompiled header in 380 both the <a href="#types">types</a> and the <a 381 href="#decls">declarations</a> blocks, because every statement or 382 expression will be associated with either a type or declaration. The 383 actual statement and expression records are stored immediately 384 following the declaration or type that owns the statement or 385 expression. For example, the statement representing the body of a 386 function will be stored directly following the declaration of the 387 function.</p> 388 389 <p>As with types and declarations, each statement and expression kind 390 in Clang's abstract syntax tree (<code>ForStmt</code>, 391 <code>CallExpr</code>, etc.) has a corresponding record type in the 392 precompiled header, which contains the serialized representation of 393 that statement or expression. Each substatement or subexpression 394 within an expression is stored as a separate record (which keeps most 395 records to a fixed size). Within the precompiled header, the 396 subexpressions of an expression are stored, in reverse order, prior to the expression 397 that owns those expression, using a form of <a 398 href="http://en.wikipedia.org/wiki/Reverse_Polish_notation">Reverse 399 Polish Notation</a>. For example, an expression <code>3 - 4 + 5</code> 400 would be represented as follows:</p> 401 402 <table border="1"> 403 <tr><td><code>IntegerLiteral(5)</code></td></tr> 404 <tr><td><code>IntegerLiteral(4)</code></td></tr> 405 <tr><td><code>IntegerLiteral(3)</code></td></tr> 406 <tr><td><code>BinaryOperator(-)</code></td></tr> 407 <tr><td><code>BinaryOperator(+)</code></td></tr> 408 <tr><td>STOP</td></tr> 409 </table> 410 411 <p>When reading this representation, Clang evaluates each expression 412 record it encounters, builds the appropriate abstract syntax tree node, 413 and then pushes that expression on to a stack. When a record contains <i>N</i> 414 subexpressions--<code>BinaryOperator</code> has two of them--those 415 expressions are popped from the top of the stack. The special STOP 416 code indicates that we have reached the end of a serialized expression 417 or statement; other expression or statement records may follow, but 418 they are part of a different expression.</p> 419 420 <h3 id="idtable">Identifier Table Block</h3> 421 422 <p>The identifier table block contains an on-disk hash table that maps 423 each identifier mentioned within the precompiled header to the 424 serialized representation of the identifier's information (e.g, the 425 <code>IdentifierInfo</code> structure). The serialized representation 426 contains:</p> 427 428 <ul> 429 <li>The actual identifier string.</li> 430 <li>Flags that describe whether this identifier is the name of a 431 built-in, a poisoned identifier, an extension token, or a 432 macro.</li> 433 <li>If the identifier names a macro, the offset of the macro 434 definition within the <a href="#preprocessor">preprocessor 435 block</a>.</li> 436 <li>If the identifier names one or more declarations visible from 437 translation unit scope, the <a href="#decls">declaration IDs</a> of these 438 declarations.</li> 439 </ul> 440 441 <p>When a precompiled header is loaded, the precompiled header 442 mechanism introduces itself into the identifier table as an external 443 lookup source. Thus, when the user program refers to an identifier 444 that has not yet been seen, Clang will perform a lookup into the 445 identifier table. If an identifier is found, its contents (macro 446 definitions, flags, top-level declarations, etc.) will be deserialized, at which point the corresponding <code>IdentifierInfo</code> structure will have the same contents it would have after parsing the headers in the precompiled header.</p> 447 448 <p>Within the PCH file, the identifiers used to name declarations are represented with an integral value. A separate table provides a mapping from this integral value (the identifier ID) to the location within the on-disk 449 hash table where that identifier is stored. This mapping is used when 450 deserializing the name of a declaration, the identifier of a token, or 451 any other construct in the PCH file that refers to a name.</p> 452 453 <h3 id="method-pool">Method Pool Block</h3> 454 455 <p>The method pool block is represented as an on-disk hash table that 456 serves two purposes: it provides a mapping from the names of 457 Objective-C selectors to the set of Objective-C instance and class 458 methods that have that particular selector (which is required for 459 semantic analysis in Objective-C) and also stores all of the selectors 460 used by entities within the precompiled header. The design of the 461 method pool is similar to that of the <a href="#idtable">identifier 462 table</a>: the first time a particular selector is formed during the 463 compilation of the program, Clang will search in the on-disk hash 464 table of selectors; if found, Clang will read the Objective-C methods 465 associated with that selector into the appropriate front-end data 466 structure (<code>Sema::InstanceMethodPool</code> and 467 <code>Sema::FactoryMethodPool</code> for instance and class methods, 468 respectively).</p> 469 470 <p>As with identifiers, selectors are represented by numeric values 471 within the PCH file. A separate index maps these numeric selector 472 values to the offset of the selector within the on-disk hash table, 473 and will be used when de-serializing an Objective-C method declaration 474 (or other Objective-C construct) that refers to the selector.</p> 475 476 <h2 id="tendrils">Precompiled Header Integration Points</h2> 477 478 <p>The "lazy" deserialization behavior of precompiled headers requires 479 their integration into several completely different submodules of 480 Clang. For example, lazily deserializing the declarations during name 481 lookup requires that the name-lookup routines be able to query the 482 precompiled header to find entities within the PCH file.</p> 483 484 <p>For each Clang data structure that requires direct interaction with 485 the precompiled header logic, there is an abstract class that provides 486 the interface between the two modules. The <code>PCHReader</code> 487 class, which handles the loading of a precompiled header, inherits 488 from all of these abstract classes to provide lazy deserialization of 489 Clang's data structures. <code>PCHReader</code> implements the 490 following abstract classes:</p> 491 492 <dl> 493 <dt><code>StatSysCallCache</code></dt> 494 <dd>This abstract interface is associated with the 495 <code>FileManager</code> class, and is used whenever the file 496 manager is going to perform a <code>stat()</code> system call.</dd> 497 498 <dt><code>ExternalSLocEntrySource</code></dt> 499 <dd>This abstract interface is associated with the 500 <code>SourceManager</code> class, and is used whenever the 501 <a href="#sourcemgr">source manager</a> needs to load the details 502 of a file, buffer, or macro instantiation.</dd> 503 504 <dt><code>IdentifierInfoLookup</code></dt> 505 <dd>This abstract interface is associated with the 506 <code>IdentifierTable</code> class, and is used whenever the 507 program source refers to an identifier that has not yet been seen. 508 In this case, the precompiled header implementation searches for 509 this identifier within its <a href="#idtable">identifier table</a> 510 to load any top-level declarations or macros associated with that 511 identifier.</dd> 512 513 <dt><code>ExternalASTSource</code></dt> 514 <dd>This abstract interface is associated with the 515 <code>ASTContext</code> class, and is used whenever the abstract 516 syntax tree nodes need to loaded from the precompiled header. It 517 provides the ability to de-serialize declarations and types 518 identified by their numeric values, read the bodies of functions 519 when required, and read the declarations stored within a 520 declaration context (either for iteration or for name lookup).</dd> 521 522 <dt><code>ExternalSemaSource</code></dt> 523 <dd>This abstract interface is associated with the <code>Sema</code> 524 class, and is used whenever semantic analysis needs to read 525 information from the <a href="#methodpool">global method 526 pool</a>.</dd> 527 </dl> 528 529 </div> 530 531 </body> 532 </html> 533