1 <html> 2 <head> 3 <title>Precompiled Headers (PCH)</title> 4 <link type="text/css" rel="stylesheet" href="../menu.css" /> 5 <link type="text/css" rel="stylesheet" href="../content.css" /> 6 <style type="text/css"> 7 td { 8 vertical-align: top; 9 } 10 </style> 11 </head> 12 13 <body> 14 15 <!--#include virtual="../menu.html.incl"--> 16 17 <div id="content"> 18 19 <h1>Precompiled Headers</h1> 20 21 <p>This document describes the design and implementation of Clang's 22 precompiled headers (PCH). If you are interested in the end-user 23 view, please see the <a 24 href="UsersManual.html#precompiledheaders">User's Manual</a>.</p> 25 26 <p><b>Table of Contents</b></p> 27 <ul> 28 <li><a href="#usage">Using Precompiled Headers with 29 <tt>clang</tt></a></li> 30 <li><a href="#philosophy">Design Philosophy</a></li> 31 <li><a href="#contents">Precompiled Header Contents</a> 32 <ul> 33 <li><a href="#metadata">Metadata Block</a></li> 34 <li><a href="#sourcemgr">Source Manager Block</a></li> 35 <li><a href="#preprocessor">Preprocessor Block</a></li> 36 <li><a href="#types">Types Block</a></li> 37 <li><a href="#decls">Declarations Block</a></li> 38 <li><a href="#stmt">Statements and Expressions</a></li> 39 <li><a href="#idtable">Identifier Table Block</a></li> 40 <li><a href="#method-pool">Method Pool Block</a></li> 41 </ul> 42 </li> 43 <li><a href="#tendrils">Precompiled Header Integration 44 Points</a></li> 45 </ul> 46 47 <h2 id="usage">Using Precompiled Headers with <tt>clang</tt></h2> 48 49 <p>The Clang compiler frontend, <tt>clang -cc1</tt>, supports two command line 50 options for generating and using PCH files.<p> 51 52 <p>To generate PCH files using <tt>clang -cc1</tt>, use the option 53 <b><tt>-emit-pch</tt></b>: 54 55 <pre> $ clang -cc1 test.h -emit-pch -o test.h.pch </pre> 56 57 <p>This option is transparently used by <tt>clang</tt> when generating 58 PCH files. The resulting PCH file contains the serialized form of the 59 compiler's internal representation after it has completed parsing and 60 semantic analysis. The PCH file can then be used as a prefix header 61 with the <b><tt>-include-pch</tt></b> option:</p> 62 63 <pre> 64 $ clang -cc1 -include-pch test.h.pch test.c -o test.s 65 </pre> 66 67 <h2 id="philosophy">Design Philosophy</h2> 68 69 <p>Precompiled headers are meant to improve overall compile times for 70 projects, so the design of precompiled headers is entirely driven by 71 performance concerns. The use case for precompiled headers is 72 relatively simple: when there is a common set of headers that is 73 included in nearly every source file in the project, we 74 <i>precompile</i> that bundle of headers into a single precompiled 75 header (PCH file). Then, when compiling the source files in the 76 project, we load the PCH file first (as a prefix header), which acts 77 as a stand-in for that bundle of headers.</p> 78 79 <p>A precompiled header implementation improves performance when:</p> 80 <ul> 81 <li>Loading the PCH file is significantly faster than re-parsing the 82 bundle of headers stored within the PCH file. Thus, a precompiled 83 header design attempts to minimize the cost of reading the PCH 84 file. Ideally, this cost should not vary with the size of the 85 precompiled header file.</li> 86 87 <li>The cost of generating the PCH file initially is not so large 88 that it counters the per-source-file performance improvement due to 89 eliminating the need to parse the bundled headers in the first 90 place. This is particularly important on multi-core systems, because 91 PCH file generation serializes the build when all compilations 92 require the PCH file to be up-to-date.</li> 93 </ul> 94 95 <p>Clang's precompiled headers are designed with a compact on-disk 96 representation, which minimizes both PCH creation time and the time 97 required to initially load the PCH file. The PCH file itself contains 98 a serialized representation of Clang's abstract syntax trees and 99 supporting data structures, stored using the same compressed bitstream 100 as <a href="http://llvm.org/docs/BitCodeFormat.html">LLVM's bitcode 101 file format</a>.</p> 102 103 <p>Clang's precompiled headers are loaded "lazily" from disk. When a 104 PCH file is initially loaded, Clang reads only a small amount of data 105 from the PCH file to establish where certain important data structures 106 are stored. The amount of data read in this initial load is 107 independent of the size of the PCH file, such that a larger PCH file 108 does not lead to longer PCH load times. The actual header data in the 109 PCH file--macros, functions, variables, types, etc.--is loaded only 110 when it is referenced from the user's code, at which point only that 111 entity (and those entities it depends on) are deserialized from the 112 PCH file. With this approach, the cost of using a precompiled header 113 for a translation unit is proportional to the amount of code actually 114 used from the header, rather than being proportional to the size of 115 the header itself.</p> 116 117 <p>When given the <code>-print-stats</code> option, Clang produces 118 statistics describing how much of the precompiled header was actually 119 loaded from disk. For a simple "Hello, World!" program that includes 120 the Apple <code>Cocoa.h</code> header (which is built as a precompiled 121 header), this option illustrates how little of the actual precompiled 122 header is required:</p> 123 124 <pre> 125 *** PCH Statistics: 126 933 stat cache hits 127 4 stat cache misses 128 895/39981 source location entries read (2.238563%) 129 19/15315 types read (0.124061%) 130 20/82685 declarations read (0.024188%) 131 154/58070 identifiers read (0.265197%) 132 0/7260 selectors read (0.000000%) 133 0/30842 statements read (0.000000%) 134 4/8400 macros read (0.047619%) 135 1/4995 lexical declcontexts read (0.020020%) 136 0/4413 visible declcontexts read (0.000000%) 137 0/7230 method pool entries read (0.000000%) 138 0 method pool misses 139 </pre> 140 141 <p>For this small program, only a tiny fraction of the source 142 locations, types, declarations, identifiers, and macros were actually 143 deserialized from the precompiled header. These statistics can be 144 useful to determine whether the precompiled header implementation can 145 be improved by making more of the implementation lazy.</p> 146 147 <p>Precompiled headers can be chained. When you create a PCH while 148 including an existing PCH, Clang can create the new PCH by referencing 149 the original file and only writing the new data to the new file. For 150 example, you could create a PCH out of all the headers that are very 151 commonly used throughout your project, and then create a PCH for every 152 single source file in the project that includes the code that is 153 specific to that file, so that recompiling the file itself is very fast, 154 without duplicating the data from the common headers for every file.</p> 155 156 <h2 id="contents">Precompiled Header Contents</h2> 157 158 <img src="PCHLayout.png" align="right" alt="Precompiled header layout"> 159 160 <p>Clang's precompiled headers are organized into several different 161 blocks, each of which contains the serialized representation of a part 162 of Clang's internal representation. Each of the blocks corresponds to 163 either a block or a record within <a 164 href="http://llvm.org/docs/BitCodeFormat.html">LLVM's bitstream 165 format</a>. The contents of each of these logical blocks are described 166 below.</p> 167 168 <p>For a given precompiled header, the <a 169 href="http://llvm.org/cmds/llvm-bcanalyzer.html"><code>llvm-bcanalyzer</code></a> 170 utility can be used to examine the actual structure of the bitstream 171 for the precompiled header. This information can be used both to help 172 understand the structure of the precompiled header and to isolate 173 areas where precompiled headers can still be optimized, e.g., through 174 the introduction of abbreviations.</p> 175 176 <h3 id="metadata">Metadata Block</h3> 177 178 <p>The metadata block contains several records that provide 179 information about how the precompiled header was built. This metadata 180 is primarily used to validate the use of a precompiled header. For 181 example, a precompiled header built for a 32-bit x86 target cannot be used 182 when compiling for a 64-bit x86 target. The metadata block contains 183 information about:</p> 184 185 <dl> 186 <dt>Language options</dt> 187 <dd>Describes the particular language dialect used to compile the 188 PCH file, including major options (e.g., Objective-C support) and more 189 minor options (e.g., support for "//" comments). The contents of this 190 record correspond to the <code>LangOptions</code> class.</dd> 191 192 <dt>Target architecture</dt> 193 <dd>The target triple that describes the architecture, platform, and 194 ABI for which the PCH file was generated, e.g., 195 <code>i386-apple-darwin9</code>.</dd> 196 197 <dt>PCH version</dt> 198 <dd>The major and minor version numbers of the precompiled header 199 format. Changes in the minor version number should not affect backward 200 compatibility, while changes in the major version number imply that a 201 newer compiler cannot read an older precompiled header (and 202 vice-versa).</dd> 203 204 <dt>Original file name</dt> 205 <dd>The full path of the header that was used to generate the 206 precompiled header.</dd> 207 208 <dt>Predefines buffer</dt> 209 <dd>Although not explicitly stored as part of the metadata, the 210 predefines buffer is used in the validation of the precompiled header. 211 The predefines buffer itself contains code generated by the compiler 212 to initialize the preprocessor state according to the current target, 213 platform, and command-line options. For example, the predefines buffer 214 will contain "<code>#define __STDC__ 1</code>" when we are compiling C 215 without Microsoft extensions. The predefines buffer itself is stored 216 within the <a href="#sourcemgr">source manager block</a>, but its 217 contents are verified along with the rest of the metadata.</dd> 218 219 </dl> 220 221 <p>A chained PCH file (that is, one that references another PCH) has 222 a slightly different metadata block, which contains the following 223 information:</p> 224 225 <dl> 226 <dt>Referenced file</dt> 227 <dd>The name of the referenced PCH file. It is looked up like a file 228 specified using -include-pch.</dd> 229 230 <dt>PCH version</dt> 231 <dd>This is the same as in normal PCH files.</dd> 232 233 <dt>Original file name</dt> 234 <dd>The full path of the header that was used to generate this 235 precompiled header.</dd> 236 237 </dl> 238 239 <p>The language options, target architecture and predefines buffer data 240 is taken from the end of the chain, since they have to match anyway.</p> 241 242 <h3 id="sourcemgr">Source Manager Block</h3> 243 244 <p>The source manager block contains the serialized representation of 245 Clang's <a 246 href="InternalsManual.html#SourceLocation">SourceManager</a> class, 247 which handles the mapping from source locations (as represented in 248 Clang's abstract syntax tree) into actual column/line positions within 249 a source file or macro instantiation. The precompiled header's 250 representation of the source manager also includes information about 251 all of the headers that were (transitively) included when building the 252 precompiled header.</p> 253 254 <p>The bulk of the source manager block is dedicated to information 255 about the various files, buffers, and macro instantiations into which 256 a source location can refer. Each of these is referenced by a numeric 257 "file ID", which is a unique number (allocated starting at 1) stored 258 in the source location. Clang serializes the information for each kind 259 of file ID, along with an index that maps file IDs to the position 260 within the PCH file where the information about that file ID is 261 stored. The data associated with a file ID is loaded only when 262 required by the front end, e.g., to emit a diagnostic that includes a 263 macro instantiation history inside the header itself.</p> 264 265 <p>The source manager block also contains information about all of the 266 headers that were included when building the precompiled header. This 267 includes information about the controlling macro for the header (e.g., 268 when the preprocessor identified that the contents of the header 269 dependent on a macro like <code>LLVM_CLANG_SOURCEMANAGER_H</code>) 270 along with a cached version of the results of the <code>stat()</code> 271 system calls performed when building the precompiled header. The 272 latter is particularly useful in reducing system time when searching 273 for include files.</p> 274 275 <h3 id="preprocessor">Preprocessor Block</h3> 276 277 <p>The preprocessor block contains the serialized representation of 278 the preprocessor. Specifically, it contains all of the macros that 279 have been defined by the end of the header used to build the 280 precompiled header, along with the token sequences that comprise each 281 macro. The macro definitions are only read from the PCH file when the 282 name of the macro first occurs in the program. This lazy loading of 283 macro definitions is triggered by lookups into the <a 284 href="#idtable">identifier table</a>.</p> 285 286 <h3 id="types">Types Block</h3> 287 288 <p>The types block contains the serialized representation of all of 289 the types referenced in the translation unit. Each Clang type node 290 (<code>PointerType</code>, <code>FunctionProtoType</code>, etc.) has a 291 corresponding record type in the PCH file. When types are deserialized 292 from the precompiled header, the data within the record is used to 293 reconstruct the appropriate type node using the AST context.</p> 294 295 <p>Each type has a unique type ID, which is an integer that uniquely 296 identifies that type. Type ID 0 represents the NULL type, type IDs 297 less than <code>NUM_PREDEF_TYPE_IDS</code> represent predefined types 298 (<code>void</code>, <code>float</code>, etc.), while other 299 "user-defined" type IDs are assigned consecutively from 300 <code>NUM_PREDEF_TYPE_IDS</code> upward as the types are encountered. 301 The PCH file has an associated mapping from the user-defined types 302 block to the location within the types block where the serialized 303 representation of that type resides, enabling lazy deserialization of 304 types. When a type is referenced from within the PCH file, that 305 reference is encoded using the type ID shifted left by 3 bits. The 306 lower three bits are used to represent the <code>const</code>, 307 <code>volatile</code>, and <code>restrict</code> qualifiers, as in 308 Clang's <a 309 href="http://clang.llvm.org/docs/InternalsManual.html#Type">QualType</a> 310 class.</p> 311 312 <h3 id="decls">Declarations Block</h3> 313 314 <p>The declarations block contains the serialized representation of 315 all of the declarations referenced in the translation unit. Each Clang 316 declaration node (<code>VarDecl</code>, <code>FunctionDecl</code>, 317 etc.) has a corresponding record type in the PCH file. When 318 declarations are deserialized from the precompiled header, the data 319 within the record is used to build and populate a new instance of the 320 corresponding <code>Decl</code> node. As with types, each declaration 321 node has a numeric ID that is used to refer to that declaration within 322 the PCH file. In addition, a lookup table provides a mapping from that 323 numeric ID to the offset within the precompiled header where that 324 declaration is described.</p> 325 326 <p>Declarations in Clang's abstract syntax trees are stored 327 hierarchically. At the top of the hierarchy is the translation unit 328 (<code>TranslationUnitDecl</code>), which contains all of the 329 declarations in the translation unit. These declarations (such as 330 functions or struct types) may also contain other declarations inside 331 them, and so on. Within Clang, each declaration is stored within a <a 332 href="http://clang.llvm.org/docs/InternalsManual.html#DeclContext">declaration 333 context</a>, as represented by the <code>DeclContext</code> class. 334 Declaration contexts provide the mechanism to perform name lookup 335 within a given declaration (e.g., find the member named <code>x</code> 336 in a structure) and iterate over the declarations stored within a 337 context (e.g., iterate over all of the fields of a structure for 338 structure layout).</p> 339 340 <p>In Clang's precompiled header format, deserializing a declaration 341 that is a <code>DeclContext</code> is a separate operation from 342 deserializing all of the declarations stored within that declaration 343 context. Therefore, Clang will deserialize the translation unit 344 declaration without deserializing the declarations within that 345 translation unit. When required, the declarations stored within a 346 declaration context will be deserialized. There are two representations 347 of the declarations within a declaration context, which correspond to 348 the name-lookup and iteration behavior described above:</p> 349 350 <ul> 351 <li>When the front end performs name lookup to find a name 352 <code>x</code> within a given declaration context (for example, 353 during semantic analysis of the expression <code>p->x</code>, 354 where <code>p</code>'s type is defined in the precompiled header), 355 Clang deserializes a hash table mapping from the names within that 356 declaration context to the declaration IDs that represent each 357 visible declaration with that name. The entire hash table is 358 deserialized at this point (into the <code>llvm::DenseMap</code> 359 stored within each <code>DeclContext</code> object), but the actual 360 declarations are not yet deserialized. In a second step, those 361 declarations with the name <code>x</code> will be deserialized and 362 will be used as the result of name lookup.</li> 363 364 <li>When the front end performs iteration over all of the 365 declarations within a declaration context, all of those declarations 366 are immediately de-serialized. For large declaration contexts (e.g., 367 the translation unit), this operation is expensive; however, large 368 declaration contexts are not traversed in normal compilation, since 369 such a traversal is unnecessary. However, it is common for the code 370 generator and semantic analysis to traverse declaration contexts for 371 structs, classes, unions, and enumerations, although those contexts 372 contain relatively few declarations in the common case.</li> 373 </ul> 374 375 <h3 id="stmt">Statements and Expressions</h3> 376 377 <p>Statements and expressions are stored in the precompiled header in 378 both the <a href="#types">types</a> and the <a 379 href="#decls">declarations</a> blocks, because every statement or 380 expression will be associated with either a type or declaration. The 381 actual statement and expression records are stored immediately 382 following the declaration or type that owns the statement or 383 expression. For example, the statement representing the body of a 384 function will be stored directly following the declaration of the 385 function.</p> 386 387 <p>As with types and declarations, each statement and expression kind 388 in Clang's abstract syntax tree (<code>ForStmt</code>, 389 <code>CallExpr</code>, etc.) has a corresponding record type in the 390 precompiled header, which contains the serialized representation of 391 that statement or expression. Each substatement or subexpression 392 within an expression is stored as a separate record (which keeps most 393 records to a fixed size). Within the precompiled header, the 394 subexpressions of an expression are stored, in reverse order, prior to the expression 395 that owns those expression, using a form of <a 396 href="http://en.wikipedia.org/wiki/Reverse_Polish_notation">Reverse 397 Polish Notation</a>. For example, an expression <code>3 - 4 + 5</code> 398 would be represented as follows:</p> 399 400 <table border="1"> 401 <tr><td><code>IntegerLiteral(5)</code></td></tr> 402 <tr><td><code>IntegerLiteral(4)</code></td></tr> 403 <tr><td><code>IntegerLiteral(3)</code></td></tr> 404 <tr><td><code>BinaryOperator(-)</code></td></tr> 405 <tr><td><code>BinaryOperator(+)</code></td></tr> 406 <tr><td>STOP</td></tr> 407 </table> 408 409 <p>When reading this representation, Clang evaluates each expression 410 record it encounters, builds the appropriate abstract syntax tree node, 411 and then pushes that expression on to a stack. When a record contains <i>N</i> 412 subexpressions--<code>BinaryOperator</code> has two of them--those 413 expressions are popped from the top of the stack. The special STOP 414 code indicates that we have reached the end of a serialized expression 415 or statement; other expression or statement records may follow, but 416 they are part of a different expression.</p> 417 418 <h3 id="idtable">Identifier Table Block</h3> 419 420 <p>The identifier table block contains an on-disk hash table that maps 421 each identifier mentioned within the precompiled header to the 422 serialized representation of the identifier's information (e.g, the 423 <code>IdentifierInfo</code> structure). The serialized representation 424 contains:</p> 425 426 <ul> 427 <li>The actual identifier string.</li> 428 <li>Flags that describe whether this identifier is the name of a 429 built-in, a poisoned identifier, an extension token, or a 430 macro.</li> 431 <li>If the identifier names a macro, the offset of the macro 432 definition within the <a href="#preprocessor">preprocessor 433 block</a>.</li> 434 <li>If the identifier names one or more declarations visible from 435 translation unit scope, the <a href="#decls">declaration IDs</a> of these 436 declarations.</li> 437 </ul> 438 439 <p>When a precompiled header is loaded, the precompiled header 440 mechanism introduces itself into the identifier table as an external 441 lookup source. Thus, when the user program refers to an identifier 442 that has not yet been seen, Clang will perform a lookup into the 443 identifier table. If an identifier is found, its contents (macro 444 definitions, flags, top-level declarations, etc.) will be deserialized, at which point the corresponding <code>IdentifierInfo</code> structure will have the same contents it would have after parsing the headers in the precompiled header.</p> 445 446 <p>Within the PCH file, the identifiers used to name declarations are represented with an integral value. A separate table provides a mapping from this integral value (the identifier ID) to the location within the on-disk 447 hash table where that identifier is stored. This mapping is used when 448 deserializing the name of a declaration, the identifier of a token, or 449 any other construct in the PCH file that refers to a name.</p> 450 451 <h3 id="method-pool">Method Pool Block</h3> 452 453 <p>The method pool block is represented as an on-disk hash table that 454 serves two purposes: it provides a mapping from the names of 455 Objective-C selectors to the set of Objective-C instance and class 456 methods that have that particular selector (which is required for 457 semantic analysis in Objective-C) and also stores all of the selectors 458 used by entities within the precompiled header. The design of the 459 method pool is similar to that of the <a href="#idtable">identifier 460 table</a>: the first time a particular selector is formed during the 461 compilation of the program, Clang will search in the on-disk hash 462 table of selectors; if found, Clang will read the Objective-C methods 463 associated with that selector into the appropriate front-end data 464 structure (<code>Sema::InstanceMethodPool</code> and 465 <code>Sema::FactoryMethodPool</code> for instance and class methods, 466 respectively).</p> 467 468 <p>As with identifiers, selectors are represented by numeric values 469 within the PCH file. A separate index maps these numeric selector 470 values to the offset of the selector within the on-disk hash table, 471 and will be used when de-serializing an Objective-C method declaration 472 (or other Objective-C construct) that refers to the selector.</p> 473 474 <h2 id="tendrils">Precompiled Header Integration Points</h2> 475 476 <p>The "lazy" deserialization behavior of precompiled headers requires 477 their integration into several completely different submodules of 478 Clang. For example, lazily deserializing the declarations during name 479 lookup requires that the name-lookup routines be able to query the 480 precompiled header to find entities within the PCH file.</p> 481 482 <p>For each Clang data structure that requires direct interaction with 483 the precompiled header logic, there is an abstract class that provides 484 the interface between the two modules. The <code>PCHReader</code> 485 class, which handles the loading of a precompiled header, inherits 486 from all of these abstract classes to provide lazy deserialization of 487 Clang's data structures. <code>PCHReader</code> implements the 488 following abstract classes:</p> 489 490 <dl> 491 <dt><code>StatSysCallCache</code></dt> 492 <dd>This abstract interface is associated with the 493 <code>FileManager</code> class, and is used whenever the file 494 manager is going to perform a <code>stat()</code> system call.</dd> 495 496 <dt><code>ExternalSLocEntrySource</code></dt> 497 <dd>This abstract interface is associated with the 498 <code>SourceManager</code> class, and is used whenever the 499 <a href="#sourcemgr">source manager</a> needs to load the details 500 of a file, buffer, or macro instantiation.</dd> 501 502 <dt><code>IdentifierInfoLookup</code></dt> 503 <dd>This abstract interface is associated with the 504 <code>IdentifierTable</code> class, and is used whenever the 505 program source refers to an identifier that has not yet been seen. 506 In this case, the precompiled header implementation searches for 507 this identifier within its <a href="#idtable">identifier table</a> 508 to load any top-level declarations or macros associated with that 509 identifier.</dd> 510 511 <dt><code>ExternalASTSource</code></dt> 512 <dd>This abstract interface is associated with the 513 <code>ASTContext</code> class, and is used whenever the abstract 514 syntax tree nodes need to loaded from the precompiled header. It 515 provides the ability to de-serialize declarations and types 516 identified by their numeric values, read the bodies of functions 517 when required, and read the declarations stored within a 518 declaration context (either for iteration or for name lookup).</dd> 519 520 <dt><code>ExternalSemaSource</code></dt> 521 <dd>This abstract interface is associated with the <code>Sema</code> 522 class, and is used whenever semantic analysis needs to read 523 information from the <a href="#methodpool">global method 524 pool</a>.</dd> 525 </dl> 526 527 </div> 528 529 </body> 530 </html> 531