Home | History | Annotate | Download | only in docs
      1 ==========================
      2 Pretokenized Headers (PTH)
      3 ==========================
      4 
      5 This document first describes the low-level interface for using PTH and
      6 then briefly elaborates on its design and implementation. If you are
      7 interested in the end-user view, please see the :ref:`User's Manual
      8 <usersmanual-precompiled-headers>`.
      9 
     10 Using Pretokenized Headers with ``clang`` (Low-level Interface)
     11 ===============================================================
     12 
     13 The Clang compiler frontend, ``clang -cc1``, supports three command line
     14 options for generating and using PTH files.
     15 
     16 To generate PTH files using ``clang -cc1``, use the option ``-emit-pth``:
     17 
     18 .. code-block:: console
     19 
     20   $ clang -cc1 test.h -emit-pth -o test.h.pth
     21 
     22 This option is transparently used by ``clang`` when generating PTH
     23 files. Similarly, PTH files can be used as prefix headers using the
     24 ``-include-pth`` option:
     25 
     26 .. code-block:: console
     27 
     28   $ clang -cc1 -include-pth test.h.pth test.c -o test.s
     29 
     30 Alternatively, Clang's PTH files can be used as a raw "token-cache" (or
     31 "content" cache) of the source included by the original header file.
     32 This means that the contents of the PTH file are searched as substitutes
     33 for *any* source files that are used by ``clang -cc1`` to process a
     34 source file. This is done by specifying the ``-token-cache`` option:
     35 
     36 .. code-block:: console
     37 
     38   $ cat test.h
     39   #include <stdio.h>
     40   $ clang -cc1 -emit-pth test.h -o test.h.pth
     41   $ cat test.c
     42   #include "test.h"
     43   $ clang -cc1 test.c -o test -token-cache test.h.pth
     44 
     45 In this example the contents of ``stdio.h`` (and the files it includes)
     46 will be retrieved from ``test.h.pth``, as the PTH file is being used in
     47 this case as a raw cache of the contents of ``test.h``. This is a
     48 low-level interface used to both implement the high-level PTH interface
     49 as well as to provide alternative means to use PTH-style caching.
     50 
     51 PTH Design and Implementation
     52 =============================
     53 
     54 Unlike GCC's precompiled headers, which cache the full ASTs and
     55 preprocessor state of a header file, Clang's pretokenized header files
     56 mainly cache the raw lexer *tokens* that are needed to segment the
     57 stream of characters in a source file into keywords, identifiers, and
     58 operators. Consequently, PTH serves to mainly directly speed up the
     59 lexing and preprocessing of a source file, while parsing and
     60 type-checking must be completely redone every time a PTH file is used.
     61 
     62 Basic Design Tradeoffs
     63 ----------------------
     64 
     65 In the long term there are plans to provide an alternate PCH
     66 implementation for Clang that also caches the work for parsing and type
     67 checking the contents of header files. The current implementation of PCH
     68 in Clang as pretokenized header files was motivated by the following
     69 factors:
     70 
     71 **Language independence**
     72    PTH files work with any language that
     73    Clang's lexer can handle, including C, Objective-C, and (in the early
     74    stages) C++. This means development on language features at the
     75    parsing level or above (which is basically almost all interesting
     76    pieces) does not require PTH to be modified.
     77 
     78 **Simple design**
     79    Relatively speaking, PTH has a simple design and
     80    implementation, making it easy to test. Further, because the
     81    machinery for PTH resides at the lower-levels of the Clang library
     82    stack it is fairly straightforward to profile and optimize.
     83 
     84 Further, compared to GCC's PCH implementation (which is the dominate
     85 precompiled header file implementation that Clang can be directly
     86 compared against) the PTH design in Clang yields several attractive
     87 features:
     88 
     89 **Architecture independence**
     90    In contrast to GCC's PCH files (and
     91    those of several other compilers), Clang's PTH files are architecture
     92    independent, requiring only a single PTH file when building a
     93    program for multiple architectures.
     94 
     95    For example, on Mac OS X one may wish to compile a "universal binary"
     96    that runs on PowerPC, 32-bit Intel (i386), and 64-bit Intel
     97    architectures. In contrast, GCC requires a PCH file for each
     98    architecture, as the definitions of types in the AST are
     99    architecture-specific. Since a Clang PTH file essentially represents
    100    a lexical cache of header files, a single PTH file can be safely used
    101    when compiling for multiple architectures. This can also reduce
    102    compile times because only a single PTH file needs to be generated
    103    during a build instead of several.
    104 
    105 **Reduced memory pressure**
    106    Similar to GCC, Clang reads PTH files
    107    via the use of memory mapping (i.e., ``mmap``). Clang, however,
    108    memory maps PTH files as read-only, meaning that multiple invocations
    109    of ``clang -cc1`` can share the same pages in memory from a
    110    memory-mapped PTH file. In comparison, GCC also memory maps its PCH
    111    files but also modifies those pages in memory, incurring the
    112    copy-on-write costs. The read-only nature of PTH can greatly reduce
    113    memory pressure for builds involving multiple cores, thus improving
    114    overall scalability.
    115 
    116 **Fast generation**
    117    PTH files can be generated in a small fraction
    118    of the time needed to generate GCC's PCH files. Since PTH/PCH
    119    generation is a serial operation that typically blocks progress
    120    during a build, faster generation time leads to improved processor
    121    utilization with parallel builds on multicore machines.
    122 
    123 Despite these strengths, PTH's simple design suffers some algorithmic
    124 handicaps compared to other PCH strategies such as those used by GCC.
    125 While PTH can greatly speed up the processing time of a header file, the
    126 amount of work required to process a header file is still roughly linear
    127 in the size of the header file. In contrast, the amount of work done by
    128 GCC to process a precompiled header is (theoretically) constant (the
    129 ASTs for the header are literally memory mapped into the compiler). This
    130 means that only the pieces of the header file that are referenced by the
    131 source file including the header are the only ones the compiler needs to
    132 process during actual compilation. While GCC's particular implementation
    133 of PCH mitigates some of these algorithmic strengths via the use of
    134 copy-on-write pages, the approach itself can fundamentally dominate at
    135 an algorithmic level, especially when one considers header files of
    136 arbitrary size.
    137 
    138 There is also a PCH implementation for Clang based on the lazy
    139 deserialization of ASTs. This approach theoretically has the same
    140 constant-time algorithmic advantages just mentioned but also retains some
    141 of the strengths of PTH such as reduced memory pressure (ideal for
    142 multi-core builds).
    143 
    144 Internal PTH Optimizations
    145 --------------------------
    146 
    147 While the main optimization employed by PTH is to reduce lexing time of
    148 header files by caching pre-lexed tokens, PTH also employs several other
    149 optimizations to speed up the processing of header files:
    150 
    151 -  ``stat`` caching: PTH files cache information obtained via calls to
    152    ``stat`` that ``clang -cc1`` uses to resolve which files are included
    153    by ``#include`` directives. This greatly reduces the overhead
    154    involved in context-switching to the kernel to resolve included
    155    files.
    156 
    157 -  Fast skipping of ``#ifdef`` ... ``#endif`` chains: PTH files
    158    record the basic structure of nested preprocessor blocks. When the
    159    condition of the preprocessor block is false, all of its tokens are
    160    immediately skipped instead of requiring them to be handled by
    161    Clang's preprocessor.
    162 
    163 
    164