Home | History | Annotate | Download | only in reference
      1 ==============================
      2 PNaCl Bitcode Reference Manual
      3 ==============================
      4 
      5 .. contents::
      6    :local:
      7    :backlinks: none
      8    :depth: 3
      9 
     10 Introduction
     11 ============
     12 
     13 This document is a reference manual for the PNaCl bitcode format. It describes
     14 the bitcode on a *semantic* level; the physical encoding level will be described
     15 elsewhere. For the purpose of this document, the textual form of LLVM IR is
     16 used to describe instructions and other bitcode constructs.
     17 
     18 Since the PNaCl bitcode is based to a large extent on LLVM IR as of
     19 version 3.3, many sections in this document point to a relevant section
     20 of the LLVM language reference manual. Only the changes, restrictions
     21 and variations specific to PNaCl are described---full semantic
     22 descriptions are not duplicated from the LLVM reference manual.
     23 
     24 High Level Structure
     25 ====================
     26 
     27 A PNaCl portable executable (**pexe** in short) is a single LLVM IR module.
     28 
     29 Data Model
     30 ----------
     31 
     32 The data model for PNaCl bitcode is fixed at little-endian ILP32: pointers are
     33 32 bits in size. 64-bit integer types are also supported natively via the i64
     34 type (for example, a front-end can generate these from the C/C++ type
     35 ``long long``).
     36 
     37 Floating point support is fixed at IEEE 754 32-bit and 64-bit values (f32 and
     38 f64, respectively).
     39 
     40 .. _bitcode_linkagetypes:
     41 
     42 Linkage Types
     43 -------------
     44 
     45 `LLVM LangRef: Linkage Types
     46 <http://llvm.org/releases/3.3/docs/LangRef.html#linkage>`_
     47 
     48 The linkage types supported by PNaCl bitcode are ``internal`` and ``external``.
     49 A single function in the pexe, named ``_start``, has the linkage type
     50 ``external``. All the other functions and globals have the linkage type
     51 ``internal``.
     52 
     53 Calling Conventions
     54 -------------------
     55 
     56 `LLVM LangRef: Calling Conventions
     57 <http://llvm.org/releases/3.3/docs/LangRef.html#callingconv>`_
     58 
     59 The only calling convention supported by PNaCl bitcode is ``ccc`` - the C
     60 calling convention.
     61 
     62 Visibility Styles
     63 -----------------
     64 
     65 `LLVM LangRef: Visibility Styles
     66 <http://llvm.org/releases/3.3/docs/LangRef.html#visibility-styles>`_
     67 
     68 PNaCl bitcode does not support visibility styles.
     69 
     70 .. _bitcode_globalvariables:
     71 
     72 Global Variables
     73 ----------------
     74 
     75 `LLVM LangRef: Global Variables
     76 <http://llvm.org/releases/3.3/docs/LangRef.html#globalvars>`_
     77 
     78 Restrictions on global variables:
     79 
     80 * PNaCl bitcode does not support LLVM IR TLS models. See
     81   :ref:`language_support_threading` for more details.
     82 * Restrictions on :ref:`linkage types <bitcode_linkagetypes>`.
     83 * The ``addrspace``, ``section``, ``unnamed_addr`` and
     84   ``externally_initialized`` attributes are not supported.
     85 
     86 Every global variable must have an initializer. Each initializer must be
     87 either a *SimpleElement* or a *CompoundElement*, defined as follows.
     88 
     89 A *SimpleElement* is one of the following:
     90 
     91 1) An i8 array literal or ``zeroinitializer``:
     92 
     93 .. naclcode::
     94   :prettyprint: 0
     95 
     96      [SIZE x i8] c"DATA"
     97      [SIZE x i8] zeroinitializer
     98 
     99 2) A reference to a *GlobalValue* (a function or global variable) with an
    100    optional 32-bit byte offset added to it (the addend, which may be
    101    negative):
    102 
    103 .. naclcode::
    104   :prettyprint: 0
    105 
    106      ptrtoint (TYPE* @GLOBAL to i32)
    107      add (i32 ptrtoint (TYPE* @GLOBAL to i32), i32 ADDEND)
    108 
    109 A *CompoundElement* is a unnamed, packed struct containing more than one
    110 *SimpleElement*.
    111 
    112 Functions
    113 ---------
    114 
    115 `LLVM LangRef: Functions
    116 <http://llvm.org/releases/3.3/docs/LangRef.html#functionstructure>`_
    117 
    118 The restrictions on :ref:`linkage types <bitcode_linkagetypes>`, calling
    119 conventions and visibility styles apply to functions. In addition, the following
    120 are not supported for functions:
    121 
    122 * Function attributes (either for the the function itself, its parameters or its
    123   return type).
    124 * Garbage collector name (``gc``).
    125 * Functions with a variable number of arguments (*vararg*).
    126 * Alignment (``align``).
    127 
    128 Aliases
    129 -------
    130 
    131 `LLVM LangRef: Aliases
    132 <http://llvm.org/releases/3.3/docs/LangRef.html#aliases>`_
    133 
    134 PNaCl bitcode does not support aliases.
    135 
    136 Named Metadata
    137 --------------
    138 
    139 `LLVM LangRef: Named Metadata
    140 <http://llvm.org/releases/3.3/docs/LangRef.html#namedmetadatastructure>`_
    141 
    142 While PNaCl bitcode has provisions for debugging metadata, it is not considered
    143 part of the stable ABI. It exists for tool support and should not appear in
    144 distributed pexes.
    145 
    146 Other kinds of LLVM metadata are not supported.
    147 
    148 Module-Level Inline Assembly
    149 ----------------------------
    150 
    151 `LLVM LangRef: Module-Level Inline Assembly
    152 <http://llvm.org/releases/3.3/docs/LangRef.html#moduleasm>`_
    153 
    154 PNaCl bitcode does not support inline assembly.
    155 
    156 Volatile Memory Accesses
    157 ------------------------
    158 
    159 `LLVM LangRef: Volatile Memory Accesses
    160 <http://llvm.org/releases/3.3/docs/LangRef.html#volatile>`_
    161 
    162 PNaCl bitcode does not support volatile memory accesses. The
    163 ``volatile`` attribute on loads and stores is not supported. See the
    164 :doc:`pnacl-c-cpp-language-support` for more details.
    165 
    166 Memory Model for Concurrent Operations
    167 --------------------------------------
    168 
    169 `LLVM LangRef: Memory Model for Concurrent Operations
    170 <http://llvm.org/releases/3.3/docs/LangRef.html#memmodel>`_
    171 
    172 See the `PNaCl Developer's Guide <PNaClDeveloperGuide.html>`_ for more
    173 details.
    174 
    175 Fast-Math Flags
    176 ---------------
    177 
    178 `LLVM LangRef: Fast-Math Flags
    179 <http://llvm.org/releases/3.3/docs/LangRef.html#fastmath>`_
    180 
    181 Fast-math mode is not currently supported by the PNaCl bitcode.
    182 
    183 Type System
    184 ===========
    185 
    186 `LLVM LangRef: Type System
    187 <http://llvm.org/releases/3.3/docs/LangRef.html#typesystem>`_
    188 
    189 The LLVM types allowed in PNaCl bitcode are restricted, as follows:
    190 
    191 Scalar types
    192 ------------
    193 
    194 * The only scalar types allowed are integer, float (32-bit floating point),
    195   double (64-bit floating point) and void.
    196 
    197   * The only integer sizes allowed are i1, i8, i16, i32 and i64.
    198   * The only integer sizes allowed for function arguments and function return
    199     values are i32 and i64.
    200 
    201 Vector types
    202 ------------
    203 
    204 The only vector types allowed are:
    205 
    206 * 128-bit vectors integers of elements size i8, i16, i32.
    207 * 128-bit vectors of float elements.
    208 * Vectors of i1 type with element counts corresponding to the allowed
    209   element counts listed previously (their width is therefore not
    210   128-bits).
    211 
    212 Array and struct types
    213 ----------------------
    214 
    215 Array and struct types are only allowed in
    216 :ref:`global variable initializers <bitcode_globalvariables>`.
    217 
    218 .. _bitcode_pointertypes:
    219 
    220 Pointer types
    221 -------------
    222 
    223 Only the following pointer types are allowed:
    224 
    225 * Pointers to valid PNaCl bitcode scalar types, as specified above, except for
    226   ``i1``.
    227 * Pointers to valid PNaCl bitcode vector types, as specified above, except for
    228   ``<? x i1>``.
    229 * Pointers to functions.
    230 
    231 In addition, the address space for all pointers must be 0.
    232 
    233 A pointer is *inherent* when it represents the return value of an ``alloca``
    234 instruction, or is an address of a global value.
    235 
    236 A pointer is *normalized* if it's either:
    237 
    238 * *inherent*
    239 * Is the return value of a ``bitcast`` instruction.
    240 * Is the return value of a ``inttoptr`` instruction.
    241 
    242 Undefined Values
    243 ----------------
    244 
    245 `LLVM LangRef: Undefined Values
    246 <http://llvm.org/releases/3.3/docs/LangRef.html#undefvalues>`_
    247 
    248 ``undef`` is only allowed within functions, not in global variable initializers.
    249 
    250 Constant Expressions
    251 --------------------
    252 
    253 `LLVM LangRef: Constant Expressions
    254 <http://llvm.org/releases/3.3/docs/LangRef.html#constant-expressions>`_
    255 
    256 Constant expressions are only allowed in
    257 :ref:`global variable initializers <bitcode_globalvariables>`.
    258 
    259 Other Values
    260 ============
    261 
    262 Metadata Nodes and Metadata Strings
    263 -----------------------------------
    264 
    265 `LLVM LangRef: Metadata Nodes and Metadata Strings
    266 <http://llvm.org/releases/3.3/docs/LangRef.html#metadata>`_
    267 
    268 While PNaCl bitcode has provisions for debugging metadata, it is not considered
    269 part of the stable ABI. It exists for tool support and should not appear in
    270 distributed pexes.
    271 
    272 Other kinds of LLVM metadata are not supported.
    273 
    274 Intrinsic Global Variables
    275 ==========================
    276 
    277 `LLVM LangRef: Intrinsic Global Variables
    278 <http://llvm.org/releases/3.3/docs/LangRef.html#intrinsic-global-variables>`_
    279 
    280 PNaCl bitcode does not support intrinsic global variables.
    281 
    282 .. _ir_and_errno:
    283 
    284 Errno and errors in arithmetic instructions
    285 ===========================================
    286 
    287 Some arithmetic instructions and intrinsics have the similar semantics to
    288 libc math functions, but differ in the treatment of ``errno``. While the
    289 libc functions may set ``errno`` for domain errors, the instructions and
    290 intrinsics do not. This is because the variable ``errno`` is not special
    291 and is not required to be part of the program.
    292 
    293 Instruction Reference
    294 =====================
    295 
    296 List of allowed instructions
    297 ----------------------------
    298 
    299 This is a list of LLVM instructions supported by PNaCl bitcode. Where
    300 applicable, PNaCl-specific restrictions are provided.
    301 
    302 .. TODO: explain instructions or link in the future
    303 
    304 The following attributes are disallowed for all instructions:
    305 
    306 * ``nsw`` and ``nuw``
    307 * ``exact``
    308 
    309 Only the LLVM instructions listed here are supported by PNaCl bitcode.
    310 
    311 * ``ret``
    312 * ``br``
    313 * ``switch``
    314 
    315   i1 values are disallowed for ``switch``.
    316 
    317 * ``add``, ``sub``, ``mul``, ``shl``,  ``udiv``, ``sdiv``, ``urem``, ``srem``,
    318   ``lshr``, ``ashr``
    319 
    320   These arithmetic operations are disallowed on values of type ``i1``.
    321 
    322   Integer division (``udiv``, ``sdiv``, ``urem``, ``srem``) by zero is
    323   guaranteed to trap in PNaCl bitcode.
    324 
    325 * ``and``
    326 * ``or``
    327 * ``xor``
    328 * ``fadd``
    329 * ``fsub``
    330 * ``fmul``
    331 * ``fdiv``
    332 * ``frem``
    333 
    334   The frem instruction has the semantics of the libc fmod function for
    335   computing the floating point remainder. If the numerator is infinity, or
    336   denominator is zero, or either are NaN, then the result is NaN.
    337   Unlike the libc fmod function, this does not set ``errno`` when the
    338   result is NaN (see the :ref:`instructions and errno <ir_and_errno>`
    339   section).
    340 
    341 * ``alloca``
    342 
    343   See :ref:`alloca instructions <bitcode_allocainst>`.
    344 
    345 * ``load``, ``store``
    346 
    347   The pointer argument of these instructions must be a *normalized* pointer (see
    348   :ref:`pointer types <bitcode_pointertypes>`). The ``volatile`` and ``atomic``
    349   attributes are not supported. Loads and stores of the type ``i1`` and ``<? x
    350   i1>`` are not supported.
    351 
    352   These instructions must follow the following alignment restrictions:
    353 
    354   * On integer memory accesses: ``align 1``.
    355   * On ``float`` memory accesses: ``align 1`` or ``align 4``.
    356   * On ``double`` memory accesses: ``align 1`` or ``align 8``.
    357   * On vector memory accesses: alignment at the vector's element width, for
    358     example ``<4 x i32>`` must be ``align 4``.
    359 
    360 * ``trunc``
    361 * ``zext``
    362 * ``sext``
    363 * ``fptrunc``
    364 * ``fpext``
    365 * ``fptoui``
    366 * ``fptosi``
    367 * ``uitofp``
    368 * ``sitofp``
    369 
    370 * ``ptrtoint``
    371 
    372   The pointer argument of a ``ptrtoint`` instruction must be a *normalized*
    373   pointer (see :ref:`pointer types <bitcode_pointertypes>`) and the integer
    374   argument must be an i32.
    375 
    376 * ``inttoptr``
    377 
    378   The integer argument of a ``inttoptr`` instruction must be an i32.
    379 
    380 * ``bitcast``
    381 
    382   The pointer argument of a ``bitcast`` instruction must be a *inherent* pointer
    383   (see :ref:`pointer types <bitcode_pointertypes>`).
    384 
    385 * ``icmp``
    386 * ``fcmp``
    387 * ``phi``
    388 * ``select``
    389 * ``call``
    390 * ``unreachable``
    391 * ``insertelement``
    392 * ``extractelement``
    393 
    394 .. _bitcode_allocainst:
    395 
    396 ``alloca``
    397 ----------
    398 
    399 The only allowed type for ``alloca`` instructions in PNaCl bitcode is i8. The
    400 size argument must be an i32. For example:
    401 
    402 .. naclcode::
    403   :prettyprint: 0
    404 
    405     %buf = alloca i8, i32 8, align 4
    406 
    407 Intrinsic Functions
    408 ===================
    409 
    410 `LLVM LangRef: Intrinsic Functions
    411 <http://llvm.org/releases/3.3/docs/LangRef.html#intrinsics>`_
    412 
    413 List of allowed intrinsics
    414 --------------------------
    415 
    416 The only intrinsics supported by PNaCl bitcode are the following.
    417 
    418 * ``llvm.memcpy``
    419 * ``llvm.memmove``
    420 * ``llvm.memset``
    421 
    422   These intrinsics are only supported with an i32 ``len`` argument.
    423 
    424 * ``llvm.bswap``
    425 
    426   The overloaded ``llvm.bswap`` intrinsic is only supported with the following
    427   argument types: i16, i32, i64 (the types supported by C-style GCC builtins).
    428 
    429 * ``llvm.ctlz``
    430 * ``llvm.cttz``
    431 * ``llvm.ctpop``
    432 
    433   The overloaded llvm.ctlz, llvm.cttz, and llvm.ctpop intrinsics are only
    434   supported with the i32 and i64 argument types (the types supported by
    435   C-style GCC builtins).
    436 
    437 * ``llvm.sqrt``
    438 
    439   The overloaded ``llvm.sqrt`` intrinsic is only supported for float
    440   and double arguments types. This has the same semantics as the libc
    441   sqrt function, returning NaN for values less than -0.0. However, this
    442   does not set ``errno`` when the result is NaN (see the
    443   :ref:`instructions and errno <ir_and_errno>` section).
    444 
    445 * ``llvm.stacksave``
    446 * ``llvm.stackrestore``
    447 
    448   These intrinsics are used to implement language features like scoped automatic
    449   variable sized arrays in C99. ``llvm.stacksave`` returns a value that
    450   represents the current state of the stack. This value may only be used as the
    451   argument to ``llvm.stackrestore``, which restores the stack to the given
    452   state.
    453 
    454 * ``llvm.trap``
    455 
    456   This intrinsic is lowered to a target dependent trap instruction, which aborts
    457   execution.
    458 
    459 * ``llvm.nacl.read.tp``
    460 
    461   See :ref:`thread pointer related intrinsics
    462   <bitcode_threadpointerintrinsics>`.
    463 
    464 * ``llvm.nacl.longjmp``
    465 * ``llvm.nacl.setjmp``
    466 
    467   See :ref:`Setjmp and Longjmp <bitcode_setjmplongjmp>`.
    468 
    469 * ``llvm.nacl.atomic.store``
    470 * ``llvm.nacl.atomic.load``
    471 * ``llvm.nacl.atomic.rmw``
    472 * ``llvm.nacl.atomic.cmpxchg``
    473 * ``llvm.nacl.atomic.fence``
    474 * ``llvm.nacl.atomic.fence.all``
    475 * ``llvm.nacl.atomic.is.lock.free``
    476 
    477   See :ref:`atomic intrinsics <bitcode_atomicintrinsics>`.
    478 
    479 .. _bitcode_threadpointerintrinsics:
    480 
    481 Thread pointer related intrinsics
    482 ---------------------------------
    483 
    484 .. naclcode::
    485   :prettyprint: 0
    486 
    487     declare i8* @llvm.nacl.read.tp()
    488 
    489 Returns a read-only thread pointer. The value is controlled by the embedding
    490 sandbox's runtime.
    491 
    492 .. _bitcode_setjmplongjmp:
    493 
    494 Setjmp and Longjmp
    495 ------------------
    496 
    497 .. naclcode::
    498   :prettyprint: 0
    499 
    500     declare void @llvm.nacl.longjmp(i8* %jmpbuf, i32)
    501     declare i32 @llvm.nacl.setjmp(i8* %jmpbuf)
    502 
    503 These intrinsics implement the semantics of C11 ``setjmp`` and ``longjmp``. The
    504 ``jmpbuf`` pointer must be 64-bit aligned and point to at least 1024 bytes of
    505 allocated memory.
    506 
    507 .. _bitcode_atomicintrinsics:
    508 
    509 Atomic intrinsics
    510 -----------------
    511 
    512 .. naclcode::
    513   :prettyprint: 0
    514 
    515     declare iN @llvm.nacl.atomic.load.<size>(
    516             iN* <source>, i32 <memory_order>)
    517     declare void @llvm.nacl.atomic.store.<size>(
    518             iN <operand>, iN* <destination>, i32 <memory_order>)
    519     declare iN @llvm.nacl.atomic.rmw.<size>(
    520             i32 <computation>, iN* <object>, iN <operand>, i32 <memory_order>)
    521     declare iN @llvm.nacl.atomic.cmpxchg.<size>(
    522             iN* <object>, iN <expected>, iN <desired>,
    523             i32 <memory_order_success>, i32 <memory_order_failure>)
    524     declare void @llvm.nacl.atomic.fence(i32 <memory_order>)
    525     declare void @llvm.nacl.atomic.fence.all()
    526 
    527 Each of these intrinsics is overloaded on the ``iN`` argument, which is
    528 reflected through ``<size>`` in the overload's name. Integral types of
    529 8, 16, 32 and 64-bit width are supported for these arguments.
    530 
    531 The ``@llvm.nacl.atomic.rmw`` intrinsic implements the following
    532 read-modify-write operations, from the general and arithmetic sections
    533 of the C11/C++11 standards:
    534 
    535  - ``add``
    536  - ``sub``
    537  - ``or``
    538  - ``and``
    539  - ``xor``
    540  - ``exchange``
    541 
    542 For all of these read-modify-write operations, the returned value is
    543 that at ``object`` before the computation. The ``computation`` argument
    544 must be a compile-time constant.
    545 
    546 All atomic intrinsics also support C11/C++11 memory orderings, which
    547 must be compile-time constants.
    548 
    549 Integer values for these computations and memory orderings are defined
    550 in ``"llvm/IR/NaClAtomicIntrinsics.h"``.
    551 
    552 The ``@llvm.nacl.atomic.fence.all`` intrinsic is equivalent to the
    553 ``@llvm.nacl.atomic.fence`` intrinsic with sequentially consistent
    554 ordering and compiler barriers preventing most non-atomic memory
    555 accesses from reordering around it.
    556 
    557 .. Note::
    558   :class: note
    559 
    560     These intrinsics allow PNaCl to support C11/C++11 style atomic
    561     operations as well as some legacy GCC-style ``__sync_*`` builtins
    562     while remaining stable as the LLVM codebase changes. The user isn't
    563     expected to use these intrinsics directly.
    564 
    565 .. naclcode::
    566   :prettyprint: 0
    567 
    568     declare i1 @llvm.nacl.atomic.is.lock.free(i32 <byte_size>, i8* <address>)
    569 
    570 The ``llvm.nacl.atomic.is.lock.free`` intrinsic is designed to
    571 determine at translation time whether atomic operations of a certain
    572 ``byte_size`` (a compile-time constant), at a particular ``address``,
    573 are lock-free or not. This reflects the C11 ``atomic_is_lock_free``
    574 function from header ``<stdatomic.h>`` and the C++11 ``is_lock_free``
    575 member function in header ``<atomic>``. It can be used through the
    576 ``__nacl_atomic_is_lock_free`` builtin.
    577