1 ============================== 2 PNaCl Bitcode Reference Manual 3 ============================== 4 5 .. contents:: 6 :local: 7 :backlinks: none 8 :depth: 3 9 10 Introduction 11 ============ 12 13 This document is a reference manual for the PNaCl bitcode format. It describes 14 the bitcode on a *semantic* level; the physical encoding level will be described 15 elsewhere. For the purpose of this document, the textual form of LLVM IR is 16 used to describe instructions and other bitcode constructs. 17 18 Since the PNaCl bitcode is based to a large extent on LLVM IR as of 19 version 3.3, many sections in this document point to a relevant section 20 of the LLVM language reference manual. Only the changes, restrictions 21 and variations specific to PNaCl are described---full semantic 22 descriptions are not duplicated from the LLVM reference manual. 23 24 High Level Structure 25 ==================== 26 27 A PNaCl portable executable (**pexe** in short) is a single LLVM IR module. 28 29 Data Model 30 ---------- 31 32 The data model for PNaCl bitcode is fixed at little-endian ILP32: pointers are 33 32 bits in size. 64-bit integer types are also supported natively via the i64 34 type (for example, a front-end can generate these from the C/C++ type 35 ``long long``). 36 37 Floating point support is fixed at IEEE 754 32-bit and 64-bit values (f32 and 38 f64, respectively). 39 40 .. _bitcode_linkagetypes: 41 42 Linkage Types 43 ------------- 44 45 `LLVM LangRef: Linkage Types 46 <http://llvm.org/releases/3.3/docs/LangRef.html#linkage>`_ 47 48 The linkage types supported by PNaCl bitcode are ``internal`` and ``external``. 49 A single function in the pexe, named ``_start``, has the linkage type 50 ``external``. All the other functions and globals have the linkage type 51 ``internal``. 52 53 Calling Conventions 54 ------------------- 55 56 `LLVM LangRef: Calling Conventions 57 <http://llvm.org/releases/3.3/docs/LangRef.html#callingconv>`_ 58 59 The only calling convention supported by PNaCl bitcode is ``ccc`` - the C 60 calling convention. 61 62 Visibility Styles 63 ----------------- 64 65 `LLVM LangRef: Visibility Styles 66 <http://llvm.org/releases/3.3/docs/LangRef.html#visibility-styles>`_ 67 68 PNaCl bitcode does not support visibility styles. 69 70 .. _bitcode_globalvariables: 71 72 Global Variables 73 ---------------- 74 75 `LLVM LangRef: Global Variables 76 <http://llvm.org/releases/3.3/docs/LangRef.html#globalvars>`_ 77 78 Restrictions on global variables: 79 80 * PNaCl bitcode does not support LLVM IR TLS models. See 81 :ref:`language_support_threading` for more details. 82 * Restrictions on :ref:`linkage types <bitcode_linkagetypes>`. 83 * The ``addrspace``, ``section``, ``unnamed_addr`` and 84 ``externally_initialized`` attributes are not supported. 85 86 Every global variable must have an initializer. Each initializer must be 87 either a *SimpleElement* or a *CompoundElement*, defined as follows. 88 89 A *SimpleElement* is one of the following: 90 91 1) An i8 array literal or ``zeroinitializer``: 92 93 .. naclcode:: 94 :prettyprint: 0 95 96 [SIZE x i8] c"DATA" 97 [SIZE x i8] zeroinitializer 98 99 2) A reference to a *GlobalValue* (a function or global variable) with an 100 optional 32-bit byte offset added to it (the addend, which may be 101 negative): 102 103 .. naclcode:: 104 :prettyprint: 0 105 106 ptrtoint (TYPE* @GLOBAL to i32) 107 add (i32 ptrtoint (TYPE* @GLOBAL to i32), i32 ADDEND) 108 109 A *CompoundElement* is a unnamed, packed struct containing more than one 110 *SimpleElement*. 111 112 Functions 113 --------- 114 115 `LLVM LangRef: Functions 116 <http://llvm.org/releases/3.3/docs/LangRef.html#functionstructure>`_ 117 118 The restrictions on :ref:`linkage types <bitcode_linkagetypes>`, calling 119 conventions and visibility styles apply to functions. In addition, the following 120 are not supported for functions: 121 122 * Function attributes (either for the the function itself, its parameters or its 123 return type). 124 * Garbage collector name (``gc``). 125 * Functions with a variable number of arguments (*vararg*). 126 * Alignment (``align``). 127 128 Aliases 129 ------- 130 131 `LLVM LangRef: Aliases 132 <http://llvm.org/releases/3.3/docs/LangRef.html#aliases>`_ 133 134 PNaCl bitcode does not support aliases. 135 136 Named Metadata 137 -------------- 138 139 `LLVM LangRef: Named Metadata 140 <http://llvm.org/releases/3.3/docs/LangRef.html#namedmetadatastructure>`_ 141 142 While PNaCl bitcode has provisions for debugging metadata, it is not considered 143 part of the stable ABI. It exists for tool support and should not appear in 144 distributed pexes. 145 146 Other kinds of LLVM metadata are not supported. 147 148 Module-Level Inline Assembly 149 ---------------------------- 150 151 `LLVM LangRef: Module-Level Inline Assembly 152 <http://llvm.org/releases/3.3/docs/LangRef.html#moduleasm>`_ 153 154 PNaCl bitcode does not support inline assembly. 155 156 Volatile Memory Accesses 157 ------------------------ 158 159 `LLVM LangRef: Volatile Memory Accesses 160 <http://llvm.org/releases/3.3/docs/LangRef.html#volatile>`_ 161 162 PNaCl bitcode does not support volatile memory accesses. The 163 ``volatile`` attribute on loads and stores is not supported. See the 164 :doc:`pnacl-c-cpp-language-support` for more details. 165 166 Memory Model for Concurrent Operations 167 -------------------------------------- 168 169 `LLVM LangRef: Memory Model for Concurrent Operations 170 <http://llvm.org/releases/3.3/docs/LangRef.html#memmodel>`_ 171 172 See the `PNaCl Developer's Guide <PNaClDeveloperGuide.html>`_ for more 173 details. 174 175 Fast-Math Flags 176 --------------- 177 178 `LLVM LangRef: Fast-Math Flags 179 <http://llvm.org/releases/3.3/docs/LangRef.html#fastmath>`_ 180 181 Fast-math mode is not currently supported by the PNaCl bitcode. 182 183 Type System 184 =========== 185 186 `LLVM LangRef: Type System 187 <http://llvm.org/releases/3.3/docs/LangRef.html#typesystem>`_ 188 189 The LLVM types allowed in PNaCl bitcode are restricted, as follows: 190 191 Scalar types 192 ------------ 193 194 * The only scalar types allowed are integer, float (32-bit floating point), 195 double (64-bit floating point) and void. 196 197 * The only integer sizes allowed are i1, i8, i16, i32 and i64. 198 * The only integer sizes allowed for function arguments and function return 199 values are i32 and i64. 200 201 Vector types 202 ------------ 203 204 The only vector types allowed are: 205 206 * 128-bit vectors integers of elements size i8, i16, i32. 207 * 128-bit vectors of float elements. 208 * Vectors of i1 type with element counts corresponding to the allowed 209 element counts listed previously (their width is therefore not 210 128-bits). 211 212 Array and struct types 213 ---------------------- 214 215 Array and struct types are only allowed in 216 :ref:`global variable initializers <bitcode_globalvariables>`. 217 218 .. _bitcode_pointertypes: 219 220 Pointer types 221 ------------- 222 223 Only the following pointer types are allowed: 224 225 * Pointers to valid PNaCl bitcode scalar types, as specified above, except for 226 ``i1``. 227 * Pointers to valid PNaCl bitcode vector types, as specified above, except for 228 ``<? x i1>``. 229 * Pointers to functions. 230 231 In addition, the address space for all pointers must be 0. 232 233 A pointer is *inherent* when it represents the return value of an ``alloca`` 234 instruction, or is an address of a global value. 235 236 A pointer is *normalized* if it's either: 237 238 * *inherent* 239 * Is the return value of a ``bitcast`` instruction. 240 * Is the return value of a ``inttoptr`` instruction. 241 242 Undefined Values 243 ---------------- 244 245 `LLVM LangRef: Undefined Values 246 <http://llvm.org/releases/3.3/docs/LangRef.html#undefvalues>`_ 247 248 ``undef`` is only allowed within functions, not in global variable initializers. 249 250 Constant Expressions 251 -------------------- 252 253 `LLVM LangRef: Constant Expressions 254 <http://llvm.org/releases/3.3/docs/LangRef.html#constant-expressions>`_ 255 256 Constant expressions are only allowed in 257 :ref:`global variable initializers <bitcode_globalvariables>`. 258 259 Other Values 260 ============ 261 262 Metadata Nodes and Metadata Strings 263 ----------------------------------- 264 265 `LLVM LangRef: Metadata Nodes and Metadata Strings 266 <http://llvm.org/releases/3.3/docs/LangRef.html#metadata>`_ 267 268 While PNaCl bitcode has provisions for debugging metadata, it is not considered 269 part of the stable ABI. It exists for tool support and should not appear in 270 distributed pexes. 271 272 Other kinds of LLVM metadata are not supported. 273 274 Intrinsic Global Variables 275 ========================== 276 277 `LLVM LangRef: Intrinsic Global Variables 278 <http://llvm.org/releases/3.3/docs/LangRef.html#intrinsic-global-variables>`_ 279 280 PNaCl bitcode does not support intrinsic global variables. 281 282 .. _ir_and_errno: 283 284 Errno and errors in arithmetic instructions 285 =========================================== 286 287 Some arithmetic instructions and intrinsics have the similar semantics to 288 libc math functions, but differ in the treatment of ``errno``. While the 289 libc functions may set ``errno`` for domain errors, the instructions and 290 intrinsics do not. This is because the variable ``errno`` is not special 291 and is not required to be part of the program. 292 293 Instruction Reference 294 ===================== 295 296 List of allowed instructions 297 ---------------------------- 298 299 This is a list of LLVM instructions supported by PNaCl bitcode. Where 300 applicable, PNaCl-specific restrictions are provided. 301 302 .. TODO: explain instructions or link in the future 303 304 The following attributes are disallowed for all instructions: 305 306 * ``nsw`` and ``nuw`` 307 * ``exact`` 308 309 Only the LLVM instructions listed here are supported by PNaCl bitcode. 310 311 * ``ret`` 312 * ``br`` 313 * ``switch`` 314 315 i1 values are disallowed for ``switch``. 316 317 * ``add``, ``sub``, ``mul``, ``shl``, ``udiv``, ``sdiv``, ``urem``, ``srem``, 318 ``lshr``, ``ashr`` 319 320 These arithmetic operations are disallowed on values of type ``i1``. 321 322 Integer division (``udiv``, ``sdiv``, ``urem``, ``srem``) by zero is 323 guaranteed to trap in PNaCl bitcode. 324 325 * ``and`` 326 * ``or`` 327 * ``xor`` 328 * ``fadd`` 329 * ``fsub`` 330 * ``fmul`` 331 * ``fdiv`` 332 * ``frem`` 333 334 The frem instruction has the semantics of the libc fmod function for 335 computing the floating point remainder. If the numerator is infinity, or 336 denominator is zero, or either are NaN, then the result is NaN. 337 Unlike the libc fmod function, this does not set ``errno`` when the 338 result is NaN (see the :ref:`instructions and errno <ir_and_errno>` 339 section). 340 341 * ``alloca`` 342 343 See :ref:`alloca instructions <bitcode_allocainst>`. 344 345 * ``load``, ``store`` 346 347 The pointer argument of these instructions must be a *normalized* pointer (see 348 :ref:`pointer types <bitcode_pointertypes>`). The ``volatile`` and ``atomic`` 349 attributes are not supported. Loads and stores of the type ``i1`` and ``<? x 350 i1>`` are not supported. 351 352 These instructions must follow the following alignment restrictions: 353 354 * On integer memory accesses: ``align 1``. 355 * On ``float`` memory accesses: ``align 1`` or ``align 4``. 356 * On ``double`` memory accesses: ``align 1`` or ``align 8``. 357 * On vector memory accesses: alignment at the vector's element width, for 358 example ``<4 x i32>`` must be ``align 4``. 359 360 * ``trunc`` 361 * ``zext`` 362 * ``sext`` 363 * ``fptrunc`` 364 * ``fpext`` 365 * ``fptoui`` 366 * ``fptosi`` 367 * ``uitofp`` 368 * ``sitofp`` 369 370 * ``ptrtoint`` 371 372 The pointer argument of a ``ptrtoint`` instruction must be a *normalized* 373 pointer (see :ref:`pointer types <bitcode_pointertypes>`) and the integer 374 argument must be an i32. 375 376 * ``inttoptr`` 377 378 The integer argument of a ``inttoptr`` instruction must be an i32. 379 380 * ``bitcast`` 381 382 The pointer argument of a ``bitcast`` instruction must be a *inherent* pointer 383 (see :ref:`pointer types <bitcode_pointertypes>`). 384 385 * ``icmp`` 386 * ``fcmp`` 387 * ``phi`` 388 * ``select`` 389 * ``call`` 390 * ``unreachable`` 391 * ``insertelement`` 392 * ``extractelement`` 393 394 .. _bitcode_allocainst: 395 396 ``alloca`` 397 ---------- 398 399 The only allowed type for ``alloca`` instructions in PNaCl bitcode is i8. The 400 size argument must be an i32. For example: 401 402 .. naclcode:: 403 :prettyprint: 0 404 405 %buf = alloca i8, i32 8, align 4 406 407 Intrinsic Functions 408 =================== 409 410 `LLVM LangRef: Intrinsic Functions 411 <http://llvm.org/releases/3.3/docs/LangRef.html#intrinsics>`_ 412 413 List of allowed intrinsics 414 -------------------------- 415 416 The only intrinsics supported by PNaCl bitcode are the following. 417 418 * ``llvm.memcpy`` 419 * ``llvm.memmove`` 420 * ``llvm.memset`` 421 422 These intrinsics are only supported with an i32 ``len`` argument. 423 424 * ``llvm.bswap`` 425 426 The overloaded ``llvm.bswap`` intrinsic is only supported with the following 427 argument types: i16, i32, i64 (the types supported by C-style GCC builtins). 428 429 * ``llvm.ctlz`` 430 * ``llvm.cttz`` 431 * ``llvm.ctpop`` 432 433 The overloaded llvm.ctlz, llvm.cttz, and llvm.ctpop intrinsics are only 434 supported with the i32 and i64 argument types (the types supported by 435 C-style GCC builtins). 436 437 * ``llvm.sqrt`` 438 439 The overloaded ``llvm.sqrt`` intrinsic is only supported for float 440 and double arguments types. This has the same semantics as the libc 441 sqrt function, returning NaN for values less than -0.0. However, this 442 does not set ``errno`` when the result is NaN (see the 443 :ref:`instructions and errno <ir_and_errno>` section). 444 445 * ``llvm.stacksave`` 446 * ``llvm.stackrestore`` 447 448 These intrinsics are used to implement language features like scoped automatic 449 variable sized arrays in C99. ``llvm.stacksave`` returns a value that 450 represents the current state of the stack. This value may only be used as the 451 argument to ``llvm.stackrestore``, which restores the stack to the given 452 state. 453 454 * ``llvm.trap`` 455 456 This intrinsic is lowered to a target dependent trap instruction, which aborts 457 execution. 458 459 * ``llvm.nacl.read.tp`` 460 461 See :ref:`thread pointer related intrinsics 462 <bitcode_threadpointerintrinsics>`. 463 464 * ``llvm.nacl.longjmp`` 465 * ``llvm.nacl.setjmp`` 466 467 See :ref:`Setjmp and Longjmp <bitcode_setjmplongjmp>`. 468 469 * ``llvm.nacl.atomic.store`` 470 * ``llvm.nacl.atomic.load`` 471 * ``llvm.nacl.atomic.rmw`` 472 * ``llvm.nacl.atomic.cmpxchg`` 473 * ``llvm.nacl.atomic.fence`` 474 * ``llvm.nacl.atomic.fence.all`` 475 * ``llvm.nacl.atomic.is.lock.free`` 476 477 See :ref:`atomic intrinsics <bitcode_atomicintrinsics>`. 478 479 .. _bitcode_threadpointerintrinsics: 480 481 Thread pointer related intrinsics 482 --------------------------------- 483 484 .. naclcode:: 485 :prettyprint: 0 486 487 declare i8* @llvm.nacl.read.tp() 488 489 Returns a read-only thread pointer. The value is controlled by the embedding 490 sandbox's runtime. 491 492 .. _bitcode_setjmplongjmp: 493 494 Setjmp and Longjmp 495 ------------------ 496 497 .. naclcode:: 498 :prettyprint: 0 499 500 declare void @llvm.nacl.longjmp(i8* %jmpbuf, i32) 501 declare i32 @llvm.nacl.setjmp(i8* %jmpbuf) 502 503 These intrinsics implement the semantics of C11 ``setjmp`` and ``longjmp``. The 504 ``jmpbuf`` pointer must be 64-bit aligned and point to at least 1024 bytes of 505 allocated memory. 506 507 .. _bitcode_atomicintrinsics: 508 509 Atomic intrinsics 510 ----------------- 511 512 .. naclcode:: 513 :prettyprint: 0 514 515 declare iN @llvm.nacl.atomic.load.<size>( 516 iN* <source>, i32 <memory_order>) 517 declare void @llvm.nacl.atomic.store.<size>( 518 iN <operand>, iN* <destination>, i32 <memory_order>) 519 declare iN @llvm.nacl.atomic.rmw.<size>( 520 i32 <computation>, iN* <object>, iN <operand>, i32 <memory_order>) 521 declare iN @llvm.nacl.atomic.cmpxchg.<size>( 522 iN* <object>, iN <expected>, iN <desired>, 523 i32 <memory_order_success>, i32 <memory_order_failure>) 524 declare void @llvm.nacl.atomic.fence(i32 <memory_order>) 525 declare void @llvm.nacl.atomic.fence.all() 526 527 Each of these intrinsics is overloaded on the ``iN`` argument, which is 528 reflected through ``<size>`` in the overload's name. Integral types of 529 8, 16, 32 and 64-bit width are supported for these arguments. 530 531 The ``@llvm.nacl.atomic.rmw`` intrinsic implements the following 532 read-modify-write operations, from the general and arithmetic sections 533 of the C11/C++11 standards: 534 535 - ``add`` 536 - ``sub`` 537 - ``or`` 538 - ``and`` 539 - ``xor`` 540 - ``exchange`` 541 542 For all of these read-modify-write operations, the returned value is 543 that at ``object`` before the computation. The ``computation`` argument 544 must be a compile-time constant. 545 546 All atomic intrinsics also support C11/C++11 memory orderings, which 547 must be compile-time constants. 548 549 Integer values for these computations and memory orderings are defined 550 in ``"llvm/IR/NaClAtomicIntrinsics.h"``. 551 552 The ``@llvm.nacl.atomic.fence.all`` intrinsic is equivalent to the 553 ``@llvm.nacl.atomic.fence`` intrinsic with sequentially consistent 554 ordering and compiler barriers preventing most non-atomic memory 555 accesses from reordering around it. 556 557 .. Note:: 558 :class: note 559 560 These intrinsics allow PNaCl to support C11/C++11 style atomic 561 operations as well as some legacy GCC-style ``__sync_*`` builtins 562 while remaining stable as the LLVM codebase changes. The user isn't 563 expected to use these intrinsics directly. 564 565 .. naclcode:: 566 :prettyprint: 0 567 568 declare i1 @llvm.nacl.atomic.is.lock.free(i32 <byte_size>, i8* <address>) 569 570 The ``llvm.nacl.atomic.is.lock.free`` intrinsic is designed to 571 determine at translation time whether atomic operations of a certain 572 ``byte_size`` (a compile-time constant), at a particular ``address``, 573 are lock-free or not. This reflects the C11 ``atomic_is_lock_free`` 574 function from header ``<stdatomic.h>`` and the C++11 ``is_lock_free`` 575 member function in header ``<atomic>``. It can be used through the 576 ``__nacl_atomic_is_lock_free`` builtin. 577