Home | History | Annotate | Download | only in docs
      1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
      2                       "http://www.w3.org/TR/html4/strict.dtd">
      3 <html>
      4 <head>
      5   <title>LLVM Atomic Instructions and Concurrency Guide</title>
      6   <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
      7   <link rel="stylesheet" href="llvm.css" type="text/css">
      8 </head>
      9 <body>
     10 
     11 <h1>
     12   LLVM Atomic Instructions and Concurrency Guide
     13 </h1>
     14 
     15 <ol>
     16   <li><a href="#introduction">Introduction</a></li>
     17   <li><a href="#outsideatomic">Optimization outside atomic</a></li>
     18   <li><a href="#atomicinst">Atomic instructions</a></li>
     19   <li><a href="#ordering">Atomic orderings</a></li>
     20   <li><a href="#iropt">Atomics and IR optimization</a></li>
     21   <li><a href="#codegen">Atomics and Codegen</a></li>
     22 </ol>
     23 
     24 <div class="doc_author">
     25   <p>Written by Eli Friedman</p>
     26 </div>
     27 
     28 <!-- *********************************************************************** -->
     29 <h2>
     30   <a name="introduction">Introduction</a>
     31 </h2>
     32 <!-- *********************************************************************** -->
     33 
     34 <div>
     35 
     36 <p>Historically, LLVM has not had very strong support for concurrency; some
     37 minimal intrinsics were provided, and <code>volatile</code> was used in some
     38 cases to achieve rough semantics in the presence of concurrency.  However, this
     39 is changing; there are now new instructions which are well-defined in the
     40 presence of threads and asynchronous signals, and the model for existing
     41 instructions has been clarified in the IR.</p>
     42 
     43 <p>The atomic instructions are designed specifically to provide readable IR and
     44    optimized code generation for the following:</p>
     45 <ul>
     46   <li>The new C++0x <code>&lt;atomic&gt;</code> header.
     47       (<a href="http://www.open-std.org/jtc1/sc22/wg21/">C++0x draft available here</a>.)
     48       (<a href="http://www.open-std.org/jtc1/sc22/wg14/">C1x draft available here</a>)</li>
     49   <li>Proper semantics for Java-style memory, for both <code>volatile</code> and
     50       regular shared variables.
     51       (<a href="http://java.sun.com/docs/books/jls/third_edition/html/memory.html">Java Specification</a>)</li>
     52   <li>gcc-compatible <code>__sync_*</code> builtins.
     53       (<a href="http://gcc.gnu.org/onlinedocs/gcc/Atomic-Builtins.html">Description</a>)</li>
     54   <li>Other scenarios with atomic semantics, including <code>static</code>
     55       variables with non-trivial constructors in C++.</li>
     56 </ul>
     57 
     58 <p>Atomic and volatile in the IR are orthogonal; "volatile" is the C/C++
     59    volatile, which ensures that every volatile load and store happens and is
     60    performed in the stated order.  A couple examples: if a
     61    SequentiallyConsistent store is immediately followed by another
     62    SequentiallyConsistent store to the same address, the first store can
     63    be erased. This transformation is not allowed for a pair of volatile
     64    stores. On the other hand, a non-volatile non-atomic load can be moved
     65    across a volatile load freely, but not an Acquire load.</p>
     66 
     67 <p>This document is intended to provide a guide to anyone either writing a
     68    frontend for LLVM or working on optimization passes for LLVM with a guide
     69    for how to deal with instructions with special semantics in the presence of
     70    concurrency.  This is not intended to be a precise guide to the semantics;
     71    the details can get extremely complicated and unreadable, and are not
     72    usually necessary.</p>
     73 
     74 </div>
     75 
     76 <!-- *********************************************************************** -->
     77 <h2>
     78   <a name="outsideatomic">Optimization outside atomic</a>
     79 </h2>
     80 <!-- *********************************************************************** -->
     81 
     82 <div>
     83 
     84 <p>The basic <code>'load'</code> and <code>'store'</code> allow a variety of 
     85    optimizations, but can lead to undefined results in a concurrent environment;
     86    see <a href="#o_nonatomic">NonAtomic</a>. This section specifically goes
     87    into the one optimizer restriction which applies in concurrent environments,
     88    which gets a bit more of an extended description because any optimization
     89    dealing with stores needs to be aware of it.</p>
     90 
     91 <p>From the optimizer's point of view, the rule is that if there
     92    are not any instructions with atomic ordering involved, concurrency does
     93    not matter, with one exception: if a variable might be visible to another
     94    thread or signal handler, a store cannot be inserted along a path where it
     95    might not execute otherwise.  Take the following example:</p>
     96 
     97 <pre>
     98 /* C code, for readability; run through clang -O2 -S -emit-llvm to get
     99    equivalent IR */
    100 int x;
    101 void f(int* a) {
    102   for (int i = 0; i &lt; 100; i++) {
    103     if (a[i])
    104       x += 1;
    105   }
    106 }
    107 </pre>
    108 
    109 <p>The following is equivalent in non-concurrent situations:</p>
    110 
    111 <pre>
    112 int x;
    113 void f(int* a) {
    114   int xtemp = x;
    115   for (int i = 0; i &lt; 100; i++) {
    116     if (a[i])
    117       xtemp += 1;
    118   }
    119   x = xtemp;
    120 }
    121 </pre>
    122 
    123 <p>However, LLVM is not allowed to transform the former to the latter: it could
    124    indirectly introduce undefined behavior if another thread can access x at
    125    the same time. (This example is particularly of interest because before the
    126    concurrency model was implemented, LLVM would perform this
    127    transformation.)</p>
    128 
    129 <p>Note that speculative loads are allowed; a load which
    130    is part of a race returns <code>undef</code>, but does not have undefined
    131    behavior.</p>
    132 
    133 
    134 </div>
    135 
    136 <!-- *********************************************************************** -->
    137 <h2>
    138   <a name="atomicinst">Atomic instructions</a>
    139 </h2>
    140 <!-- *********************************************************************** -->
    141 
    142 <div>
    143 
    144 <p>For cases where simple loads and stores are not sufficient, LLVM provides
    145    various atomic instructions. The exact guarantees provided depend on the
    146    ordering; see <a href="#ordering">Atomic orderings</a></p>
    147 
    148 <p><code>load atomic</code> and <code>store atomic</code> provide the same
    149    basic functionality as non-atomic loads and stores, but provide additional
    150    guarantees in situations where threads and signals are involved.</p>
    151 
    152 <p><code>cmpxchg</code> and <code>atomicrmw</code> are essentially like an
    153    atomic load followed by an atomic store (where the store is conditional for
    154    <code>cmpxchg</code>), but no other memory operation can happen on any thread
    155    between the load and store.  Note that LLVM's cmpxchg does not provide quite
    156    as many options as the C++0x version.</p>
    157 
    158 <p>A <code>fence</code> provides Acquire and/or Release ordering which is not
    159    part of another operation; it is normally used along with Monotonic memory
    160    operations.  A Monotonic load followed by an Acquire fence is roughly
    161    equivalent to an Acquire load.</p>
    162 
    163 <p>Frontends generating atomic instructions generally need to be aware of the
    164    target to some degree; atomic instructions are guaranteed to be lock-free,
    165    and therefore an instruction which is wider than the target natively supports
    166    can be impossible to generate.</p>
    167 
    168 </div>
    169 
    170 <!-- *********************************************************************** -->
    171 <h2>
    172   <a name="ordering">Atomic orderings</a>
    173 </h2>
    174 <!-- *********************************************************************** -->
    175 
    176 <div>
    177 
    178 <p>In order to achieve a balance between performance and necessary guarantees,
    179    there are six levels of atomicity. They are listed in order of strength;
    180    each level includes all the guarantees of the previous level except for
    181    Acquire/Release. (See also <a href="LangRef.html#ordering">LangRef</a>.)</p>
    182 
    183 <!-- ======================================================================= -->
    184 <h3>
    185      <a name="o_notatomic">NotAtomic</a>
    186 </h3>
    187 
    188 <div>
    189 
    190 <p>NotAtomic is the obvious, a load or store which is not atomic. (This isn't
    191    really a level of atomicity, but is listed here for comparison.) This is
    192    essentially a regular load or store. If there is a race on a given memory
    193    location, loads from that location return undef.</p>
    194 
    195 <dl>
    196   <dt>Relevant standard</dt>
    197   <dd>This is intended to match shared variables in C/C++, and to be used
    198       in any other context where memory access is necessary, and
    199       a race is impossible. (The precise definition is in
    200       <a href="LangRef.html#memmodel">LangRef</a>.)
    201   <dt>Notes for frontends</dt>
    202   <dd>The rule is essentially that all memory accessed with basic loads and
    203       stores by multiple threads should be protected by a lock or other
    204       synchronization; otherwise, you are likely to run into undefined
    205       behavior. If your frontend is for a "safe" language like Java,
    206       use Unordered to load and store any shared variable.  Note that NotAtomic
    207       volatile loads and stores are not properly atomic; do not try to use
    208       them as a substitute. (Per the C/C++ standards, volatile does provide
    209       some limited guarantees around asynchronous signals, but atomics are
    210       generally a better solution.)
    211   <dt>Notes for optimizers</dt>
    212   <dd>Introducing loads to shared variables along a codepath where they would
    213       not otherwise exist is allowed; introducing stores to shared variables
    214       is not. See <a href="#outsideatomic">Optimization outside
    215       atomic</a>.</dd>
    216   <dt>Notes for code generation</dt>
    217   <dd>The one interesting restriction here is that it is not allowed to write
    218       to bytes outside of the bytes relevant to a store.  This is mostly
    219       relevant to unaligned stores: it is not allowed in general to convert
    220       an unaligned store into two aligned stores of the same width as the
    221       unaligned store. Backends are also expected to generate an i8 store
    222       as an i8 store, and not an instruction which writes to surrounding
    223       bytes.  (If you are writing a backend for an architecture which cannot
    224       satisfy these restrictions and cares about concurrency, please send an
    225       email to llvmdev.)</dd>
    226 </dl>
    227 
    228 </div>
    229 
    230 
    231 <!-- ======================================================================= -->
    232 <h3>
    233      <a name="o_unordered">Unordered</a>
    234 </h3>
    235 
    236 <div>
    237 
    238 <p>Unordered is the lowest level of atomicity. It essentially guarantees that
    239    races produce somewhat sane results instead of having undefined behavior.
    240    It also guarantees the operation to be lock-free, so it do not depend on
    241    the data being part of a special atomic structure or depend on a separate
    242    per-process global lock.  Note that code generation will fail for
    243    unsupported atomic operations; if you need such an operation, use explicit
    244    locking.</p>
    245 
    246 <dl>
    247   <dt>Relevant standard</dt>
    248   <dd>This is intended to match the Java memory model for shared
    249       variables.</dd>
    250   <dt>Notes for frontends</dt>
    251   <dd>This cannot be used for synchronization, but is useful for Java and
    252       other "safe" languages which need to guarantee that the generated
    253       code never exhibits undefined behavior. Note that this guarantee
    254       is cheap on common platforms for loads of a native width, but can
    255       be expensive or unavailable for wider loads, like a 64-bit store
    256       on ARM. (A frontend for Java or other "safe" languages would normally
    257       split a 64-bit store on ARM into two 32-bit unordered stores.)
    258   <dt>Notes for optimizers</dt>
    259   <dd>In terms of the optimizer, this prohibits any transformation that
    260       transforms a single load into multiple loads, transforms a store
    261       into multiple stores, narrows a store, or stores a value which
    262       would not be stored otherwise.  Some examples of unsafe optimizations
    263       are narrowing an assignment into a bitfield, rematerializing
    264       a load, and turning loads and stores into a memcpy call. Reordering
    265       unordered operations is safe, though, and optimizers should take 
    266       advantage of that because unordered operations are common in
    267       languages that need them.</dd>
    268   <dt>Notes for code generation</dt>
    269   <dd>These operations are required to be atomic in the sense that if you
    270       use unordered loads and unordered stores, a load cannot see a value
    271       which was never stored.  A normal load or store instruction is usually
    272       sufficient, but note that an unordered load or store cannot
    273       be split into multiple instructions (or an instruction which
    274       does multiple memory operations, like <code>LDRD</code> on ARM).</dd>
    275 </dl>
    276 
    277 </div>
    278 
    279 <!-- ======================================================================= -->
    280 <h3>
    281      <a name="o_monotonic">Monotonic</a>
    282 </h3>
    283 
    284 <div>
    285 
    286 <p>Monotonic is the weakest level of atomicity that can be used in
    287    synchronization primitives, although it does not provide any general
    288    synchronization. It essentially guarantees that if you take all the
    289    operations affecting a specific address, a consistent ordering exists.
    290 
    291 <dl>
    292   <dt>Relevant standard</dt>
    293   <dd>This corresponds to the C++0x/C1x <code>memory_order_relaxed</code>;
    294      see those standards for the exact definition.
    295   <dt>Notes for frontends</dt>
    296   <dd>If you are writing a frontend which uses this directly, use with caution.
    297       The guarantees in terms of synchronization are very weak, so make
    298       sure these are only used in a pattern which you know is correct.
    299       Generally, these would either be used for atomic operations which
    300       do not protect other memory (like an atomic counter), or along with
    301       a <code>fence</code>.</dd>
    302   <dt>Notes for optimizers</dt>
    303   <dd>In terms of the optimizer, this can be treated as a read+write on the
    304       relevant memory location (and alias analysis will take advantage of
    305       that). In addition, it is legal to reorder non-atomic and Unordered
    306       loads around Monotonic loads. CSE/DSE and a few other optimizations
    307       are allowed, but Monotonic operations are unlikely to be used in ways
    308       which would make those optimizations useful.</dd>
    309   <dt>Notes for code generation</dt>
    310   <dd>Code generation is essentially the same as that for unordered for loads
    311      and stores.  No fences are required.  <code>cmpxchg</code> and 
    312      <code>atomicrmw</code> are required to appear as a single operation.</dd>
    313 </dl>
    314 
    315 </div>
    316 
    317 <!-- ======================================================================= -->
    318 <h3>
    319      <a name="o_acquire">Acquire</a>
    320 </h3>
    321 
    322 <div>
    323 
    324 <p>Acquire provides a barrier of the sort necessary to acquire a lock to access
    325    other memory with normal loads and stores.
    326 
    327 <dl>
    328   <dt>Relevant standard</dt>
    329   <dd>This corresponds to the C++0x/C1x <code>memory_order_acquire</code>. It
    330       should also be used for C++0x/C1x <code>memory_order_consume</code>.
    331   <dt>Notes for frontends</dt>
    332   <dd>If you are writing a frontend which uses this directly, use with caution.
    333       Acquire only provides a semantic guarantee when paired with a Release
    334       operation.</dd>
    335   <dt>Notes for optimizers</dt>
    336   <dd>Optimizers not aware of atomics can treat this like a nothrow call.
    337       It is also possible to move stores from before an Acquire load
    338       or read-modify-write operation to after it, and move non-Acquire
    339       loads from before an Acquire operation to after it.</dd>
    340   <dt>Notes for code generation</dt>
    341   <dd>Architectures with weak memory ordering (essentially everything relevant
    342       today except x86 and SPARC) require some sort of fence to maintain
    343       the Acquire semantics.  The precise fences required varies widely by
    344       architecture, but for a simple implementation, most architectures provide
    345       a barrier which is strong enough for everything (<code>dmb</code> on ARM,
    346       <code>sync</code> on PowerPC, etc.).  Putting such a fence after the
    347       equivalent Monotonic operation is sufficient to maintain Acquire
    348       semantics for a memory operation.</dd>
    349 </dl>
    350 
    351 </div>
    352 
    353 <!-- ======================================================================= -->
    354 <h3>
    355      <a name="o_acquire">Release</a>
    356 </h3>
    357 
    358 <div>
    359 
    360 <p>Release is similar to Acquire, but with a barrier of the sort necessary to
    361    release a lock.
    362 
    363 <dl>
    364   <dt>Relevant standard</dt>
    365   <dd>This corresponds to the C++0x/C1x <code>memory_order_release</code>.</dd>
    366   <dt>Notes for frontends</dt>
    367   <dd>If you are writing a frontend which uses this directly, use with caution.
    368       Release only provides a semantic guarantee when paired with a Acquire
    369       operation.</dd>
    370   <dt>Notes for optimizers</dt>
    371   <dd>Optimizers not aware of atomics can treat this like a nothrow call.
    372       It is also possible to move loads from after a Release store
    373       or read-modify-write operation to before it, and move non-Release
    374       stores from after an Release operation to before it.</dd>
    375   <dt>Notes for code generation</dt>
    376   <dd>See the section on Acquire; a fence before the relevant operation is
    377       usually sufficient for Release. Note that a store-store fence is not
    378       sufficient to implement Release semantics; store-store fences are
    379       generally not exposed to IR because they are extremely difficult to
    380       use correctly.</dd>
    381 </dl>
    382 
    383 </div>
    384 
    385 <!-- ======================================================================= -->
    386 <h3>
    387      <a name="o_acqrel">AcquireRelease</a>
    388 </h3>
    389 
    390 <div>
    391 
    392 <p>AcquireRelease (<code>acq_rel</code> in IR) provides both an Acquire and a
    393    Release barrier (for fences and operations which both read and write memory).
    394 
    395 <dl>
    396   <dt>Relevant standard</dt>
    397   <dd>This corresponds to the C++0x/C1x <code>memory_order_acq_rel</code>.
    398   <dt>Notes for frontends</dt>
    399   <dd>If you are writing a frontend which uses this directly, use with caution.
    400       Acquire only provides a semantic guarantee when paired with a Release
    401       operation, and vice versa.</dd>
    402   <dt>Notes for optimizers</dt>
    403   <dd>In general, optimizers should treat this like a nothrow call; the
    404       the possible optimizations are usually not interesting.</dd>
    405   <dt>Notes for code generation</dt>
    406   <dd>This operation has Acquire and Release semantics; see the sections on
    407       Acquire and Release.</dd>
    408 </dl>
    409 
    410 </div>
    411 
    412 <!-- ======================================================================= -->
    413 <h3>
    414      <a name="o_seqcst">SequentiallyConsistent</a>
    415 </h3>
    416 
    417 <div>
    418 
    419 <p>SequentiallyConsistent (<code>seq_cst</code> in IR) provides
    420    Acquire semantics for loads and Release semantics for
    421    stores. Additionally, it guarantees that a total ordering exists
    422    between all SequentiallyConsistent operations.
    423 
    424 <dl>
    425   <dt>Relevant standard</dt>
    426   <dd>This corresponds to the C++0x/C1x <code>memory_order_seq_cst</code>,
    427       Java volatile, and the gcc-compatible <code>__sync_*</code> builtins
    428       which do not specify otherwise.
    429   <dt>Notes for frontends</dt>
    430   <dd>If a frontend is exposing atomic operations, these are much easier to
    431       reason about for the programmer than other kinds of operations, and using
    432       them is generally a practical performance tradeoff.</dd>
    433   <dt>Notes for optimizers</dt>
    434   <dd>Optimizers not aware of atomics can treat this like a nothrow call.
    435       For SequentiallyConsistent loads and stores, the same reorderings are
    436       allowed as for Acquire loads and Release stores, except that
    437       SequentiallyConsistent operations may not be reordered.</dd>
    438   <dt>Notes for code generation</dt>
    439   <dd>SequentiallyConsistent loads minimally require the same barriers
    440      as Acquire operations and SequentiallyConsistent stores require
    441      Release barriers. Additionally, the code generator must enforce
    442      ordering between SequentiallyConsistent stores followed by
    443      SequentiallyConsistent loads. This is usually done by emitting
    444      either a full fence before the loads or a full fence after the
    445      stores; which is preferred varies by architecture.</dd>
    446 </dl>
    447 
    448 </div>
    449 
    450 </div>
    451 
    452 <!-- *********************************************************************** -->
    453 <h2>
    454   <a name="iropt">Atomics and IR optimization</a>
    455 </h2>
    456 <!-- *********************************************************************** -->
    457 
    458 <div>
    459 
    460 <p>Predicates for optimizer writers to query:
    461 <ul>
    462   <li>isSimple(): A load or store which is not volatile or atomic.  This is
    463       what, for example, memcpyopt would check for operations it might
    464       transform.</li>
    465   <li>isUnordered(): A load or store which is not volatile and at most
    466       Unordered. This would be checked, for example, by LICM before hoisting
    467       an operation.</li>
    468   <li>mayReadFromMemory()/mayWriteToMemory(): Existing predicate, but note
    469       that they return true for any operation which is volatile or at least
    470       Monotonic.</li>
    471   <li>Alias analysis: Note that AA will return ModRef for anything Acquire or
    472       Release, and for the address accessed by any Monotonic operation.</li>
    473 </ul>
    474 
    475 <p>To support optimizing around atomic operations, make sure you are using
    476    the right predicates; everything should work if that is done.  If your
    477    pass should optimize some atomic operations (Unordered operations in
    478    particular), make sure it doesn't replace an atomic load or store with
    479    a non-atomic operation.</p>
    480 
    481 <p>Some examples of how optimizations interact with various kinds of atomic
    482    operations:
    483 <ul>
    484   <li>memcpyopt: An atomic operation cannot be optimized into part of a
    485       memcpy/memset, including unordered loads/stores.  It can pull operations
    486       across some atomic operations.
    487   <li>LICM: Unordered loads/stores can be moved out of a loop.  It just treats
    488       monotonic operations like a read+write to a memory location, and anything
    489       stricter than that like a nothrow call.
    490   <li>DSE: Unordered stores can be DSE'ed like normal stores.  Monotonic stores
    491       can be DSE'ed in some cases, but it's tricky to reason about, and not
    492       especially important.
    493   <li>Folding a load: Any atomic load from a constant global can be
    494       constant-folded, because it cannot be observed.  Similar reasoning allows
    495       scalarrepl with atomic loads and stores.
    496 </ul>
    497 
    498 </div>
    499 
    500 <!-- *********************************************************************** -->
    501 <h2>
    502   <a name="codegen">Atomics and Codegen</a>
    503 </h2>
    504 <!-- *********************************************************************** -->
    505 
    506 <div>
    507 
    508 <p>Atomic operations are represented in the SelectionDAG with
    509    <code>ATOMIC_*</code> opcodes.  On architectures which use barrier
    510    instructions for all atomic ordering (like ARM), appropriate fences are
    511    split out as the DAG is built.</p>
    512 
    513 <p>The MachineMemOperand for all atomic operations is currently marked as
    514    volatile; this is not correct in the IR sense of volatile, but CodeGen
    515    handles anything marked volatile very conservatively.  This should get
    516    fixed at some point.</p>
    517 
    518 <p>Common architectures have some way of representing at least a pointer-sized
    519    lock-free <code>cmpxchg</code>; such an operation can be used to implement
    520    all the other atomic operations which can be represented in IR up to that
    521    size.  Backends are expected to implement all those operations, but not
    522    operations which cannot be implemented in a lock-free manner.  It is
    523    expected that backends will give an error when given an operation which
    524    cannot be implemented.  (The LLVM code generator is not very helpful here
    525    at the moment, but hopefully that will change.)</p>
    526 
    527 <p>The implementation of atomics on LL/SC architectures (like ARM) is currently
    528    a bit of a mess; there is a lot of copy-pasted code across targets, and
    529    the representation is relatively unsuited to optimization (it would be nice
    530    to be able to optimize loops involving cmpxchg etc.).</p>
    531 
    532 <p>On x86, all atomic loads generate a <code>MOV</code>.
    533    SequentiallyConsistent stores generate an <code>XCHG</code>, other stores
    534    generate a <code>MOV</code>. SequentiallyConsistent fences generate an
    535    <code>MFENCE</code>, other fences do not cause any code to be generated.
    536    cmpxchg uses the <code>LOCK CMPXCHG</code> instruction.
    537    <code>atomicrmw xchg</code> uses <code>XCHG</code>,
    538    <code>atomicrmw add</code> and <code>atomicrmw sub</code> use
    539    <code>XADD</code>, and all other <code>atomicrmw</code> operations generate
    540    a loop with <code>LOCK CMPXCHG</code>.  Depending on the users of the
    541    result, some <code>atomicrmw</code> operations can be translated into
    542    operations like <code>LOCK AND</code>, but that does not work in
    543    general.</p>
    544 
    545 <p>On ARM, MIPS, and many other RISC architectures, Acquire, Release, and
    546    SequentiallyConsistent semantics require barrier instructions
    547    for every such operation. Loads and stores generate normal instructions.
    548    <code>cmpxchg</code> and <code>atomicrmw</code> can be represented using
    549    a loop with LL/SC-style instructions which take some sort of exclusive
    550    lock on a cache line  (<code>LDREX</code> and <code>STREX</code> on
    551    ARM, etc.). At the moment, the IR does not provide any way to represent a
    552    weak <code>cmpxchg</code> which would not require a loop.</p>
    553 </div>
    554 
    555 <!-- *********************************************************************** -->
    556 
    557 <hr>
    558 <address>
    559   <a href="http://jigsaw.w3.org/css-validator/check/referer"><img
    560   src="http://jigsaw.w3.org/css-validator/images/vcss-blue" alt="Valid CSS"></a>
    561   <a href="http://validator.w3.org/check/referer"><img
    562   src="http://www.w3.org/Icons/valid-html401-blue" alt="Valid HTML 4.01"></a>
    563 
    564   <a href="http://llvm.org/">LLVM Compiler Infrastructure</a><br>
    565   Last modified: $Date: 2011-08-09 02:07:00 -0700 (Tue, 09 Aug 2011) $
    566 </address>
    567 
    568 </body>
    569 </html>
    570