Home | History | Annotate | Download | only in tutorial
      1 =======================================================
      2 Kaleidoscope: Extending the Language: Mutable Variables
      3 =======================================================
      4 
      5 .. contents::
      6    :local:
      7 
      8 Chapter 7 Introduction
      9 ======================
     10 
     11 Welcome to Chapter 7 of the "`Implementing a language with
     12 LLVM <index.html>`_" tutorial. In chapters 1 through 6, we've built a
     13 very respectable, albeit simple, `functional programming
     14 language <http://en.wikipedia.org/wiki/Functional_programming>`_. In our
     15 journey, we learned some parsing techniques, how to build and represent
     16 an AST, how to build LLVM IR, and how to optimize the resultant code as
     17 well as JIT compile it.
     18 
     19 While Kaleidoscope is interesting as a functional language, the fact
     20 that it is functional makes it "too easy" to generate LLVM IR for it. In
     21 particular, a functional language makes it very easy to build LLVM IR
     22 directly in `SSA
     23 form <http://en.wikipedia.org/wiki/Static_single_assignment_form>`_.
     24 Since LLVM requires that the input code be in SSA form, this is a very
     25 nice property and it is often unclear to newcomers how to generate code
     26 for an imperative language with mutable variables.
     27 
     28 The short (and happy) summary of this chapter is that there is no need
     29 for your front-end to build SSA form: LLVM provides highly tuned and
     30 well tested support for this, though the way it works is a bit
     31 unexpected for some.
     32 
     33 Why is this a hard problem?
     34 ===========================
     35 
     36 To understand why mutable variables cause complexities in SSA
     37 construction, consider this extremely simple C example:
     38 
     39 .. code-block:: c
     40 
     41     int G, H;
     42     int test(_Bool Condition) {
     43       int X;
     44       if (Condition)
     45         X = G;
     46       else
     47         X = H;
     48       return X;
     49     }
     50 
     51 In this case, we have the variable "X", whose value depends on the path
     52 executed in the program. Because there are two different possible values
     53 for X before the return instruction, a PHI node is inserted to merge the
     54 two values. The LLVM IR that we want for this example looks like this:
     55 
     56 .. code-block:: llvm
     57 
     58     @G = weak global i32 0   ; type of @G is i32*
     59     @H = weak global i32 0   ; type of @H is i32*
     60 
     61     define i32 @test(i1 %Condition) {
     62     entry:
     63       br i1 %Condition, label %cond_true, label %cond_false
     64 
     65     cond_true:
     66       %X.0 = load i32* @G
     67       br label %cond_next
     68 
     69     cond_false:
     70       %X.1 = load i32* @H
     71       br label %cond_next
     72 
     73     cond_next:
     74       %X.2 = phi i32 [ %X.1, %cond_false ], [ %X.0, %cond_true ]
     75       ret i32 %X.2
     76     }
     77 
     78 In this example, the loads from the G and H global variables are
     79 explicit in the LLVM IR, and they live in the then/else branches of the
     80 if statement (cond\_true/cond\_false). In order to merge the incoming
     81 values, the X.2 phi node in the cond\_next block selects the right value
     82 to use based on where control flow is coming from: if control flow comes
     83 from the cond\_false block, X.2 gets the value of X.1. Alternatively, if
     84 control flow comes from cond\_true, it gets the value of X.0. The intent
     85 of this chapter is not to explain the details of SSA form. For more
     86 information, see one of the many `online
     87 references <http://en.wikipedia.org/wiki/Static_single_assignment_form>`_.
     88 
     89 The question for this article is "who places the phi nodes when lowering
     90 assignments to mutable variables?". The issue here is that LLVM
     91 *requires* that its IR be in SSA form: there is no "non-ssa" mode for
     92 it. However, SSA construction requires non-trivial algorithms and data
     93 structures, so it is inconvenient and wasteful for every front-end to
     94 have to reproduce this logic.
     95 
     96 Memory in LLVM
     97 ==============
     98 
     99 The 'trick' here is that while LLVM does require all register values to
    100 be in SSA form, it does not require (or permit) memory objects to be in
    101 SSA form. In the example above, note that the loads from G and H are
    102 direct accesses to G and H: they are not renamed or versioned. This
    103 differs from some other compiler systems, which do try to version memory
    104 objects. In LLVM, instead of encoding dataflow analysis of memory into
    105 the LLVM IR, it is handled with `Analysis
    106 Passes <../WritingAnLLVMPass.html>`_ which are computed on demand.
    107 
    108 With this in mind, the high-level idea is that we want to make a stack
    109 variable (which lives in memory, because it is on the stack) for each
    110 mutable object in a function. To take advantage of this trick, we need
    111 to talk about how LLVM represents stack variables.
    112 
    113 In LLVM, all memory accesses are explicit with load/store instructions,
    114 and it is carefully designed not to have (or need) an "address-of"
    115 operator. Notice how the type of the @G/@H global variables is actually
    116 "i32\*" even though the variable is defined as "i32". What this means is
    117 that @G defines *space* for an i32 in the global data area, but its
    118 *name* actually refers to the address for that space. Stack variables
    119 work the same way, except that instead of being declared with global
    120 variable definitions, they are declared with the `LLVM alloca
    121 instruction <../LangRef.html#alloca-instruction>`_:
    122 
    123 .. code-block:: llvm
    124 
    125     define i32 @example() {
    126     entry:
    127       %X = alloca i32           ; type of %X is i32*.
    128       ...
    129       %tmp = load i32* %X       ; load the stack value %X from the stack.
    130       %tmp2 = add i32 %tmp, 1   ; increment it
    131       store i32 %tmp2, i32* %X  ; store it back
    132       ...
    133 
    134 This code shows an example of how you can declare and manipulate a stack
    135 variable in the LLVM IR. Stack memory allocated with the alloca
    136 instruction is fully general: you can pass the address of the stack slot
    137 to functions, you can store it in other variables, etc. In our example
    138 above, we could rewrite the example to use the alloca technique to avoid
    139 using a PHI node:
    140 
    141 .. code-block:: llvm
    142 
    143     @G = weak global i32 0   ; type of @G is i32*
    144     @H = weak global i32 0   ; type of @H is i32*
    145 
    146     define i32 @test(i1 %Condition) {
    147     entry:
    148       %X = alloca i32           ; type of %X is i32*.
    149       br i1 %Condition, label %cond_true, label %cond_false
    150 
    151     cond_true:
    152       %X.0 = load i32* @G
    153             store i32 %X.0, i32* %X   ; Update X
    154       br label %cond_next
    155 
    156     cond_false:
    157       %X.1 = load i32* @H
    158             store i32 %X.1, i32* %X   ; Update X
    159       br label %cond_next
    160 
    161     cond_next:
    162       %X.2 = load i32* %X       ; Read X
    163       ret i32 %X.2
    164     }
    165 
    166 With this, we have discovered a way to handle arbitrary mutable
    167 variables without the need to create Phi nodes at all:
    168 
    169 #. Each mutable variable becomes a stack allocation.
    170 #. Each read of the variable becomes a load from the stack.
    171 #. Each update of the variable becomes a store to the stack.
    172 #. Taking the address of a variable just uses the stack address
    173    directly.
    174 
    175 While this solution has solved our immediate problem, it introduced
    176 another one: we have now apparently introduced a lot of stack traffic
    177 for very simple and common operations, a major performance problem.
    178 Fortunately for us, the LLVM optimizer has a highly-tuned optimization
    179 pass named "mem2reg" that handles this case, promoting allocas like this
    180 into SSA registers, inserting Phi nodes as appropriate. If you run this
    181 example through the pass, for example, you'll get:
    182 
    183 .. code-block:: bash
    184 
    185     $ llvm-as < example.ll | opt -mem2reg | llvm-dis
    186     @G = weak global i32 0
    187     @H = weak global i32 0
    188 
    189     define i32 @test(i1 %Condition) {
    190     entry:
    191       br i1 %Condition, label %cond_true, label %cond_false
    192 
    193     cond_true:
    194       %X.0 = load i32* @G
    195       br label %cond_next
    196 
    197     cond_false:
    198       %X.1 = load i32* @H
    199       br label %cond_next
    200 
    201     cond_next:
    202       %X.01 = phi i32 [ %X.1, %cond_false ], [ %X.0, %cond_true ]
    203       ret i32 %X.01
    204     }
    205 
    206 The mem2reg pass implements the standard "iterated dominance frontier"
    207 algorithm for constructing SSA form and has a number of optimizations
    208 that speed up (very common) degenerate cases. The mem2reg optimization
    209 pass is the answer to dealing with mutable variables, and we highly
    210 recommend that you depend on it. Note that mem2reg only works on
    211 variables in certain circumstances:
    212 
    213 #. mem2reg is alloca-driven: it looks for allocas and if it can handle
    214    them, it promotes them. It does not apply to global variables or heap
    215    allocations.
    216 #. mem2reg only looks for alloca instructions in the entry block of the
    217    function. Being in the entry block guarantees that the alloca is only
    218    executed once, which makes analysis simpler.
    219 #. mem2reg only promotes allocas whose uses are direct loads and stores.
    220    If the address of the stack object is passed to a function, or if any
    221    funny pointer arithmetic is involved, the alloca will not be
    222    promoted.
    223 #. mem2reg only works on allocas of `first
    224    class <../LangRef.html#first-class-types>`_ values (such as pointers,
    225    scalars and vectors), and only if the array size of the allocation is
    226    1 (or missing in the .ll file). mem2reg is not capable of promoting
    227    structs or arrays to registers. Note that the "sroa" pass is
    228    more powerful and can promote structs, "unions", and arrays in many
    229    cases.
    230 
    231 All of these properties are easy to satisfy for most imperative
    232 languages, and we'll illustrate it below with Kaleidoscope. The final
    233 question you may be asking is: should I bother with this nonsense for my
    234 front-end? Wouldn't it be better if I just did SSA construction
    235 directly, avoiding use of the mem2reg optimization pass? In short, we
    236 strongly recommend that you use this technique for building SSA form,
    237 unless there is an extremely good reason not to. Using this technique
    238 is:
    239 
    240 -  Proven and well tested: clang uses this technique
    241    for local mutable variables. As such, the most common clients of LLVM
    242    are using this to handle a bulk of their variables. You can be sure
    243    that bugs are found fast and fixed early.
    244 -  Extremely Fast: mem2reg has a number of special cases that make it
    245    fast in common cases as well as fully general. For example, it has
    246    fast-paths for variables that are only used in a single block,
    247    variables that only have one assignment point, good heuristics to
    248    avoid insertion of unneeded phi nodes, etc.
    249 -  Needed for debug info generation: `Debug information in
    250    LLVM <../SourceLevelDebugging.html>`_ relies on having the address of
    251    the variable exposed so that debug info can be attached to it. This
    252    technique dovetails very naturally with this style of debug info.
    253 
    254 If nothing else, this makes it much easier to get your front-end up and
    255 running, and is very simple to implement. Lets extend Kaleidoscope with
    256 mutable variables now!
    257 
    258 Mutable Variables in Kaleidoscope
    259 =================================
    260 
    261 Now that we know the sort of problem we want to tackle, lets see what
    262 this looks like in the context of our little Kaleidoscope language.
    263 We're going to add two features:
    264 
    265 #. The ability to mutate variables with the '=' operator.
    266 #. The ability to define new variables.
    267 
    268 While the first item is really what this is about, we only have
    269 variables for incoming arguments as well as for induction variables, and
    270 redefining those only goes so far :). Also, the ability to define new
    271 variables is a useful thing regardless of whether you will be mutating
    272 them. Here's a motivating example that shows how we could use these:
    273 
    274 ::
    275 
    276     # Define ':' for sequencing: as a low-precedence operator that ignores operands
    277     # and just returns the RHS.
    278     def binary : 1 (x y) y;
    279 
    280     # Recursive fib, we could do this before.
    281     def fib(x)
    282       if (x < 3) then
    283         1
    284       else
    285         fib(x-1)+fib(x-2);
    286 
    287     # Iterative fib.
    288     def fibi(x)
    289       var a = 1, b = 1, c in
    290       (for i = 3, i < x in
    291          c = a + b :
    292          a = b :
    293          b = c) :
    294       b;
    295 
    296     # Call it.
    297     fibi(10);
    298 
    299 In order to mutate variables, we have to change our existing variables
    300 to use the "alloca trick". Once we have that, we'll add our new
    301 operator, then extend Kaleidoscope to support new variable definitions.
    302 
    303 Adjusting Existing Variables for Mutation
    304 =========================================
    305 
    306 The symbol table in Kaleidoscope is managed at code generation time by
    307 the '``named_values``' map. This map currently keeps track of the LLVM
    308 "Value\*" that holds the double value for the named variable. In order
    309 to support mutation, we need to change this slightly, so that it
    310 ``named_values`` holds the *memory location* of the variable in
    311 question. Note that this change is a refactoring: it changes the
    312 structure of the code, but does not (by itself) change the behavior of
    313 the compiler. All of these changes are isolated in the Kaleidoscope code
    314 generator.
    315 
    316 At this point in Kaleidoscope's development, it only supports variables
    317 for two things: incoming arguments to functions and the induction
    318 variable of 'for' loops. For consistency, we'll allow mutation of these
    319 variables in addition to other user-defined variables. This means that
    320 these will both need memory locations.
    321 
    322 To start our transformation of Kaleidoscope, we'll change the
    323 ``named_values`` map so that it maps to AllocaInst\* instead of Value\*.
    324 Once we do this, the C++ compiler will tell us what parts of the code we
    325 need to update:
    326 
    327 **Note:** the ocaml bindings currently model both ``Value*``'s and
    328 ``AllocInst*``'s as ``Llvm.llvalue``'s, but this may change in the future
    329 to be more type safe.
    330 
    331 .. code-block:: ocaml
    332 
    333     let named_values:(string, llvalue) Hashtbl.t = Hashtbl.create 10
    334 
    335 Also, since we will need to create these alloca's, we'll use a helper
    336 function that ensures that the allocas are created in the entry block of
    337 the function:
    338 
    339 .. code-block:: ocaml
    340 
    341     (* Create an alloca instruction in the entry block of the function. This
    342      * is used for mutable variables etc. *)
    343     let create_entry_block_alloca the_function var_name =
    344       let builder = builder_at (instr_begin (entry_block the_function)) in
    345       build_alloca double_type var_name builder
    346 
    347 This funny looking code creates an ``Llvm.llbuilder`` object that is
    348 pointing at the first instruction of the entry block. It then creates an
    349 alloca with the expected name and returns it. Because all values in
    350 Kaleidoscope are doubles, there is no need to pass in a type to use.
    351 
    352 With this in place, the first functionality change we want to make is to
    353 variable references. In our new scheme, variables live on the stack, so
    354 code generating a reference to them actually needs to produce a load
    355 from the stack slot:
    356 
    357 .. code-block:: ocaml
    358 
    359     let rec codegen_expr = function
    360       ...
    361       | Ast.Variable name ->
    362           let v = try Hashtbl.find named_values name with
    363             | Not_found -> raise (Error "unknown variable name")
    364           in
    365           (* Load the value. *)
    366           build_load v name builder
    367 
    368 As you can see, this is pretty straightforward. Now we need to update
    369 the things that define the variables to set up the alloca. We'll start
    370 with ``codegen_expr Ast.For ...`` (see the `full code listing <#id1>`_
    371 for the unabridged code):
    372 
    373 .. code-block:: ocaml
    374 
    375       | Ast.For (var_name, start, end_, step, body) ->
    376           let the_function = block_parent (insertion_block builder) in
    377 
    378           (* Create an alloca for the variable in the entry block. *)
    379           let alloca = create_entry_block_alloca the_function var_name in
    380 
    381           (* Emit the start code first, without 'variable' in scope. *)
    382           let start_val = codegen_expr start in
    383 
    384           (* Store the value into the alloca. *)
    385           ignore(build_store start_val alloca builder);
    386 
    387           ...
    388 
    389           (* Within the loop, the variable is defined equal to the PHI node. If it
    390            * shadows an existing variable, we have to restore it, so save it
    391            * now. *)
    392           let old_val =
    393             try Some (Hashtbl.find named_values var_name) with Not_found -> None
    394           in
    395           Hashtbl.add named_values var_name alloca;
    396 
    397           ...
    398 
    399           (* Compute the end condition. *)
    400           let end_cond = codegen_expr end_ in
    401 
    402           (* Reload, increment, and restore the alloca. This handles the case where
    403            * the body of the loop mutates the variable. *)
    404           let cur_var = build_load alloca var_name builder in
    405           let next_var = build_add cur_var step_val "nextvar" builder in
    406           ignore(build_store next_var alloca builder);
    407           ...
    408 
    409 This code is virtually identical to the code `before we allowed mutable
    410 variables <OCamlLangImpl5.html#code-generation-for-the-for-loop>`_. The big difference is that
    411 we no longer have to construct a PHI node, and we use load/store to
    412 access the variable as needed.
    413 
    414 To support mutable argument variables, we need to also make allocas for
    415 them. The code for this is also pretty simple:
    416 
    417 .. code-block:: ocaml
    418 
    419     (* Create an alloca for each argument and register the argument in the symbol
    420      * table so that references to it will succeed. *)
    421     let create_argument_allocas the_function proto =
    422       let args = match proto with
    423         | Ast.Prototype (_, args) | Ast.BinOpPrototype (_, args, _) -> args
    424       in
    425       Array.iteri (fun i ai ->
    426         let var_name = args.(i) in
    427         (* Create an alloca for this variable. *)
    428         let alloca = create_entry_block_alloca the_function var_name in
    429 
    430         (* Store the initial value into the alloca. *)
    431         ignore(build_store ai alloca builder);
    432 
    433         (* Add arguments to variable symbol table. *)
    434         Hashtbl.add named_values var_name alloca;
    435       ) (params the_function)
    436 
    437 For each argument, we make an alloca, store the input value to the
    438 function into the alloca, and register the alloca as the memory location
    439 for the argument. This method gets invoked by ``Codegen.codegen_func``
    440 right after it sets up the entry block for the function.
    441 
    442 The final missing piece is adding the mem2reg pass, which allows us to
    443 get good codegen once again:
    444 
    445 .. code-block:: ocaml
    446 
    447     let main () =
    448       ...
    449       let the_fpm = PassManager.create_function Codegen.the_module in
    450 
    451       (* Set up the optimizer pipeline.  Start with registering info about how the
    452        * target lays out data structures. *)
    453       DataLayout.add (ExecutionEngine.target_data the_execution_engine) the_fpm;
    454 
    455       (* Promote allocas to registers. *)
    456       add_memory_to_register_promotion the_fpm;
    457 
    458       (* Do simple "peephole" optimizations and bit-twiddling optzn. *)
    459       add_instruction_combining the_fpm;
    460 
    461       (* reassociate expressions. *)
    462       add_reassociation the_fpm;
    463 
    464 It is interesting to see what the code looks like before and after the
    465 mem2reg optimization runs. For example, this is the before/after code
    466 for our recursive fib function. Before the optimization:
    467 
    468 .. code-block:: llvm
    469 
    470     define double @fib(double %x) {
    471     entry:
    472       %x1 = alloca double
    473       store double %x, double* %x1
    474       %x2 = load double* %x1
    475       %cmptmp = fcmp ult double %x2, 3.000000e+00
    476       %booltmp = uitofp i1 %cmptmp to double
    477       %ifcond = fcmp one double %booltmp, 0.000000e+00
    478       br i1 %ifcond, label %then, label %else
    479 
    480     then:    ; preds = %entry
    481       br label %ifcont
    482 
    483     else:    ; preds = %entry
    484       %x3 = load double* %x1
    485       %subtmp = fsub double %x3, 1.000000e+00
    486       %calltmp = call double @fib(double %subtmp)
    487       %x4 = load double* %x1
    488       %subtmp5 = fsub double %x4, 2.000000e+00
    489       %calltmp6 = call double @fib(double %subtmp5)
    490       %addtmp = fadd double %calltmp, %calltmp6
    491       br label %ifcont
    492 
    493     ifcont:    ; preds = %else, %then
    494       %iftmp = phi double [ 1.000000e+00, %then ], [ %addtmp, %else ]
    495       ret double %iftmp
    496     }
    497 
    498 Here there is only one variable (x, the input argument) but you can
    499 still see the extremely simple-minded code generation strategy we are
    500 using. In the entry block, an alloca is created, and the initial input
    501 value is stored into it. Each reference to the variable does a reload
    502 from the stack. Also, note that we didn't modify the if/then/else
    503 expression, so it still inserts a PHI node. While we could make an
    504 alloca for it, it is actually easier to create a PHI node for it, so we
    505 still just make the PHI.
    506 
    507 Here is the code after the mem2reg pass runs:
    508 
    509 .. code-block:: llvm
    510 
    511     define double @fib(double %x) {
    512     entry:
    513       %cmptmp = fcmp ult double %x, 3.000000e+00
    514       %booltmp = uitofp i1 %cmptmp to double
    515       %ifcond = fcmp one double %booltmp, 0.000000e+00
    516       br i1 %ifcond, label %then, label %else
    517 
    518     then:
    519       br label %ifcont
    520 
    521     else:
    522       %subtmp = fsub double %x, 1.000000e+00
    523       %calltmp = call double @fib(double %subtmp)
    524       %subtmp5 = fsub double %x, 2.000000e+00
    525       %calltmp6 = call double @fib(double %subtmp5)
    526       %addtmp = fadd double %calltmp, %calltmp6
    527       br label %ifcont
    528 
    529     ifcont:    ; preds = %else, %then
    530       %iftmp = phi double [ 1.000000e+00, %then ], [ %addtmp, %else ]
    531       ret double %iftmp
    532     }
    533 
    534 This is a trivial case for mem2reg, since there are no redefinitions of
    535 the variable. The point of showing this is to calm your tension about
    536 inserting such blatent inefficiencies :).
    537 
    538 After the rest of the optimizers run, we get:
    539 
    540 .. code-block:: llvm
    541 
    542     define double @fib(double %x) {
    543     entry:
    544       %cmptmp = fcmp ult double %x, 3.000000e+00
    545       %booltmp = uitofp i1 %cmptmp to double
    546       %ifcond = fcmp ueq double %booltmp, 0.000000e+00
    547       br i1 %ifcond, label %else, label %ifcont
    548 
    549     else:
    550       %subtmp = fsub double %x, 1.000000e+00
    551       %calltmp = call double @fib(double %subtmp)
    552       %subtmp5 = fsub double %x, 2.000000e+00
    553       %calltmp6 = call double @fib(double %subtmp5)
    554       %addtmp = fadd double %calltmp, %calltmp6
    555       ret double %addtmp
    556 
    557     ifcont:
    558       ret double 1.000000e+00
    559     }
    560 
    561 Here we see that the simplifycfg pass decided to clone the return
    562 instruction into the end of the 'else' block. This allowed it to
    563 eliminate some branches and the PHI node.
    564 
    565 Now that all symbol table references are updated to use stack variables,
    566 we'll add the assignment operator.
    567 
    568 New Assignment Operator
    569 =======================
    570 
    571 With our current framework, adding a new assignment operator is really
    572 simple. We will parse it just like any other binary operator, but handle
    573 it internally (instead of allowing the user to define it). The first
    574 step is to set a precedence:
    575 
    576 .. code-block:: ocaml
    577 
    578     let main () =
    579       (* Install standard binary operators.
    580        * 1 is the lowest precedence. *)
    581       Hashtbl.add Parser.binop_precedence '=' 2;
    582       Hashtbl.add Parser.binop_precedence '<' 10;
    583       Hashtbl.add Parser.binop_precedence '+' 20;
    584       Hashtbl.add Parser.binop_precedence '-' 20;
    585       ...
    586 
    587 Now that the parser knows the precedence of the binary operator, it
    588 takes care of all the parsing and AST generation. We just need to
    589 implement codegen for the assignment operator. This looks like:
    590 
    591 .. code-block:: ocaml
    592 
    593     let rec codegen_expr = function
    594           begin match op with
    595           | '=' ->
    596               (* Special case '=' because we don't want to emit the LHS as an
    597                * expression. *)
    598               let name =
    599                 match lhs with
    600                 | Ast.Variable name -> name
    601                 | _ -> raise (Error "destination of '=' must be a variable")
    602               in
    603 
    604 Unlike the rest of the binary operators, our assignment operator doesn't
    605 follow the "emit LHS, emit RHS, do computation" model. As such, it is
    606 handled as a special case before the other binary operators are handled.
    607 The other strange thing is that it requires the LHS to be a variable. It
    608 is invalid to have "(x+1) = expr" - only things like "x = expr" are
    609 allowed.
    610 
    611 .. code-block:: ocaml
    612 
    613               (* Codegen the rhs. *)
    614               let val_ = codegen_expr rhs in
    615 
    616               (* Lookup the name. *)
    617               let variable = try Hashtbl.find named_values name with
    618               | Not_found -> raise (Error "unknown variable name")
    619               in
    620               ignore(build_store val_ variable builder);
    621               val_
    622           | _ ->
    623                 ...
    624 
    625 Once we have the variable, codegen'ing the assignment is
    626 straightforward: we emit the RHS of the assignment, create a store, and
    627 return the computed value. Returning a value allows for chained
    628 assignments like "X = (Y = Z)".
    629 
    630 Now that we have an assignment operator, we can mutate loop variables
    631 and arguments. For example, we can now run code like this:
    632 
    633 ::
    634 
    635     # Function to print a double.
    636     extern printd(x);
    637 
    638     # Define ':' for sequencing: as a low-precedence operator that ignores operands
    639     # and just returns the RHS.
    640     def binary : 1 (x y) y;
    641 
    642     def test(x)
    643       printd(x) :
    644       x = 4 :
    645       printd(x);
    646 
    647     test(123);
    648 
    649 When run, this example prints "123" and then "4", showing that we did
    650 actually mutate the value! Okay, we have now officially implemented our
    651 goal: getting this to work requires SSA construction in the general
    652 case. However, to be really useful, we want the ability to define our
    653 own local variables, lets add this next!
    654 
    655 User-defined Local Variables
    656 ============================
    657 
    658 Adding var/in is just like any other other extensions we made to
    659 Kaleidoscope: we extend the lexer, the parser, the AST and the code
    660 generator. The first step for adding our new 'var/in' construct is to
    661 extend the lexer. As before, this is pretty trivial, the code looks like
    662 this:
    663 
    664 .. code-block:: ocaml
    665 
    666     type token =
    667       ...
    668       (* var definition *)
    669       | Var
    670 
    671     ...
    672 
    673     and lex_ident buffer = parser
    674           ...
    675           | "in" -> [< 'Token.In; stream >]
    676           | "binary" -> [< 'Token.Binary; stream >]
    677           | "unary" -> [< 'Token.Unary; stream >]
    678           | "var" -> [< 'Token.Var; stream >]
    679           ...
    680 
    681 The next step is to define the AST node that we will construct. For
    682 var/in, it looks like this:
    683 
    684 .. code-block:: ocaml
    685 
    686     type expr =
    687       ...
    688       (* variant for var/in. *)
    689       | Var of (string * expr option) array * expr
    690       ...
    691 
    692 var/in allows a list of names to be defined all at once, and each name
    693 can optionally have an initializer value. As such, we capture this
    694 information in the VarNames vector. Also, var/in has a body, this body
    695 is allowed to access the variables defined by the var/in.
    696 
    697 With this in place, we can define the parser pieces. The first thing we
    698 do is add it as a primary expression:
    699 
    700 .. code-block:: ocaml
    701 
    702     (* primary
    703      *   ::= identifier
    704      *   ::= numberexpr
    705      *   ::= parenexpr
    706      *   ::= ifexpr
    707      *   ::= forexpr
    708      *   ::= varexpr *)
    709     let rec parse_primary = parser
    710       ...
    711       (* varexpr
    712        *   ::= 'var' identifier ('=' expression?
    713        *             (',' identifier ('=' expression)?)* 'in' expression *)
    714       | [< 'Token.Var;
    715            (* At least one variable name is required. *)
    716            'Token.Ident id ?? "expected identifier after var";
    717            init=parse_var_init;
    718            var_names=parse_var_names [(id, init)];
    719            (* At this point, we have to have 'in'. *)
    720            'Token.In ?? "expected 'in' keyword after 'var'";
    721            body=parse_expr >] ->
    722           Ast.Var (Array.of_list (List.rev var_names), body)
    723 
    724     ...
    725 
    726     and parse_var_init = parser
    727       (* read in the optional initializer. *)
    728       | [< 'Token.Kwd '='; e=parse_expr >] -> Some e
    729       | [< >] -> None
    730 
    731     and parse_var_names accumulator = parser
    732       | [< 'Token.Kwd ',';
    733            'Token.Ident id ?? "expected identifier list after var";
    734            init=parse_var_init;
    735            e=parse_var_names ((id, init) :: accumulator) >] -> e
    736       | [< >] -> accumulator
    737 
    738 Now that we can parse and represent the code, we need to support
    739 emission of LLVM IR for it. This code starts out with:
    740 
    741 .. code-block:: ocaml
    742 
    743     let rec codegen_expr = function
    744       ...
    745       | Ast.Var (var_names, body)
    746           let old_bindings = ref [] in
    747 
    748           let the_function = block_parent (insertion_block builder) in
    749 
    750           (* Register all variables and emit their initializer. *)
    751           Array.iter (fun (var_name, init) ->
    752 
    753 Basically it loops over all the variables, installing them one at a
    754 time. For each variable we put into the symbol table, we remember the
    755 previous value that we replace in OldBindings.
    756 
    757 .. code-block:: ocaml
    758 
    759             (* Emit the initializer before adding the variable to scope, this
    760              * prevents the initializer from referencing the variable itself, and
    761              * permits stuff like this:
    762              *   var a = 1 in
    763              *     var a = a in ...   # refers to outer 'a'. *)
    764             let init_val =
    765               match init with
    766               | Some init -> codegen_expr init
    767               (* If not specified, use 0.0. *)
    768               | None -> const_float double_type 0.0
    769             in
    770 
    771             let alloca = create_entry_block_alloca the_function var_name in
    772             ignore(build_store init_val alloca builder);
    773 
    774             (* Remember the old variable binding so that we can restore the binding
    775              * when we unrecurse. *)
    776 
    777             begin
    778               try
    779                 let old_value = Hashtbl.find named_values var_name in
    780                 old_bindings := (var_name, old_value) :: !old_bindings;
    781               with Not_found > ()
    782             end;
    783 
    784             (* Remember this binding. *)
    785             Hashtbl.add named_values var_name alloca;
    786           ) var_names;
    787 
    788 There are more comments here than code. The basic idea is that we emit
    789 the initializer, create the alloca, then update the symbol table to
    790 point to it. Once all the variables are installed in the symbol table,
    791 we evaluate the body of the var/in expression:
    792 
    793 .. code-block:: ocaml
    794 
    795           (* Codegen the body, now that all vars are in scope. *)
    796           let body_val = codegen_expr body in
    797 
    798 Finally, before returning, we restore the previous variable bindings:
    799 
    800 .. code-block:: ocaml
    801 
    802           (* Pop all our variables from scope. *)
    803           List.iter (fun (var_name, old_value) ->
    804             Hashtbl.add named_values var_name old_value
    805           ) !old_bindings;
    806 
    807           (* Return the body computation. *)
    808           body_val
    809 
    810 The end result of all of this is that we get properly scoped variable
    811 definitions, and we even (trivially) allow mutation of them :).
    812 
    813 With this, we completed what we set out to do. Our nice iterative fib
    814 example from the intro compiles and runs just fine. The mem2reg pass
    815 optimizes all of our stack variables into SSA registers, inserting PHI
    816 nodes where needed, and our front-end remains simple: no "iterated
    817 dominance frontier" computation anywhere in sight.
    818 
    819 Full Code Listing
    820 =================
    821 
    822 Here is the complete code listing for our running example, enhanced with
    823 mutable variables and var/in support. To build this example, use:
    824 
    825 .. code-block:: bash
    826 
    827     # Compile
    828     ocamlbuild toy.byte
    829     # Run
    830     ./toy.byte
    831 
    832 Here is the code:
    833 
    834 \_tags:
    835     ::
    836 
    837         <{lexer,parser}.ml>: use_camlp4, pp(camlp4of)
    838         <*.{byte,native}>: g++, use_llvm, use_llvm_analysis
    839         <*.{byte,native}>: use_llvm_executionengine, use_llvm_target
    840         <*.{byte,native}>: use_llvm_scalar_opts, use_bindings
    841 
    842 myocamlbuild.ml:
    843     .. code-block:: ocaml
    844 
    845         open Ocamlbuild_plugin;;
    846 
    847         ocaml_lib ~extern:true "llvm";;
    848         ocaml_lib ~extern:true "llvm_analysis";;
    849         ocaml_lib ~extern:true "llvm_executionengine";;
    850         ocaml_lib ~extern:true "llvm_target";;
    851         ocaml_lib ~extern:true "llvm_scalar_opts";;
    852 
    853         flag ["link"; "ocaml"; "g++"] (S[A"-cc"; A"g++"; A"-cclib"; A"-rdynamic"]);;
    854         dep ["link"; "ocaml"; "use_bindings"] ["bindings.o"];;
    855 
    856 token.ml:
    857     .. code-block:: ocaml
    858 
    859         (*===----------------------------------------------------------------------===
    860          * Lexer Tokens
    861          *===----------------------------------------------------------------------===*)
    862 
    863         (* The lexer returns these 'Kwd' if it is an unknown character, otherwise one of
    864          * these others for known things. *)
    865         type token =
    866           (* commands *)
    867           | Def | Extern
    868 
    869           (* primary *)
    870           | Ident of string | Number of float
    871 
    872           (* unknown *)
    873           | Kwd of char
    874 
    875           (* control *)
    876           | If | Then | Else
    877           | For | In
    878 
    879           (* operators *)
    880           | Binary | Unary
    881 
    882           (* var definition *)
    883           | Var
    884 
    885 lexer.ml:
    886     .. code-block:: ocaml
    887 
    888         (*===----------------------------------------------------------------------===
    889          * Lexer
    890          *===----------------------------------------------------------------------===*)
    891 
    892         let rec lex = parser
    893           (* Skip any whitespace. *)
    894           | [< ' (' ' | '\n' | '\r' | '\t'); stream >] -> lex stream
    895 
    896           (* identifier: [a-zA-Z][a-zA-Z0-9] *)
    897           | [< ' ('A' .. 'Z' | 'a' .. 'z' as c); stream >] ->
    898               let buffer = Buffer.create 1 in
    899               Buffer.add_char buffer c;
    900               lex_ident buffer stream
    901 
    902           (* number: [0-9.]+ *)
    903           | [< ' ('0' .. '9' as c); stream >] ->
    904               let buffer = Buffer.create 1 in
    905               Buffer.add_char buffer c;
    906               lex_number buffer stream
    907 
    908           (* Comment until end of line. *)
    909           | [< ' ('#'); stream >] ->
    910               lex_comment stream
    911 
    912           (* Otherwise, just return the character as its ascii value. *)
    913           | [< 'c; stream >] ->
    914               [< 'Token.Kwd c; lex stream >]
    915 
    916           (* end of stream. *)
    917           | [< >] -> [< >]
    918 
    919         and lex_number buffer = parser
    920           | [< ' ('0' .. '9' | '.' as c); stream >] ->
    921               Buffer.add_char buffer c;
    922               lex_number buffer stream
    923           | [< stream=lex >] ->
    924               [< 'Token.Number (float_of_string (Buffer.contents buffer)); stream >]
    925 
    926         and lex_ident buffer = parser
    927           | [< ' ('A' .. 'Z' | 'a' .. 'z' | '0' .. '9' as c); stream >] ->
    928               Buffer.add_char buffer c;
    929               lex_ident buffer stream
    930           | [< stream=lex >] ->
    931               match Buffer.contents buffer with
    932               | "def" -> [< 'Token.Def; stream >]
    933               | "extern" -> [< 'Token.Extern; stream >]
    934               | "if" -> [< 'Token.If; stream >]
    935               | "then" -> [< 'Token.Then; stream >]
    936               | "else" -> [< 'Token.Else; stream >]
    937               | "for" -> [< 'Token.For; stream >]
    938               | "in" -> [< 'Token.In; stream >]
    939               | "binary" -> [< 'Token.Binary; stream >]
    940               | "unary" -> [< 'Token.Unary; stream >]
    941               | "var" -> [< 'Token.Var; stream >]
    942               | id -> [< 'Token.Ident id; stream >]
    943 
    944         and lex_comment = parser
    945           | [< ' ('\n'); stream=lex >] -> stream
    946           | [< 'c; e=lex_comment >] -> e
    947           | [< >] -> [< >]
    948 
    949 ast.ml:
    950     .. code-block:: ocaml
    951 
    952         (*===----------------------------------------------------------------------===
    953          * Abstract Syntax Tree (aka Parse Tree)
    954          *===----------------------------------------------------------------------===*)
    955 
    956         (* expr - Base type for all expression nodes. *)
    957         type expr =
    958           (* variant for numeric literals like "1.0". *)
    959           | Number of float
    960 
    961           (* variant for referencing a variable, like "a". *)
    962           | Variable of string
    963 
    964           (* variant for a unary operator. *)
    965           | Unary of char * expr
    966 
    967           (* variant for a binary operator. *)
    968           | Binary of char * expr * expr
    969 
    970           (* variant for function calls. *)
    971           | Call of string * expr array
    972 
    973           (* variant for if/then/else. *)
    974           | If of expr * expr * expr
    975 
    976           (* variant for for/in. *)
    977           | For of string * expr * expr * expr option * expr
    978 
    979           (* variant for var/in. *)
    980           | Var of (string * expr option) array * expr
    981 
    982         (* proto - This type represents the "prototype" for a function, which captures
    983          * its name, and its argument names (thus implicitly the number of arguments the
    984          * function takes). *)
    985         type proto =
    986           | Prototype of string * string array
    987           | BinOpPrototype of string * string array * int
    988 
    989         (* func - This type represents a function definition itself. *)
    990         type func = Function of proto * expr
    991 
    992 parser.ml:
    993     .. code-block:: ocaml
    994 
    995         (*===---------------------------------------------------------------------===
    996          * Parser
    997          *===---------------------------------------------------------------------===*)
    998 
    999         (* binop_precedence - This holds the precedence for each binary operator that is
   1000          * defined *)
   1001         let binop_precedence:(char, int) Hashtbl.t = Hashtbl.create 10
   1002 
   1003         (* precedence - Get the precedence of the pending binary operator token. *)
   1004         let precedence c = try Hashtbl.find binop_precedence c with Not_found -> -1
   1005 
   1006         (* primary
   1007          *   ::= identifier
   1008          *   ::= numberexpr
   1009          *   ::= parenexpr
   1010          *   ::= ifexpr
   1011          *   ::= forexpr
   1012          *   ::= varexpr *)
   1013         let rec parse_primary = parser
   1014           (* numberexpr ::= number *)
   1015           | [< 'Token.Number n >] -> Ast.Number n
   1016 
   1017           (* parenexpr ::= '(' expression ')' *)
   1018           | [< 'Token.Kwd '('; e=parse_expr; 'Token.Kwd ')' ?? "expected ')'" >] -> e
   1019 
   1020           (* identifierexpr
   1021            *   ::= identifier
   1022            *   ::= identifier '(' argumentexpr ')' *)
   1023           | [< 'Token.Ident id; stream >] ->
   1024               let rec parse_args accumulator = parser
   1025                 | [< e=parse_expr; stream >] ->
   1026                     begin parser
   1027                       | [< 'Token.Kwd ','; e=parse_args (e :: accumulator) >] -> e
   1028                       | [< >] -> e :: accumulator
   1029                     end stream
   1030                 | [< >] -> accumulator
   1031               in
   1032               let rec parse_ident id = parser
   1033                 (* Call. *)
   1034                 | [< 'Token.Kwd '(';
   1035                      args=parse_args [];
   1036                      'Token.Kwd ')' ?? "expected ')'">] ->
   1037                     Ast.Call (id, Array.of_list (List.rev args))
   1038 
   1039                 (* Simple variable ref. *)
   1040                 | [< >] -> Ast.Variable id
   1041               in
   1042               parse_ident id stream
   1043 
   1044           (* ifexpr ::= 'if' expr 'then' expr 'else' expr *)
   1045           | [< 'Token.If; c=parse_expr;
   1046                'Token.Then ?? "expected 'then'"; t=parse_expr;
   1047                'Token.Else ?? "expected 'else'"; e=parse_expr >] ->
   1048               Ast.If (c, t, e)
   1049 
   1050           (* forexpr
   1051                 ::= 'for' identifier '=' expr ',' expr (',' expr)? 'in' expression *)
   1052           | [< 'Token.For;
   1053                'Token.Ident id ?? "expected identifier after for";
   1054                'Token.Kwd '=' ?? "expected '=' after for";
   1055                stream >] ->
   1056               begin parser
   1057                 | [<
   1058                      start=parse_expr;
   1059                      'Token.Kwd ',' ?? "expected ',' after for";
   1060                      end_=parse_expr;
   1061                      stream >] ->
   1062                     let step =
   1063                       begin parser
   1064                       | [< 'Token.Kwd ','; step=parse_expr >] -> Some step
   1065                       | [< >] -> None
   1066                       end stream
   1067                     in
   1068                     begin parser
   1069                     | [< 'Token.In; body=parse_expr >] ->
   1070                         Ast.For (id, start, end_, step, body)
   1071                     | [< >] ->
   1072                         raise (Stream.Error "expected 'in' after for")
   1073                     end stream
   1074                 | [< >] ->
   1075                     raise (Stream.Error "expected '=' after for")
   1076               end stream
   1077 
   1078           (* varexpr
   1079            *   ::= 'var' identifier ('=' expression?
   1080            *             (',' identifier ('=' expression)?)* 'in' expression *)
   1081           | [< 'Token.Var;
   1082                (* At least one variable name is required. *)
   1083                'Token.Ident id ?? "expected identifier after var";
   1084                init=parse_var_init;
   1085                var_names=parse_var_names [(id, init)];
   1086                (* At this point, we have to have 'in'. *)
   1087                'Token.In ?? "expected 'in' keyword after 'var'";
   1088                body=parse_expr >] ->
   1089               Ast.Var (Array.of_list (List.rev var_names), body)
   1090 
   1091           | [< >] -> raise (Stream.Error "unknown token when expecting an expression.")
   1092 
   1093         (* unary
   1094          *   ::= primary
   1095          *   ::= '!' unary *)
   1096         and parse_unary = parser
   1097           (* If this is a unary operator, read it. *)
   1098           | [< 'Token.Kwd op when op != '(' && op != ')'; operand=parse_expr >] ->
   1099               Ast.Unary (op, operand)
   1100 
   1101           (* If the current token is not an operator, it must be a primary expr. *)
   1102           | [< stream >] -> parse_primary stream
   1103 
   1104         (* binoprhs
   1105          *   ::= ('+' primary)* *)
   1106         and parse_bin_rhs expr_prec lhs stream =
   1107           match Stream.peek stream with
   1108           (* If this is a binop, find its precedence. *)
   1109           | Some (Token.Kwd c) when Hashtbl.mem binop_precedence c ->
   1110               let token_prec = precedence c in
   1111 
   1112               (* If this is a binop that binds at least as tightly as the current binop,
   1113                * consume it, otherwise we are done. *)
   1114               if token_prec < expr_prec then lhs else begin
   1115                 (* Eat the binop. *)
   1116                 Stream.junk stream;
   1117 
   1118                 (* Parse the primary expression after the binary operator. *)
   1119                 let rhs = parse_unary stream in
   1120 
   1121                 (* Okay, we know this is a binop. *)
   1122                 let rhs =
   1123                   match Stream.peek stream with
   1124                   | Some (Token.Kwd c2) ->
   1125                       (* If BinOp binds less tightly with rhs than the operator after
   1126                        * rhs, let the pending operator take rhs as its lhs. *)
   1127                       let next_prec = precedence c2 in
   1128                       if token_prec < next_prec
   1129                       then parse_bin_rhs (token_prec + 1) rhs stream
   1130                       else rhs
   1131                   | _ -> rhs
   1132                 in
   1133 
   1134                 (* Merge lhs/rhs. *)
   1135                 let lhs = Ast.Binary (c, lhs, rhs) in
   1136                 parse_bin_rhs expr_prec lhs stream
   1137               end
   1138           | _ -> lhs
   1139 
   1140         and parse_var_init = parser
   1141           (* read in the optional initializer. *)
   1142           | [< 'Token.Kwd '='; e=parse_expr >] -> Some e
   1143           | [< >] -> None
   1144 
   1145         and parse_var_names accumulator = parser
   1146           | [< 'Token.Kwd ',';
   1147                'Token.Ident id ?? "expected identifier list after var";
   1148                init=parse_var_init;
   1149                e=parse_var_names ((id, init) :: accumulator) >] -> e
   1150           | [< >] -> accumulator
   1151 
   1152         (* expression
   1153          *   ::= primary binoprhs *)
   1154         and parse_expr = parser
   1155           | [< lhs=parse_unary; stream >] -> parse_bin_rhs 0 lhs stream
   1156 
   1157         (* prototype
   1158          *   ::= id '(' id* ')'
   1159          *   ::= binary LETTER number? (id, id)
   1160          *   ::= unary LETTER number? (id) *)
   1161         let parse_prototype =
   1162           let rec parse_args accumulator = parser
   1163             | [< 'Token.Ident id; e=parse_args (id::accumulator) >] -> e
   1164             | [< >] -> accumulator
   1165           in
   1166           let parse_operator = parser
   1167             | [< 'Token.Unary >] -> "unary", 1
   1168             | [< 'Token.Binary >] -> "binary", 2
   1169           in
   1170           let parse_binary_precedence = parser
   1171             | [< 'Token.Number n >] -> int_of_float n
   1172             | [< >] -> 30
   1173           in
   1174           parser
   1175           | [< 'Token.Ident id;
   1176                'Token.Kwd '(' ?? "expected '(' in prototype";
   1177                args=parse_args [];
   1178                'Token.Kwd ')' ?? "expected ')' in prototype" >] ->
   1179               (* success. *)
   1180               Ast.Prototype (id, Array.of_list (List.rev args))
   1181           | [< (prefix, kind)=parse_operator;
   1182                'Token.Kwd op ?? "expected an operator";
   1183                (* Read the precedence if present. *)
   1184                binary_precedence=parse_binary_precedence;
   1185                'Token.Kwd '(' ?? "expected '(' in prototype";
   1186                 args=parse_args [];
   1187                'Token.Kwd ')' ?? "expected ')' in prototype" >] ->
   1188               let name = prefix ^ (String.make 1 op) in
   1189               let args = Array.of_list (List.rev args) in
   1190 
   1191               (* Verify right number of arguments for operator. *)
   1192               if Array.length args != kind
   1193               then raise (Stream.Error "invalid number of operands for operator")
   1194               else
   1195                 if kind == 1 then
   1196                   Ast.Prototype (name, args)
   1197                 else
   1198                   Ast.BinOpPrototype (name, args, binary_precedence)
   1199           | [< >] ->
   1200               raise (Stream.Error "expected function name in prototype")
   1201 
   1202         (* definition ::= 'def' prototype expression *)
   1203         let parse_definition = parser
   1204           | [< 'Token.Def; p=parse_prototype; e=parse_expr >] ->
   1205               Ast.Function (p, e)
   1206 
   1207         (* toplevelexpr ::= expression *)
   1208         let parse_toplevel = parser
   1209           | [< e=parse_expr >] ->
   1210               (* Make an anonymous proto. *)
   1211               Ast.Function (Ast.Prototype ("", [||]), e)
   1212 
   1213         (*  external ::= 'extern' prototype *)
   1214         let parse_extern = parser
   1215           | [< 'Token.Extern; e=parse_prototype >] -> e
   1216 
   1217 codegen.ml:
   1218     .. code-block:: ocaml
   1219 
   1220         (*===----------------------------------------------------------------------===
   1221          * Code Generation
   1222          *===----------------------------------------------------------------------===*)
   1223 
   1224         open Llvm
   1225 
   1226         exception Error of string
   1227 
   1228         let context = global_context ()
   1229         let the_module = create_module context "my cool jit"
   1230         let builder = builder context
   1231         let named_values:(string, llvalue) Hashtbl.t = Hashtbl.create 10
   1232         let double_type = double_type context
   1233 
   1234         (* Create an alloca instruction in the entry block of the function. This
   1235          * is used for mutable variables etc. *)
   1236         let create_entry_block_alloca the_function var_name =
   1237           let builder = builder_at context (instr_begin (entry_block the_function)) in
   1238           build_alloca double_type var_name builder
   1239 
   1240         let rec codegen_expr = function
   1241           | Ast.Number n -> const_float double_type n
   1242           | Ast.Variable name ->
   1243               let v = try Hashtbl.find named_values name with
   1244                 | Not_found -> raise (Error "unknown variable name")
   1245               in
   1246               (* Load the value. *)
   1247               build_load v name builder
   1248           | Ast.Unary (op, operand) ->
   1249               let operand = codegen_expr operand in
   1250               let callee = "unary" ^ (String.make 1 op) in
   1251               let callee =
   1252                 match lookup_function callee the_module with
   1253                 | Some callee -> callee
   1254                 | None -> raise (Error "unknown unary operator")
   1255               in
   1256               build_call callee [|operand|] "unop" builder
   1257           | Ast.Binary (op, lhs, rhs) ->
   1258               begin match op with
   1259               | '=' ->
   1260                   (* Special case '=' because we don't want to emit the LHS as an
   1261                    * expression. *)
   1262                   let name =
   1263                     match lhs with
   1264                     | Ast.Variable name -> name
   1265                     | _ -> raise (Error "destination of '=' must be a variable")
   1266                   in
   1267 
   1268                   (* Codegen the rhs. *)
   1269                   let val_ = codegen_expr rhs in
   1270 
   1271                   (* Lookup the name. *)
   1272                   let variable = try Hashtbl.find named_values name with
   1273                   | Not_found -> raise (Error "unknown variable name")
   1274                   in
   1275                   ignore(build_store val_ variable builder);
   1276                   val_
   1277               | _ ->
   1278                   let lhs_val = codegen_expr lhs in
   1279                   let rhs_val = codegen_expr rhs in
   1280                   begin
   1281                     match op with
   1282                     | '+' -> build_add lhs_val rhs_val "addtmp" builder
   1283                     | '-' -> build_sub lhs_val rhs_val "subtmp" builder
   1284                     | '*' -> build_mul lhs_val rhs_val "multmp" builder
   1285                     | '<' ->
   1286                         (* Convert bool 0/1 to double 0.0 or 1.0 *)
   1287                         let i = build_fcmp Fcmp.Ult lhs_val rhs_val "cmptmp" builder in
   1288                         build_uitofp i double_type "booltmp" builder
   1289                     | _ ->
   1290                         (* If it wasn't a builtin binary operator, it must be a user defined
   1291                          * one. Emit a call to it. *)
   1292                         let callee = "binary" ^ (String.make 1 op) in
   1293                         let callee =
   1294                           match lookup_function callee the_module with
   1295                           | Some callee -> callee
   1296                           | None -> raise (Error "binary operator not found!")
   1297                         in
   1298                         build_call callee [|lhs_val; rhs_val|] "binop" builder
   1299                   end
   1300               end
   1301           | Ast.Call (callee, args) ->
   1302               (* Look up the name in the module table. *)
   1303               let callee =
   1304                 match lookup_function callee the_module with
   1305                 | Some callee -> callee
   1306                 | None -> raise (Error "unknown function referenced")
   1307               in
   1308               let params = params callee in
   1309 
   1310               (* If argument mismatch error. *)
   1311               if Array.length params == Array.length args then () else
   1312                 raise (Error "incorrect # arguments passed");
   1313               let args = Array.map codegen_expr args in
   1314               build_call callee args "calltmp" builder
   1315           | Ast.If (cond, then_, else_) ->
   1316               let cond = codegen_expr cond in
   1317 
   1318               (* Convert condition to a bool by comparing equal to 0.0 *)
   1319               let zero = const_float double_type 0.0 in
   1320               let cond_val = build_fcmp Fcmp.One cond zero "ifcond" builder in
   1321 
   1322               (* Grab the first block so that we might later add the conditional branch
   1323                * to it at the end of the function. *)
   1324               let start_bb = insertion_block builder in
   1325               let the_function = block_parent start_bb in
   1326 
   1327               let then_bb = append_block context "then" the_function in
   1328 
   1329               (* Emit 'then' value. *)
   1330               position_at_end then_bb builder;
   1331               let then_val = codegen_expr then_ in
   1332 
   1333               (* Codegen of 'then' can change the current block, update then_bb for the
   1334                * phi. We create a new name because one is used for the phi node, and the
   1335                * other is used for the conditional branch. *)
   1336               let new_then_bb = insertion_block builder in
   1337 
   1338               (* Emit 'else' value. *)
   1339               let else_bb = append_block context "else" the_function in
   1340               position_at_end else_bb builder;
   1341               let else_val = codegen_expr else_ in
   1342 
   1343               (* Codegen of 'else' can change the current block, update else_bb for the
   1344                * phi. *)
   1345               let new_else_bb = insertion_block builder in
   1346 
   1347               (* Emit merge block. *)
   1348               let merge_bb = append_block context "ifcont" the_function in
   1349               position_at_end merge_bb builder;
   1350               let incoming = [(then_val, new_then_bb); (else_val, new_else_bb)] in
   1351               let phi = build_phi incoming "iftmp" builder in
   1352 
   1353               (* Return to the start block to add the conditional branch. *)
   1354               position_at_end start_bb builder;
   1355               ignore (build_cond_br cond_val then_bb else_bb builder);
   1356 
   1357               (* Set a unconditional branch at the end of the 'then' block and the
   1358                * 'else' block to the 'merge' block. *)
   1359               position_at_end new_then_bb builder; ignore (build_br merge_bb builder);
   1360               position_at_end new_else_bb builder; ignore (build_br merge_bb builder);
   1361 
   1362               (* Finally, set the builder to the end of the merge block. *)
   1363               position_at_end merge_bb builder;
   1364 
   1365               phi
   1366           | Ast.For (var_name, start, end_, step, body) ->
   1367               (* Output this as:
   1368                *   var = alloca double
   1369                *   ...
   1370                *   start = startexpr
   1371                *   store start -> var
   1372                *   goto loop
   1373                * loop:
   1374                *   ...
   1375                *   bodyexpr
   1376                *   ...
   1377                * loopend:
   1378                *   step = stepexpr
   1379                *   endcond = endexpr
   1380                *
   1381                *   curvar = load var
   1382                *   nextvar = curvar + step
   1383                *   store nextvar -> var
   1384                *   br endcond, loop, endloop
   1385                * outloop: *)
   1386 
   1387               let the_function = block_parent (insertion_block builder) in
   1388 
   1389               (* Create an alloca for the variable in the entry block. *)
   1390               let alloca = create_entry_block_alloca the_function var_name in
   1391 
   1392               (* Emit the start code first, without 'variable' in scope. *)
   1393               let start_val = codegen_expr start in
   1394 
   1395               (* Store the value into the alloca. *)
   1396               ignore(build_store start_val alloca builder);
   1397 
   1398               (* Make the new basic block for the loop header, inserting after current
   1399                * block. *)
   1400               let loop_bb = append_block context "loop" the_function in
   1401 
   1402               (* Insert an explicit fall through from the current block to the
   1403                * loop_bb. *)
   1404               ignore (build_br loop_bb builder);
   1405 
   1406               (* Start insertion in loop_bb. *)
   1407               position_at_end loop_bb builder;
   1408 
   1409               (* Within the loop, the variable is defined equal to the PHI node. If it
   1410                * shadows an existing variable, we have to restore it, so save it
   1411                * now. *)
   1412               let old_val =
   1413                 try Some (Hashtbl.find named_values var_name) with Not_found -> None
   1414               in
   1415               Hashtbl.add named_values var_name alloca;
   1416 
   1417               (* Emit the body of the loop.  This, like any other expr, can change the
   1418                * current BB.  Note that we ignore the value computed by the body, but
   1419                * don't allow an error *)
   1420               ignore (codegen_expr body);
   1421 
   1422               (* Emit the step value. *)
   1423               let step_val =
   1424                 match step with
   1425                 | Some step -> codegen_expr step
   1426                 (* If not specified, use 1.0. *)
   1427                 | None -> const_float double_type 1.0
   1428               in
   1429 
   1430               (* Compute the end condition. *)
   1431               let end_cond = codegen_expr end_ in
   1432 
   1433               (* Reload, increment, and restore the alloca. This handles the case where
   1434                * the body of the loop mutates the variable. *)
   1435               let cur_var = build_load alloca var_name builder in
   1436               let next_var = build_add cur_var step_val "nextvar" builder in
   1437               ignore(build_store next_var alloca builder);
   1438 
   1439               (* Convert condition to a bool by comparing equal to 0.0. *)
   1440               let zero = const_float double_type 0.0 in
   1441               let end_cond = build_fcmp Fcmp.One end_cond zero "loopcond" builder in
   1442 
   1443               (* Create the "after loop" block and insert it. *)
   1444               let after_bb = append_block context "afterloop" the_function in
   1445 
   1446               (* Insert the conditional branch into the end of loop_end_bb. *)
   1447               ignore (build_cond_br end_cond loop_bb after_bb builder);
   1448 
   1449               (* Any new code will be inserted in after_bb. *)
   1450               position_at_end after_bb builder;
   1451 
   1452               (* Restore the unshadowed variable. *)
   1453               begin match old_val with
   1454               | Some old_val -> Hashtbl.add named_values var_name old_val
   1455               | None -> ()
   1456               end;
   1457 
   1458               (* for expr always returns 0.0. *)
   1459               const_null double_type
   1460           | Ast.Var (var_names, body) ->
   1461               let old_bindings = ref [] in
   1462 
   1463               let the_function = block_parent (insertion_block builder) in
   1464 
   1465               (* Register all variables and emit their initializer. *)
   1466               Array.iter (fun (var_name, init) ->
   1467                 (* Emit the initializer before adding the variable to scope, this
   1468                  * prevents the initializer from referencing the variable itself, and
   1469                  * permits stuff like this:
   1470                  *   var a = 1 in
   1471                  *     var a = a in ...   # refers to outer 'a'. *)
   1472                 let init_val =
   1473                   match init with
   1474                   | Some init -> codegen_expr init
   1475                   (* If not specified, use 0.0. *)
   1476                   | None -> const_float double_type 0.0
   1477                 in
   1478 
   1479                 let alloca = create_entry_block_alloca the_function var_name in
   1480                 ignore(build_store init_val alloca builder);
   1481 
   1482                 (* Remember the old variable binding so that we can restore the binding
   1483                  * when we unrecurse. *)
   1484                 begin
   1485                   try
   1486                     let old_value = Hashtbl.find named_values var_name in
   1487                     old_bindings := (var_name, old_value) :: !old_bindings;
   1488                   with Not_found -> ()
   1489                 end;
   1490 
   1491                 (* Remember this binding. *)
   1492                 Hashtbl.add named_values var_name alloca;
   1493               ) var_names;
   1494 
   1495               (* Codegen the body, now that all vars are in scope. *)
   1496               let body_val = codegen_expr body in
   1497 
   1498               (* Pop all our variables from scope. *)
   1499               List.iter (fun (var_name, old_value) ->
   1500                 Hashtbl.add named_values var_name old_value
   1501               ) !old_bindings;
   1502 
   1503               (* Return the body computation. *)
   1504               body_val
   1505 
   1506         let codegen_proto = function
   1507           | Ast.Prototype (name, args) | Ast.BinOpPrototype (name, args, _) ->
   1508               (* Make the function type: double(double,double) etc. *)
   1509               let doubles = Array.make (Array.length args) double_type in
   1510               let ft = function_type double_type doubles in
   1511               let f =
   1512                 match lookup_function name the_module with
   1513                 | None -> declare_function name ft the_module
   1514 
   1515                 (* If 'f' conflicted, there was already something named 'name'. If it
   1516                  * has a body, don't allow redefinition or reextern. *)
   1517                 | Some f ->
   1518                     (* If 'f' already has a body, reject this. *)
   1519                     if block_begin f <> At_end f then
   1520                       raise (Error "redefinition of function");
   1521 
   1522                     (* If 'f' took a different number of arguments, reject. *)
   1523                     if element_type (type_of f) <> ft then
   1524                       raise (Error "redefinition of function with different # args");
   1525                     f
   1526               in
   1527 
   1528               (* Set names for all arguments. *)
   1529               Array.iteri (fun i a ->
   1530                 let n = args.(i) in
   1531                 set_value_name n a;
   1532                 Hashtbl.add named_values n a;
   1533               ) (params f);
   1534               f
   1535 
   1536         (* Create an alloca for each argument and register the argument in the symbol
   1537          * table so that references to it will succeed. *)
   1538         let create_argument_allocas the_function proto =
   1539           let args = match proto with
   1540             | Ast.Prototype (_, args) | Ast.BinOpPrototype (_, args, _) -> args
   1541           in
   1542           Array.iteri (fun i ai ->
   1543             let var_name = args.(i) in
   1544             (* Create an alloca for this variable. *)
   1545             let alloca = create_entry_block_alloca the_function var_name in
   1546 
   1547             (* Store the initial value into the alloca. *)
   1548             ignore(build_store ai alloca builder);
   1549 
   1550             (* Add arguments to variable symbol table. *)
   1551             Hashtbl.add named_values var_name alloca;
   1552           ) (params the_function)
   1553 
   1554         let codegen_func the_fpm = function
   1555           | Ast.Function (proto, body) ->
   1556               Hashtbl.clear named_values;
   1557               let the_function = codegen_proto proto in
   1558 
   1559               (* If this is an operator, install it. *)
   1560               begin match proto with
   1561               | Ast.BinOpPrototype (name, args, prec) ->
   1562                   let op = name.[String.length name - 1] in
   1563                   Hashtbl.add Parser.binop_precedence op prec;
   1564               | _ -> ()
   1565               end;
   1566 
   1567               (* Create a new basic block to start insertion into. *)
   1568               let bb = append_block context "entry" the_function in
   1569               position_at_end bb builder;
   1570 
   1571               try
   1572                 (* Add all arguments to the symbol table and create their allocas. *)
   1573                 create_argument_allocas the_function proto;
   1574 
   1575                 let ret_val = codegen_expr body in
   1576 
   1577                 (* Finish off the function. *)
   1578                 let _ = build_ret ret_val builder in
   1579 
   1580                 (* Validate the generated code, checking for consistency. *)
   1581                 Llvm_analysis.assert_valid_function the_function;
   1582 
   1583                 (* Optimize the function. *)
   1584                 let _ = PassManager.run_function the_function the_fpm in
   1585 
   1586                 the_function
   1587               with e ->
   1588                 delete_function the_function;
   1589                 raise e
   1590 
   1591 toplevel.ml:
   1592     .. code-block:: ocaml
   1593 
   1594         (*===----------------------------------------------------------------------===
   1595          * Top-Level parsing and JIT Driver
   1596          *===----------------------------------------------------------------------===*)
   1597 
   1598         open Llvm
   1599         open Llvm_executionengine
   1600 
   1601         (* top ::= definition | external | expression | ';' *)
   1602         let rec main_loop the_fpm the_execution_engine stream =
   1603           match Stream.peek stream with
   1604           | None -> ()
   1605 
   1606           (* ignore top-level semicolons. *)
   1607           | Some (Token.Kwd ';') ->
   1608               Stream.junk stream;
   1609               main_loop the_fpm the_execution_engine stream
   1610 
   1611           | Some token ->
   1612               begin
   1613                 try match token with
   1614                 | Token.Def ->
   1615                     let e = Parser.parse_definition stream in
   1616                     print_endline "parsed a function definition.";
   1617                     dump_value (Codegen.codegen_func the_fpm e);
   1618                 | Token.Extern ->
   1619                     let e = Parser.parse_extern stream in
   1620                     print_endline "parsed an extern.";
   1621                     dump_value (Codegen.codegen_proto e);
   1622                 | _ ->
   1623                     (* Evaluate a top-level expression into an anonymous function. *)
   1624                     let e = Parser.parse_toplevel stream in
   1625                     print_endline "parsed a top-level expr";
   1626                     let the_function = Codegen.codegen_func the_fpm e in
   1627                     dump_value the_function;
   1628 
   1629                     (* JIT the function, returning a function pointer. *)
   1630                     let result = ExecutionEngine.run_function the_function [||]
   1631                       the_execution_engine in
   1632 
   1633                     print_string "Evaluated to ";
   1634                     print_float (GenericValue.as_float Codegen.double_type result);
   1635                     print_newline ();
   1636                 with Stream.Error s | Codegen.Error s ->
   1637                   (* Skip token for error recovery. *)
   1638                   Stream.junk stream;
   1639                   print_endline s;
   1640               end;
   1641               print_string "ready> "; flush stdout;
   1642               main_loop the_fpm the_execution_engine stream
   1643 
   1644 toy.ml:
   1645     .. code-block:: ocaml
   1646 
   1647         (*===----------------------------------------------------------------------===
   1648          * Main driver code.
   1649          *===----------------------------------------------------------------------===*)
   1650 
   1651         open Llvm
   1652         open Llvm_executionengine
   1653         open Llvm_target
   1654         open Llvm_scalar_opts
   1655 
   1656         let main () =
   1657           ignore (initialize_native_target ());
   1658 
   1659           (* Install standard binary operators.
   1660            * 1 is the lowest precedence. *)
   1661           Hashtbl.add Parser.binop_precedence '=' 2;
   1662           Hashtbl.add Parser.binop_precedence '<' 10;
   1663           Hashtbl.add Parser.binop_precedence '+' 20;
   1664           Hashtbl.add Parser.binop_precedence '-' 20;
   1665           Hashtbl.add Parser.binop_precedence '*' 40;    (* highest. *)
   1666 
   1667           (* Prime the first token. *)
   1668           print_string "ready> "; flush stdout;
   1669           let stream = Lexer.lex (Stream.of_channel stdin) in
   1670 
   1671           (* Create the JIT. *)
   1672           let the_execution_engine = ExecutionEngine.create Codegen.the_module in
   1673           let the_fpm = PassManager.create_function Codegen.the_module in
   1674 
   1675           (* Set up the optimizer pipeline.  Start with registering info about how the
   1676            * target lays out data structures. *)
   1677           DataLayout.add (ExecutionEngine.target_data the_execution_engine) the_fpm;
   1678 
   1679           (* Promote allocas to registers. *)
   1680           add_memory_to_register_promotion the_fpm;
   1681 
   1682           (* Do simple "peephole" optimizations and bit-twiddling optzn. *)
   1683           add_instruction_combination the_fpm;
   1684 
   1685           (* reassociate expressions. *)
   1686           add_reassociation the_fpm;
   1687 
   1688           (* Eliminate Common SubExpressions. *)
   1689           add_gvn the_fpm;
   1690 
   1691           (* Simplify the control flow graph (deleting unreachable blocks, etc). *)
   1692           add_cfg_simplification the_fpm;
   1693 
   1694           ignore (PassManager.initialize the_fpm);
   1695 
   1696           (* Run the main "interpreter loop" now. *)
   1697           Toplevel.main_loop the_fpm the_execution_engine stream;
   1698 
   1699           (* Print out all the generated code. *)
   1700           dump_module Codegen.the_module
   1701         ;;
   1702 
   1703         main ()
   1704 
   1705 bindings.c
   1706     .. code-block:: c
   1707 
   1708         #include <stdio.h>
   1709 
   1710         /* putchard - putchar that takes a double and returns 0. */
   1711         extern double putchard(double X) {
   1712           putchar((char)X);
   1713           return 0;
   1714         }
   1715 
   1716         /* printd - printf that takes a double prints it as "%f\n", returning 0. */
   1717         extern double printd(double X) {
   1718           printf("%f\n", X);
   1719           return 0;
   1720         }
   1721 
   1722 `Next: Conclusion and other useful LLVM tidbits <OCamlLangImpl8.html>`_
   1723 
   1724