Home | History | Annotate | Download | only in tutorial
      1 =======================================================
      2 Kaleidoscope: Extending the Language: Mutable Variables
      3 =======================================================
      4 
      5 .. contents::
      6    :local:
      7 
      8 Chapter 7 Introduction
      9 ======================
     10 
     11 Welcome to Chapter 7 of the "`Implementing a language with
     12 LLVM <index.html>`_" tutorial. In chapters 1 through 6, we've built a
     13 very respectable, albeit simple, `functional programming
     14 language <http://en.wikipedia.org/wiki/Functional_programming>`_. In our
     15 journey, we learned some parsing techniques, how to build and represent
     16 an AST, how to build LLVM IR, and how to optimize the resultant code as
     17 well as JIT compile it.
     18 
     19 While Kaleidoscope is interesting as a functional language, the fact
     20 that it is functional makes it "too easy" to generate LLVM IR for it. In
     21 particular, a functional language makes it very easy to build LLVM IR
     22 directly in `SSA
     23 form <http://en.wikipedia.org/wiki/Static_single_assignment_form>`_.
     24 Since LLVM requires that the input code be in SSA form, this is a very
     25 nice property and it is often unclear to newcomers how to generate code
     26 for an imperative language with mutable variables.
     27 
     28 The short (and happy) summary of this chapter is that there is no need
     29 for your front-end to build SSA form: LLVM provides highly tuned and
     30 well tested support for this, though the way it works is a bit
     31 unexpected for some.
     32 
     33 Why is this a hard problem?
     34 ===========================
     35 
     36 To understand why mutable variables cause complexities in SSA
     37 construction, consider this extremely simple C example:
     38 
     39 .. code-block:: c
     40 
     41     int G, H;
     42     int test(_Bool Condition) {
     43       int X;
     44       if (Condition)
     45         X = G;
     46       else
     47         X = H;
     48       return X;
     49     }
     50 
     51 In this case, we have the variable "X", whose value depends on the path
     52 executed in the program. Because there are two different possible values
     53 for X before the return instruction, a PHI node is inserted to merge the
     54 two values. The LLVM IR that we want for this example looks like this:
     55 
     56 .. code-block:: llvm
     57 
     58     @G = weak global i32 0   ; type of @G is i32*
     59     @H = weak global i32 0   ; type of @H is i32*
     60 
     61     define i32 @test(i1 %Condition) {
     62     entry:
     63       br i1 %Condition, label %cond_true, label %cond_false
     64 
     65     cond_true:
     66       %X.0 = load i32* @G
     67       br label %cond_next
     68 
     69     cond_false:
     70       %X.1 = load i32* @H
     71       br label %cond_next
     72 
     73     cond_next:
     74       %X.2 = phi i32 [ %X.1, %cond_false ], [ %X.0, %cond_true ]
     75       ret i32 %X.2
     76     }
     77 
     78 In this example, the loads from the G and H global variables are
     79 explicit in the LLVM IR, and they live in the then/else branches of the
     80 if statement (cond\_true/cond\_false). In order to merge the incoming
     81 values, the X.2 phi node in the cond\_next block selects the right value
     82 to use based on where control flow is coming from: if control flow comes
     83 from the cond\_false block, X.2 gets the value of X.1. Alternatively, if
     84 control flow comes from cond\_true, it gets the value of X.0. The intent
     85 of this chapter is not to explain the details of SSA form. For more
     86 information, see one of the many `online
     87 references <http://en.wikipedia.org/wiki/Static_single_assignment_form>`_.
     88 
     89 The question for this article is "who places the phi nodes when lowering
     90 assignments to mutable variables?". The issue here is that LLVM
     91 *requires* that its IR be in SSA form: there is no "non-ssa" mode for
     92 it. However, SSA construction requires non-trivial algorithms and data
     93 structures, so it is inconvenient and wasteful for every front-end to
     94 have to reproduce this logic.
     95 
     96 Memory in LLVM
     97 ==============
     98 
     99 The 'trick' here is that while LLVM does require all register values to
    100 be in SSA form, it does not require (or permit) memory objects to be in
    101 SSA form. In the example above, note that the loads from G and H are
    102 direct accesses to G and H: they are not renamed or versioned. This
    103 differs from some other compiler systems, which do try to version memory
    104 objects. In LLVM, instead of encoding dataflow analysis of memory into
    105 the LLVM IR, it is handled with `Analysis
    106 Passes <../WritingAnLLVMPass.html>`_ which are computed on demand.
    107 
    108 With this in mind, the high-level idea is that we want to make a stack
    109 variable (which lives in memory, because it is on the stack) for each
    110 mutable object in a function. To take advantage of this trick, we need
    111 to talk about how LLVM represents stack variables.
    112 
    113 In LLVM, all memory accesses are explicit with load/store instructions,
    114 and it is carefully designed not to have (or need) an "address-of"
    115 operator. Notice how the type of the @G/@H global variables is actually
    116 "i32\*" even though the variable is defined as "i32". What this means is
    117 that @G defines *space* for an i32 in the global data area, but its
    118 *name* actually refers to the address for that space. Stack variables
    119 work the same way, except that instead of being declared with global
    120 variable definitions, they are declared with the `LLVM alloca
    121 instruction <../LangRef.html#i_alloca>`_:
    122 
    123 .. code-block:: llvm
    124 
    125     define i32 @example() {
    126     entry:
    127       %X = alloca i32           ; type of %X is i32*.
    128       ...
    129       %tmp = load i32* %X       ; load the stack value %X from the stack.
    130       %tmp2 = add i32 %tmp, 1   ; increment it
    131       store i32 %tmp2, i32* %X  ; store it back
    132       ...
    133 
    134 This code shows an example of how you can declare and manipulate a stack
    135 variable in the LLVM IR. Stack memory allocated with the alloca
    136 instruction is fully general: you can pass the address of the stack slot
    137 to functions, you can store it in other variables, etc. In our example
    138 above, we could rewrite the example to use the alloca technique to avoid
    139 using a PHI node:
    140 
    141 .. code-block:: llvm
    142 
    143     @G = weak global i32 0   ; type of @G is i32*
    144     @H = weak global i32 0   ; type of @H is i32*
    145 
    146     define i32 @test(i1 %Condition) {
    147     entry:
    148       %X = alloca i32           ; type of %X is i32*.
    149       br i1 %Condition, label %cond_true, label %cond_false
    150 
    151     cond_true:
    152       %X.0 = load i32* @G
    153       store i32 %X.0, i32* %X   ; Update X
    154       br label %cond_next
    155 
    156     cond_false:
    157       %X.1 = load i32* @H
    158       store i32 %X.1, i32* %X   ; Update X
    159       br label %cond_next
    160 
    161     cond_next:
    162       %X.2 = load i32* %X       ; Read X
    163       ret i32 %X.2
    164     }
    165 
    166 With this, we have discovered a way to handle arbitrary mutable
    167 variables without the need to create Phi nodes at all:
    168 
    169 #. Each mutable variable becomes a stack allocation.
    170 #. Each read of the variable becomes a load from the stack.
    171 #. Each update of the variable becomes a store to the stack.
    172 #. Taking the address of a variable just uses the stack address
    173    directly.
    174 
    175 While this solution has solved our immediate problem, it introduced
    176 another one: we have now apparently introduced a lot of stack traffic
    177 for very simple and common operations, a major performance problem.
    178 Fortunately for us, the LLVM optimizer has a highly-tuned optimization
    179 pass named "mem2reg" that handles this case, promoting allocas like this
    180 into SSA registers, inserting Phi nodes as appropriate. If you run this
    181 example through the pass, for example, you'll get:
    182 
    183 .. code-block:: bash
    184 
    185     $ llvm-as < example.ll | opt -mem2reg | llvm-dis
    186     @G = weak global i32 0
    187     @H = weak global i32 0
    188 
    189     define i32 @test(i1 %Condition) {
    190     entry:
    191       br i1 %Condition, label %cond_true, label %cond_false
    192 
    193     cond_true:
    194       %X.0 = load i32* @G
    195       br label %cond_next
    196 
    197     cond_false:
    198       %X.1 = load i32* @H
    199       br label %cond_next
    200 
    201     cond_next:
    202       %X.01 = phi i32 [ %X.1, %cond_false ], [ %X.0, %cond_true ]
    203       ret i32 %X.01
    204     }
    205 
    206 The mem2reg pass implements the standard "iterated dominance frontier"
    207 algorithm for constructing SSA form and has a number of optimizations
    208 that speed up (very common) degenerate cases. The mem2reg optimization
    209 pass is the answer to dealing with mutable variables, and we highly
    210 recommend that you depend on it. Note that mem2reg only works on
    211 variables in certain circumstances:
    212 
    213 #. mem2reg is alloca-driven: it looks for allocas and if it can handle
    214    them, it promotes them. It does not apply to global variables or heap
    215    allocations.
    216 #. mem2reg only looks for alloca instructions in the entry block of the
    217    function. Being in the entry block guarantees that the alloca is only
    218    executed once, which makes analysis simpler.
    219 #. mem2reg only promotes allocas whose uses are direct loads and stores.
    220    If the address of the stack object is passed to a function, or if any
    221    funny pointer arithmetic is involved, the alloca will not be
    222    promoted.
    223 #. mem2reg only works on allocas of `first
    224    class <../LangRef.html#t_classifications>`_ values (such as pointers,
    225    scalars and vectors), and only if the array size of the allocation is
    226    1 (or missing in the .ll file). mem2reg is not capable of promoting
    227    structs or arrays to registers. Note that the "scalarrepl" pass is
    228    more powerful and can promote structs, "unions", and arrays in many
    229    cases.
    230 
    231 All of these properties are easy to satisfy for most imperative
    232 languages, and we'll illustrate it below with Kaleidoscope. The final
    233 question you may be asking is: should I bother with this nonsense for my
    234 front-end? Wouldn't it be better if I just did SSA construction
    235 directly, avoiding use of the mem2reg optimization pass? In short, we
    236 strongly recommend that you use this technique for building SSA form,
    237 unless there is an extremely good reason not to. Using this technique
    238 is:
    239 
    240 -  Proven and well tested: llvm-gcc and clang both use this technique
    241    for local mutable variables. As such, the most common clients of LLVM
    242    are using this to handle a bulk of their variables. You can be sure
    243    that bugs are found fast and fixed early.
    244 -  Extremely Fast: mem2reg has a number of special cases that make it
    245    fast in common cases as well as fully general. For example, it has
    246    fast-paths for variables that are only used in a single block,
    247    variables that only have one assignment point, good heuristics to
    248    avoid insertion of unneeded phi nodes, etc.
    249 -  Needed for debug info generation: `Debug information in
    250    LLVM <../SourceLevelDebugging.html>`_ relies on having the address of
    251    the variable exposed so that debug info can be attached to it. This
    252    technique dovetails very naturally with this style of debug info.
    253 
    254 If nothing else, this makes it much easier to get your front-end up and
    255 running, and is very simple to implement. Lets extend Kaleidoscope with
    256 mutable variables now!
    257 
    258 Mutable Variables in Kaleidoscope
    259 =================================
    260 
    261 Now that we know the sort of problem we want to tackle, lets see what
    262 this looks like in the context of our little Kaleidoscope language.
    263 We're going to add two features:
    264 
    265 #. The ability to mutate variables with the '=' operator.
    266 #. The ability to define new variables.
    267 
    268 While the first item is really what this is about, we only have
    269 variables for incoming arguments as well as for induction variables, and
    270 redefining those only goes so far :). Also, the ability to define new
    271 variables is a useful thing regardless of whether you will be mutating
    272 them. Here's a motivating example that shows how we could use these:
    273 
    274 ::
    275 
    276     # Define ':' for sequencing: as a low-precedence operator that ignores operands
    277     # and just returns the RHS.
    278     def binary : 1 (x y) y;
    279 
    280     # Recursive fib, we could do this before.
    281     def fib(x)
    282       if (x < 3) then
    283         1
    284       else
    285         fib(x-1)+fib(x-2);
    286 
    287     # Iterative fib.
    288     def fibi(x)
    289       var a = 1, b = 1, c in
    290       (for i = 3, i < x in
    291          c = a + b :
    292          a = b :
    293          b = c) :
    294       b;
    295 
    296     # Call it.
    297     fibi(10);
    298 
    299 In order to mutate variables, we have to change our existing variables
    300 to use the "alloca trick". Once we have that, we'll add our new
    301 operator, then extend Kaleidoscope to support new variable definitions.
    302 
    303 Adjusting Existing Variables for Mutation
    304 =========================================
    305 
    306 The symbol table in Kaleidoscope is managed at code generation time by
    307 the '``NamedValues``' map. This map currently keeps track of the LLVM
    308 "Value\*" that holds the double value for the named variable. In order
    309 to support mutation, we need to change this slightly, so that it
    310 ``NamedValues`` holds the *memory location* of the variable in question.
    311 Note that this change is a refactoring: it changes the structure of the
    312 code, but does not (by itself) change the behavior of the compiler. All
    313 of these changes are isolated in the Kaleidoscope code generator.
    314 
    315 At this point in Kaleidoscope's development, it only supports variables
    316 for two things: incoming arguments to functions and the induction
    317 variable of 'for' loops. For consistency, we'll allow mutation of these
    318 variables in addition to other user-defined variables. This means that
    319 these will both need memory locations.
    320 
    321 To start our transformation of Kaleidoscope, we'll change the
    322 NamedValues map so that it maps to AllocaInst\* instead of Value\*. Once
    323 we do this, the C++ compiler will tell us what parts of the code we need
    324 to update:
    325 
    326 .. code-block:: c++
    327 
    328     static std::map<std::string, AllocaInst*> NamedValues;
    329 
    330 Also, since we will need to create these alloca's, we'll use a helper
    331 function that ensures that the allocas are created in the entry block of
    332 the function:
    333 
    334 .. code-block:: c++
    335 
    336     /// CreateEntryBlockAlloca - Create an alloca instruction in the entry block of
    337     /// the function.  This is used for mutable variables etc.
    338     static AllocaInst *CreateEntryBlockAlloca(Function *TheFunction,
    339                                               const std::string &VarName) {
    340       IRBuilder<> TmpB(&TheFunction->getEntryBlock(),
    341                      TheFunction->getEntryBlock().begin());
    342       return TmpB.CreateAlloca(Type::getDoubleTy(getGlobalContext()), 0,
    343                                VarName.c_str());
    344     }
    345 
    346 This funny looking code creates an IRBuilder object that is pointing at
    347 the first instruction (.begin()) of the entry block. It then creates an
    348 alloca with the expected name and returns it. Because all values in
    349 Kaleidoscope are doubles, there is no need to pass in a type to use.
    350 
    351 With this in place, the first functionality change we want to make is to
    352 variable references. In our new scheme, variables live on the stack, so
    353 code generating a reference to them actually needs to produce a load
    354 from the stack slot:
    355 
    356 .. code-block:: c++
    357 
    358     Value *VariableExprAST::Codegen() {
    359       // Look this variable up in the function.
    360       Value *V = NamedValues[Name];
    361       if (V == 0) return ErrorV("Unknown variable name");
    362 
    363       // Load the value.
    364       return Builder.CreateLoad(V, Name.c_str());
    365     }
    366 
    367 As you can see, this is pretty straightforward. Now we need to update
    368 the things that define the variables to set up the alloca. We'll start
    369 with ``ForExprAST::Codegen`` (see the `full code listing <#code>`_ for
    370 the unabridged code):
    371 
    372 .. code-block:: c++
    373 
    374       Function *TheFunction = Builder.GetInsertBlock()->getParent();
    375 
    376       // Create an alloca for the variable in the entry block.
    377       AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName);
    378 
    379         // Emit the start code first, without 'variable' in scope.
    380       Value *StartVal = Start->Codegen();
    381       if (StartVal == 0) return 0;
    382 
    383       // Store the value into the alloca.
    384       Builder.CreateStore(StartVal, Alloca);
    385       ...
    386 
    387       // Compute the end condition.
    388       Value *EndCond = End->Codegen();
    389       if (EndCond == 0) return EndCond;
    390 
    391       // Reload, increment, and restore the alloca.  This handles the case where
    392       // the body of the loop mutates the variable.
    393       Value *CurVar = Builder.CreateLoad(Alloca);
    394       Value *NextVar = Builder.CreateFAdd(CurVar, StepVal, "nextvar");
    395       Builder.CreateStore(NextVar, Alloca);
    396       ...
    397 
    398 This code is virtually identical to the code `before we allowed mutable
    399 variables <LangImpl5.html#forcodegen>`_. The big difference is that we
    400 no longer have to construct a PHI node, and we use load/store to access
    401 the variable as needed.
    402 
    403 To support mutable argument variables, we need to also make allocas for
    404 them. The code for this is also pretty simple:
    405 
    406 .. code-block:: c++
    407 
    408     /// CreateArgumentAllocas - Create an alloca for each argument and register the
    409     /// argument in the symbol table so that references to it will succeed.
    410     void PrototypeAST::CreateArgumentAllocas(Function *F) {
    411       Function::arg_iterator AI = F->arg_begin();
    412       for (unsigned Idx = 0, e = Args.size(); Idx != e; ++Idx, ++AI) {
    413         // Create an alloca for this variable.
    414         AllocaInst *Alloca = CreateEntryBlockAlloca(F, Args[Idx]);
    415 
    416         // Store the initial value into the alloca.
    417         Builder.CreateStore(AI, Alloca);
    418 
    419         // Add arguments to variable symbol table.
    420         NamedValues[Args[Idx]] = Alloca;
    421       }
    422     }
    423 
    424 For each argument, we make an alloca, store the input value to the
    425 function into the alloca, and register the alloca as the memory location
    426 for the argument. This method gets invoked by ``FunctionAST::Codegen``
    427 right after it sets up the entry block for the function.
    428 
    429 The final missing piece is adding the mem2reg pass, which allows us to
    430 get good codegen once again:
    431 
    432 .. code-block:: c++
    433 
    434         // Set up the optimizer pipeline.  Start with registering info about how the
    435         // target lays out data structures.
    436         OurFPM.add(new DataLayout(*TheExecutionEngine->getDataLayout()));
    437         // Promote allocas to registers.
    438         OurFPM.add(createPromoteMemoryToRegisterPass());
    439         // Do simple "peephole" optimizations and bit-twiddling optzns.
    440         OurFPM.add(createInstructionCombiningPass());
    441         // Reassociate expressions.
    442         OurFPM.add(createReassociatePass());
    443 
    444 It is interesting to see what the code looks like before and after the
    445 mem2reg optimization runs. For example, this is the before/after code
    446 for our recursive fib function. Before the optimization:
    447 
    448 .. code-block:: llvm
    449 
    450     define double @fib(double %x) {
    451     entry:
    452       %x1 = alloca double
    453       store double %x, double* %x1
    454       %x2 = load double* %x1
    455       %cmptmp = fcmp ult double %x2, 3.000000e+00
    456       %booltmp = uitofp i1 %cmptmp to double
    457       %ifcond = fcmp one double %booltmp, 0.000000e+00
    458       br i1 %ifcond, label %then, label %else
    459 
    460     then:       ; preds = %entry
    461       br label %ifcont
    462 
    463     else:       ; preds = %entry
    464       %x3 = load double* %x1
    465       %subtmp = fsub double %x3, 1.000000e+00
    466       %calltmp = call double @fib(double %subtmp)
    467       %x4 = load double* %x1
    468       %subtmp5 = fsub double %x4, 2.000000e+00
    469       %calltmp6 = call double @fib(double %subtmp5)
    470       %addtmp = fadd double %calltmp, %calltmp6
    471       br label %ifcont
    472 
    473     ifcont:     ; preds = %else, %then
    474       %iftmp = phi double [ 1.000000e+00, %then ], [ %addtmp, %else ]
    475       ret double %iftmp
    476     }
    477 
    478 Here there is only one variable (x, the input argument) but you can
    479 still see the extremely simple-minded code generation strategy we are
    480 using. In the entry block, an alloca is created, and the initial input
    481 value is stored into it. Each reference to the variable does a reload
    482 from the stack. Also, note that we didn't modify the if/then/else
    483 expression, so it still inserts a PHI node. While we could make an
    484 alloca for it, it is actually easier to create a PHI node for it, so we
    485 still just make the PHI.
    486 
    487 Here is the code after the mem2reg pass runs:
    488 
    489 .. code-block:: llvm
    490 
    491     define double @fib(double %x) {
    492     entry:
    493       %cmptmp = fcmp ult double %x, 3.000000e+00
    494       %booltmp = uitofp i1 %cmptmp to double
    495       %ifcond = fcmp one double %booltmp, 0.000000e+00
    496       br i1 %ifcond, label %then, label %else
    497 
    498     then:
    499       br label %ifcont
    500 
    501     else:
    502       %subtmp = fsub double %x, 1.000000e+00
    503       %calltmp = call double @fib(double %subtmp)
    504       %subtmp5 = fsub double %x, 2.000000e+00
    505       %calltmp6 = call double @fib(double %subtmp5)
    506       %addtmp = fadd double %calltmp, %calltmp6
    507       br label %ifcont
    508 
    509     ifcont:     ; preds = %else, %then
    510       %iftmp = phi double [ 1.000000e+00, %then ], [ %addtmp, %else ]
    511       ret double %iftmp
    512     }
    513 
    514 This is a trivial case for mem2reg, since there are no redefinitions of
    515 the variable. The point of showing this is to calm your tension about
    516 inserting such blatent inefficiencies :).
    517 
    518 After the rest of the optimizers run, we get:
    519 
    520 .. code-block:: llvm
    521 
    522     define double @fib(double %x) {
    523     entry:
    524       %cmptmp = fcmp ult double %x, 3.000000e+00
    525       %booltmp = uitofp i1 %cmptmp to double
    526       %ifcond = fcmp ueq double %booltmp, 0.000000e+00
    527       br i1 %ifcond, label %else, label %ifcont
    528 
    529     else:
    530       %subtmp = fsub double %x, 1.000000e+00
    531       %calltmp = call double @fib(double %subtmp)
    532       %subtmp5 = fsub double %x, 2.000000e+00
    533       %calltmp6 = call double @fib(double %subtmp5)
    534       %addtmp = fadd double %calltmp, %calltmp6
    535       ret double %addtmp
    536 
    537     ifcont:
    538       ret double 1.000000e+00
    539     }
    540 
    541 Here we see that the simplifycfg pass decided to clone the return
    542 instruction into the end of the 'else' block. This allowed it to
    543 eliminate some branches and the PHI node.
    544 
    545 Now that all symbol table references are updated to use stack variables,
    546 we'll add the assignment operator.
    547 
    548 New Assignment Operator
    549 =======================
    550 
    551 With our current framework, adding a new assignment operator is really
    552 simple. We will parse it just like any other binary operator, but handle
    553 it internally (instead of allowing the user to define it). The first
    554 step is to set a precedence:
    555 
    556 .. code-block:: c++
    557 
    558      int main() {
    559        // Install standard binary operators.
    560        // 1 is lowest precedence.
    561        BinopPrecedence['='] = 2;
    562        BinopPrecedence['<'] = 10;
    563        BinopPrecedence['+'] = 20;
    564        BinopPrecedence['-'] = 20;
    565 
    566 Now that the parser knows the precedence of the binary operator, it
    567 takes care of all the parsing and AST generation. We just need to
    568 implement codegen for the assignment operator. This looks like:
    569 
    570 .. code-block:: c++
    571 
    572     Value *BinaryExprAST::Codegen() {
    573       // Special case '=' because we don't want to emit the LHS as an expression.
    574       if (Op == '=') {
    575         // Assignment requires the LHS to be an identifier.
    576         VariableExprAST *LHSE = dynamic_cast<VariableExprAST*>(LHS);
    577         if (!LHSE)
    578           return ErrorV("destination of '=' must be a variable");
    579 
    580 Unlike the rest of the binary operators, our assignment operator doesn't
    581 follow the "emit LHS, emit RHS, do computation" model. As such, it is
    582 handled as a special case before the other binary operators are handled.
    583 The other strange thing is that it requires the LHS to be a variable. It
    584 is invalid to have "(x+1) = expr" - only things like "x = expr" are
    585 allowed.
    586 
    587 .. code-block:: c++
    588 
    589         // Codegen the RHS.
    590         Value *Val = RHS->Codegen();
    591         if (Val == 0) return 0;
    592 
    593         // Look up the name.
    594         Value *Variable = NamedValues[LHSE->getName()];
    595         if (Variable == 0) return ErrorV("Unknown variable name");
    596 
    597         Builder.CreateStore(Val, Variable);
    598         return Val;
    599       }
    600       ...
    601 
    602 Once we have the variable, codegen'ing the assignment is
    603 straightforward: we emit the RHS of the assignment, create a store, and
    604 return the computed value. Returning a value allows for chained
    605 assignments like "X = (Y = Z)".
    606 
    607 Now that we have an assignment operator, we can mutate loop variables
    608 and arguments. For example, we can now run code like this:
    609 
    610 ::
    611 
    612     # Function to print a double.
    613     extern printd(x);
    614 
    615     # Define ':' for sequencing: as a low-precedence operator that ignores operands
    616     # and just returns the RHS.
    617     def binary : 1 (x y) y;
    618 
    619     def test(x)
    620       printd(x) :
    621       x = 4 :
    622       printd(x);
    623 
    624     test(123);
    625 
    626 When run, this example prints "123" and then "4", showing that we did
    627 actually mutate the value! Okay, we have now officially implemented our
    628 goal: getting this to work requires SSA construction in the general
    629 case. However, to be really useful, we want the ability to define our
    630 own local variables, lets add this next!
    631 
    632 User-defined Local Variables
    633 ============================
    634 
    635 Adding var/in is just like any other other extensions we made to
    636 Kaleidoscope: we extend the lexer, the parser, the AST and the code
    637 generator. The first step for adding our new 'var/in' construct is to
    638 extend the lexer. As before, this is pretty trivial, the code looks like
    639 this:
    640 
    641 .. code-block:: c++
    642 
    643     enum Token {
    644       ...
    645       // var definition
    646       tok_var = -13
    647     ...
    648     }
    649     ...
    650     static int gettok() {
    651     ...
    652         if (IdentifierStr == "in") return tok_in;
    653         if (IdentifierStr == "binary") return tok_binary;
    654         if (IdentifierStr == "unary") return tok_unary;
    655         if (IdentifierStr == "var") return tok_var;
    656         return tok_identifier;
    657     ...
    658 
    659 The next step is to define the AST node that we will construct. For
    660 var/in, it looks like this:
    661 
    662 .. code-block:: c++
    663 
    664     /// VarExprAST - Expression class for var/in
    665     class VarExprAST : public ExprAST {
    666       std::vector<std::pair<std::string, ExprAST*> > VarNames;
    667       ExprAST *Body;
    668     public:
    669       VarExprAST(const std::vector<std::pair<std::string, ExprAST*> > &varnames,
    670                  ExprAST *body)
    671       : VarNames(varnames), Body(body) {}
    672 
    673       virtual Value *Codegen();
    674     };
    675 
    676 var/in allows a list of names to be defined all at once, and each name
    677 can optionally have an initializer value. As such, we capture this
    678 information in the VarNames vector. Also, var/in has a body, this body
    679 is allowed to access the variables defined by the var/in.
    680 
    681 With this in place, we can define the parser pieces. The first thing we
    682 do is add it as a primary expression:
    683 
    684 .. code-block:: c++
    685 
    686     /// primary
    687     ///   ::= identifierexpr
    688     ///   ::= numberexpr
    689     ///   ::= parenexpr
    690     ///   ::= ifexpr
    691     ///   ::= forexpr
    692     ///   ::= varexpr
    693     static ExprAST *ParsePrimary() {
    694       switch (CurTok) {
    695       default: return Error("unknown token when expecting an expression");
    696       case tok_identifier: return ParseIdentifierExpr();
    697       case tok_number:     return ParseNumberExpr();
    698       case '(':            return ParseParenExpr();
    699       case tok_if:         return ParseIfExpr();
    700       case tok_for:        return ParseForExpr();
    701       case tok_var:        return ParseVarExpr();
    702       }
    703     }
    704 
    705 Next we define ParseVarExpr:
    706 
    707 .. code-block:: c++
    708 
    709     /// varexpr ::= 'var' identifier ('=' expression)?
    710     //                    (',' identifier ('=' expression)?)* 'in' expression
    711     static ExprAST *ParseVarExpr() {
    712       getNextToken();  // eat the var.
    713 
    714       std::vector<std::pair<std::string, ExprAST*> > VarNames;
    715 
    716       // At least one variable name is required.
    717       if (CurTok != tok_identifier)
    718         return Error("expected identifier after var");
    719 
    720 The first part of this code parses the list of identifier/expr pairs
    721 into the local ``VarNames`` vector.
    722 
    723 .. code-block:: c++
    724 
    725       while (1) {
    726         std::string Name = IdentifierStr;
    727         getNextToken();  // eat identifier.
    728 
    729         // Read the optional initializer.
    730         ExprAST *Init = 0;
    731         if (CurTok == '=') {
    732           getNextToken(); // eat the '='.
    733 
    734           Init = ParseExpression();
    735           if (Init == 0) return 0;
    736         }
    737 
    738         VarNames.push_back(std::make_pair(Name, Init));
    739 
    740         // End of var list, exit loop.
    741         if (CurTok != ',') break;
    742         getNextToken(); // eat the ','.
    743 
    744         if (CurTok != tok_identifier)
    745           return Error("expected identifier list after var");
    746       }
    747 
    748 Once all the variables are parsed, we then parse the body and create the
    749 AST node:
    750 
    751 .. code-block:: c++
    752 
    753       // At this point, we have to have 'in'.
    754       if (CurTok != tok_in)
    755         return Error("expected 'in' keyword after 'var'");
    756       getNextToken();  // eat 'in'.
    757 
    758       ExprAST *Body = ParseExpression();
    759       if (Body == 0) return 0;
    760 
    761       return new VarExprAST(VarNames, Body);
    762     }
    763 
    764 Now that we can parse and represent the code, we need to support
    765 emission of LLVM IR for it. This code starts out with:
    766 
    767 .. code-block:: c++
    768 
    769     Value *VarExprAST::Codegen() {
    770       std::vector<AllocaInst *> OldBindings;
    771 
    772       Function *TheFunction = Builder.GetInsertBlock()->getParent();
    773 
    774       // Register all variables and emit their initializer.
    775       for (unsigned i = 0, e = VarNames.size(); i != e; ++i) {
    776         const std::string &VarName = VarNames[i].first;
    777         ExprAST *Init = VarNames[i].second;
    778 
    779 Basically it loops over all the variables, installing them one at a
    780 time. For each variable we put into the symbol table, we remember the
    781 previous value that we replace in OldBindings.
    782 
    783 .. code-block:: c++
    784 
    785         // Emit the initializer before adding the variable to scope, this prevents
    786         // the initializer from referencing the variable itself, and permits stuff
    787         // like this:
    788         //  var a = 1 in
    789         //    var a = a in ...   # refers to outer 'a'.
    790         Value *InitVal;
    791         if (Init) {
    792           InitVal = Init->Codegen();
    793           if (InitVal == 0) return 0;
    794         } else { // If not specified, use 0.0.
    795           InitVal = ConstantFP::get(getGlobalContext(), APFloat(0.0));
    796         }
    797 
    798         AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName);
    799         Builder.CreateStore(InitVal, Alloca);
    800 
    801         // Remember the old variable binding so that we can restore the binding when
    802         // we unrecurse.
    803         OldBindings.push_back(NamedValues[VarName]);
    804 
    805         // Remember this binding.
    806         NamedValues[VarName] = Alloca;
    807       }
    808 
    809 There are more comments here than code. The basic idea is that we emit
    810 the initializer, create the alloca, then update the symbol table to
    811 point to it. Once all the variables are installed in the symbol table,
    812 we evaluate the body of the var/in expression:
    813 
    814 .. code-block:: c++
    815 
    816       // Codegen the body, now that all vars are in scope.
    817       Value *BodyVal = Body->Codegen();
    818       if (BodyVal == 0) return 0;
    819 
    820 Finally, before returning, we restore the previous variable bindings:
    821 
    822 .. code-block:: c++
    823 
    824       // Pop all our variables from scope.
    825       for (unsigned i = 0, e = VarNames.size(); i != e; ++i)
    826         NamedValues[VarNames[i].first] = OldBindings[i];
    827 
    828       // Return the body computation.
    829       return BodyVal;
    830     }
    831 
    832 The end result of all of this is that we get properly scoped variable
    833 definitions, and we even (trivially) allow mutation of them :).
    834 
    835 With this, we completed what we set out to do. Our nice iterative fib
    836 example from the intro compiles and runs just fine. The mem2reg pass
    837 optimizes all of our stack variables into SSA registers, inserting PHI
    838 nodes where needed, and our front-end remains simple: no "iterated
    839 dominance frontier" computation anywhere in sight.
    840 
    841 Full Code Listing
    842 =================
    843 
    844 Here is the complete code listing for our running example, enhanced with
    845 mutable variables and var/in support. To build this example, use:
    846 
    847 .. code-block:: bash
    848 
    849     # Compile
    850     clang++ -g toy.cpp `llvm-config --cppflags --ldflags --libs core jit native` -O3 -o toy
    851     # Run
    852     ./toy
    853 
    854 Here is the code:
    855 
    856 .. code-block:: c++
    857 
    858     #include "llvm/DerivedTypes.h"
    859     #include "llvm/ExecutionEngine/ExecutionEngine.h"
    860     #include "llvm/ExecutionEngine/JIT.h"
    861     #include "llvm/IRBuilder.h"
    862     #include "llvm/LLVMContext.h"
    863     #include "llvm/Module.h"
    864     #include "llvm/PassManager.h"
    865     #include "llvm/Analysis/Verifier.h"
    866     #include "llvm/Analysis/Passes.h"
    867     #include "llvm/DataLayout.h"
    868     #include "llvm/Transforms/Scalar.h"
    869     #include "llvm/Support/TargetSelect.h"
    870     #include <cstdio>
    871     #include <string>
    872     #include <map>
    873     #include <vector>
    874     using namespace llvm;
    875 
    876     //===----------------------------------------------------------------------===//
    877     // Lexer
    878     //===----------------------------------------------------------------------===//
    879 
    880     // The lexer returns tokens [0-255] if it is an unknown character, otherwise one
    881     // of these for known things.
    882     enum Token {
    883       tok_eof = -1,
    884 
    885       // commands
    886       tok_def = -2, tok_extern = -3,
    887 
    888       // primary
    889       tok_identifier = -4, tok_number = -5,
    890 
    891       // control
    892       tok_if = -6, tok_then = -7, tok_else = -8,
    893       tok_for = -9, tok_in = -10,
    894 
    895       // operators
    896       tok_binary = -11, tok_unary = -12,
    897 
    898       // var definition
    899       tok_var = -13
    900     };
    901 
    902     static std::string IdentifierStr;  // Filled in if tok_identifier
    903     static double NumVal;              // Filled in if tok_number
    904 
    905     /// gettok - Return the next token from standard input.
    906     static int gettok() {
    907       static int LastChar = ' ';
    908 
    909       // Skip any whitespace.
    910       while (isspace(LastChar))
    911         LastChar = getchar();
    912 
    913       if (isalpha(LastChar)) { // identifier: [a-zA-Z][a-zA-Z0-9]*
    914         IdentifierStr = LastChar;
    915         while (isalnum((LastChar = getchar())))
    916           IdentifierStr += LastChar;
    917 
    918         if (IdentifierStr == "def") return tok_def;
    919         if (IdentifierStr == "extern") return tok_extern;
    920         if (IdentifierStr == "if") return tok_if;
    921         if (IdentifierStr == "then") return tok_then;
    922         if (IdentifierStr == "else") return tok_else;
    923         if (IdentifierStr == "for") return tok_for;
    924         if (IdentifierStr == "in") return tok_in;
    925         if (IdentifierStr == "binary") return tok_binary;
    926         if (IdentifierStr == "unary") return tok_unary;
    927         if (IdentifierStr == "var") return tok_var;
    928         return tok_identifier;
    929       }
    930 
    931       if (isdigit(LastChar) || LastChar == '.') {   // Number: [0-9.]+
    932         std::string NumStr;
    933         do {
    934           NumStr += LastChar;
    935           LastChar = getchar();
    936         } while (isdigit(LastChar) || LastChar == '.');
    937 
    938         NumVal = strtod(NumStr.c_str(), 0);
    939         return tok_number;
    940       }
    941 
    942       if (LastChar == '#') {
    943         // Comment until end of line.
    944         do LastChar = getchar();
    945         while (LastChar != EOF && LastChar != '\n' && LastChar != '\r');
    946 
    947         if (LastChar != EOF)
    948           return gettok();
    949       }
    950 
    951       // Check for end of file.  Don't eat the EOF.
    952       if (LastChar == EOF)
    953         return tok_eof;
    954 
    955       // Otherwise, just return the character as its ascii value.
    956       int ThisChar = LastChar;
    957       LastChar = getchar();
    958       return ThisChar;
    959     }
    960 
    961     //===----------------------------------------------------------------------===//
    962     // Abstract Syntax Tree (aka Parse Tree)
    963     //===----------------------------------------------------------------------===//
    964 
    965     /// ExprAST - Base class for all expression nodes.
    966     class ExprAST {
    967     public:
    968       virtual ~ExprAST() {}
    969       virtual Value *Codegen() = 0;
    970     };
    971 
    972     /// NumberExprAST - Expression class for numeric literals like "1.0".
    973     class NumberExprAST : public ExprAST {
    974       double Val;
    975     public:
    976       NumberExprAST(double val) : Val(val) {}
    977       virtual Value *Codegen();
    978     };
    979 
    980     /// VariableExprAST - Expression class for referencing a variable, like "a".
    981     class VariableExprAST : public ExprAST {
    982       std::string Name;
    983     public:
    984       VariableExprAST(const std::string &name) : Name(name) {}
    985       const std::string &getName() const { return Name; }
    986       virtual Value *Codegen();
    987     };
    988 
    989     /// UnaryExprAST - Expression class for a unary operator.
    990     class UnaryExprAST : public ExprAST {
    991       char Opcode;
    992       ExprAST *Operand;
    993     public:
    994       UnaryExprAST(char opcode, ExprAST *operand)
    995         : Opcode(opcode), Operand(operand) {}
    996       virtual Value *Codegen();
    997     };
    998 
    999     /// BinaryExprAST - Expression class for a binary operator.
   1000     class BinaryExprAST : public ExprAST {
   1001       char Op;
   1002       ExprAST *LHS, *RHS;
   1003     public:
   1004       BinaryExprAST(char op, ExprAST *lhs, ExprAST *rhs)
   1005         : Op(op), LHS(lhs), RHS(rhs) {}
   1006       virtual Value *Codegen();
   1007     };
   1008 
   1009     /// CallExprAST - Expression class for function calls.
   1010     class CallExprAST : public ExprAST {
   1011       std::string Callee;
   1012       std::vector<ExprAST*> Args;
   1013     public:
   1014       CallExprAST(const std::string &callee, std::vector<ExprAST*> &args)
   1015         : Callee(callee), Args(args) {}
   1016       virtual Value *Codegen();
   1017     };
   1018 
   1019     /// IfExprAST - Expression class for if/then/else.
   1020     class IfExprAST : public ExprAST {
   1021       ExprAST *Cond, *Then, *Else;
   1022     public:
   1023       IfExprAST(ExprAST *cond, ExprAST *then, ExprAST *_else)
   1024       : Cond(cond), Then(then), Else(_else) {}
   1025       virtual Value *Codegen();
   1026     };
   1027 
   1028     /// ForExprAST - Expression class for for/in.
   1029     class ForExprAST : public ExprAST {
   1030       std::string VarName;
   1031       ExprAST *Start, *End, *Step, *Body;
   1032     public:
   1033       ForExprAST(const std::string &varname, ExprAST *start, ExprAST *end,
   1034                  ExprAST *step, ExprAST *body)
   1035         : VarName(varname), Start(start), End(end), Step(step), Body(body) {}
   1036       virtual Value *Codegen();
   1037     };
   1038 
   1039     /// VarExprAST - Expression class for var/in
   1040     class VarExprAST : public ExprAST {
   1041       std::vector<std::pair<std::string, ExprAST*> > VarNames;
   1042       ExprAST *Body;
   1043     public:
   1044       VarExprAST(const std::vector<std::pair<std::string, ExprAST*> > &varnames,
   1045                  ExprAST *body)
   1046       : VarNames(varnames), Body(body) {}
   1047 
   1048       virtual Value *Codegen();
   1049     };
   1050 
   1051     /// PrototypeAST - This class represents the "prototype" for a function,
   1052     /// which captures its name, and its argument names (thus implicitly the number
   1053     /// of arguments the function takes), as well as if it is an operator.
   1054     class PrototypeAST {
   1055       std::string Name;
   1056       std::vector<std::string> Args;
   1057       bool isOperator;
   1058       unsigned Precedence;  // Precedence if a binary op.
   1059     public:
   1060       PrototypeAST(const std::string &name, const std::vector<std::string> &args,
   1061                    bool isoperator = false, unsigned prec = 0)
   1062       : Name(name), Args(args), isOperator(isoperator), Precedence(prec) {}
   1063 
   1064       bool isUnaryOp() const { return isOperator && Args.size() == 1; }
   1065       bool isBinaryOp() const { return isOperator && Args.size() == 2; }
   1066 
   1067       char getOperatorName() const {
   1068         assert(isUnaryOp() || isBinaryOp());
   1069         return Name[Name.size()-1];
   1070       }
   1071 
   1072       unsigned getBinaryPrecedence() const { return Precedence; }
   1073 
   1074       Function *Codegen();
   1075 
   1076       void CreateArgumentAllocas(Function *F);
   1077     };
   1078 
   1079     /// FunctionAST - This class represents a function definition itself.
   1080     class FunctionAST {
   1081       PrototypeAST *Proto;
   1082       ExprAST *Body;
   1083     public:
   1084       FunctionAST(PrototypeAST *proto, ExprAST *body)
   1085         : Proto(proto), Body(body) {}
   1086 
   1087       Function *Codegen();
   1088     };
   1089 
   1090     //===----------------------------------------------------------------------===//
   1091     // Parser
   1092     //===----------------------------------------------------------------------===//
   1093 
   1094     /// CurTok/getNextToken - Provide a simple token buffer.  CurTok is the current
   1095     /// token the parser is looking at.  getNextToken reads another token from the
   1096     /// lexer and updates CurTok with its results.
   1097     static int CurTok;
   1098     static int getNextToken() {
   1099       return CurTok = gettok();
   1100     }
   1101 
   1102     /// BinopPrecedence - This holds the precedence for each binary operator that is
   1103     /// defined.
   1104     static std::map<char, int> BinopPrecedence;
   1105 
   1106     /// GetTokPrecedence - Get the precedence of the pending binary operator token.
   1107     static int GetTokPrecedence() {
   1108       if (!isascii(CurTok))
   1109         return -1;
   1110 
   1111       // Make sure it's a declared binop.
   1112       int TokPrec = BinopPrecedence[CurTok];
   1113       if (TokPrec <= 0) return -1;
   1114       return TokPrec;
   1115     }
   1116 
   1117     /// Error* - These are little helper functions for error handling.
   1118     ExprAST *Error(const char *Str) { fprintf(stderr, "Error: %s\n", Str);return 0;}
   1119     PrototypeAST *ErrorP(const char *Str) { Error(Str); return 0; }
   1120     FunctionAST *ErrorF(const char *Str) { Error(Str); return 0; }
   1121 
   1122     static ExprAST *ParseExpression();
   1123 
   1124     /// identifierexpr
   1125     ///   ::= identifier
   1126     ///   ::= identifier '(' expression* ')'
   1127     static ExprAST *ParseIdentifierExpr() {
   1128       std::string IdName = IdentifierStr;
   1129 
   1130       getNextToken();  // eat identifier.
   1131 
   1132       if (CurTok != '(') // Simple variable ref.
   1133         return new VariableExprAST(IdName);
   1134 
   1135       // Call.
   1136       getNextToken();  // eat (
   1137       std::vector<ExprAST*> Args;
   1138       if (CurTok != ')') {
   1139         while (1) {
   1140           ExprAST *Arg = ParseExpression();
   1141           if (!Arg) return 0;
   1142           Args.push_back(Arg);
   1143 
   1144           if (CurTok == ')') break;
   1145 
   1146           if (CurTok != ',')
   1147             return Error("Expected ')' or ',' in argument list");
   1148           getNextToken();
   1149         }
   1150       }
   1151 
   1152       // Eat the ')'.
   1153       getNextToken();
   1154 
   1155       return new CallExprAST(IdName, Args);
   1156     }
   1157 
   1158     /// numberexpr ::= number
   1159     static ExprAST *ParseNumberExpr() {
   1160       ExprAST *Result = new NumberExprAST(NumVal);
   1161       getNextToken(); // consume the number
   1162       return Result;
   1163     }
   1164 
   1165     /// parenexpr ::= '(' expression ')'
   1166     static ExprAST *ParseParenExpr() {
   1167       getNextToken();  // eat (.
   1168       ExprAST *V = ParseExpression();
   1169       if (!V) return 0;
   1170 
   1171       if (CurTok != ')')
   1172         return Error("expected ')'");
   1173       getNextToken();  // eat ).
   1174       return V;
   1175     }
   1176 
   1177     /// ifexpr ::= 'if' expression 'then' expression 'else' expression
   1178     static ExprAST *ParseIfExpr() {
   1179       getNextToken();  // eat the if.
   1180 
   1181       // condition.
   1182       ExprAST *Cond = ParseExpression();
   1183       if (!Cond) return 0;
   1184 
   1185       if (CurTok != tok_then)
   1186         return Error("expected then");
   1187       getNextToken();  // eat the then
   1188 
   1189       ExprAST *Then = ParseExpression();
   1190       if (Then == 0) return 0;
   1191 
   1192       if (CurTok != tok_else)
   1193         return Error("expected else");
   1194 
   1195       getNextToken();
   1196 
   1197       ExprAST *Else = ParseExpression();
   1198       if (!Else) return 0;
   1199 
   1200       return new IfExprAST(Cond, Then, Else);
   1201     }
   1202 
   1203     /// forexpr ::= 'for' identifier '=' expr ',' expr (',' expr)? 'in' expression
   1204     static ExprAST *ParseForExpr() {
   1205       getNextToken();  // eat the for.
   1206 
   1207       if (CurTok != tok_identifier)
   1208         return Error("expected identifier after for");
   1209 
   1210       std::string IdName = IdentifierStr;
   1211       getNextToken();  // eat identifier.
   1212 
   1213       if (CurTok != '=')
   1214         return Error("expected '=' after for");
   1215       getNextToken();  // eat '='.
   1216 
   1217 
   1218       ExprAST *Start = ParseExpression();
   1219       if (Start == 0) return 0;
   1220       if (CurTok != ',')
   1221         return Error("expected ',' after for start value");
   1222       getNextToken();
   1223 
   1224       ExprAST *End = ParseExpression();
   1225       if (End == 0) return 0;
   1226 
   1227       // The step value is optional.
   1228       ExprAST *Step = 0;
   1229       if (CurTok == ',') {
   1230         getNextToken();
   1231         Step = ParseExpression();
   1232         if (Step == 0) return 0;
   1233       }
   1234 
   1235       if (CurTok != tok_in)
   1236         return Error("expected 'in' after for");
   1237       getNextToken();  // eat 'in'.
   1238 
   1239       ExprAST *Body = ParseExpression();
   1240       if (Body == 0) return 0;
   1241 
   1242       return new ForExprAST(IdName, Start, End, Step, Body);
   1243     }
   1244 
   1245     /// varexpr ::= 'var' identifier ('=' expression)?
   1246     //                    (',' identifier ('=' expression)?)* 'in' expression
   1247     static ExprAST *ParseVarExpr() {
   1248       getNextToken();  // eat the var.
   1249 
   1250       std::vector<std::pair<std::string, ExprAST*> > VarNames;
   1251 
   1252       // At least one variable name is required.
   1253       if (CurTok != tok_identifier)
   1254         return Error("expected identifier after var");
   1255 
   1256       while (1) {
   1257         std::string Name = IdentifierStr;
   1258         getNextToken();  // eat identifier.
   1259 
   1260         // Read the optional initializer.
   1261         ExprAST *Init = 0;
   1262         if (CurTok == '=') {
   1263           getNextToken(); // eat the '='.
   1264 
   1265           Init = ParseExpression();
   1266           if (Init == 0) return 0;
   1267         }
   1268 
   1269         VarNames.push_back(std::make_pair(Name, Init));
   1270 
   1271         // End of var list, exit loop.
   1272         if (CurTok != ',') break;
   1273         getNextToken(); // eat the ','.
   1274 
   1275         if (CurTok != tok_identifier)
   1276           return Error("expected identifier list after var");
   1277       }
   1278 
   1279       // At this point, we have to have 'in'.
   1280       if (CurTok != tok_in)
   1281         return Error("expected 'in' keyword after 'var'");
   1282       getNextToken();  // eat 'in'.
   1283 
   1284       ExprAST *Body = ParseExpression();
   1285       if (Body == 0) return 0;
   1286 
   1287       return new VarExprAST(VarNames, Body);
   1288     }
   1289 
   1290     /// primary
   1291     ///   ::= identifierexpr
   1292     ///   ::= numberexpr
   1293     ///   ::= parenexpr
   1294     ///   ::= ifexpr
   1295     ///   ::= forexpr
   1296     ///   ::= varexpr
   1297     static ExprAST *ParsePrimary() {
   1298       switch (CurTok) {
   1299       default: return Error("unknown token when expecting an expression");
   1300       case tok_identifier: return ParseIdentifierExpr();
   1301       case tok_number:     return ParseNumberExpr();
   1302       case '(':            return ParseParenExpr();
   1303       case tok_if:         return ParseIfExpr();
   1304       case tok_for:        return ParseForExpr();
   1305       case tok_var:        return ParseVarExpr();
   1306       }
   1307     }
   1308 
   1309     /// unary
   1310     ///   ::= primary
   1311     ///   ::= '!' unary
   1312     static ExprAST *ParseUnary() {
   1313       // If the current token is not an operator, it must be a primary expr.
   1314       if (!isascii(CurTok) || CurTok == '(' || CurTok == ',')
   1315         return ParsePrimary();
   1316 
   1317       // If this is a unary operator, read it.
   1318       int Opc = CurTok;
   1319       getNextToken();
   1320       if (ExprAST *Operand = ParseUnary())
   1321         return new UnaryExprAST(Opc, Operand);
   1322       return 0;
   1323     }
   1324 
   1325     /// binoprhs
   1326     ///   ::= ('+' unary)*
   1327     static ExprAST *ParseBinOpRHS(int ExprPrec, ExprAST *LHS) {
   1328       // If this is a binop, find its precedence.
   1329       while (1) {
   1330         int TokPrec = GetTokPrecedence();
   1331 
   1332         // If this is a binop that binds at least as tightly as the current binop,
   1333         // consume it, otherwise we are done.
   1334         if (TokPrec < ExprPrec)
   1335           return LHS;
   1336 
   1337         // Okay, we know this is a binop.
   1338         int BinOp = CurTok;
   1339         getNextToken();  // eat binop
   1340 
   1341         // Parse the unary expression after the binary operator.
   1342         ExprAST *RHS = ParseUnary();
   1343         if (!RHS) return 0;
   1344 
   1345         // If BinOp binds less tightly with RHS than the operator after RHS, let
   1346         // the pending operator take RHS as its LHS.
   1347         int NextPrec = GetTokPrecedence();
   1348         if (TokPrec < NextPrec) {
   1349           RHS = ParseBinOpRHS(TokPrec+1, RHS);
   1350           if (RHS == 0) return 0;
   1351         }
   1352 
   1353         // Merge LHS/RHS.
   1354         LHS = new BinaryExprAST(BinOp, LHS, RHS);
   1355       }
   1356     }
   1357 
   1358     /// expression
   1359     ///   ::= unary binoprhs
   1360     ///
   1361     static ExprAST *ParseExpression() {
   1362       ExprAST *LHS = ParseUnary();
   1363       if (!LHS) return 0;
   1364 
   1365       return ParseBinOpRHS(0, LHS);
   1366     }
   1367 
   1368     /// prototype
   1369     ///   ::= id '(' id* ')'
   1370     ///   ::= binary LETTER number? (id, id)
   1371     ///   ::= unary LETTER (id)
   1372     static PrototypeAST *ParsePrototype() {
   1373       std::string FnName;
   1374 
   1375       unsigned Kind = 0; // 0 = identifier, 1 = unary, 2 = binary.
   1376       unsigned BinaryPrecedence = 30;
   1377 
   1378       switch (CurTok) {
   1379       default:
   1380         return ErrorP("Expected function name in prototype");
   1381       case tok_identifier:
   1382         FnName = IdentifierStr;
   1383         Kind = 0;
   1384         getNextToken();
   1385         break;
   1386       case tok_unary:
   1387         getNextToken();
   1388         if (!isascii(CurTok))
   1389           return ErrorP("Expected unary operator");
   1390         FnName = "unary";
   1391         FnName += (char)CurTok;
   1392         Kind = 1;
   1393         getNextToken();
   1394         break;
   1395       case tok_binary:
   1396         getNextToken();
   1397         if (!isascii(CurTok))
   1398           return ErrorP("Expected binary operator");
   1399         FnName = "binary";
   1400         FnName += (char)CurTok;
   1401         Kind = 2;
   1402         getNextToken();
   1403 
   1404         // Read the precedence if present.
   1405         if (CurTok == tok_number) {
   1406           if (NumVal < 1 || NumVal > 100)
   1407             return ErrorP("Invalid precedecnce: must be 1..100");
   1408           BinaryPrecedence = (unsigned)NumVal;
   1409           getNextToken();
   1410         }
   1411         break;
   1412       }
   1413 
   1414       if (CurTok != '(')
   1415         return ErrorP("Expected '(' in prototype");
   1416 
   1417       std::vector<std::string> ArgNames;
   1418       while (getNextToken() == tok_identifier)
   1419         ArgNames.push_back(IdentifierStr);
   1420       if (CurTok != ')')
   1421         return ErrorP("Expected ')' in prototype");
   1422 
   1423       // success.
   1424       getNextToken();  // eat ')'.
   1425 
   1426       // Verify right number of names for operator.
   1427       if (Kind && ArgNames.size() != Kind)
   1428         return ErrorP("Invalid number of operands for operator");
   1429 
   1430       return new PrototypeAST(FnName, ArgNames, Kind != 0, BinaryPrecedence);
   1431     }
   1432 
   1433     /// definition ::= 'def' prototype expression
   1434     static FunctionAST *ParseDefinition() {
   1435       getNextToken();  // eat def.
   1436       PrototypeAST *Proto = ParsePrototype();
   1437       if (Proto == 0) return 0;
   1438 
   1439       if (ExprAST *E = ParseExpression())
   1440         return new FunctionAST(Proto, E);
   1441       return 0;
   1442     }
   1443 
   1444     /// toplevelexpr ::= expression
   1445     static FunctionAST *ParseTopLevelExpr() {
   1446       if (ExprAST *E = ParseExpression()) {
   1447         // Make an anonymous proto.
   1448         PrototypeAST *Proto = new PrototypeAST("", std::vector<std::string>());
   1449         return new FunctionAST(Proto, E);
   1450       }
   1451       return 0;
   1452     }
   1453 
   1454     /// external ::= 'extern' prototype
   1455     static PrototypeAST *ParseExtern() {
   1456       getNextToken();  // eat extern.
   1457       return ParsePrototype();
   1458     }
   1459 
   1460     //===----------------------------------------------------------------------===//
   1461     // Code Generation
   1462     //===----------------------------------------------------------------------===//
   1463 
   1464     static Module *TheModule;
   1465     static IRBuilder<> Builder(getGlobalContext());
   1466     static std::map<std::string, AllocaInst*> NamedValues;
   1467     static FunctionPassManager *TheFPM;
   1468 
   1469     Value *ErrorV(const char *Str) { Error(Str); return 0; }
   1470 
   1471     /// CreateEntryBlockAlloca - Create an alloca instruction in the entry block of
   1472     /// the function.  This is used for mutable variables etc.
   1473     static AllocaInst *CreateEntryBlockAlloca(Function *TheFunction,
   1474                                               const std::string &VarName) {
   1475       IRBuilder<> TmpB(&TheFunction->getEntryBlock(),
   1476                      TheFunction->getEntryBlock().begin());
   1477       return TmpB.CreateAlloca(Type::getDoubleTy(getGlobalContext()), 0,
   1478                                VarName.c_str());
   1479     }
   1480 
   1481     Value *NumberExprAST::Codegen() {
   1482       return ConstantFP::get(getGlobalContext(), APFloat(Val));
   1483     }
   1484 
   1485     Value *VariableExprAST::Codegen() {
   1486       // Look this variable up in the function.
   1487       Value *V = NamedValues[Name];
   1488       if (V == 0) return ErrorV("Unknown variable name");
   1489 
   1490       // Load the value.
   1491       return Builder.CreateLoad(V, Name.c_str());
   1492     }
   1493 
   1494     Value *UnaryExprAST::Codegen() {
   1495       Value *OperandV = Operand->Codegen();
   1496       if (OperandV == 0) return 0;
   1497 
   1498       Function *F = TheModule->getFunction(std::string("unary")+Opcode);
   1499       if (F == 0)
   1500         return ErrorV("Unknown unary operator");
   1501 
   1502       return Builder.CreateCall(F, OperandV, "unop");
   1503     }
   1504 
   1505     Value *BinaryExprAST::Codegen() {
   1506       // Special case '=' because we don't want to emit the LHS as an expression.
   1507       if (Op == '=') {
   1508         // Assignment requires the LHS to be an identifier.
   1509         VariableExprAST *LHSE = dynamic_cast<VariableExprAST*>(LHS);
   1510         if (!LHSE)
   1511           return ErrorV("destination of '=' must be a variable");
   1512         // Codegen the RHS.
   1513         Value *Val = RHS->Codegen();
   1514         if (Val == 0) return 0;
   1515 
   1516         // Look up the name.
   1517         Value *Variable = NamedValues[LHSE->getName()];
   1518         if (Variable == 0) return ErrorV("Unknown variable name");
   1519 
   1520         Builder.CreateStore(Val, Variable);
   1521         return Val;
   1522       }
   1523 
   1524       Value *L = LHS->Codegen();
   1525       Value *R = RHS->Codegen();
   1526       if (L == 0 || R == 0) return 0;
   1527 
   1528       switch (Op) {
   1529       case '+': return Builder.CreateFAdd(L, R, "addtmp");
   1530       case '-': return Builder.CreateFSub(L, R, "subtmp");
   1531       case '*': return Builder.CreateFMul(L, R, "multmp");
   1532       case '<':
   1533         L = Builder.CreateFCmpULT(L, R, "cmptmp");
   1534         // Convert bool 0/1 to double 0.0 or 1.0
   1535         return Builder.CreateUIToFP(L, Type::getDoubleTy(getGlobalContext()),
   1536                                     "booltmp");
   1537       default: break;
   1538       }
   1539 
   1540       // If it wasn't a builtin binary operator, it must be a user defined one. Emit
   1541       // a call to it.
   1542       Function *F = TheModule->getFunction(std::string("binary")+Op);
   1543       assert(F && "binary operator not found!");
   1544 
   1545       Value *Ops[2] = { L, R };
   1546       return Builder.CreateCall(F, Ops, "binop");
   1547     }
   1548 
   1549     Value *CallExprAST::Codegen() {
   1550       // Look up the name in the global module table.
   1551       Function *CalleeF = TheModule->getFunction(Callee);
   1552       if (CalleeF == 0)
   1553         return ErrorV("Unknown function referenced");
   1554 
   1555       // If argument mismatch error.
   1556       if (CalleeF->arg_size() != Args.size())
   1557         return ErrorV("Incorrect # arguments passed");
   1558 
   1559       std::vector<Value*> ArgsV;
   1560       for (unsigned i = 0, e = Args.size(); i != e; ++i) {
   1561         ArgsV.push_back(Args[i]->Codegen());
   1562         if (ArgsV.back() == 0) return 0;
   1563       }
   1564 
   1565       return Builder.CreateCall(CalleeF, ArgsV, "calltmp");
   1566     }
   1567 
   1568     Value *IfExprAST::Codegen() {
   1569       Value *CondV = Cond->Codegen();
   1570       if (CondV == 0) return 0;
   1571 
   1572       // Convert condition to a bool by comparing equal to 0.0.
   1573       CondV = Builder.CreateFCmpONE(CondV,
   1574                                   ConstantFP::get(getGlobalContext(), APFloat(0.0)),
   1575                                     "ifcond");
   1576 
   1577       Function *TheFunction = Builder.GetInsertBlock()->getParent();
   1578 
   1579       // Create blocks for the then and else cases.  Insert the 'then' block at the
   1580       // end of the function.
   1581       BasicBlock *ThenBB = BasicBlock::Create(getGlobalContext(), "then", TheFunction);
   1582       BasicBlock *ElseBB = BasicBlock::Create(getGlobalContext(), "else");
   1583       BasicBlock *MergeBB = BasicBlock::Create(getGlobalContext(), "ifcont");
   1584 
   1585       Builder.CreateCondBr(CondV, ThenBB, ElseBB);
   1586 
   1587       // Emit then value.
   1588       Builder.SetInsertPoint(ThenBB);
   1589 
   1590       Value *ThenV = Then->Codegen();
   1591       if (ThenV == 0) return 0;
   1592 
   1593       Builder.CreateBr(MergeBB);
   1594       // Codegen of 'Then' can change the current block, update ThenBB for the PHI.
   1595       ThenBB = Builder.GetInsertBlock();
   1596 
   1597       // Emit else block.
   1598       TheFunction->getBasicBlockList().push_back(ElseBB);
   1599       Builder.SetInsertPoint(ElseBB);
   1600 
   1601       Value *ElseV = Else->Codegen();
   1602       if (ElseV == 0) return 0;
   1603 
   1604       Builder.CreateBr(MergeBB);
   1605       // Codegen of 'Else' can change the current block, update ElseBB for the PHI.
   1606       ElseBB = Builder.GetInsertBlock();
   1607 
   1608       // Emit merge block.
   1609       TheFunction->getBasicBlockList().push_back(MergeBB);
   1610       Builder.SetInsertPoint(MergeBB);
   1611       PHINode *PN = Builder.CreatePHI(Type::getDoubleTy(getGlobalContext()), 2,
   1612                                       "iftmp");
   1613 
   1614       PN->addIncoming(ThenV, ThenBB);
   1615       PN->addIncoming(ElseV, ElseBB);
   1616       return PN;
   1617     }
   1618 
   1619     Value *ForExprAST::Codegen() {
   1620       // Output this as:
   1621       //   var = alloca double
   1622       //   ...
   1623       //   start = startexpr
   1624       //   store start -> var
   1625       //   goto loop
   1626       // loop:
   1627       //   ...
   1628       //   bodyexpr
   1629       //   ...
   1630       // loopend:
   1631       //   step = stepexpr
   1632       //   endcond = endexpr
   1633       //
   1634       //   curvar = load var
   1635       //   nextvar = curvar + step
   1636       //   store nextvar -> var
   1637       //   br endcond, loop, endloop
   1638       // outloop:
   1639 
   1640       Function *TheFunction = Builder.GetInsertBlock()->getParent();
   1641 
   1642       // Create an alloca for the variable in the entry block.
   1643       AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName);
   1644 
   1645       // Emit the start code first, without 'variable' in scope.
   1646       Value *StartVal = Start->Codegen();
   1647       if (StartVal == 0) return 0;
   1648 
   1649       // Store the value into the alloca.
   1650       Builder.CreateStore(StartVal, Alloca);
   1651 
   1652       // Make the new basic block for the loop header, inserting after current
   1653       // block.
   1654       BasicBlock *LoopBB = BasicBlock::Create(getGlobalContext(), "loop", TheFunction);
   1655 
   1656       // Insert an explicit fall through from the current block to the LoopBB.
   1657       Builder.CreateBr(LoopBB);
   1658 
   1659       // Start insertion in LoopBB.
   1660       Builder.SetInsertPoint(LoopBB);
   1661 
   1662       // Within the loop, the variable is defined equal to the PHI node.  If it
   1663       // shadows an existing variable, we have to restore it, so save it now.
   1664       AllocaInst *OldVal = NamedValues[VarName];
   1665       NamedValues[VarName] = Alloca;
   1666 
   1667       // Emit the body of the loop.  This, like any other expr, can change the
   1668       // current BB.  Note that we ignore the value computed by the body, but don't
   1669       // allow an error.
   1670       if (Body->Codegen() == 0)
   1671         return 0;
   1672 
   1673       // Emit the step value.
   1674       Value *StepVal;
   1675       if (Step) {
   1676         StepVal = Step->Codegen();
   1677         if (StepVal == 0) return 0;
   1678       } else {
   1679         // If not specified, use 1.0.
   1680         StepVal = ConstantFP::get(getGlobalContext(), APFloat(1.0));
   1681       }
   1682 
   1683       // Compute the end condition.
   1684       Value *EndCond = End->Codegen();
   1685       if (EndCond == 0) return EndCond;
   1686 
   1687       // Reload, increment, and restore the alloca.  This handles the case where
   1688       // the body of the loop mutates the variable.
   1689       Value *CurVar = Builder.CreateLoad(Alloca, VarName.c_str());
   1690       Value *NextVar = Builder.CreateFAdd(CurVar, StepVal, "nextvar");
   1691       Builder.CreateStore(NextVar, Alloca);
   1692 
   1693       // Convert condition to a bool by comparing equal to 0.0.
   1694       EndCond = Builder.CreateFCmpONE(EndCond,
   1695                                   ConstantFP::get(getGlobalContext(), APFloat(0.0)),
   1696                                       "loopcond");
   1697 
   1698       // Create the "after loop" block and insert it.
   1699       BasicBlock *AfterBB = BasicBlock::Create(getGlobalContext(), "afterloop", TheFunction);
   1700 
   1701       // Insert the conditional branch into the end of LoopEndBB.
   1702       Builder.CreateCondBr(EndCond, LoopBB, AfterBB);
   1703 
   1704       // Any new code will be inserted in AfterBB.
   1705       Builder.SetInsertPoint(AfterBB);
   1706 
   1707       // Restore the unshadowed variable.
   1708       if (OldVal)
   1709         NamedValues[VarName] = OldVal;
   1710       else
   1711         NamedValues.erase(VarName);
   1712 
   1713 
   1714       // for expr always returns 0.0.
   1715       return Constant::getNullValue(Type::getDoubleTy(getGlobalContext()));
   1716     }
   1717 
   1718     Value *VarExprAST::Codegen() {
   1719       std::vector<AllocaInst *> OldBindings;
   1720 
   1721       Function *TheFunction = Builder.GetInsertBlock()->getParent();
   1722 
   1723       // Register all variables and emit their initializer.
   1724       for (unsigned i = 0, e = VarNames.size(); i != e; ++i) {
   1725         const std::string &VarName = VarNames[i].first;
   1726         ExprAST *Init = VarNames[i].second;
   1727 
   1728         // Emit the initializer before adding the variable to scope, this prevents
   1729         // the initializer from referencing the variable itself, and permits stuff
   1730         // like this:
   1731         //  var a = 1 in
   1732         //    var a = a in ...   # refers to outer 'a'.
   1733         Value *InitVal;
   1734         if (Init) {
   1735           InitVal = Init->Codegen();
   1736           if (InitVal == 0) return 0;
   1737         } else { // If not specified, use 0.0.
   1738           InitVal = ConstantFP::get(getGlobalContext(), APFloat(0.0));
   1739         }
   1740 
   1741         AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName);
   1742         Builder.CreateStore(InitVal, Alloca);
   1743 
   1744         // Remember the old variable binding so that we can restore the binding when
   1745         // we unrecurse.
   1746         OldBindings.push_back(NamedValues[VarName]);
   1747 
   1748         // Remember this binding.
   1749         NamedValues[VarName] = Alloca;
   1750       }
   1751 
   1752       // Codegen the body, now that all vars are in scope.
   1753       Value *BodyVal = Body->Codegen();
   1754       if (BodyVal == 0) return 0;
   1755 
   1756       // Pop all our variables from scope.
   1757       for (unsigned i = 0, e = VarNames.size(); i != e; ++i)
   1758         NamedValues[VarNames[i].first] = OldBindings[i];
   1759 
   1760       // Return the body computation.
   1761       return BodyVal;
   1762     }
   1763 
   1764     Function *PrototypeAST::Codegen() {
   1765       // Make the function type:  double(double,double) etc.
   1766       std::vector<Type*> Doubles(Args.size(),
   1767                                  Type::getDoubleTy(getGlobalContext()));
   1768       FunctionType *FT = FunctionType::get(Type::getDoubleTy(getGlobalContext()),
   1769                                            Doubles, false);
   1770 
   1771       Function *F = Function::Create(FT, Function::ExternalLinkage, Name, TheModule);
   1772 
   1773       // If F conflicted, there was already something named 'Name'.  If it has a
   1774       // body, don't allow redefinition or reextern.
   1775       if (F->getName() != Name) {
   1776         // Delete the one we just made and get the existing one.
   1777         F->eraseFromParent();
   1778         F = TheModule->getFunction(Name);
   1779 
   1780         // If F already has a body, reject this.
   1781         if (!F->empty()) {
   1782           ErrorF("redefinition of function");
   1783           return 0;
   1784         }
   1785 
   1786         // If F took a different number of args, reject.
   1787         if (F->arg_size() != Args.size()) {
   1788           ErrorF("redefinition of function with different # args");
   1789           return 0;
   1790         }
   1791       }
   1792 
   1793       // Set names for all arguments.
   1794       unsigned Idx = 0;
   1795       for (Function::arg_iterator AI = F->arg_begin(); Idx != Args.size();
   1796            ++AI, ++Idx)
   1797         AI->setName(Args[Idx]);
   1798 
   1799       return F;
   1800     }
   1801 
   1802     /// CreateArgumentAllocas - Create an alloca for each argument and register the
   1803     /// argument in the symbol table so that references to it will succeed.
   1804     void PrototypeAST::CreateArgumentAllocas(Function *F) {
   1805       Function::arg_iterator AI = F->arg_begin();
   1806       for (unsigned Idx = 0, e = Args.size(); Idx != e; ++Idx, ++AI) {
   1807         // Create an alloca for this variable.
   1808         AllocaInst *Alloca = CreateEntryBlockAlloca(F, Args[Idx]);
   1809 
   1810         // Store the initial value into the alloca.
   1811         Builder.CreateStore(AI, Alloca);
   1812 
   1813         // Add arguments to variable symbol table.
   1814         NamedValues[Args[Idx]] = Alloca;
   1815       }
   1816     }
   1817 
   1818     Function *FunctionAST::Codegen() {
   1819       NamedValues.clear();
   1820 
   1821       Function *TheFunction = Proto->Codegen();
   1822       if (TheFunction == 0)
   1823         return 0;
   1824 
   1825       // If this is an operator, install it.
   1826       if (Proto->isBinaryOp())
   1827         BinopPrecedence[Proto->getOperatorName()] = Proto->getBinaryPrecedence();
   1828 
   1829       // Create a new basic block to start insertion into.
   1830       BasicBlock *BB = BasicBlock::Create(getGlobalContext(), "entry", TheFunction);
   1831       Builder.SetInsertPoint(BB);
   1832 
   1833       // Add all arguments to the symbol table and create their allocas.
   1834       Proto->CreateArgumentAllocas(TheFunction);
   1835 
   1836       if (Value *RetVal = Body->Codegen()) {
   1837         // Finish off the function.
   1838         Builder.CreateRet(RetVal);
   1839 
   1840         // Validate the generated code, checking for consistency.
   1841         verifyFunction(*TheFunction);
   1842 
   1843         // Optimize the function.
   1844         TheFPM->run(*TheFunction);
   1845 
   1846         return TheFunction;
   1847       }
   1848 
   1849       // Error reading body, remove function.
   1850       TheFunction->eraseFromParent();
   1851 
   1852       if (Proto->isBinaryOp())
   1853         BinopPrecedence.erase(Proto->getOperatorName());
   1854       return 0;
   1855     }
   1856 
   1857     //===----------------------------------------------------------------------===//
   1858     // Top-Level parsing and JIT Driver
   1859     //===----------------------------------------------------------------------===//
   1860 
   1861     static ExecutionEngine *TheExecutionEngine;
   1862 
   1863     static void HandleDefinition() {
   1864       if (FunctionAST *F = ParseDefinition()) {
   1865         if (Function *LF = F->Codegen()) {
   1866           fprintf(stderr, "Read function definition:");
   1867           LF->dump();
   1868         }
   1869       } else {
   1870         // Skip token for error recovery.
   1871         getNextToken();
   1872       }
   1873     }
   1874 
   1875     static void HandleExtern() {
   1876       if (PrototypeAST *P = ParseExtern()) {
   1877         if (Function *F = P->Codegen()) {
   1878           fprintf(stderr, "Read extern: ");
   1879           F->dump();
   1880         }
   1881       } else {
   1882         // Skip token for error recovery.
   1883         getNextToken();
   1884       }
   1885     }
   1886 
   1887     static void HandleTopLevelExpression() {
   1888       // Evaluate a top-level expression into an anonymous function.
   1889       if (FunctionAST *F = ParseTopLevelExpr()) {
   1890         if (Function *LF = F->Codegen()) {
   1891           // JIT the function, returning a function pointer.
   1892           void *FPtr = TheExecutionEngine->getPointerToFunction(LF);
   1893 
   1894           // Cast it to the right type (takes no arguments, returns a double) so we
   1895           // can call it as a native function.
   1896           double (*FP)() = (double (*)())(intptr_t)FPtr;
   1897           fprintf(stderr, "Evaluated to %f\n", FP());
   1898         }
   1899       } else {
   1900         // Skip token for error recovery.
   1901         getNextToken();
   1902       }
   1903     }
   1904 
   1905     /// top ::= definition | external | expression | ';'
   1906     static void MainLoop() {
   1907       while (1) {
   1908         fprintf(stderr, "ready> ");
   1909         switch (CurTok) {
   1910         case tok_eof:    return;
   1911         case ';':        getNextToken(); break;  // ignore top-level semicolons.
   1912         case tok_def:    HandleDefinition(); break;
   1913         case tok_extern: HandleExtern(); break;
   1914         default:         HandleTopLevelExpression(); break;
   1915         }
   1916       }
   1917     }
   1918 
   1919     //===----------------------------------------------------------------------===//
   1920     // "Library" functions that can be "extern'd" from user code.
   1921     //===----------------------------------------------------------------------===//
   1922 
   1923     /// putchard - putchar that takes a double and returns 0.
   1924     extern "C"
   1925     double putchard(double X) {
   1926       putchar((char)X);
   1927       return 0;
   1928     }
   1929 
   1930     /// printd - printf that takes a double prints it as "%f\n", returning 0.
   1931     extern "C"
   1932     double printd(double X) {
   1933       printf("%f\n", X);
   1934       return 0;
   1935     }
   1936 
   1937     //===----------------------------------------------------------------------===//
   1938     // Main driver code.
   1939     //===----------------------------------------------------------------------===//
   1940 
   1941     int main() {
   1942       InitializeNativeTarget();
   1943       LLVMContext &Context = getGlobalContext();
   1944 
   1945       // Install standard binary operators.
   1946       // 1 is lowest precedence.
   1947       BinopPrecedence['='] = 2;
   1948       BinopPrecedence['<'] = 10;
   1949       BinopPrecedence['+'] = 20;
   1950       BinopPrecedence['-'] = 20;
   1951       BinopPrecedence['*'] = 40;  // highest.
   1952 
   1953       // Prime the first token.
   1954       fprintf(stderr, "ready> ");
   1955       getNextToken();
   1956 
   1957       // Make the module, which holds all the code.
   1958       TheModule = new Module("my cool jit", Context);
   1959 
   1960       // Create the JIT.  This takes ownership of the module.
   1961       std::string ErrStr;
   1962       TheExecutionEngine = EngineBuilder(TheModule).setErrorStr(&ErrStr).create();
   1963       if (!TheExecutionEngine) {
   1964         fprintf(stderr, "Could not create ExecutionEngine: %s\n", ErrStr.c_str());
   1965         exit(1);
   1966       }
   1967 
   1968       FunctionPassManager OurFPM(TheModule);
   1969 
   1970       // Set up the optimizer pipeline.  Start with registering info about how the
   1971       // target lays out data structures.
   1972       OurFPM.add(new DataLayout(*TheExecutionEngine->getDataLayout()));
   1973       // Provide basic AliasAnalysis support for GVN.
   1974       OurFPM.add(createBasicAliasAnalysisPass());
   1975       // Promote allocas to registers.
   1976       OurFPM.add(createPromoteMemoryToRegisterPass());
   1977       // Do simple "peephole" optimizations and bit-twiddling optzns.
   1978       OurFPM.add(createInstructionCombiningPass());
   1979       // Reassociate expressions.
   1980       OurFPM.add(createReassociatePass());
   1981       // Eliminate Common SubExpressions.
   1982       OurFPM.add(createGVNPass());
   1983       // Simplify the control flow graph (deleting unreachable blocks, etc).
   1984       OurFPM.add(createCFGSimplificationPass());
   1985 
   1986       OurFPM.doInitialization();
   1987 
   1988       // Set the global so the code gen can use this.
   1989       TheFPM = &OurFPM;
   1990 
   1991       // Run the main "interpreter loop" now.
   1992       MainLoop();
   1993 
   1994       TheFPM = 0;
   1995 
   1996       // Print out all of the generated code.
   1997       TheModule->dump();
   1998 
   1999       return 0;
   2000     }
   2001 
   2002 `Next: Conclusion and other useful LLVM tidbits <LangImpl8.html>`_
   2003 
   2004