Home | History | Annotate | Download | only in tutorial
      1 =======================================================
      2 Kaleidoscope: Extending the Language: Mutable Variables
      3 =======================================================
      4 
      5 .. contents::
      6    :local:
      7 
      8 Chapter 7 Introduction
      9 ======================
     10 
     11 Welcome to Chapter 7 of the "`Implementing a language with
     12 LLVM <index.html>`_" tutorial. In chapters 1 through 6, we've built a
     13 very respectable, albeit simple, `functional programming
     14 language <http://en.wikipedia.org/wiki/Functional_programming>`_. In our
     15 journey, we learned some parsing techniques, how to build and represent
     16 an AST, how to build LLVM IR, and how to optimize the resultant code as
     17 well as JIT compile it.
     18 
     19 While Kaleidoscope is interesting as a functional language, the fact
     20 that it is functional makes it "too easy" to generate LLVM IR for it. In
     21 particular, a functional language makes it very easy to build LLVM IR
     22 directly in `SSA
     23 form <http://en.wikipedia.org/wiki/Static_single_assignment_form>`_.
     24 Since LLVM requires that the input code be in SSA form, this is a very
     25 nice property and it is often unclear to newcomers how to generate code
     26 for an imperative language with mutable variables.
     27 
     28 The short (and happy) summary of this chapter is that there is no need
     29 for your front-end to build SSA form: LLVM provides highly tuned and
     30 well tested support for this, though the way it works is a bit
     31 unexpected for some.
     32 
     33 Why is this a hard problem?
     34 ===========================
     35 
     36 To understand why mutable variables cause complexities in SSA
     37 construction, consider this extremely simple C example:
     38 
     39 .. code-block:: c
     40 
     41     int G, H;
     42     int test(_Bool Condition) {
     43       int X;
     44       if (Condition)
     45         X = G;
     46       else
     47         X = H;
     48       return X;
     49     }
     50 
     51 In this case, we have the variable "X", whose value depends on the path
     52 executed in the program. Because there are two different possible values
     53 for X before the return instruction, a PHI node is inserted to merge the
     54 two values. The LLVM IR that we want for this example looks like this:
     55 
     56 .. code-block:: llvm
     57 
     58     @G = weak global i32 0   ; type of @G is i32*
     59     @H = weak global i32 0   ; type of @H is i32*
     60 
     61     define i32 @test(i1 %Condition) {
     62     entry:
     63       br i1 %Condition, label %cond_true, label %cond_false
     64 
     65     cond_true:
     66       %X.0 = load i32* @G
     67       br label %cond_next
     68 
     69     cond_false:
     70       %X.1 = load i32* @H
     71       br label %cond_next
     72 
     73     cond_next:
     74       %X.2 = phi i32 [ %X.1, %cond_false ], [ %X.0, %cond_true ]
     75       ret i32 %X.2
     76     }
     77 
     78 In this example, the loads from the G and H global variables are
     79 explicit in the LLVM IR, and they live in the then/else branches of the
     80 if statement (cond\_true/cond\_false). In order to merge the incoming
     81 values, the X.2 phi node in the cond\_next block selects the right value
     82 to use based on where control flow is coming from: if control flow comes
     83 from the cond\_false block, X.2 gets the value of X.1. Alternatively, if
     84 control flow comes from cond\_true, it gets the value of X.0. The intent
     85 of this chapter is not to explain the details of SSA form. For more
     86 information, see one of the many `online
     87 references <http://en.wikipedia.org/wiki/Static_single_assignment_form>`_.
     88 
     89 The question for this article is "who places the phi nodes when lowering
     90 assignments to mutable variables?". The issue here is that LLVM
     91 *requires* that its IR be in SSA form: there is no "non-ssa" mode for
     92 it. However, SSA construction requires non-trivial algorithms and data
     93 structures, so it is inconvenient and wasteful for every front-end to
     94 have to reproduce this logic.
     95 
     96 Memory in LLVM
     97 ==============
     98 
     99 The 'trick' here is that while LLVM does require all register values to
    100 be in SSA form, it does not require (or permit) memory objects to be in
    101 SSA form. In the example above, note that the loads from G and H are
    102 direct accesses to G and H: they are not renamed or versioned. This
    103 differs from some other compiler systems, which do try to version memory
    104 objects. In LLVM, instead of encoding dataflow analysis of memory into
    105 the LLVM IR, it is handled with `Analysis
    106 Passes <../WritingAnLLVMPass.html>`_ which are computed on demand.
    107 
    108 With this in mind, the high-level idea is that we want to make a stack
    109 variable (which lives in memory, because it is on the stack) for each
    110 mutable object in a function. To take advantage of this trick, we need
    111 to talk about how LLVM represents stack variables.
    112 
    113 In LLVM, all memory accesses are explicit with load/store instructions,
    114 and it is carefully designed not to have (or need) an "address-of"
    115 operator. Notice how the type of the @G/@H global variables is actually
    116 "i32\*" even though the variable is defined as "i32". What this means is
    117 that @G defines *space* for an i32 in the global data area, but its
    118 *name* actually refers to the address for that space. Stack variables
    119 work the same way, except that instead of being declared with global
    120 variable definitions, they are declared with the `LLVM alloca
    121 instruction <../LangRef.html#alloca-instruction>`_:
    122 
    123 .. code-block:: llvm
    124 
    125     define i32 @example() {
    126     entry:
    127       %X = alloca i32           ; type of %X is i32*.
    128       ...
    129       %tmp = load i32* %X       ; load the stack value %X from the stack.
    130       %tmp2 = add i32 %tmp, 1   ; increment it
    131       store i32 %tmp2, i32* %X  ; store it back
    132       ...
    133 
    134 This code shows an example of how you can declare and manipulate a stack
    135 variable in the LLVM IR. Stack memory allocated with the alloca
    136 instruction is fully general: you can pass the address of the stack slot
    137 to functions, you can store it in other variables, etc. In our example
    138 above, we could rewrite the example to use the alloca technique to avoid
    139 using a PHI node:
    140 
    141 .. code-block:: llvm
    142 
    143     @G = weak global i32 0   ; type of @G is i32*
    144     @H = weak global i32 0   ; type of @H is i32*
    145 
    146     define i32 @test(i1 %Condition) {
    147     entry:
    148       %X = alloca i32           ; type of %X is i32*.
    149       br i1 %Condition, label %cond_true, label %cond_false
    150 
    151     cond_true:
    152       %X.0 = load i32* @G
    153       store i32 %X.0, i32* %X   ; Update X
    154       br label %cond_next
    155 
    156     cond_false:
    157       %X.1 = load i32* @H
    158       store i32 %X.1, i32* %X   ; Update X
    159       br label %cond_next
    160 
    161     cond_next:
    162       %X.2 = load i32* %X       ; Read X
    163       ret i32 %X.2
    164     }
    165 
    166 With this, we have discovered a way to handle arbitrary mutable
    167 variables without the need to create Phi nodes at all:
    168 
    169 #. Each mutable variable becomes a stack allocation.
    170 #. Each read of the variable becomes a load from the stack.
    171 #. Each update of the variable becomes a store to the stack.
    172 #. Taking the address of a variable just uses the stack address
    173    directly.
    174 
    175 While this solution has solved our immediate problem, it introduced
    176 another one: we have now apparently introduced a lot of stack traffic
    177 for very simple and common operations, a major performance problem.
    178 Fortunately for us, the LLVM optimizer has a highly-tuned optimization
    179 pass named "mem2reg" that handles this case, promoting allocas like this
    180 into SSA registers, inserting Phi nodes as appropriate. If you run this
    181 example through the pass, for example, you'll get:
    182 
    183 .. code-block:: bash
    184 
    185     $ llvm-as < example.ll | opt -mem2reg | llvm-dis
    186     @G = weak global i32 0
    187     @H = weak global i32 0
    188 
    189     define i32 @test(i1 %Condition) {
    190     entry:
    191       br i1 %Condition, label %cond_true, label %cond_false
    192 
    193     cond_true:
    194       %X.0 = load i32* @G
    195       br label %cond_next
    196 
    197     cond_false:
    198       %X.1 = load i32* @H
    199       br label %cond_next
    200 
    201     cond_next:
    202       %X.01 = phi i32 [ %X.1, %cond_false ], [ %X.0, %cond_true ]
    203       ret i32 %X.01
    204     }
    205 
    206 The mem2reg pass implements the standard "iterated dominance frontier"
    207 algorithm for constructing SSA form and has a number of optimizations
    208 that speed up (very common) degenerate cases. The mem2reg optimization
    209 pass is the answer to dealing with mutable variables, and we highly
    210 recommend that you depend on it. Note that mem2reg only works on
    211 variables in certain circumstances:
    212 
    213 #. mem2reg is alloca-driven: it looks for allocas and if it can handle
    214    them, it promotes them. It does not apply to global variables or heap
    215    allocations.
    216 #. mem2reg only looks for alloca instructions in the entry block of the
    217    function. Being in the entry block guarantees that the alloca is only
    218    executed once, which makes analysis simpler.
    219 #. mem2reg only promotes allocas whose uses are direct loads and stores.
    220    If the address of the stack object is passed to a function, or if any
    221    funny pointer arithmetic is involved, the alloca will not be
    222    promoted.
    223 #. mem2reg only works on allocas of `first
    224    class <../LangRef.html#first-class-types>`_ values (such as pointers,
    225    scalars and vectors), and only if the array size of the allocation is
    226    1 (or missing in the .ll file). mem2reg is not capable of promoting
    227    structs or arrays to registers. Note that the "sroa" pass is
    228    more powerful and can promote structs, "unions", and arrays in many
    229    cases.
    230 
    231 All of these properties are easy to satisfy for most imperative
    232 languages, and we'll illustrate it below with Kaleidoscope. The final
    233 question you may be asking is: should I bother with this nonsense for my
    234 front-end? Wouldn't it be better if I just did SSA construction
    235 directly, avoiding use of the mem2reg optimization pass? In short, we
    236 strongly recommend that you use this technique for building SSA form,
    237 unless there is an extremely good reason not to. Using this technique
    238 is:
    239 
    240 -  Proven and well tested: clang uses this technique
    241    for local mutable variables. As such, the most common clients of LLVM
    242    are using this to handle a bulk of their variables. You can be sure
    243    that bugs are found fast and fixed early.
    244 -  Extremely Fast: mem2reg has a number of special cases that make it
    245    fast in common cases as well as fully general. For example, it has
    246    fast-paths for variables that are only used in a single block,
    247    variables that only have one assignment point, good heuristics to
    248    avoid insertion of unneeded phi nodes, etc.
    249 -  Needed for debug info generation: `Debug information in
    250    LLVM <../SourceLevelDebugging.html>`_ relies on having the address of
    251    the variable exposed so that debug info can be attached to it. This
    252    technique dovetails very naturally with this style of debug info.
    253 
    254 If nothing else, this makes it much easier to get your front-end up and
    255 running, and is very simple to implement. Let's extend Kaleidoscope with
    256 mutable variables now!
    257 
    258 Mutable Variables in Kaleidoscope
    259 =================================
    260 
    261 Now that we know the sort of problem we want to tackle, let's see what
    262 this looks like in the context of our little Kaleidoscope language.
    263 We're going to add two features:
    264 
    265 #. The ability to mutate variables with the '=' operator.
    266 #. The ability to define new variables.
    267 
    268 While the first item is really what this is about, we only have
    269 variables for incoming arguments as well as for induction variables, and
    270 redefining those only goes so far :). Also, the ability to define new
    271 variables is a useful thing regardless of whether you will be mutating
    272 them. Here's a motivating example that shows how we could use these:
    273 
    274 ::
    275 
    276     # Define ':' for sequencing: as a low-precedence operator that ignores operands
    277     # and just returns the RHS.
    278     def binary : 1 (x y) y;
    279 
    280     # Recursive fib, we could do this before.
    281     def fib(x)
    282       if (x < 3) then
    283         1
    284       else
    285         fib(x-1)+fib(x-2);
    286 
    287     # Iterative fib.
    288     def fibi(x)
    289       var a = 1, b = 1, c in
    290       (for i = 3, i < x in
    291          c = a + b :
    292          a = b :
    293          b = c) :
    294       b;
    295 
    296     # Call it.
    297     fibi(10);
    298 
    299 In order to mutate variables, we have to change our existing variables
    300 to use the "alloca trick". Once we have that, we'll add our new
    301 operator, then extend Kaleidoscope to support new variable definitions.
    302 
    303 Adjusting Existing Variables for Mutation
    304 =========================================
    305 
    306 The symbol table in Kaleidoscope is managed at code generation time by
    307 the '``NamedValues``' map. This map currently keeps track of the LLVM
    308 "Value\*" that holds the double value for the named variable. In order
    309 to support mutation, we need to change this slightly, so that
    310 ``NamedValues`` holds the *memory location* of the variable in question.
    311 Note that this change is a refactoring: it changes the structure of the
    312 code, but does not (by itself) change the behavior of the compiler. All
    313 of these changes are isolated in the Kaleidoscope code generator.
    314 
    315 At this point in Kaleidoscope's development, it only supports variables
    316 for two things: incoming arguments to functions and the induction
    317 variable of 'for' loops. For consistency, we'll allow mutation of these
    318 variables in addition to other user-defined variables. This means that
    319 these will both need memory locations.
    320 
    321 To start our transformation of Kaleidoscope, we'll change the
    322 NamedValues map so that it maps to AllocaInst\* instead of Value\*. Once
    323 we do this, the C++ compiler will tell us what parts of the code we need
    324 to update:
    325 
    326 .. code-block:: c++
    327 
    328     static std::map<std::string, AllocaInst*> NamedValues;
    329 
    330 Also, since we will need to create these alloca's, we'll use a helper
    331 function that ensures that the allocas are created in the entry block of
    332 the function:
    333 
    334 .. code-block:: c++
    335 
    336     /// CreateEntryBlockAlloca - Create an alloca instruction in the entry block of
    337     /// the function.  This is used for mutable variables etc.
    338     static AllocaInst *CreateEntryBlockAlloca(Function *TheFunction,
    339                                               const std::string &VarName) {
    340       IRBuilder<> TmpB(&TheFunction->getEntryBlock(),
    341                      TheFunction->getEntryBlock().begin());
    342       return TmpB.CreateAlloca(Type::getDoubleTy(LLVMContext), 0,
    343                                VarName.c_str());
    344     }
    345 
    346 This funny looking code creates an IRBuilder object that is pointing at
    347 the first instruction (.begin()) of the entry block. It then creates an
    348 alloca with the expected name and returns it. Because all values in
    349 Kaleidoscope are doubles, there is no need to pass in a type to use.
    350 
    351 With this in place, the first functionality change we want to make is to
    352 variable references. In our new scheme, variables live on the stack, so
    353 code generating a reference to them actually needs to produce a load
    354 from the stack slot:
    355 
    356 .. code-block:: c++
    357 
    358     Value *VariableExprAST::codegen() {
    359       // Look this variable up in the function.
    360       Value *V = NamedValues[Name];
    361       if (!V)
    362         return LogErrorV("Unknown variable name");
    363 
    364       // Load the value.
    365       return Builder.CreateLoad(V, Name.c_str());
    366     }
    367 
    368 As you can see, this is pretty straightforward. Now we need to update
    369 the things that define the variables to set up the alloca. We'll start
    370 with ``ForExprAST::codegen()`` (see the `full code listing <#id1>`_ for
    371 the unabridged code):
    372 
    373 .. code-block:: c++
    374 
    375       Function *TheFunction = Builder.GetInsertBlock()->getParent();
    376 
    377       // Create an alloca for the variable in the entry block.
    378       AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName);
    379 
    380         // Emit the start code first, without 'variable' in scope.
    381       Value *StartVal = Start->codegen();
    382       if (!StartVal)
    383         return nullptr;
    384 
    385       // Store the value into the alloca.
    386       Builder.CreateStore(StartVal, Alloca);
    387       ...
    388 
    389       // Compute the end condition.
    390       Value *EndCond = End->codegen();
    391       if (!EndCond)
    392         return nullptr;
    393 
    394       // Reload, increment, and restore the alloca.  This handles the case where
    395       // the body of the loop mutates the variable.
    396       Value *CurVar = Builder.CreateLoad(Alloca);
    397       Value *NextVar = Builder.CreateFAdd(CurVar, StepVal, "nextvar");
    398       Builder.CreateStore(NextVar, Alloca);
    399       ...
    400 
    401 This code is virtually identical to the code `before we allowed mutable
    402 variables <LangImpl5.html#code-generation-for-the-for-loop>`_. The big difference is that we
    403 no longer have to construct a PHI node, and we use load/store to access
    404 the variable as needed.
    405 
    406 To support mutable argument variables, we need to also make allocas for
    407 them. The code for this is also pretty simple:
    408 
    409 .. code-block:: c++
    410 
    411     /// CreateArgumentAllocas - Create an alloca for each argument and register the
    412     /// argument in the symbol table so that references to it will succeed.
    413     void PrototypeAST::CreateArgumentAllocas(Function *F) {
    414       Function::arg_iterator AI = F->arg_begin();
    415       for (unsigned Idx = 0, e = Args.size(); Idx != e; ++Idx, ++AI) {
    416         // Create an alloca for this variable.
    417         AllocaInst *Alloca = CreateEntryBlockAlloca(F, Args[Idx]);
    418 
    419         // Store the initial value into the alloca.
    420         Builder.CreateStore(AI, Alloca);
    421 
    422         // Add arguments to variable symbol table.
    423         NamedValues[Args[Idx]] = Alloca;
    424       }
    425     }
    426 
    427 For each argument, we make an alloca, store the input value to the
    428 function into the alloca, and register the alloca as the memory location
    429 for the argument. This method gets invoked by ``FunctionAST::codegen()``
    430 right after it sets up the entry block for the function.
    431 
    432 The final missing piece is adding the mem2reg pass, which allows us to
    433 get good codegen once again:
    434 
    435 .. code-block:: c++
    436 
    437         // Set up the optimizer pipeline.  Start with registering info about how the
    438         // target lays out data structures.
    439         OurFPM.add(new DataLayout(*TheExecutionEngine->getDataLayout()));
    440         // Promote allocas to registers.
    441         OurFPM.add(createPromoteMemoryToRegisterPass());
    442         // Do simple "peephole" optimizations and bit-twiddling optzns.
    443         OurFPM.add(createInstructionCombiningPass());
    444         // Reassociate expressions.
    445         OurFPM.add(createReassociatePass());
    446 
    447 It is interesting to see what the code looks like before and after the
    448 mem2reg optimization runs. For example, this is the before/after code
    449 for our recursive fib function. Before the optimization:
    450 
    451 .. code-block:: llvm
    452 
    453     define double @fib(double %x) {
    454     entry:
    455       %x1 = alloca double
    456       store double %x, double* %x1
    457       %x2 = load double* %x1
    458       %cmptmp = fcmp ult double %x2, 3.000000e+00
    459       %booltmp = uitofp i1 %cmptmp to double
    460       %ifcond = fcmp one double %booltmp, 0.000000e+00
    461       br i1 %ifcond, label %then, label %else
    462 
    463     then:       ; preds = %entry
    464       br label %ifcont
    465 
    466     else:       ; preds = %entry
    467       %x3 = load double* %x1
    468       %subtmp = fsub double %x3, 1.000000e+00
    469       %calltmp = call double @fib(double %subtmp)
    470       %x4 = load double* %x1
    471       %subtmp5 = fsub double %x4, 2.000000e+00
    472       %calltmp6 = call double @fib(double %subtmp5)
    473       %addtmp = fadd double %calltmp, %calltmp6
    474       br label %ifcont
    475 
    476     ifcont:     ; preds = %else, %then
    477       %iftmp = phi double [ 1.000000e+00, %then ], [ %addtmp, %else ]
    478       ret double %iftmp
    479     }
    480 
    481 Here there is only one variable (x, the input argument) but you can
    482 still see the extremely simple-minded code generation strategy we are
    483 using. In the entry block, an alloca is created, and the initial input
    484 value is stored into it. Each reference to the variable does a reload
    485 from the stack. Also, note that we didn't modify the if/then/else
    486 expression, so it still inserts a PHI node. While we could make an
    487 alloca for it, it is actually easier to create a PHI node for it, so we
    488 still just make the PHI.
    489 
    490 Here is the code after the mem2reg pass runs:
    491 
    492 .. code-block:: llvm
    493 
    494     define double @fib(double %x) {
    495     entry:
    496       %cmptmp = fcmp ult double %x, 3.000000e+00
    497       %booltmp = uitofp i1 %cmptmp to double
    498       %ifcond = fcmp one double %booltmp, 0.000000e+00
    499       br i1 %ifcond, label %then, label %else
    500 
    501     then:
    502       br label %ifcont
    503 
    504     else:
    505       %subtmp = fsub double %x, 1.000000e+00
    506       %calltmp = call double @fib(double %subtmp)
    507       %subtmp5 = fsub double %x, 2.000000e+00
    508       %calltmp6 = call double @fib(double %subtmp5)
    509       %addtmp = fadd double %calltmp, %calltmp6
    510       br label %ifcont
    511 
    512     ifcont:     ; preds = %else, %then
    513       %iftmp = phi double [ 1.000000e+00, %then ], [ %addtmp, %else ]
    514       ret double %iftmp
    515     }
    516 
    517 This is a trivial case for mem2reg, since there are no redefinitions of
    518 the variable. The point of showing this is to calm your tension about
    519 inserting such blatent inefficiencies :).
    520 
    521 After the rest of the optimizers run, we get:
    522 
    523 .. code-block:: llvm
    524 
    525     define double @fib(double %x) {
    526     entry:
    527       %cmptmp = fcmp ult double %x, 3.000000e+00
    528       %booltmp = uitofp i1 %cmptmp to double
    529       %ifcond = fcmp ueq double %booltmp, 0.000000e+00
    530       br i1 %ifcond, label %else, label %ifcont
    531 
    532     else:
    533       %subtmp = fsub double %x, 1.000000e+00
    534       %calltmp = call double @fib(double %subtmp)
    535       %subtmp5 = fsub double %x, 2.000000e+00
    536       %calltmp6 = call double @fib(double %subtmp5)
    537       %addtmp = fadd double %calltmp, %calltmp6
    538       ret double %addtmp
    539 
    540     ifcont:
    541       ret double 1.000000e+00
    542     }
    543 
    544 Here we see that the simplifycfg pass decided to clone the return
    545 instruction into the end of the 'else' block. This allowed it to
    546 eliminate some branches and the PHI node.
    547 
    548 Now that all symbol table references are updated to use stack variables,
    549 we'll add the assignment operator.
    550 
    551 New Assignment Operator
    552 =======================
    553 
    554 With our current framework, adding a new assignment operator is really
    555 simple. We will parse it just like any other binary operator, but handle
    556 it internally (instead of allowing the user to define it). The first
    557 step is to set a precedence:
    558 
    559 .. code-block:: c++
    560 
    561      int main() {
    562        // Install standard binary operators.
    563        // 1 is lowest precedence.
    564        BinopPrecedence['='] = 2;
    565        BinopPrecedence['<'] = 10;
    566        BinopPrecedence['+'] = 20;
    567        BinopPrecedence['-'] = 20;
    568 
    569 Now that the parser knows the precedence of the binary operator, it
    570 takes care of all the parsing and AST generation. We just need to
    571 implement codegen for the assignment operator. This looks like:
    572 
    573 .. code-block:: c++
    574 
    575     Value *BinaryExprAST::codegen() {
    576       // Special case '=' because we don't want to emit the LHS as an expression.
    577       if (Op == '=') {
    578         // Assignment requires the LHS to be an identifier.
    579         VariableExprAST *LHSE = dynamic_cast<VariableExprAST*>(LHS.get());
    580         if (!LHSE)
    581           return LogErrorV("destination of '=' must be a variable");
    582 
    583 Unlike the rest of the binary operators, our assignment operator doesn't
    584 follow the "emit LHS, emit RHS, do computation" model. As such, it is
    585 handled as a special case before the other binary operators are handled.
    586 The other strange thing is that it requires the LHS to be a variable. It
    587 is invalid to have "(x+1) = expr" - only things like "x = expr" are
    588 allowed.
    589 
    590 .. code-block:: c++
    591 
    592         // Codegen the RHS.
    593         Value *Val = RHS->codegen();
    594         if (!Val)
    595           return nullptr;
    596 
    597         // Look up the name.
    598         Value *Variable = NamedValues[LHSE->getName()];
    599         if (!Variable)
    600           return LogErrorV("Unknown variable name");
    601 
    602         Builder.CreateStore(Val, Variable);
    603         return Val;
    604       }
    605       ...
    606 
    607 Once we have the variable, codegen'ing the assignment is
    608 straightforward: we emit the RHS of the assignment, create a store, and
    609 return the computed value. Returning a value allows for chained
    610 assignments like "X = (Y = Z)".
    611 
    612 Now that we have an assignment operator, we can mutate loop variables
    613 and arguments. For example, we can now run code like this:
    614 
    615 ::
    616 
    617     # Function to print a double.
    618     extern printd(x);
    619 
    620     # Define ':' for sequencing: as a low-precedence operator that ignores operands
    621     # and just returns the RHS.
    622     def binary : 1 (x y) y;
    623 
    624     def test(x)
    625       printd(x) :
    626       x = 4 :
    627       printd(x);
    628 
    629     test(123);
    630 
    631 When run, this example prints "123" and then "4", showing that we did
    632 actually mutate the value! Okay, we have now officially implemented our
    633 goal: getting this to work requires SSA construction in the general
    634 case. However, to be really useful, we want the ability to define our
    635 own local variables, let's add this next!
    636 
    637 User-defined Local Variables
    638 ============================
    639 
    640 Adding var/in is just like any other extension we made to
    641 Kaleidoscope: we extend the lexer, the parser, the AST and the code
    642 generator. The first step for adding our new 'var/in' construct is to
    643 extend the lexer. As before, this is pretty trivial, the code looks like
    644 this:
    645 
    646 .. code-block:: c++
    647 
    648     enum Token {
    649       ...
    650       // var definition
    651       tok_var = -13
    652     ...
    653     }
    654     ...
    655     static int gettok() {
    656     ...
    657         if (IdentifierStr == "in")
    658           return tok_in;
    659         if (IdentifierStr == "binary")
    660           return tok_binary;
    661         if (IdentifierStr == "unary")
    662           return tok_unary;
    663         if (IdentifierStr == "var")
    664           return tok_var;
    665         return tok_identifier;
    666     ...
    667 
    668 The next step is to define the AST node that we will construct. For
    669 var/in, it looks like this:
    670 
    671 .. code-block:: c++
    672 
    673     /// VarExprAST - Expression class for var/in
    674     class VarExprAST : public ExprAST {
    675       std::vector<std::pair<std::string, std::unique_ptr<ExprAST>>> VarNames;
    676       std::unique_ptr<ExprAST> Body;
    677 
    678     public:
    679       VarExprAST(std::vector<std::pair<std::string, std::unique_ptr<ExprAST>>> VarNames,
    680                  std::unique_ptr<ExprAST> body)
    681       : VarNames(std::move(VarNames)), Body(std::move(Body)) {}
    682 
    683       virtual Value *codegen();
    684     };
    685 
    686 var/in allows a list of names to be defined all at once, and each name
    687 can optionally have an initializer value. As such, we capture this
    688 information in the VarNames vector. Also, var/in has a body, this body
    689 is allowed to access the variables defined by the var/in.
    690 
    691 With this in place, we can define the parser pieces. The first thing we
    692 do is add it as a primary expression:
    693 
    694 .. code-block:: c++
    695 
    696     /// primary
    697     ///   ::= identifierexpr
    698     ///   ::= numberexpr
    699     ///   ::= parenexpr
    700     ///   ::= ifexpr
    701     ///   ::= forexpr
    702     ///   ::= varexpr
    703     static std::unique_ptr<ExprAST> ParsePrimary() {
    704       switch (CurTok) {
    705       default:
    706         return LogError("unknown token when expecting an expression");
    707       case tok_identifier:
    708         return ParseIdentifierExpr();
    709       case tok_number:
    710         return ParseNumberExpr();
    711       case '(':
    712         return ParseParenExpr();
    713       case tok_if:
    714         return ParseIfExpr();
    715       case tok_for:
    716         return ParseForExpr();
    717       case tok_var:
    718         return ParseVarExpr();
    719       }
    720     }
    721 
    722 Next we define ParseVarExpr:
    723 
    724 .. code-block:: c++
    725 
    726     /// varexpr ::= 'var' identifier ('=' expression)?
    727     //                    (',' identifier ('=' expression)?)* 'in' expression
    728     static std::unique_ptr<ExprAST> ParseVarExpr() {
    729       getNextToken();  // eat the var.
    730 
    731       std::vector<std::pair<std::string, std::unique_ptr<ExprAST>>> VarNames;
    732 
    733       // At least one variable name is required.
    734       if (CurTok != tok_identifier)
    735         return LogError("expected identifier after var");
    736 
    737 The first part of this code parses the list of identifier/expr pairs
    738 into the local ``VarNames`` vector.
    739 
    740 .. code-block:: c++
    741 
    742       while (1) {
    743         std::string Name = IdentifierStr;
    744         getNextToken();  // eat identifier.
    745 
    746         // Read the optional initializer.
    747         std::unique_ptr<ExprAST> Init;
    748         if (CurTok == '=') {
    749           getNextToken(); // eat the '='.
    750 
    751           Init = ParseExpression();
    752           if (!Init) return nullptr;
    753         }
    754 
    755         VarNames.push_back(std::make_pair(Name, std::move(Init)));
    756 
    757         // End of var list, exit loop.
    758         if (CurTok != ',') break;
    759         getNextToken(); // eat the ','.
    760 
    761         if (CurTok != tok_identifier)
    762           return LogError("expected identifier list after var");
    763       }
    764 
    765 Once all the variables are parsed, we then parse the body and create the
    766 AST node:
    767 
    768 .. code-block:: c++
    769 
    770       // At this point, we have to have 'in'.
    771       if (CurTok != tok_in)
    772         return LogError("expected 'in' keyword after 'var'");
    773       getNextToken();  // eat 'in'.
    774 
    775       auto Body = ParseExpression();
    776       if (!Body)
    777         return nullptr;
    778 
    779       return llvm::make_unique<VarExprAST>(std::move(VarNames),
    780                                            std::move(Body));
    781     }
    782 
    783 Now that we can parse and represent the code, we need to support
    784 emission of LLVM IR for it. This code starts out with:
    785 
    786 .. code-block:: c++
    787 
    788     Value *VarExprAST::codegen() {
    789       std::vector<AllocaInst *> OldBindings;
    790 
    791       Function *TheFunction = Builder.GetInsertBlock()->getParent();
    792 
    793       // Register all variables and emit their initializer.
    794       for (unsigned i = 0, e = VarNames.size(); i != e; ++i) {
    795         const std::string &VarName = VarNames[i].first;
    796         ExprAST *Init = VarNames[i].second.get();
    797 
    798 Basically it loops over all the variables, installing them one at a
    799 time. For each variable we put into the symbol table, we remember the
    800 previous value that we replace in OldBindings.
    801 
    802 .. code-block:: c++
    803 
    804         // Emit the initializer before adding the variable to scope, this prevents
    805         // the initializer from referencing the variable itself, and permits stuff
    806         // like this:
    807         //  var a = 1 in
    808         //    var a = a in ...   # refers to outer 'a'.
    809         Value *InitVal;
    810         if (Init) {
    811           InitVal = Init->codegen();
    812           if (!InitVal)
    813             return nullptr;
    814         } else { // If not specified, use 0.0.
    815           InitVal = ConstantFP::get(LLVMContext, APFloat(0.0));
    816         }
    817 
    818         AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName);
    819         Builder.CreateStore(InitVal, Alloca);
    820 
    821         // Remember the old variable binding so that we can restore the binding when
    822         // we unrecurse.
    823         OldBindings.push_back(NamedValues[VarName]);
    824 
    825         // Remember this binding.
    826         NamedValues[VarName] = Alloca;
    827       }
    828 
    829 There are more comments here than code. The basic idea is that we emit
    830 the initializer, create the alloca, then update the symbol table to
    831 point to it. Once all the variables are installed in the symbol table,
    832 we evaluate the body of the var/in expression:
    833 
    834 .. code-block:: c++
    835 
    836       // Codegen the body, now that all vars are in scope.
    837       Value *BodyVal = Body->codegen();
    838       if (!BodyVal)
    839         return nullptr;
    840 
    841 Finally, before returning, we restore the previous variable bindings:
    842 
    843 .. code-block:: c++
    844 
    845       // Pop all our variables from scope.
    846       for (unsigned i = 0, e = VarNames.size(); i != e; ++i)
    847         NamedValues[VarNames[i].first] = OldBindings[i];
    848 
    849       // Return the body computation.
    850       return BodyVal;
    851     }
    852 
    853 The end result of all of this is that we get properly scoped variable
    854 definitions, and we even (trivially) allow mutation of them :).
    855 
    856 With this, we completed what we set out to do. Our nice iterative fib
    857 example from the intro compiles and runs just fine. The mem2reg pass
    858 optimizes all of our stack variables into SSA registers, inserting PHI
    859 nodes where needed, and our front-end remains simple: no "iterated
    860 dominance frontier" computation anywhere in sight.
    861 
    862 Full Code Listing
    863 =================
    864 
    865 Here is the complete code listing for our running example, enhanced with
    866 mutable variables and var/in support. To build this example, use:
    867 
    868 .. code-block:: bash
    869 
    870     # Compile
    871     clang++ -g toy.cpp `llvm-config --cxxflags --ldflags --system-libs --libs core mcjit native` -O3 -o toy
    872     # Run
    873     ./toy
    874 
    875 Here is the code:
    876 
    877 .. literalinclude:: ../../examples/Kaleidoscope/Chapter7/toy.cpp
    878    :language: c++
    879 
    880 `Next: Compiling to Object Code <LangImpl08.html>`_
    881 
    882