1 ======================================================= 2 Kaleidoscope: Extending the Language: Mutable Variables 3 ======================================================= 4 5 .. contents:: 6 :local: 7 8 Chapter 7 Introduction 9 ====================== 10 11 Welcome to Chapter 7 of the "`Implementing a language with 12 LLVM <index.html>`_" tutorial. In chapters 1 through 6, we've built a 13 very respectable, albeit simple, `functional programming 14 language <http://en.wikipedia.org/wiki/Functional_programming>`_. In our 15 journey, we learned some parsing techniques, how to build and represent 16 an AST, how to build LLVM IR, and how to optimize the resultant code as 17 well as JIT compile it. 18 19 While Kaleidoscope is interesting as a functional language, the fact 20 that it is functional makes it "too easy" to generate LLVM IR for it. In 21 particular, a functional language makes it very easy to build LLVM IR 22 directly in `SSA 23 form <http://en.wikipedia.org/wiki/Static_single_assignment_form>`_. 24 Since LLVM requires that the input code be in SSA form, this is a very 25 nice property and it is often unclear to newcomers how to generate code 26 for an imperative language with mutable variables. 27 28 The short (and happy) summary of this chapter is that there is no need 29 for your front-end to build SSA form: LLVM provides highly tuned and 30 well tested support for this, though the way it works is a bit 31 unexpected for some. 32 33 Why is this a hard problem? 34 =========================== 35 36 To understand why mutable variables cause complexities in SSA 37 construction, consider this extremely simple C example: 38 39 .. code-block:: c 40 41 int G, H; 42 int test(_Bool Condition) { 43 int X; 44 if (Condition) 45 X = G; 46 else 47 X = H; 48 return X; 49 } 50 51 In this case, we have the variable "X", whose value depends on the path 52 executed in the program. Because there are two different possible values 53 for X before the return instruction, a PHI node is inserted to merge the 54 two values. The LLVM IR that we want for this example looks like this: 55 56 .. code-block:: llvm 57 58 @G = weak global i32 0 ; type of @G is i32* 59 @H = weak global i32 0 ; type of @H is i32* 60 61 define i32 @test(i1 %Condition) { 62 entry: 63 br i1 %Condition, label %cond_true, label %cond_false 64 65 cond_true: 66 %X.0 = load i32* @G 67 br label %cond_next 68 69 cond_false: 70 %X.1 = load i32* @H 71 br label %cond_next 72 73 cond_next: 74 %X.2 = phi i32 [ %X.1, %cond_false ], [ %X.0, %cond_true ] 75 ret i32 %X.2 76 } 77 78 In this example, the loads from the G and H global variables are 79 explicit in the LLVM IR, and they live in the then/else branches of the 80 if statement (cond\_true/cond\_false). In order to merge the incoming 81 values, the X.2 phi node in the cond\_next block selects the right value 82 to use based on where control flow is coming from: if control flow comes 83 from the cond\_false block, X.2 gets the value of X.1. Alternatively, if 84 control flow comes from cond\_true, it gets the value of X.0. The intent 85 of this chapter is not to explain the details of SSA form. For more 86 information, see one of the many `online 87 references <http://en.wikipedia.org/wiki/Static_single_assignment_form>`_. 88 89 The question for this article is "who places the phi nodes when lowering 90 assignments to mutable variables?". The issue here is that LLVM 91 *requires* that its IR be in SSA form: there is no "non-ssa" mode for 92 it. However, SSA construction requires non-trivial algorithms and data 93 structures, so it is inconvenient and wasteful for every front-end to 94 have to reproduce this logic. 95 96 Memory in LLVM 97 ============== 98 99 The 'trick' here is that while LLVM does require all register values to 100 be in SSA form, it does not require (or permit) memory objects to be in 101 SSA form. In the example above, note that the loads from G and H are 102 direct accesses to G and H: they are not renamed or versioned. This 103 differs from some other compiler systems, which do try to version memory 104 objects. In LLVM, instead of encoding dataflow analysis of memory into 105 the LLVM IR, it is handled with `Analysis 106 Passes <../WritingAnLLVMPass.html>`_ which are computed on demand. 107 108 With this in mind, the high-level idea is that we want to make a stack 109 variable (which lives in memory, because it is on the stack) for each 110 mutable object in a function. To take advantage of this trick, we need 111 to talk about how LLVM represents stack variables. 112 113 In LLVM, all memory accesses are explicit with load/store instructions, 114 and it is carefully designed not to have (or need) an "address-of" 115 operator. Notice how the type of the @G/@H global variables is actually 116 "i32\*" even though the variable is defined as "i32". What this means is 117 that @G defines *space* for an i32 in the global data area, but its 118 *name* actually refers to the address for that space. Stack variables 119 work the same way, except that instead of being declared with global 120 variable definitions, they are declared with the `LLVM alloca 121 instruction <../LangRef.html#alloca-instruction>`_: 122 123 .. code-block:: llvm 124 125 define i32 @example() { 126 entry: 127 %X = alloca i32 ; type of %X is i32*. 128 ... 129 %tmp = load i32* %X ; load the stack value %X from the stack. 130 %tmp2 = add i32 %tmp, 1 ; increment it 131 store i32 %tmp2, i32* %X ; store it back 132 ... 133 134 This code shows an example of how you can declare and manipulate a stack 135 variable in the LLVM IR. Stack memory allocated with the alloca 136 instruction is fully general: you can pass the address of the stack slot 137 to functions, you can store it in other variables, etc. In our example 138 above, we could rewrite the example to use the alloca technique to avoid 139 using a PHI node: 140 141 .. code-block:: llvm 142 143 @G = weak global i32 0 ; type of @G is i32* 144 @H = weak global i32 0 ; type of @H is i32* 145 146 define i32 @test(i1 %Condition) { 147 entry: 148 %X = alloca i32 ; type of %X is i32*. 149 br i1 %Condition, label %cond_true, label %cond_false 150 151 cond_true: 152 %X.0 = load i32* @G 153 store i32 %X.0, i32* %X ; Update X 154 br label %cond_next 155 156 cond_false: 157 %X.1 = load i32* @H 158 store i32 %X.1, i32* %X ; Update X 159 br label %cond_next 160 161 cond_next: 162 %X.2 = load i32* %X ; Read X 163 ret i32 %X.2 164 } 165 166 With this, we have discovered a way to handle arbitrary mutable 167 variables without the need to create Phi nodes at all: 168 169 #. Each mutable variable becomes a stack allocation. 170 #. Each read of the variable becomes a load from the stack. 171 #. Each update of the variable becomes a store to the stack. 172 #. Taking the address of a variable just uses the stack address 173 directly. 174 175 While this solution has solved our immediate problem, it introduced 176 another one: we have now apparently introduced a lot of stack traffic 177 for very simple and common operations, a major performance problem. 178 Fortunately for us, the LLVM optimizer has a highly-tuned optimization 179 pass named "mem2reg" that handles this case, promoting allocas like this 180 into SSA registers, inserting Phi nodes as appropriate. If you run this 181 example through the pass, for example, you'll get: 182 183 .. code-block:: bash 184 185 $ llvm-as < example.ll | opt -mem2reg | llvm-dis 186 @G = weak global i32 0 187 @H = weak global i32 0 188 189 define i32 @test(i1 %Condition) { 190 entry: 191 br i1 %Condition, label %cond_true, label %cond_false 192 193 cond_true: 194 %X.0 = load i32* @G 195 br label %cond_next 196 197 cond_false: 198 %X.1 = load i32* @H 199 br label %cond_next 200 201 cond_next: 202 %X.01 = phi i32 [ %X.1, %cond_false ], [ %X.0, %cond_true ] 203 ret i32 %X.01 204 } 205 206 The mem2reg pass implements the standard "iterated dominance frontier" 207 algorithm for constructing SSA form and has a number of optimizations 208 that speed up (very common) degenerate cases. The mem2reg optimization 209 pass is the answer to dealing with mutable variables, and we highly 210 recommend that you depend on it. Note that mem2reg only works on 211 variables in certain circumstances: 212 213 #. mem2reg is alloca-driven: it looks for allocas and if it can handle 214 them, it promotes them. It does not apply to global variables or heap 215 allocations. 216 #. mem2reg only looks for alloca instructions in the entry block of the 217 function. Being in the entry block guarantees that the alloca is only 218 executed once, which makes analysis simpler. 219 #. mem2reg only promotes allocas whose uses are direct loads and stores. 220 If the address of the stack object is passed to a function, or if any 221 funny pointer arithmetic is involved, the alloca will not be 222 promoted. 223 #. mem2reg only works on allocas of `first 224 class <../LangRef.html#first-class-types>`_ values (such as pointers, 225 scalars and vectors), and only if the array size of the allocation is 226 1 (or missing in the .ll file). mem2reg is not capable of promoting 227 structs or arrays to registers. Note that the "sroa" pass is 228 more powerful and can promote structs, "unions", and arrays in many 229 cases. 230 231 All of these properties are easy to satisfy for most imperative 232 languages, and we'll illustrate it below with Kaleidoscope. The final 233 question you may be asking is: should I bother with this nonsense for my 234 front-end? Wouldn't it be better if I just did SSA construction 235 directly, avoiding use of the mem2reg optimization pass? In short, we 236 strongly recommend that you use this technique for building SSA form, 237 unless there is an extremely good reason not to. Using this technique 238 is: 239 240 - Proven and well tested: clang uses this technique 241 for local mutable variables. As such, the most common clients of LLVM 242 are using this to handle a bulk of their variables. You can be sure 243 that bugs are found fast and fixed early. 244 - Extremely Fast: mem2reg has a number of special cases that make it 245 fast in common cases as well as fully general. For example, it has 246 fast-paths for variables that are only used in a single block, 247 variables that only have one assignment point, good heuristics to 248 avoid insertion of unneeded phi nodes, etc. 249 - Needed for debug info generation: `Debug information in 250 LLVM <../SourceLevelDebugging.html>`_ relies on having the address of 251 the variable exposed so that debug info can be attached to it. This 252 technique dovetails very naturally with this style of debug info. 253 254 If nothing else, this makes it much easier to get your front-end up and 255 running, and is very simple to implement. Let's extend Kaleidoscope with 256 mutable variables now! 257 258 Mutable Variables in Kaleidoscope 259 ================================= 260 261 Now that we know the sort of problem we want to tackle, let's see what 262 this looks like in the context of our little Kaleidoscope language. 263 We're going to add two features: 264 265 #. The ability to mutate variables with the '=' operator. 266 #. The ability to define new variables. 267 268 While the first item is really what this is about, we only have 269 variables for incoming arguments as well as for induction variables, and 270 redefining those only goes so far :). Also, the ability to define new 271 variables is a useful thing regardless of whether you will be mutating 272 them. Here's a motivating example that shows how we could use these: 273 274 :: 275 276 # Define ':' for sequencing: as a low-precedence operator that ignores operands 277 # and just returns the RHS. 278 def binary : 1 (x y) y; 279 280 # Recursive fib, we could do this before. 281 def fib(x) 282 if (x < 3) then 283 1 284 else 285 fib(x-1)+fib(x-2); 286 287 # Iterative fib. 288 def fibi(x) 289 var a = 1, b = 1, c in 290 (for i = 3, i < x in 291 c = a + b : 292 a = b : 293 b = c) : 294 b; 295 296 # Call it. 297 fibi(10); 298 299 In order to mutate variables, we have to change our existing variables 300 to use the "alloca trick". Once we have that, we'll add our new 301 operator, then extend Kaleidoscope to support new variable definitions. 302 303 Adjusting Existing Variables for Mutation 304 ========================================= 305 306 The symbol table in Kaleidoscope is managed at code generation time by 307 the '``NamedValues``' map. This map currently keeps track of the LLVM 308 "Value\*" that holds the double value for the named variable. In order 309 to support mutation, we need to change this slightly, so that 310 ``NamedValues`` holds the *memory location* of the variable in question. 311 Note that this change is a refactoring: it changes the structure of the 312 code, but does not (by itself) change the behavior of the compiler. All 313 of these changes are isolated in the Kaleidoscope code generator. 314 315 At this point in Kaleidoscope's development, it only supports variables 316 for two things: incoming arguments to functions and the induction 317 variable of 'for' loops. For consistency, we'll allow mutation of these 318 variables in addition to other user-defined variables. This means that 319 these will both need memory locations. 320 321 To start our transformation of Kaleidoscope, we'll change the 322 NamedValues map so that it maps to AllocaInst\* instead of Value\*. Once 323 we do this, the C++ compiler will tell us what parts of the code we need 324 to update: 325 326 .. code-block:: c++ 327 328 static std::map<std::string, AllocaInst*> NamedValues; 329 330 Also, since we will need to create these alloca's, we'll use a helper 331 function that ensures that the allocas are created in the entry block of 332 the function: 333 334 .. code-block:: c++ 335 336 /// CreateEntryBlockAlloca - Create an alloca instruction in the entry block of 337 /// the function. This is used for mutable variables etc. 338 static AllocaInst *CreateEntryBlockAlloca(Function *TheFunction, 339 const std::string &VarName) { 340 IRBuilder<> TmpB(&TheFunction->getEntryBlock(), 341 TheFunction->getEntryBlock().begin()); 342 return TmpB.CreateAlloca(Type::getDoubleTy(LLVMContext), 0, 343 VarName.c_str()); 344 } 345 346 This funny looking code creates an IRBuilder object that is pointing at 347 the first instruction (.begin()) of the entry block. It then creates an 348 alloca with the expected name and returns it. Because all values in 349 Kaleidoscope are doubles, there is no need to pass in a type to use. 350 351 With this in place, the first functionality change we want to make is to 352 variable references. In our new scheme, variables live on the stack, so 353 code generating a reference to them actually needs to produce a load 354 from the stack slot: 355 356 .. code-block:: c++ 357 358 Value *VariableExprAST::codegen() { 359 // Look this variable up in the function. 360 Value *V = NamedValues[Name]; 361 if (!V) 362 return LogErrorV("Unknown variable name"); 363 364 // Load the value. 365 return Builder.CreateLoad(V, Name.c_str()); 366 } 367 368 As you can see, this is pretty straightforward. Now we need to update 369 the things that define the variables to set up the alloca. We'll start 370 with ``ForExprAST::codegen()`` (see the `full code listing <#id1>`_ for 371 the unabridged code): 372 373 .. code-block:: c++ 374 375 Function *TheFunction = Builder.GetInsertBlock()->getParent(); 376 377 // Create an alloca for the variable in the entry block. 378 AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName); 379 380 // Emit the start code first, without 'variable' in scope. 381 Value *StartVal = Start->codegen(); 382 if (!StartVal) 383 return nullptr; 384 385 // Store the value into the alloca. 386 Builder.CreateStore(StartVal, Alloca); 387 ... 388 389 // Compute the end condition. 390 Value *EndCond = End->codegen(); 391 if (!EndCond) 392 return nullptr; 393 394 // Reload, increment, and restore the alloca. This handles the case where 395 // the body of the loop mutates the variable. 396 Value *CurVar = Builder.CreateLoad(Alloca); 397 Value *NextVar = Builder.CreateFAdd(CurVar, StepVal, "nextvar"); 398 Builder.CreateStore(NextVar, Alloca); 399 ... 400 401 This code is virtually identical to the code `before we allowed mutable 402 variables <LangImpl5.html#code-generation-for-the-for-loop>`_. The big difference is that we 403 no longer have to construct a PHI node, and we use load/store to access 404 the variable as needed. 405 406 To support mutable argument variables, we need to also make allocas for 407 them. The code for this is also pretty simple: 408 409 .. code-block:: c++ 410 411 /// CreateArgumentAllocas - Create an alloca for each argument and register the 412 /// argument in the symbol table so that references to it will succeed. 413 void PrototypeAST::CreateArgumentAllocas(Function *F) { 414 Function::arg_iterator AI = F->arg_begin(); 415 for (unsigned Idx = 0, e = Args.size(); Idx != e; ++Idx, ++AI) { 416 // Create an alloca for this variable. 417 AllocaInst *Alloca = CreateEntryBlockAlloca(F, Args[Idx]); 418 419 // Store the initial value into the alloca. 420 Builder.CreateStore(AI, Alloca); 421 422 // Add arguments to variable symbol table. 423 NamedValues[Args[Idx]] = Alloca; 424 } 425 } 426 427 For each argument, we make an alloca, store the input value to the 428 function into the alloca, and register the alloca as the memory location 429 for the argument. This method gets invoked by ``FunctionAST::codegen()`` 430 right after it sets up the entry block for the function. 431 432 The final missing piece is adding the mem2reg pass, which allows us to 433 get good codegen once again: 434 435 .. code-block:: c++ 436 437 // Set up the optimizer pipeline. Start with registering info about how the 438 // target lays out data structures. 439 OurFPM.add(new DataLayout(*TheExecutionEngine->getDataLayout())); 440 // Promote allocas to registers. 441 OurFPM.add(createPromoteMemoryToRegisterPass()); 442 // Do simple "peephole" optimizations and bit-twiddling optzns. 443 OurFPM.add(createInstructionCombiningPass()); 444 // Reassociate expressions. 445 OurFPM.add(createReassociatePass()); 446 447 It is interesting to see what the code looks like before and after the 448 mem2reg optimization runs. For example, this is the before/after code 449 for our recursive fib function. Before the optimization: 450 451 .. code-block:: llvm 452 453 define double @fib(double %x) { 454 entry: 455 %x1 = alloca double 456 store double %x, double* %x1 457 %x2 = load double* %x1 458 %cmptmp = fcmp ult double %x2, 3.000000e+00 459 %booltmp = uitofp i1 %cmptmp to double 460 %ifcond = fcmp one double %booltmp, 0.000000e+00 461 br i1 %ifcond, label %then, label %else 462 463 then: ; preds = %entry 464 br label %ifcont 465 466 else: ; preds = %entry 467 %x3 = load double* %x1 468 %subtmp = fsub double %x3, 1.000000e+00 469 %calltmp = call double @fib(double %subtmp) 470 %x4 = load double* %x1 471 %subtmp5 = fsub double %x4, 2.000000e+00 472 %calltmp6 = call double @fib(double %subtmp5) 473 %addtmp = fadd double %calltmp, %calltmp6 474 br label %ifcont 475 476 ifcont: ; preds = %else, %then 477 %iftmp = phi double [ 1.000000e+00, %then ], [ %addtmp, %else ] 478 ret double %iftmp 479 } 480 481 Here there is only one variable (x, the input argument) but you can 482 still see the extremely simple-minded code generation strategy we are 483 using. In the entry block, an alloca is created, and the initial input 484 value is stored into it. Each reference to the variable does a reload 485 from the stack. Also, note that we didn't modify the if/then/else 486 expression, so it still inserts a PHI node. While we could make an 487 alloca for it, it is actually easier to create a PHI node for it, so we 488 still just make the PHI. 489 490 Here is the code after the mem2reg pass runs: 491 492 .. code-block:: llvm 493 494 define double @fib(double %x) { 495 entry: 496 %cmptmp = fcmp ult double %x, 3.000000e+00 497 %booltmp = uitofp i1 %cmptmp to double 498 %ifcond = fcmp one double %booltmp, 0.000000e+00 499 br i1 %ifcond, label %then, label %else 500 501 then: 502 br label %ifcont 503 504 else: 505 %subtmp = fsub double %x, 1.000000e+00 506 %calltmp = call double @fib(double %subtmp) 507 %subtmp5 = fsub double %x, 2.000000e+00 508 %calltmp6 = call double @fib(double %subtmp5) 509 %addtmp = fadd double %calltmp, %calltmp6 510 br label %ifcont 511 512 ifcont: ; preds = %else, %then 513 %iftmp = phi double [ 1.000000e+00, %then ], [ %addtmp, %else ] 514 ret double %iftmp 515 } 516 517 This is a trivial case for mem2reg, since there are no redefinitions of 518 the variable. The point of showing this is to calm your tension about 519 inserting such blatent inefficiencies :). 520 521 After the rest of the optimizers run, we get: 522 523 .. code-block:: llvm 524 525 define double @fib(double %x) { 526 entry: 527 %cmptmp = fcmp ult double %x, 3.000000e+00 528 %booltmp = uitofp i1 %cmptmp to double 529 %ifcond = fcmp ueq double %booltmp, 0.000000e+00 530 br i1 %ifcond, label %else, label %ifcont 531 532 else: 533 %subtmp = fsub double %x, 1.000000e+00 534 %calltmp = call double @fib(double %subtmp) 535 %subtmp5 = fsub double %x, 2.000000e+00 536 %calltmp6 = call double @fib(double %subtmp5) 537 %addtmp = fadd double %calltmp, %calltmp6 538 ret double %addtmp 539 540 ifcont: 541 ret double 1.000000e+00 542 } 543 544 Here we see that the simplifycfg pass decided to clone the return 545 instruction into the end of the 'else' block. This allowed it to 546 eliminate some branches and the PHI node. 547 548 Now that all symbol table references are updated to use stack variables, 549 we'll add the assignment operator. 550 551 New Assignment Operator 552 ======================= 553 554 With our current framework, adding a new assignment operator is really 555 simple. We will parse it just like any other binary operator, but handle 556 it internally (instead of allowing the user to define it). The first 557 step is to set a precedence: 558 559 .. code-block:: c++ 560 561 int main() { 562 // Install standard binary operators. 563 // 1 is lowest precedence. 564 BinopPrecedence['='] = 2; 565 BinopPrecedence['<'] = 10; 566 BinopPrecedence['+'] = 20; 567 BinopPrecedence['-'] = 20; 568 569 Now that the parser knows the precedence of the binary operator, it 570 takes care of all the parsing and AST generation. We just need to 571 implement codegen for the assignment operator. This looks like: 572 573 .. code-block:: c++ 574 575 Value *BinaryExprAST::codegen() { 576 // Special case '=' because we don't want to emit the LHS as an expression. 577 if (Op == '=') { 578 // Assignment requires the LHS to be an identifier. 579 VariableExprAST *LHSE = dynamic_cast<VariableExprAST*>(LHS.get()); 580 if (!LHSE) 581 return LogErrorV("destination of '=' must be a variable"); 582 583 Unlike the rest of the binary operators, our assignment operator doesn't 584 follow the "emit LHS, emit RHS, do computation" model. As such, it is 585 handled as a special case before the other binary operators are handled. 586 The other strange thing is that it requires the LHS to be a variable. It 587 is invalid to have "(x+1) = expr" - only things like "x = expr" are 588 allowed. 589 590 .. code-block:: c++ 591 592 // Codegen the RHS. 593 Value *Val = RHS->codegen(); 594 if (!Val) 595 return nullptr; 596 597 // Look up the name. 598 Value *Variable = NamedValues[LHSE->getName()]; 599 if (!Variable) 600 return LogErrorV("Unknown variable name"); 601 602 Builder.CreateStore(Val, Variable); 603 return Val; 604 } 605 ... 606 607 Once we have the variable, codegen'ing the assignment is 608 straightforward: we emit the RHS of the assignment, create a store, and 609 return the computed value. Returning a value allows for chained 610 assignments like "X = (Y = Z)". 611 612 Now that we have an assignment operator, we can mutate loop variables 613 and arguments. For example, we can now run code like this: 614 615 :: 616 617 # Function to print a double. 618 extern printd(x); 619 620 # Define ':' for sequencing: as a low-precedence operator that ignores operands 621 # and just returns the RHS. 622 def binary : 1 (x y) y; 623 624 def test(x) 625 printd(x) : 626 x = 4 : 627 printd(x); 628 629 test(123); 630 631 When run, this example prints "123" and then "4", showing that we did 632 actually mutate the value! Okay, we have now officially implemented our 633 goal: getting this to work requires SSA construction in the general 634 case. However, to be really useful, we want the ability to define our 635 own local variables, let's add this next! 636 637 User-defined Local Variables 638 ============================ 639 640 Adding var/in is just like any other extension we made to 641 Kaleidoscope: we extend the lexer, the parser, the AST and the code 642 generator. The first step for adding our new 'var/in' construct is to 643 extend the lexer. As before, this is pretty trivial, the code looks like 644 this: 645 646 .. code-block:: c++ 647 648 enum Token { 649 ... 650 // var definition 651 tok_var = -13 652 ... 653 } 654 ... 655 static int gettok() { 656 ... 657 if (IdentifierStr == "in") 658 return tok_in; 659 if (IdentifierStr == "binary") 660 return tok_binary; 661 if (IdentifierStr == "unary") 662 return tok_unary; 663 if (IdentifierStr == "var") 664 return tok_var; 665 return tok_identifier; 666 ... 667 668 The next step is to define the AST node that we will construct. For 669 var/in, it looks like this: 670 671 .. code-block:: c++ 672 673 /// VarExprAST - Expression class for var/in 674 class VarExprAST : public ExprAST { 675 std::vector<std::pair<std::string, std::unique_ptr<ExprAST>>> VarNames; 676 std::unique_ptr<ExprAST> Body; 677 678 public: 679 VarExprAST(std::vector<std::pair<std::string, std::unique_ptr<ExprAST>>> VarNames, 680 std::unique_ptr<ExprAST> body) 681 : VarNames(std::move(VarNames)), Body(std::move(Body)) {} 682 683 virtual Value *codegen(); 684 }; 685 686 var/in allows a list of names to be defined all at once, and each name 687 can optionally have an initializer value. As such, we capture this 688 information in the VarNames vector. Also, var/in has a body, this body 689 is allowed to access the variables defined by the var/in. 690 691 With this in place, we can define the parser pieces. The first thing we 692 do is add it as a primary expression: 693 694 .. code-block:: c++ 695 696 /// primary 697 /// ::= identifierexpr 698 /// ::= numberexpr 699 /// ::= parenexpr 700 /// ::= ifexpr 701 /// ::= forexpr 702 /// ::= varexpr 703 static std::unique_ptr<ExprAST> ParsePrimary() { 704 switch (CurTok) { 705 default: 706 return LogError("unknown token when expecting an expression"); 707 case tok_identifier: 708 return ParseIdentifierExpr(); 709 case tok_number: 710 return ParseNumberExpr(); 711 case '(': 712 return ParseParenExpr(); 713 case tok_if: 714 return ParseIfExpr(); 715 case tok_for: 716 return ParseForExpr(); 717 case tok_var: 718 return ParseVarExpr(); 719 } 720 } 721 722 Next we define ParseVarExpr: 723 724 .. code-block:: c++ 725 726 /// varexpr ::= 'var' identifier ('=' expression)? 727 // (',' identifier ('=' expression)?)* 'in' expression 728 static std::unique_ptr<ExprAST> ParseVarExpr() { 729 getNextToken(); // eat the var. 730 731 std::vector<std::pair<std::string, std::unique_ptr<ExprAST>>> VarNames; 732 733 // At least one variable name is required. 734 if (CurTok != tok_identifier) 735 return LogError("expected identifier after var"); 736 737 The first part of this code parses the list of identifier/expr pairs 738 into the local ``VarNames`` vector. 739 740 .. code-block:: c++ 741 742 while (1) { 743 std::string Name = IdentifierStr; 744 getNextToken(); // eat identifier. 745 746 // Read the optional initializer. 747 std::unique_ptr<ExprAST> Init; 748 if (CurTok == '=') { 749 getNextToken(); // eat the '='. 750 751 Init = ParseExpression(); 752 if (!Init) return nullptr; 753 } 754 755 VarNames.push_back(std::make_pair(Name, std::move(Init))); 756 757 // End of var list, exit loop. 758 if (CurTok != ',') break; 759 getNextToken(); // eat the ','. 760 761 if (CurTok != tok_identifier) 762 return LogError("expected identifier list after var"); 763 } 764 765 Once all the variables are parsed, we then parse the body and create the 766 AST node: 767 768 .. code-block:: c++ 769 770 // At this point, we have to have 'in'. 771 if (CurTok != tok_in) 772 return LogError("expected 'in' keyword after 'var'"); 773 getNextToken(); // eat 'in'. 774 775 auto Body = ParseExpression(); 776 if (!Body) 777 return nullptr; 778 779 return llvm::make_unique<VarExprAST>(std::move(VarNames), 780 std::move(Body)); 781 } 782 783 Now that we can parse and represent the code, we need to support 784 emission of LLVM IR for it. This code starts out with: 785 786 .. code-block:: c++ 787 788 Value *VarExprAST::codegen() { 789 std::vector<AllocaInst *> OldBindings; 790 791 Function *TheFunction = Builder.GetInsertBlock()->getParent(); 792 793 // Register all variables and emit their initializer. 794 for (unsigned i = 0, e = VarNames.size(); i != e; ++i) { 795 const std::string &VarName = VarNames[i].first; 796 ExprAST *Init = VarNames[i].second.get(); 797 798 Basically it loops over all the variables, installing them one at a 799 time. For each variable we put into the symbol table, we remember the 800 previous value that we replace in OldBindings. 801 802 .. code-block:: c++ 803 804 // Emit the initializer before adding the variable to scope, this prevents 805 // the initializer from referencing the variable itself, and permits stuff 806 // like this: 807 // var a = 1 in 808 // var a = a in ... # refers to outer 'a'. 809 Value *InitVal; 810 if (Init) { 811 InitVal = Init->codegen(); 812 if (!InitVal) 813 return nullptr; 814 } else { // If not specified, use 0.0. 815 InitVal = ConstantFP::get(LLVMContext, APFloat(0.0)); 816 } 817 818 AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName); 819 Builder.CreateStore(InitVal, Alloca); 820 821 // Remember the old variable binding so that we can restore the binding when 822 // we unrecurse. 823 OldBindings.push_back(NamedValues[VarName]); 824 825 // Remember this binding. 826 NamedValues[VarName] = Alloca; 827 } 828 829 There are more comments here than code. The basic idea is that we emit 830 the initializer, create the alloca, then update the symbol table to 831 point to it. Once all the variables are installed in the symbol table, 832 we evaluate the body of the var/in expression: 833 834 .. code-block:: c++ 835 836 // Codegen the body, now that all vars are in scope. 837 Value *BodyVal = Body->codegen(); 838 if (!BodyVal) 839 return nullptr; 840 841 Finally, before returning, we restore the previous variable bindings: 842 843 .. code-block:: c++ 844 845 // Pop all our variables from scope. 846 for (unsigned i = 0, e = VarNames.size(); i != e; ++i) 847 NamedValues[VarNames[i].first] = OldBindings[i]; 848 849 // Return the body computation. 850 return BodyVal; 851 } 852 853 The end result of all of this is that we get properly scoped variable 854 definitions, and we even (trivially) allow mutation of them :). 855 856 With this, we completed what we set out to do. Our nice iterative fib 857 example from the intro compiles and runs just fine. The mem2reg pass 858 optimizes all of our stack variables into SSA registers, inserting PHI 859 nodes where needed, and our front-end remains simple: no "iterated 860 dominance frontier" computation anywhere in sight. 861 862 Full Code Listing 863 ================= 864 865 Here is the complete code listing for our running example, enhanced with 866 mutable variables and var/in support. To build this example, use: 867 868 .. code-block:: bash 869 870 # Compile 871 clang++ -g toy.cpp `llvm-config --cxxflags --ldflags --system-libs --libs core mcjit native` -O3 -o toy 872 # Run 873 ./toy 874 875 Here is the code: 876 877 .. literalinclude:: ../../examples/Kaleidoscope/Chapter7/toy.cpp 878 :language: c++ 879 880 `Next: Compiling to Object Code <LangImpl08.html>`_ 881 882