1 ======================================================= 2 Kaleidoscope: Extending the Language: Mutable Variables 3 ======================================================= 4 5 .. contents:: 6 :local: 7 8 Chapter 7 Introduction 9 ====================== 10 11 Welcome to Chapter 7 of the "`Implementing a language with 12 LLVM <index.html>`_" tutorial. In chapters 1 through 6, we've built a 13 very respectable, albeit simple, `functional programming 14 language <http://en.wikipedia.org/wiki/Functional_programming>`_. In our 15 journey, we learned some parsing techniques, how to build and represent 16 an AST, how to build LLVM IR, and how to optimize the resultant code as 17 well as JIT compile it. 18 19 While Kaleidoscope is interesting as a functional language, the fact 20 that it is functional makes it "too easy" to generate LLVM IR for it. In 21 particular, a functional language makes it very easy to build LLVM IR 22 directly in `SSA 23 form <http://en.wikipedia.org/wiki/Static_single_assignment_form>`_. 24 Since LLVM requires that the input code be in SSA form, this is a very 25 nice property and it is often unclear to newcomers how to generate code 26 for an imperative language with mutable variables. 27 28 The short (and happy) summary of this chapter is that there is no need 29 for your front-end to build SSA form: LLVM provides highly tuned and 30 well tested support for this, though the way it works is a bit 31 unexpected for some. 32 33 Why is this a hard problem? 34 =========================== 35 36 To understand why mutable variables cause complexities in SSA 37 construction, consider this extremely simple C example: 38 39 .. code-block:: c 40 41 int G, H; 42 int test(_Bool Condition) { 43 int X; 44 if (Condition) 45 X = G; 46 else 47 X = H; 48 return X; 49 } 50 51 In this case, we have the variable "X", whose value depends on the path 52 executed in the program. Because there are two different possible values 53 for X before the return instruction, a PHI node is inserted to merge the 54 two values. The LLVM IR that we want for this example looks like this: 55 56 .. code-block:: llvm 57 58 @G = weak global i32 0 ; type of @G is i32* 59 @H = weak global i32 0 ; type of @H is i32* 60 61 define i32 @test(i1 %Condition) { 62 entry: 63 br i1 %Condition, label %cond_true, label %cond_false 64 65 cond_true: 66 %X.0 = load i32* @G 67 br label %cond_next 68 69 cond_false: 70 %X.1 = load i32* @H 71 br label %cond_next 72 73 cond_next: 74 %X.2 = phi i32 [ %X.1, %cond_false ], [ %X.0, %cond_true ] 75 ret i32 %X.2 76 } 77 78 In this example, the loads from the G and H global variables are 79 explicit in the LLVM IR, and they live in the then/else branches of the 80 if statement (cond\_true/cond\_false). In order to merge the incoming 81 values, the X.2 phi node in the cond\_next block selects the right value 82 to use based on where control flow is coming from: if control flow comes 83 from the cond\_false block, X.2 gets the value of X.1. Alternatively, if 84 control flow comes from cond\_true, it gets the value of X.0. The intent 85 of this chapter is not to explain the details of SSA form. For more 86 information, see one of the many `online 87 references <http://en.wikipedia.org/wiki/Static_single_assignment_form>`_. 88 89 The question for this article is "who places the phi nodes when lowering 90 assignments to mutable variables?". The issue here is that LLVM 91 *requires* that its IR be in SSA form: there is no "non-ssa" mode for 92 it. However, SSA construction requires non-trivial algorithms and data 93 structures, so it is inconvenient and wasteful for every front-end to 94 have to reproduce this logic. 95 96 Memory in LLVM 97 ============== 98 99 The 'trick' here is that while LLVM does require all register values to 100 be in SSA form, it does not require (or permit) memory objects to be in 101 SSA form. In the example above, note that the loads from G and H are 102 direct accesses to G and H: they are not renamed or versioned. This 103 differs from some other compiler systems, which do try to version memory 104 objects. In LLVM, instead of encoding dataflow analysis of memory into 105 the LLVM IR, it is handled with `Analysis 106 Passes <../WritingAnLLVMPass.html>`_ which are computed on demand. 107 108 With this in mind, the high-level idea is that we want to make a stack 109 variable (which lives in memory, because it is on the stack) for each 110 mutable object in a function. To take advantage of this trick, we need 111 to talk about how LLVM represents stack variables. 112 113 In LLVM, all memory accesses are explicit with load/store instructions, 114 and it is carefully designed not to have (or need) an "address-of" 115 operator. Notice how the type of the @G/@H global variables is actually 116 "i32\*" even though the variable is defined as "i32". What this means is 117 that @G defines *space* for an i32 in the global data area, but its 118 *name* actually refers to the address for that space. Stack variables 119 work the same way, except that instead of being declared with global 120 variable definitions, they are declared with the `LLVM alloca 121 instruction <../LangRef.html#i_alloca>`_: 122 123 .. code-block:: llvm 124 125 define i32 @example() { 126 entry: 127 %X = alloca i32 ; type of %X is i32*. 128 ... 129 %tmp = load i32* %X ; load the stack value %X from the stack. 130 %tmp2 = add i32 %tmp, 1 ; increment it 131 store i32 %tmp2, i32* %X ; store it back 132 ... 133 134 This code shows an example of how you can declare and manipulate a stack 135 variable in the LLVM IR. Stack memory allocated with the alloca 136 instruction is fully general: you can pass the address of the stack slot 137 to functions, you can store it in other variables, etc. In our example 138 above, we could rewrite the example to use the alloca technique to avoid 139 using a PHI node: 140 141 .. code-block:: llvm 142 143 @G = weak global i32 0 ; type of @G is i32* 144 @H = weak global i32 0 ; type of @H is i32* 145 146 define i32 @test(i1 %Condition) { 147 entry: 148 %X = alloca i32 ; type of %X is i32*. 149 br i1 %Condition, label %cond_true, label %cond_false 150 151 cond_true: 152 %X.0 = load i32* @G 153 store i32 %X.0, i32* %X ; Update X 154 br label %cond_next 155 156 cond_false: 157 %X.1 = load i32* @H 158 store i32 %X.1, i32* %X ; Update X 159 br label %cond_next 160 161 cond_next: 162 %X.2 = load i32* %X ; Read X 163 ret i32 %X.2 164 } 165 166 With this, we have discovered a way to handle arbitrary mutable 167 variables without the need to create Phi nodes at all: 168 169 #. Each mutable variable becomes a stack allocation. 170 #. Each read of the variable becomes a load from the stack. 171 #. Each update of the variable becomes a store to the stack. 172 #. Taking the address of a variable just uses the stack address 173 directly. 174 175 While this solution has solved our immediate problem, it introduced 176 another one: we have now apparently introduced a lot of stack traffic 177 for very simple and common operations, a major performance problem. 178 Fortunately for us, the LLVM optimizer has a highly-tuned optimization 179 pass named "mem2reg" that handles this case, promoting allocas like this 180 into SSA registers, inserting Phi nodes as appropriate. If you run this 181 example through the pass, for example, you'll get: 182 183 .. code-block:: bash 184 185 $ llvm-as < example.ll | opt -mem2reg | llvm-dis 186 @G = weak global i32 0 187 @H = weak global i32 0 188 189 define i32 @test(i1 %Condition) { 190 entry: 191 br i1 %Condition, label %cond_true, label %cond_false 192 193 cond_true: 194 %X.0 = load i32* @G 195 br label %cond_next 196 197 cond_false: 198 %X.1 = load i32* @H 199 br label %cond_next 200 201 cond_next: 202 %X.01 = phi i32 [ %X.1, %cond_false ], [ %X.0, %cond_true ] 203 ret i32 %X.01 204 } 205 206 The mem2reg pass implements the standard "iterated dominance frontier" 207 algorithm for constructing SSA form and has a number of optimizations 208 that speed up (very common) degenerate cases. The mem2reg optimization 209 pass is the answer to dealing with mutable variables, and we highly 210 recommend that you depend on it. Note that mem2reg only works on 211 variables in certain circumstances: 212 213 #. mem2reg is alloca-driven: it looks for allocas and if it can handle 214 them, it promotes them. It does not apply to global variables or heap 215 allocations. 216 #. mem2reg only looks for alloca instructions in the entry block of the 217 function. Being in the entry block guarantees that the alloca is only 218 executed once, which makes analysis simpler. 219 #. mem2reg only promotes allocas whose uses are direct loads and stores. 220 If the address of the stack object is passed to a function, or if any 221 funny pointer arithmetic is involved, the alloca will not be 222 promoted. 223 #. mem2reg only works on allocas of `first 224 class <../LangRef.html#t_classifications>`_ values (such as pointers, 225 scalars and vectors), and only if the array size of the allocation is 226 1 (or missing in the .ll file). mem2reg is not capable of promoting 227 structs or arrays to registers. Note that the "scalarrepl" pass is 228 more powerful and can promote structs, "unions", and arrays in many 229 cases. 230 231 All of these properties are easy to satisfy for most imperative 232 languages, and we'll illustrate it below with Kaleidoscope. The final 233 question you may be asking is: should I bother with this nonsense for my 234 front-end? Wouldn't it be better if I just did SSA construction 235 directly, avoiding use of the mem2reg optimization pass? In short, we 236 strongly recommend that you use this technique for building SSA form, 237 unless there is an extremely good reason not to. Using this technique 238 is: 239 240 - Proven and well tested: llvm-gcc and clang both use this technique 241 for local mutable variables. As such, the most common clients of LLVM 242 are using this to handle a bulk of their variables. You can be sure 243 that bugs are found fast and fixed early. 244 - Extremely Fast: mem2reg has a number of special cases that make it 245 fast in common cases as well as fully general. For example, it has 246 fast-paths for variables that are only used in a single block, 247 variables that only have one assignment point, good heuristics to 248 avoid insertion of unneeded phi nodes, etc. 249 - Needed for debug info generation: `Debug information in 250 LLVM <../SourceLevelDebugging.html>`_ relies on having the address of 251 the variable exposed so that debug info can be attached to it. This 252 technique dovetails very naturally with this style of debug info. 253 254 If nothing else, this makes it much easier to get your front-end up and 255 running, and is very simple to implement. Lets extend Kaleidoscope with 256 mutable variables now! 257 258 Mutable Variables in Kaleidoscope 259 ================================= 260 261 Now that we know the sort of problem we want to tackle, lets see what 262 this looks like in the context of our little Kaleidoscope language. 263 We're going to add two features: 264 265 #. The ability to mutate variables with the '=' operator. 266 #. The ability to define new variables. 267 268 While the first item is really what this is about, we only have 269 variables for incoming arguments as well as for induction variables, and 270 redefining those only goes so far :). Also, the ability to define new 271 variables is a useful thing regardless of whether you will be mutating 272 them. Here's a motivating example that shows how we could use these: 273 274 :: 275 276 # Define ':' for sequencing: as a low-precedence operator that ignores operands 277 # and just returns the RHS. 278 def binary : 1 (x y) y; 279 280 # Recursive fib, we could do this before. 281 def fib(x) 282 if (x < 3) then 283 1 284 else 285 fib(x-1)+fib(x-2); 286 287 # Iterative fib. 288 def fibi(x) 289 var a = 1, b = 1, c in 290 (for i = 3, i < x in 291 c = a + b : 292 a = b : 293 b = c) : 294 b; 295 296 # Call it. 297 fibi(10); 298 299 In order to mutate variables, we have to change our existing variables 300 to use the "alloca trick". Once we have that, we'll add our new 301 operator, then extend Kaleidoscope to support new variable definitions. 302 303 Adjusting Existing Variables for Mutation 304 ========================================= 305 306 The symbol table in Kaleidoscope is managed at code generation time by 307 the '``named_values``' map. This map currently keeps track of the LLVM 308 "Value\*" that holds the double value for the named variable. In order 309 to support mutation, we need to change this slightly, so that it 310 ``named_values`` holds the *memory location* of the variable in 311 question. Note that this change is a refactoring: it changes the 312 structure of the code, but does not (by itself) change the behavior of 313 the compiler. All of these changes are isolated in the Kaleidoscope code 314 generator. 315 316 At this point in Kaleidoscope's development, it only supports variables 317 for two things: incoming arguments to functions and the induction 318 variable of 'for' loops. For consistency, we'll allow mutation of these 319 variables in addition to other user-defined variables. This means that 320 these will both need memory locations. 321 322 To start our transformation of Kaleidoscope, we'll change the 323 ``named_values`` map so that it maps to AllocaInst\* instead of Value\*. 324 Once we do this, the C++ compiler will tell us what parts of the code we 325 need to update: 326 327 **Note:** the ocaml bindings currently model both ``Value*``'s and 328 ``AllocInst*``'s as ``Llvm.llvalue``'s, but this may change in the future 329 to be more type safe. 330 331 .. code-block:: ocaml 332 333 let named_values:(string, llvalue) Hashtbl.t = Hashtbl.create 10 334 335 Also, since we will need to create these alloca's, we'll use a helper 336 function that ensures that the allocas are created in the entry block of 337 the function: 338 339 .. code-block:: ocaml 340 341 (* Create an alloca instruction in the entry block of the function. This 342 * is used for mutable variables etc. *) 343 let create_entry_block_alloca the_function var_name = 344 let builder = builder_at (instr_begin (entry_block the_function)) in 345 build_alloca double_type var_name builder 346 347 This funny looking code creates an ``Llvm.llbuilder`` object that is 348 pointing at the first instruction of the entry block. It then creates an 349 alloca with the expected name and returns it. Because all values in 350 Kaleidoscope are doubles, there is no need to pass in a type to use. 351 352 With this in place, the first functionality change we want to make is to 353 variable references. In our new scheme, variables live on the stack, so 354 code generating a reference to them actually needs to produce a load 355 from the stack slot: 356 357 .. code-block:: ocaml 358 359 let rec codegen_expr = function 360 ... 361 | Ast.Variable name -> 362 let v = try Hashtbl.find named_values name with 363 | Not_found -> raise (Error "unknown variable name") 364 in 365 (* Load the value. *) 366 build_load v name builder 367 368 As you can see, this is pretty straightforward. Now we need to update 369 the things that define the variables to set up the alloca. We'll start 370 with ``codegen_expr Ast.For ...`` (see the `full code listing <#code>`_ 371 for the unabridged code): 372 373 .. code-block:: ocaml 374 375 | Ast.For (var_name, start, end_, step, body) -> 376 let the_function = block_parent (insertion_block builder) in 377 378 (* Create an alloca for the variable in the entry block. *) 379 let alloca = create_entry_block_alloca the_function var_name in 380 381 (* Emit the start code first, without 'variable' in scope. *) 382 let start_val = codegen_expr start in 383 384 (* Store the value into the alloca. *) 385 ignore(build_store start_val alloca builder); 386 387 ... 388 389 (* Within the loop, the variable is defined equal to the PHI node. If it 390 * shadows an existing variable, we have to restore it, so save it 391 * now. *) 392 let old_val = 393 try Some (Hashtbl.find named_values var_name) with Not_found -> None 394 in 395 Hashtbl.add named_values var_name alloca; 396 397 ... 398 399 (* Compute the end condition. *) 400 let end_cond = codegen_expr end_ in 401 402 (* Reload, increment, and restore the alloca. This handles the case where 403 * the body of the loop mutates the variable. *) 404 let cur_var = build_load alloca var_name builder in 405 let next_var = build_add cur_var step_val "nextvar" builder in 406 ignore(build_store next_var alloca builder); 407 ... 408 409 This code is virtually identical to the code `before we allowed mutable 410 variables <OCamlLangImpl5.html#forcodegen>`_. The big difference is that 411 we no longer have to construct a PHI node, and we use load/store to 412 access the variable as needed. 413 414 To support mutable argument variables, we need to also make allocas for 415 them. The code for this is also pretty simple: 416 417 .. code-block:: ocaml 418 419 (* Create an alloca for each argument and register the argument in the symbol 420 * table so that references to it will succeed. *) 421 let create_argument_allocas the_function proto = 422 let args = match proto with 423 | Ast.Prototype (_, args) | Ast.BinOpPrototype (_, args, _) -> args 424 in 425 Array.iteri (fun i ai -> 426 let var_name = args.(i) in 427 (* Create an alloca for this variable. *) 428 let alloca = create_entry_block_alloca the_function var_name in 429 430 (* Store the initial value into the alloca. *) 431 ignore(build_store ai alloca builder); 432 433 (* Add arguments to variable symbol table. *) 434 Hashtbl.add named_values var_name alloca; 435 ) (params the_function) 436 437 For each argument, we make an alloca, store the input value to the 438 function into the alloca, and register the alloca as the memory location 439 for the argument. This method gets invoked by ``Codegen.codegen_func`` 440 right after it sets up the entry block for the function. 441 442 The final missing piece is adding the mem2reg pass, which allows us to 443 get good codegen once again: 444 445 .. code-block:: ocaml 446 447 let main () = 448 ... 449 let the_fpm = PassManager.create_function Codegen.the_module in 450 451 (* Set up the optimizer pipeline. Start with registering info about how the 452 * target lays out data structures. *) 453 DataLayout.add (ExecutionEngine.target_data the_execution_engine) the_fpm; 454 455 (* Promote allocas to registers. *) 456 add_memory_to_register_promotion the_fpm; 457 458 (* Do simple "peephole" optimizations and bit-twiddling optzn. *) 459 add_instruction_combining the_fpm; 460 461 (* reassociate expressions. *) 462 add_reassociation the_fpm; 463 464 It is interesting to see what the code looks like before and after the 465 mem2reg optimization runs. For example, this is the before/after code 466 for our recursive fib function. Before the optimization: 467 468 .. code-block:: llvm 469 470 define double @fib(double %x) { 471 entry: 472 %x1 = alloca double 473 store double %x, double* %x1 474 %x2 = load double* %x1 475 %cmptmp = fcmp ult double %x2, 3.000000e+00 476 %booltmp = uitofp i1 %cmptmp to double 477 %ifcond = fcmp one double %booltmp, 0.000000e+00 478 br i1 %ifcond, label %then, label %else 479 480 then: ; preds = %entry 481 br label %ifcont 482 483 else: ; preds = %entry 484 %x3 = load double* %x1 485 %subtmp = fsub double %x3, 1.000000e+00 486 %calltmp = call double @fib(double %subtmp) 487 %x4 = load double* %x1 488 %subtmp5 = fsub double %x4, 2.000000e+00 489 %calltmp6 = call double @fib(double %subtmp5) 490 %addtmp = fadd double %calltmp, %calltmp6 491 br label %ifcont 492 493 ifcont: ; preds = %else, %then 494 %iftmp = phi double [ 1.000000e+00, %then ], [ %addtmp, %else ] 495 ret double %iftmp 496 } 497 498 Here there is only one variable (x, the input argument) but you can 499 still see the extremely simple-minded code generation strategy we are 500 using. In the entry block, an alloca is created, and the initial input 501 value is stored into it. Each reference to the variable does a reload 502 from the stack. Also, note that we didn't modify the if/then/else 503 expression, so it still inserts a PHI node. While we could make an 504 alloca for it, it is actually easier to create a PHI node for it, so we 505 still just make the PHI. 506 507 Here is the code after the mem2reg pass runs: 508 509 .. code-block:: llvm 510 511 define double @fib(double %x) { 512 entry: 513 %cmptmp = fcmp ult double %x, 3.000000e+00 514 %booltmp = uitofp i1 %cmptmp to double 515 %ifcond = fcmp one double %booltmp, 0.000000e+00 516 br i1 %ifcond, label %then, label %else 517 518 then: 519 br label %ifcont 520 521 else: 522 %subtmp = fsub double %x, 1.000000e+00 523 %calltmp = call double @fib(double %subtmp) 524 %subtmp5 = fsub double %x, 2.000000e+00 525 %calltmp6 = call double @fib(double %subtmp5) 526 %addtmp = fadd double %calltmp, %calltmp6 527 br label %ifcont 528 529 ifcont: ; preds = %else, %then 530 %iftmp = phi double [ 1.000000e+00, %then ], [ %addtmp, %else ] 531 ret double %iftmp 532 } 533 534 This is a trivial case for mem2reg, since there are no redefinitions of 535 the variable. The point of showing this is to calm your tension about 536 inserting such blatent inefficiencies :). 537 538 After the rest of the optimizers run, we get: 539 540 .. code-block:: llvm 541 542 define double @fib(double %x) { 543 entry: 544 %cmptmp = fcmp ult double %x, 3.000000e+00 545 %booltmp = uitofp i1 %cmptmp to double 546 %ifcond = fcmp ueq double %booltmp, 0.000000e+00 547 br i1 %ifcond, label %else, label %ifcont 548 549 else: 550 %subtmp = fsub double %x, 1.000000e+00 551 %calltmp = call double @fib(double %subtmp) 552 %subtmp5 = fsub double %x, 2.000000e+00 553 %calltmp6 = call double @fib(double %subtmp5) 554 %addtmp = fadd double %calltmp, %calltmp6 555 ret double %addtmp 556 557 ifcont: 558 ret double 1.000000e+00 559 } 560 561 Here we see that the simplifycfg pass decided to clone the return 562 instruction into the end of the 'else' block. This allowed it to 563 eliminate some branches and the PHI node. 564 565 Now that all symbol table references are updated to use stack variables, 566 we'll add the assignment operator. 567 568 New Assignment Operator 569 ======================= 570 571 With our current framework, adding a new assignment operator is really 572 simple. We will parse it just like any other binary operator, but handle 573 it internally (instead of allowing the user to define it). The first 574 step is to set a precedence: 575 576 .. code-block:: ocaml 577 578 let main () = 579 (* Install standard binary operators. 580 * 1 is the lowest precedence. *) 581 Hashtbl.add Parser.binop_precedence '=' 2; 582 Hashtbl.add Parser.binop_precedence '<' 10; 583 Hashtbl.add Parser.binop_precedence '+' 20; 584 Hashtbl.add Parser.binop_precedence '-' 20; 585 ... 586 587 Now that the parser knows the precedence of the binary operator, it 588 takes care of all the parsing and AST generation. We just need to 589 implement codegen for the assignment operator. This looks like: 590 591 .. code-block:: ocaml 592 593 let rec codegen_expr = function 594 begin match op with 595 | '=' -> 596 (* Special case '=' because we don't want to emit the LHS as an 597 * expression. *) 598 let name = 599 match lhs with 600 | Ast.Variable name -> name 601 | _ -> raise (Error "destination of '=' must be a variable") 602 in 603 604 Unlike the rest of the binary operators, our assignment operator doesn't 605 follow the "emit LHS, emit RHS, do computation" model. As such, it is 606 handled as a special case before the other binary operators are handled. 607 The other strange thing is that it requires the LHS to be a variable. It 608 is invalid to have "(x+1) = expr" - only things like "x = expr" are 609 allowed. 610 611 .. code-block:: ocaml 612 613 (* Codegen the rhs. *) 614 let val_ = codegen_expr rhs in 615 616 (* Lookup the name. *) 617 let variable = try Hashtbl.find named_values name with 618 | Not_found -> raise (Error "unknown variable name") 619 in 620 ignore(build_store val_ variable builder); 621 val_ 622 | _ -> 623 ... 624 625 Once we have the variable, codegen'ing the assignment is 626 straightforward: we emit the RHS of the assignment, create a store, and 627 return the computed value. Returning a value allows for chained 628 assignments like "X = (Y = Z)". 629 630 Now that we have an assignment operator, we can mutate loop variables 631 and arguments. For example, we can now run code like this: 632 633 :: 634 635 # Function to print a double. 636 extern printd(x); 637 638 # Define ':' for sequencing: as a low-precedence operator that ignores operands 639 # and just returns the RHS. 640 def binary : 1 (x y) y; 641 642 def test(x) 643 printd(x) : 644 x = 4 : 645 printd(x); 646 647 test(123); 648 649 When run, this example prints "123" and then "4", showing that we did 650 actually mutate the value! Okay, we have now officially implemented our 651 goal: getting this to work requires SSA construction in the general 652 case. However, to be really useful, we want the ability to define our 653 own local variables, lets add this next! 654 655 User-defined Local Variables 656 ============================ 657 658 Adding var/in is just like any other other extensions we made to 659 Kaleidoscope: we extend the lexer, the parser, the AST and the code 660 generator. The first step for adding our new 'var/in' construct is to 661 extend the lexer. As before, this is pretty trivial, the code looks like 662 this: 663 664 .. code-block:: ocaml 665 666 type token = 667 ... 668 (* var definition *) 669 | Var 670 671 ... 672 673 and lex_ident buffer = parser 674 ... 675 | "in" -> [< 'Token.In; stream >] 676 | "binary" -> [< 'Token.Binary; stream >] 677 | "unary" -> [< 'Token.Unary; stream >] 678 | "var" -> [< 'Token.Var; stream >] 679 ... 680 681 The next step is to define the AST node that we will construct. For 682 var/in, it looks like this: 683 684 .. code-block:: ocaml 685 686 type expr = 687 ... 688 (* variant for var/in. *) 689 | Var of (string * expr option) array * expr 690 ... 691 692 var/in allows a list of names to be defined all at once, and each name 693 can optionally have an initializer value. As such, we capture this 694 information in the VarNames vector. Also, var/in has a body, this body 695 is allowed to access the variables defined by the var/in. 696 697 With this in place, we can define the parser pieces. The first thing we 698 do is add it as a primary expression: 699 700 .. code-block:: ocaml 701 702 (* primary 703 * ::= identifier 704 * ::= numberexpr 705 * ::= parenexpr 706 * ::= ifexpr 707 * ::= forexpr 708 * ::= varexpr *) 709 let rec parse_primary = parser 710 ... 711 (* varexpr 712 * ::= 'var' identifier ('=' expression? 713 * (',' identifier ('=' expression)?)* 'in' expression *) 714 | [< 'Token.Var; 715 (* At least one variable name is required. *) 716 'Token.Ident id ?? "expected identifier after var"; 717 init=parse_var_init; 718 var_names=parse_var_names [(id, init)]; 719 (* At this point, we have to have 'in'. *) 720 'Token.In ?? "expected 'in' keyword after 'var'"; 721 body=parse_expr >] -> 722 Ast.Var (Array.of_list (List.rev var_names), body) 723 724 ... 725 726 and parse_var_init = parser 727 (* read in the optional initializer. *) 728 | [< 'Token.Kwd '='; e=parse_expr >] -> Some e 729 | [< >] -> None 730 731 and parse_var_names accumulator = parser 732 | [< 'Token.Kwd ','; 733 'Token.Ident id ?? "expected identifier list after var"; 734 init=parse_var_init; 735 e=parse_var_names ((id, init) :: accumulator) >] -> e 736 | [< >] -> accumulator 737 738 Now that we can parse and represent the code, we need to support 739 emission of LLVM IR for it. This code starts out with: 740 741 .. code-block:: ocaml 742 743 let rec codegen_expr = function 744 ... 745 | Ast.Var (var_names, body) 746 let old_bindings = ref [] in 747 748 let the_function = block_parent (insertion_block builder) in 749 750 (* Register all variables and emit their initializer. *) 751 Array.iter (fun (var_name, init) -> 752 753 Basically it loops over all the variables, installing them one at a 754 time. For each variable we put into the symbol table, we remember the 755 previous value that we replace in OldBindings. 756 757 .. code-block:: ocaml 758 759 (* Emit the initializer before adding the variable to scope, this 760 * prevents the initializer from referencing the variable itself, and 761 * permits stuff like this: 762 * var a = 1 in 763 * var a = a in ... # refers to outer 'a'. *) 764 let init_val = 765 match init with 766 | Some init -> codegen_expr init 767 (* If not specified, use 0.0. *) 768 | None -> const_float double_type 0.0 769 in 770 771 let alloca = create_entry_block_alloca the_function var_name in 772 ignore(build_store init_val alloca builder); 773 774 (* Remember the old variable binding so that we can restore the binding 775 * when we unrecurse. *) 776 777 begin 778 try 779 let old_value = Hashtbl.find named_values var_name in 780 old_bindings := (var_name, old_value) :: !old_bindings; 781 with Not_found > () 782 end; 783 784 (* Remember this binding. *) 785 Hashtbl.add named_values var_name alloca; 786 ) var_names; 787 788 There are more comments here than code. The basic idea is that we emit 789 the initializer, create the alloca, then update the symbol table to 790 point to it. Once all the variables are installed in the symbol table, 791 we evaluate the body of the var/in expression: 792 793 .. code-block:: ocaml 794 795 (* Codegen the body, now that all vars are in scope. *) 796 let body_val = codegen_expr body in 797 798 Finally, before returning, we restore the previous variable bindings: 799 800 .. code-block:: ocaml 801 802 (* Pop all our variables from scope. *) 803 List.iter (fun (var_name, old_value) -> 804 Hashtbl.add named_values var_name old_value 805 ) !old_bindings; 806 807 (* Return the body computation. *) 808 body_val 809 810 The end result of all of this is that we get properly scoped variable 811 definitions, and we even (trivially) allow mutation of them :). 812 813 With this, we completed what we set out to do. Our nice iterative fib 814 example from the intro compiles and runs just fine. The mem2reg pass 815 optimizes all of our stack variables into SSA registers, inserting PHI 816 nodes where needed, and our front-end remains simple: no "iterated 817 dominance frontier" computation anywhere in sight. 818 819 Full Code Listing 820 ================= 821 822 Here is the complete code listing for our running example, enhanced with 823 mutable variables and var/in support. To build this example, use: 824 825 .. code-block:: bash 826 827 # Compile 828 ocamlbuild toy.byte 829 # Run 830 ./toy.byte 831 832 Here is the code: 833 834 \_tags: 835 :: 836 837 <{lexer,parser}.ml>: use_camlp4, pp(camlp4of) 838 <*.{byte,native}>: g++, use_llvm, use_llvm_analysis 839 <*.{byte,native}>: use_llvm_executionengine, use_llvm_target 840 <*.{byte,native}>: use_llvm_scalar_opts, use_bindings 841 842 myocamlbuild.ml: 843 .. code-block:: ocaml 844 845 open Ocamlbuild_plugin;; 846 847 ocaml_lib ~extern:true "llvm";; 848 ocaml_lib ~extern:true "llvm_analysis";; 849 ocaml_lib ~extern:true "llvm_executionengine";; 850 ocaml_lib ~extern:true "llvm_target";; 851 ocaml_lib ~extern:true "llvm_scalar_opts";; 852 853 flag ["link"; "ocaml"; "g++"] (S[A"-cc"; A"g++"; A"-cclib"; A"-rdynamic"]);; 854 dep ["link"; "ocaml"; "use_bindings"] ["bindings.o"];; 855 856 token.ml: 857 .. code-block:: ocaml 858 859 (*===----------------------------------------------------------------------=== 860 * Lexer Tokens 861 *===----------------------------------------------------------------------===*) 862 863 (* The lexer returns these 'Kwd' if it is an unknown character, otherwise one of 864 * these others for known things. *) 865 type token = 866 (* commands *) 867 | Def | Extern 868 869 (* primary *) 870 | Ident of string | Number of float 871 872 (* unknown *) 873 | Kwd of char 874 875 (* control *) 876 | If | Then | Else 877 | For | In 878 879 (* operators *) 880 | Binary | Unary 881 882 (* var definition *) 883 | Var 884 885 lexer.ml: 886 .. code-block:: ocaml 887 888 (*===----------------------------------------------------------------------=== 889 * Lexer 890 *===----------------------------------------------------------------------===*) 891 892 let rec lex = parser 893 (* Skip any whitespace. *) 894 | [< ' (' ' | '\n' | '\r' | '\t'); stream >] -> lex stream 895 896 (* identifier: [a-zA-Z][a-zA-Z0-9] *) 897 | [< ' ('A' .. 'Z' | 'a' .. 'z' as c); stream >] -> 898 let buffer = Buffer.create 1 in 899 Buffer.add_char buffer c; 900 lex_ident buffer stream 901 902 (* number: [0-9.]+ *) 903 | [< ' ('0' .. '9' as c); stream >] -> 904 let buffer = Buffer.create 1 in 905 Buffer.add_char buffer c; 906 lex_number buffer stream 907 908 (* Comment until end of line. *) 909 | [< ' ('#'); stream >] -> 910 lex_comment stream 911 912 (* Otherwise, just return the character as its ascii value. *) 913 | [< 'c; stream >] -> 914 [< 'Token.Kwd c; lex stream >] 915 916 (* end of stream. *) 917 | [< >] -> [< >] 918 919 and lex_number buffer = parser 920 | [< ' ('0' .. '9' | '.' as c); stream >] -> 921 Buffer.add_char buffer c; 922 lex_number buffer stream 923 | [< stream=lex >] -> 924 [< 'Token.Number (float_of_string (Buffer.contents buffer)); stream >] 925 926 and lex_ident buffer = parser 927 | [< ' ('A' .. 'Z' | 'a' .. 'z' | '0' .. '9' as c); stream >] -> 928 Buffer.add_char buffer c; 929 lex_ident buffer stream 930 | [< stream=lex >] -> 931 match Buffer.contents buffer with 932 | "def" -> [< 'Token.Def; stream >] 933 | "extern" -> [< 'Token.Extern; stream >] 934 | "if" -> [< 'Token.If; stream >] 935 | "then" -> [< 'Token.Then; stream >] 936 | "else" -> [< 'Token.Else; stream >] 937 | "for" -> [< 'Token.For; stream >] 938 | "in" -> [< 'Token.In; stream >] 939 | "binary" -> [< 'Token.Binary; stream >] 940 | "unary" -> [< 'Token.Unary; stream >] 941 | "var" -> [< 'Token.Var; stream >] 942 | id -> [< 'Token.Ident id; stream >] 943 944 and lex_comment = parser 945 | [< ' ('\n'); stream=lex >] -> stream 946 | [< 'c; e=lex_comment >] -> e 947 | [< >] -> [< >] 948 949 ast.ml: 950 .. code-block:: ocaml 951 952 (*===----------------------------------------------------------------------=== 953 * Abstract Syntax Tree (aka Parse Tree) 954 *===----------------------------------------------------------------------===*) 955 956 (* expr - Base type for all expression nodes. *) 957 type expr = 958 (* variant for numeric literals like "1.0". *) 959 | Number of float 960 961 (* variant for referencing a variable, like "a". *) 962 | Variable of string 963 964 (* variant for a unary operator. *) 965 | Unary of char * expr 966 967 (* variant for a binary operator. *) 968 | Binary of char * expr * expr 969 970 (* variant for function calls. *) 971 | Call of string * expr array 972 973 (* variant for if/then/else. *) 974 | If of expr * expr * expr 975 976 (* variant for for/in. *) 977 | For of string * expr * expr * expr option * expr 978 979 (* variant for var/in. *) 980 | Var of (string * expr option) array * expr 981 982 (* proto - This type represents the "prototype" for a function, which captures 983 * its name, and its argument names (thus implicitly the number of arguments the 984 * function takes). *) 985 type proto = 986 | Prototype of string * string array 987 | BinOpPrototype of string * string array * int 988 989 (* func - This type represents a function definition itself. *) 990 type func = Function of proto * expr 991 992 parser.ml: 993 .. code-block:: ocaml 994 995 (*===---------------------------------------------------------------------=== 996 * Parser 997 *===---------------------------------------------------------------------===*) 998 999 (* binop_precedence - This holds the precedence for each binary operator that is 1000 * defined *) 1001 let binop_precedence:(char, int) Hashtbl.t = Hashtbl.create 10 1002 1003 (* precedence - Get the precedence of the pending binary operator token. *) 1004 let precedence c = try Hashtbl.find binop_precedence c with Not_found -> -1 1005 1006 (* primary 1007 * ::= identifier 1008 * ::= numberexpr 1009 * ::= parenexpr 1010 * ::= ifexpr 1011 * ::= forexpr 1012 * ::= varexpr *) 1013 let rec parse_primary = parser 1014 (* numberexpr ::= number *) 1015 | [< 'Token.Number n >] -> Ast.Number n 1016 1017 (* parenexpr ::= '(' expression ')' *) 1018 | [< 'Token.Kwd '('; e=parse_expr; 'Token.Kwd ')' ?? "expected ')'" >] -> e 1019 1020 (* identifierexpr 1021 * ::= identifier 1022 * ::= identifier '(' argumentexpr ')' *) 1023 | [< 'Token.Ident id; stream >] -> 1024 let rec parse_args accumulator = parser 1025 | [< e=parse_expr; stream >] -> 1026 begin parser 1027 | [< 'Token.Kwd ','; e=parse_args (e :: accumulator) >] -> e 1028 | [< >] -> e :: accumulator 1029 end stream 1030 | [< >] -> accumulator 1031 in 1032 let rec parse_ident id = parser 1033 (* Call. *) 1034 | [< 'Token.Kwd '('; 1035 args=parse_args []; 1036 'Token.Kwd ')' ?? "expected ')'">] -> 1037 Ast.Call (id, Array.of_list (List.rev args)) 1038 1039 (* Simple variable ref. *) 1040 | [< >] -> Ast.Variable id 1041 in 1042 parse_ident id stream 1043 1044 (* ifexpr ::= 'if' expr 'then' expr 'else' expr *) 1045 | [< 'Token.If; c=parse_expr; 1046 'Token.Then ?? "expected 'then'"; t=parse_expr; 1047 'Token.Else ?? "expected 'else'"; e=parse_expr >] -> 1048 Ast.If (c, t, e) 1049 1050 (* forexpr 1051 ::= 'for' identifier '=' expr ',' expr (',' expr)? 'in' expression *) 1052 | [< 'Token.For; 1053 'Token.Ident id ?? "expected identifier after for"; 1054 'Token.Kwd '=' ?? "expected '=' after for"; 1055 stream >] -> 1056 begin parser 1057 | [< 1058 start=parse_expr; 1059 'Token.Kwd ',' ?? "expected ',' after for"; 1060 end_=parse_expr; 1061 stream >] -> 1062 let step = 1063 begin parser 1064 | [< 'Token.Kwd ','; step=parse_expr >] -> Some step 1065 | [< >] -> None 1066 end stream 1067 in 1068 begin parser 1069 | [< 'Token.In; body=parse_expr >] -> 1070 Ast.For (id, start, end_, step, body) 1071 | [< >] -> 1072 raise (Stream.Error "expected 'in' after for") 1073 end stream 1074 | [< >] -> 1075 raise (Stream.Error "expected '=' after for") 1076 end stream 1077 1078 (* varexpr 1079 * ::= 'var' identifier ('=' expression? 1080 * (',' identifier ('=' expression)?)* 'in' expression *) 1081 | [< 'Token.Var; 1082 (* At least one variable name is required. *) 1083 'Token.Ident id ?? "expected identifier after var"; 1084 init=parse_var_init; 1085 var_names=parse_var_names [(id, init)]; 1086 (* At this point, we have to have 'in'. *) 1087 'Token.In ?? "expected 'in' keyword after 'var'"; 1088 body=parse_expr >] -> 1089 Ast.Var (Array.of_list (List.rev var_names), body) 1090 1091 | [< >] -> raise (Stream.Error "unknown token when expecting an expression.") 1092 1093 (* unary 1094 * ::= primary 1095 * ::= '!' unary *) 1096 and parse_unary = parser 1097 (* If this is a unary operator, read it. *) 1098 | [< 'Token.Kwd op when op != '(' && op != ')'; operand=parse_expr >] -> 1099 Ast.Unary (op, operand) 1100 1101 (* If the current token is not an operator, it must be a primary expr. *) 1102 | [< stream >] -> parse_primary stream 1103 1104 (* binoprhs 1105 * ::= ('+' primary)* *) 1106 and parse_bin_rhs expr_prec lhs stream = 1107 match Stream.peek stream with 1108 (* If this is a binop, find its precedence. *) 1109 | Some (Token.Kwd c) when Hashtbl.mem binop_precedence c -> 1110 let token_prec = precedence c in 1111 1112 (* If this is a binop that binds at least as tightly as the current binop, 1113 * consume it, otherwise we are done. *) 1114 if token_prec < expr_prec then lhs else begin 1115 (* Eat the binop. *) 1116 Stream.junk stream; 1117 1118 (* Parse the primary expression after the binary operator. *) 1119 let rhs = parse_unary stream in 1120 1121 (* Okay, we know this is a binop. *) 1122 let rhs = 1123 match Stream.peek stream with 1124 | Some (Token.Kwd c2) -> 1125 (* If BinOp binds less tightly with rhs than the operator after 1126 * rhs, let the pending operator take rhs as its lhs. *) 1127 let next_prec = precedence c2 in 1128 if token_prec < next_prec 1129 then parse_bin_rhs (token_prec + 1) rhs stream 1130 else rhs 1131 | _ -> rhs 1132 in 1133 1134 (* Merge lhs/rhs. *) 1135 let lhs = Ast.Binary (c, lhs, rhs) in 1136 parse_bin_rhs expr_prec lhs stream 1137 end 1138 | _ -> lhs 1139 1140 and parse_var_init = parser 1141 (* read in the optional initializer. *) 1142 | [< 'Token.Kwd '='; e=parse_expr >] -> Some e 1143 | [< >] -> None 1144 1145 and parse_var_names accumulator = parser 1146 | [< 'Token.Kwd ','; 1147 'Token.Ident id ?? "expected identifier list after var"; 1148 init=parse_var_init; 1149 e=parse_var_names ((id, init) :: accumulator) >] -> e 1150 | [< >] -> accumulator 1151 1152 (* expression 1153 * ::= primary binoprhs *) 1154 and parse_expr = parser 1155 | [< lhs=parse_unary; stream >] -> parse_bin_rhs 0 lhs stream 1156 1157 (* prototype 1158 * ::= id '(' id* ')' 1159 * ::= binary LETTER number? (id, id) 1160 * ::= unary LETTER number? (id) *) 1161 let parse_prototype = 1162 let rec parse_args accumulator = parser 1163 | [< 'Token.Ident id; e=parse_args (id::accumulator) >] -> e 1164 | [< >] -> accumulator 1165 in 1166 let parse_operator = parser 1167 | [< 'Token.Unary >] -> "unary", 1 1168 | [< 'Token.Binary >] -> "binary", 2 1169 in 1170 let parse_binary_precedence = parser 1171 | [< 'Token.Number n >] -> int_of_float n 1172 | [< >] -> 30 1173 in 1174 parser 1175 | [< 'Token.Ident id; 1176 'Token.Kwd '(' ?? "expected '(' in prototype"; 1177 args=parse_args []; 1178 'Token.Kwd ')' ?? "expected ')' in prototype" >] -> 1179 (* success. *) 1180 Ast.Prototype (id, Array.of_list (List.rev args)) 1181 | [< (prefix, kind)=parse_operator; 1182 'Token.Kwd op ?? "expected an operator"; 1183 (* Read the precedence if present. *) 1184 binary_precedence=parse_binary_precedence; 1185 'Token.Kwd '(' ?? "expected '(' in prototype"; 1186 args=parse_args []; 1187 'Token.Kwd ')' ?? "expected ')' in prototype" >] -> 1188 let name = prefix ^ (String.make 1 op) in 1189 let args = Array.of_list (List.rev args) in 1190 1191 (* Verify right number of arguments for operator. *) 1192 if Array.length args != kind 1193 then raise (Stream.Error "invalid number of operands for operator") 1194 else 1195 if kind == 1 then 1196 Ast.Prototype (name, args) 1197 else 1198 Ast.BinOpPrototype (name, args, binary_precedence) 1199 | [< >] -> 1200 raise (Stream.Error "expected function name in prototype") 1201 1202 (* definition ::= 'def' prototype expression *) 1203 let parse_definition = parser 1204 | [< 'Token.Def; p=parse_prototype; e=parse_expr >] -> 1205 Ast.Function (p, e) 1206 1207 (* toplevelexpr ::= expression *) 1208 let parse_toplevel = parser 1209 | [< e=parse_expr >] -> 1210 (* Make an anonymous proto. *) 1211 Ast.Function (Ast.Prototype ("", [||]), e) 1212 1213 (* external ::= 'extern' prototype *) 1214 let parse_extern = parser 1215 | [< 'Token.Extern; e=parse_prototype >] -> e 1216 1217 codegen.ml: 1218 .. code-block:: ocaml 1219 1220 (*===----------------------------------------------------------------------=== 1221 * Code Generation 1222 *===----------------------------------------------------------------------===*) 1223 1224 open Llvm 1225 1226 exception Error of string 1227 1228 let context = global_context () 1229 let the_module = create_module context "my cool jit" 1230 let builder = builder context 1231 let named_values:(string, llvalue) Hashtbl.t = Hashtbl.create 10 1232 let double_type = double_type context 1233 1234 (* Create an alloca instruction in the entry block of the function. This 1235 * is used for mutable variables etc. *) 1236 let create_entry_block_alloca the_function var_name = 1237 let builder = builder_at context (instr_begin (entry_block the_function)) in 1238 build_alloca double_type var_name builder 1239 1240 let rec codegen_expr = function 1241 | Ast.Number n -> const_float double_type n 1242 | Ast.Variable name -> 1243 let v = try Hashtbl.find named_values name with 1244 | Not_found -> raise (Error "unknown variable name") 1245 in 1246 (* Load the value. *) 1247 build_load v name builder 1248 | Ast.Unary (op, operand) -> 1249 let operand = codegen_expr operand in 1250 let callee = "unary" ^ (String.make 1 op) in 1251 let callee = 1252 match lookup_function callee the_module with 1253 | Some callee -> callee 1254 | None -> raise (Error "unknown unary operator") 1255 in 1256 build_call callee [|operand|] "unop" builder 1257 | Ast.Binary (op, lhs, rhs) -> 1258 begin match op with 1259 | '=' -> 1260 (* Special case '=' because we don't want to emit the LHS as an 1261 * expression. *) 1262 let name = 1263 match lhs with 1264 | Ast.Variable name -> name 1265 | _ -> raise (Error "destination of '=' must be a variable") 1266 in 1267 1268 (* Codegen the rhs. *) 1269 let val_ = codegen_expr rhs in 1270 1271 (* Lookup the name. *) 1272 let variable = try Hashtbl.find named_values name with 1273 | Not_found -> raise (Error "unknown variable name") 1274 in 1275 ignore(build_store val_ variable builder); 1276 val_ 1277 | _ -> 1278 let lhs_val = codegen_expr lhs in 1279 let rhs_val = codegen_expr rhs in 1280 begin 1281 match op with 1282 | '+' -> build_add lhs_val rhs_val "addtmp" builder 1283 | '-' -> build_sub lhs_val rhs_val "subtmp" builder 1284 | '*' -> build_mul lhs_val rhs_val "multmp" builder 1285 | '<' -> 1286 (* Convert bool 0/1 to double 0.0 or 1.0 *) 1287 let i = build_fcmp Fcmp.Ult lhs_val rhs_val "cmptmp" builder in 1288 build_uitofp i double_type "booltmp" builder 1289 | _ -> 1290 (* If it wasn't a builtin binary operator, it must be a user defined 1291 * one. Emit a call to it. *) 1292 let callee = "binary" ^ (String.make 1 op) in 1293 let callee = 1294 match lookup_function callee the_module with 1295 | Some callee -> callee 1296 | None -> raise (Error "binary operator not found!") 1297 in 1298 build_call callee [|lhs_val; rhs_val|] "binop" builder 1299 end 1300 end 1301 | Ast.Call (callee, args) -> 1302 (* Look up the name in the module table. *) 1303 let callee = 1304 match lookup_function callee the_module with 1305 | Some callee -> callee 1306 | None -> raise (Error "unknown function referenced") 1307 in 1308 let params = params callee in 1309 1310 (* If argument mismatch error. *) 1311 if Array.length params == Array.length args then () else 1312 raise (Error "incorrect # arguments passed"); 1313 let args = Array.map codegen_expr args in 1314 build_call callee args "calltmp" builder 1315 | Ast.If (cond, then_, else_) -> 1316 let cond = codegen_expr cond in 1317 1318 (* Convert condition to a bool by comparing equal to 0.0 *) 1319 let zero = const_float double_type 0.0 in 1320 let cond_val = build_fcmp Fcmp.One cond zero "ifcond" builder in 1321 1322 (* Grab the first block so that we might later add the conditional branch 1323 * to it at the end of the function. *) 1324 let start_bb = insertion_block builder in 1325 let the_function = block_parent start_bb in 1326 1327 let then_bb = append_block context "then" the_function in 1328 1329 (* Emit 'then' value. *) 1330 position_at_end then_bb builder; 1331 let then_val = codegen_expr then_ in 1332 1333 (* Codegen of 'then' can change the current block, update then_bb for the 1334 * phi. We create a new name because one is used for the phi node, and the 1335 * other is used for the conditional branch. *) 1336 let new_then_bb = insertion_block builder in 1337 1338 (* Emit 'else' value. *) 1339 let else_bb = append_block context "else" the_function in 1340 position_at_end else_bb builder; 1341 let else_val = codegen_expr else_ in 1342 1343 (* Codegen of 'else' can change the current block, update else_bb for the 1344 * phi. *) 1345 let new_else_bb = insertion_block builder in 1346 1347 (* Emit merge block. *) 1348 let merge_bb = append_block context "ifcont" the_function in 1349 position_at_end merge_bb builder; 1350 let incoming = [(then_val, new_then_bb); (else_val, new_else_bb)] in 1351 let phi = build_phi incoming "iftmp" builder in 1352 1353 (* Return to the start block to add the conditional branch. *) 1354 position_at_end start_bb builder; 1355 ignore (build_cond_br cond_val then_bb else_bb builder); 1356 1357 (* Set a unconditional branch at the end of the 'then' block and the 1358 * 'else' block to the 'merge' block. *) 1359 position_at_end new_then_bb builder; ignore (build_br merge_bb builder); 1360 position_at_end new_else_bb builder; ignore (build_br merge_bb builder); 1361 1362 (* Finally, set the builder to the end of the merge block. *) 1363 position_at_end merge_bb builder; 1364 1365 phi 1366 | Ast.For (var_name, start, end_, step, body) -> 1367 (* Output this as: 1368 * var = alloca double 1369 * ... 1370 * start = startexpr 1371 * store start -> var 1372 * goto loop 1373 * loop: 1374 * ... 1375 * bodyexpr 1376 * ... 1377 * loopend: 1378 * step = stepexpr 1379 * endcond = endexpr 1380 * 1381 * curvar = load var 1382 * nextvar = curvar + step 1383 * store nextvar -> var 1384 * br endcond, loop, endloop 1385 * outloop: *) 1386 1387 let the_function = block_parent (insertion_block builder) in 1388 1389 (* Create an alloca for the variable in the entry block. *) 1390 let alloca = create_entry_block_alloca the_function var_name in 1391 1392 (* Emit the start code first, without 'variable' in scope. *) 1393 let start_val = codegen_expr start in 1394 1395 (* Store the value into the alloca. *) 1396 ignore(build_store start_val alloca builder); 1397 1398 (* Make the new basic block for the loop header, inserting after current 1399 * block. *) 1400 let loop_bb = append_block context "loop" the_function in 1401 1402 (* Insert an explicit fall through from the current block to the 1403 * loop_bb. *) 1404 ignore (build_br loop_bb builder); 1405 1406 (* Start insertion in loop_bb. *) 1407 position_at_end loop_bb builder; 1408 1409 (* Within the loop, the variable is defined equal to the PHI node. If it 1410 * shadows an existing variable, we have to restore it, so save it 1411 * now. *) 1412 let old_val = 1413 try Some (Hashtbl.find named_values var_name) with Not_found -> None 1414 in 1415 Hashtbl.add named_values var_name alloca; 1416 1417 (* Emit the body of the loop. This, like any other expr, can change the 1418 * current BB. Note that we ignore the value computed by the body, but 1419 * don't allow an error *) 1420 ignore (codegen_expr body); 1421 1422 (* Emit the step value. *) 1423 let step_val = 1424 match step with 1425 | Some step -> codegen_expr step 1426 (* If not specified, use 1.0. *) 1427 | None -> const_float double_type 1.0 1428 in 1429 1430 (* Compute the end condition. *) 1431 let end_cond = codegen_expr end_ in 1432 1433 (* Reload, increment, and restore the alloca. This handles the case where 1434 * the body of the loop mutates the variable. *) 1435 let cur_var = build_load alloca var_name builder in 1436 let next_var = build_add cur_var step_val "nextvar" builder in 1437 ignore(build_store next_var alloca builder); 1438 1439 (* Convert condition to a bool by comparing equal to 0.0. *) 1440 let zero = const_float double_type 0.0 in 1441 let end_cond = build_fcmp Fcmp.One end_cond zero "loopcond" builder in 1442 1443 (* Create the "after loop" block and insert it. *) 1444 let after_bb = append_block context "afterloop" the_function in 1445 1446 (* Insert the conditional branch into the end of loop_end_bb. *) 1447 ignore (build_cond_br end_cond loop_bb after_bb builder); 1448 1449 (* Any new code will be inserted in after_bb. *) 1450 position_at_end after_bb builder; 1451 1452 (* Restore the unshadowed variable. *) 1453 begin match old_val with 1454 | Some old_val -> Hashtbl.add named_values var_name old_val 1455 | None -> () 1456 end; 1457 1458 (* for expr always returns 0.0. *) 1459 const_null double_type 1460 | Ast.Var (var_names, body) -> 1461 let old_bindings = ref [] in 1462 1463 let the_function = block_parent (insertion_block builder) in 1464 1465 (* Register all variables and emit their initializer. *) 1466 Array.iter (fun (var_name, init) -> 1467 (* Emit the initializer before adding the variable to scope, this 1468 * prevents the initializer from referencing the variable itself, and 1469 * permits stuff like this: 1470 * var a = 1 in 1471 * var a = a in ... # refers to outer 'a'. *) 1472 let init_val = 1473 match init with 1474 | Some init -> codegen_expr init 1475 (* If not specified, use 0.0. *) 1476 | None -> const_float double_type 0.0 1477 in 1478 1479 let alloca = create_entry_block_alloca the_function var_name in 1480 ignore(build_store init_val alloca builder); 1481 1482 (* Remember the old variable binding so that we can restore the binding 1483 * when we unrecurse. *) 1484 begin 1485 try 1486 let old_value = Hashtbl.find named_values var_name in 1487 old_bindings := (var_name, old_value) :: !old_bindings; 1488 with Not_found -> () 1489 end; 1490 1491 (* Remember this binding. *) 1492 Hashtbl.add named_values var_name alloca; 1493 ) var_names; 1494 1495 (* Codegen the body, now that all vars are in scope. *) 1496 let body_val = codegen_expr body in 1497 1498 (* Pop all our variables from scope. *) 1499 List.iter (fun (var_name, old_value) -> 1500 Hashtbl.add named_values var_name old_value 1501 ) !old_bindings; 1502 1503 (* Return the body computation. *) 1504 body_val 1505 1506 let codegen_proto = function 1507 | Ast.Prototype (name, args) | Ast.BinOpPrototype (name, args, _) -> 1508 (* Make the function type: double(double,double) etc. *) 1509 let doubles = Array.make (Array.length args) double_type in 1510 let ft = function_type double_type doubles in 1511 let f = 1512 match lookup_function name the_module with 1513 | None -> declare_function name ft the_module 1514 1515 (* If 'f' conflicted, there was already something named 'name'. If it 1516 * has a body, don't allow redefinition or reextern. *) 1517 | Some f -> 1518 (* If 'f' already has a body, reject this. *) 1519 if block_begin f <> At_end f then 1520 raise (Error "redefinition of function"); 1521 1522 (* If 'f' took a different number of arguments, reject. *) 1523 if element_type (type_of f) <> ft then 1524 raise (Error "redefinition of function with different # args"); 1525 f 1526 in 1527 1528 (* Set names for all arguments. *) 1529 Array.iteri (fun i a -> 1530 let n = args.(i) in 1531 set_value_name n a; 1532 Hashtbl.add named_values n a; 1533 ) (params f); 1534 f 1535 1536 (* Create an alloca for each argument and register the argument in the symbol 1537 * table so that references to it will succeed. *) 1538 let create_argument_allocas the_function proto = 1539 let args = match proto with 1540 | Ast.Prototype (_, args) | Ast.BinOpPrototype (_, args, _) -> args 1541 in 1542 Array.iteri (fun i ai -> 1543 let var_name = args.(i) in 1544 (* Create an alloca for this variable. *) 1545 let alloca = create_entry_block_alloca the_function var_name in 1546 1547 (* Store the initial value into the alloca. *) 1548 ignore(build_store ai alloca builder); 1549 1550 (* Add arguments to variable symbol table. *) 1551 Hashtbl.add named_values var_name alloca; 1552 ) (params the_function) 1553 1554 let codegen_func the_fpm = function 1555 | Ast.Function (proto, body) -> 1556 Hashtbl.clear named_values; 1557 let the_function = codegen_proto proto in 1558 1559 (* If this is an operator, install it. *) 1560 begin match proto with 1561 | Ast.BinOpPrototype (name, args, prec) -> 1562 let op = name.[String.length name - 1] in 1563 Hashtbl.add Parser.binop_precedence op prec; 1564 | _ -> () 1565 end; 1566 1567 (* Create a new basic block to start insertion into. *) 1568 let bb = append_block context "entry" the_function in 1569 position_at_end bb builder; 1570 1571 try 1572 (* Add all arguments to the symbol table and create their allocas. *) 1573 create_argument_allocas the_function proto; 1574 1575 let ret_val = codegen_expr body in 1576 1577 (* Finish off the function. *) 1578 let _ = build_ret ret_val builder in 1579 1580 (* Validate the generated code, checking for consistency. *) 1581 Llvm_analysis.assert_valid_function the_function; 1582 1583 (* Optimize the function. *) 1584 let _ = PassManager.run_function the_function the_fpm in 1585 1586 the_function 1587 with e -> 1588 delete_function the_function; 1589 raise e 1590 1591 toplevel.ml: 1592 .. code-block:: ocaml 1593 1594 (*===----------------------------------------------------------------------=== 1595 * Top-Level parsing and JIT Driver 1596 *===----------------------------------------------------------------------===*) 1597 1598 open Llvm 1599 open Llvm_executionengine 1600 1601 (* top ::= definition | external | expression | ';' *) 1602 let rec main_loop the_fpm the_execution_engine stream = 1603 match Stream.peek stream with 1604 | None -> () 1605 1606 (* ignore top-level semicolons. *) 1607 | Some (Token.Kwd ';') -> 1608 Stream.junk stream; 1609 main_loop the_fpm the_execution_engine stream 1610 1611 | Some token -> 1612 begin 1613 try match token with 1614 | Token.Def -> 1615 let e = Parser.parse_definition stream in 1616 print_endline "parsed a function definition."; 1617 dump_value (Codegen.codegen_func the_fpm e); 1618 | Token.Extern -> 1619 let e = Parser.parse_extern stream in 1620 print_endline "parsed an extern."; 1621 dump_value (Codegen.codegen_proto e); 1622 | _ -> 1623 (* Evaluate a top-level expression into an anonymous function. *) 1624 let e = Parser.parse_toplevel stream in 1625 print_endline "parsed a top-level expr"; 1626 let the_function = Codegen.codegen_func the_fpm e in 1627 dump_value the_function; 1628 1629 (* JIT the function, returning a function pointer. *) 1630 let result = ExecutionEngine.run_function the_function [||] 1631 the_execution_engine in 1632 1633 print_string "Evaluated to "; 1634 print_float (GenericValue.as_float Codegen.double_type result); 1635 print_newline (); 1636 with Stream.Error s | Codegen.Error s -> 1637 (* Skip token for error recovery. *) 1638 Stream.junk stream; 1639 print_endline s; 1640 end; 1641 print_string "ready> "; flush stdout; 1642 main_loop the_fpm the_execution_engine stream 1643 1644 toy.ml: 1645 .. code-block:: ocaml 1646 1647 (*===----------------------------------------------------------------------=== 1648 * Main driver code. 1649 *===----------------------------------------------------------------------===*) 1650 1651 open Llvm 1652 open Llvm_executionengine 1653 open Llvm_target 1654 open Llvm_scalar_opts 1655 1656 let main () = 1657 ignore (initialize_native_target ()); 1658 1659 (* Install standard binary operators. 1660 * 1 is the lowest precedence. *) 1661 Hashtbl.add Parser.binop_precedence '=' 2; 1662 Hashtbl.add Parser.binop_precedence '<' 10; 1663 Hashtbl.add Parser.binop_precedence '+' 20; 1664 Hashtbl.add Parser.binop_precedence '-' 20; 1665 Hashtbl.add Parser.binop_precedence '*' 40; (* highest. *) 1666 1667 (* Prime the first token. *) 1668 print_string "ready> "; flush stdout; 1669 let stream = Lexer.lex (Stream.of_channel stdin) in 1670 1671 (* Create the JIT. *) 1672 let the_execution_engine = ExecutionEngine.create Codegen.the_module in 1673 let the_fpm = PassManager.create_function Codegen.the_module in 1674 1675 (* Set up the optimizer pipeline. Start with registering info about how the 1676 * target lays out data structures. *) 1677 DataLayout.add (ExecutionEngine.target_data the_execution_engine) the_fpm; 1678 1679 (* Promote allocas to registers. *) 1680 add_memory_to_register_promotion the_fpm; 1681 1682 (* Do simple "peephole" optimizations and bit-twiddling optzn. *) 1683 add_instruction_combination the_fpm; 1684 1685 (* reassociate expressions. *) 1686 add_reassociation the_fpm; 1687 1688 (* Eliminate Common SubExpressions. *) 1689 add_gvn the_fpm; 1690 1691 (* Simplify the control flow graph (deleting unreachable blocks, etc). *) 1692 add_cfg_simplification the_fpm; 1693 1694 ignore (PassManager.initialize the_fpm); 1695 1696 (* Run the main "interpreter loop" now. *) 1697 Toplevel.main_loop the_fpm the_execution_engine stream; 1698 1699 (* Print out all the generated code. *) 1700 dump_module Codegen.the_module 1701 ;; 1702 1703 main () 1704 1705 bindings.c 1706 .. code-block:: c 1707 1708 #include <stdio.h> 1709 1710 /* putchard - putchar that takes a double and returns 0. */ 1711 extern double putchard(double X) { 1712 putchar((char)X); 1713 return 0; 1714 } 1715 1716 /* printd - printf that takes a double prints it as "%f\n", returning 0. */ 1717 extern double printd(double X) { 1718 printf("%f\n", X); 1719 return 0; 1720 } 1721 1722 `Next: Conclusion and other useful LLVM tidbits <OCamlLangImpl8.html>`_ 1723 1724