1 ============================================================ 2 Kaleidoscope: Extending the Language: User-defined Operators 3 ============================================================ 4 5 .. contents:: 6 :local: 7 8 Chapter 6 Introduction 9 ====================== 10 11 Welcome to Chapter 6 of the "`Implementing a language with 12 LLVM <index.html>`_" tutorial. At this point in our tutorial, we now 13 have a fully functional language that is fairly minimal, but also 14 useful. There is still one big problem with it, however. Our language 15 doesn't have many useful operators (like division, logical negation, or 16 even any comparisons besides less-than). 17 18 This chapter of the tutorial takes a wild digression into adding 19 user-defined operators to the simple and beautiful Kaleidoscope 20 language. This digression now gives us a simple and ugly language in 21 some ways, but also a powerful one at the same time. One of the great 22 things about creating your own language is that you get to decide what 23 is good or bad. In this tutorial we'll assume that it is okay to use 24 this as a way to show some interesting parsing techniques. 25 26 At the end of this tutorial, we'll run through an example Kaleidoscope 27 application that `renders the Mandelbrot set <#kicking-the-tires>`_. This gives an 28 example of what you can build with Kaleidoscope and its feature set. 29 30 User-defined Operators: the Idea 31 ================================ 32 33 The "operator overloading" that we will add to Kaleidoscope is more 34 general than languages like C++. In C++, you are only allowed to 35 redefine existing operators: you can't programmatically change the 36 grammar, introduce new operators, change precedence levels, etc. In this 37 chapter, we will add this capability to Kaleidoscope, which will let the 38 user round out the set of operators that are supported. 39 40 The point of going into user-defined operators in a tutorial like this 41 is to show the power and flexibility of using a hand-written parser. 42 Thus far, the parser we have been implementing uses recursive descent 43 for most parts of the grammar and operator precedence parsing for the 44 expressions. See `Chapter 2 <OCamlLangImpl2.html>`_ for details. Without 45 using operator precedence parsing, it would be very difficult to allow 46 the programmer to introduce new operators into the grammar: the grammar 47 is dynamically extensible as the JIT runs. 48 49 The two specific features we'll add are programmable unary operators 50 (right now, Kaleidoscope has no unary operators at all) as well as 51 binary operators. An example of this is: 52 53 :: 54 55 # Logical unary not. 56 def unary!(v) 57 if v then 58 0 59 else 60 1; 61 62 # Define > with the same precedence as <. 63 def binary> 10 (LHS RHS) 64 RHS < LHS; 65 66 # Binary "logical or", (note that it does not "short circuit") 67 def binary| 5 (LHS RHS) 68 if LHS then 69 1 70 else if RHS then 71 1 72 else 73 0; 74 75 # Define = with slightly lower precedence than relationals. 76 def binary= 9 (LHS RHS) 77 !(LHS < RHS | LHS > RHS); 78 79 Many languages aspire to being able to implement their standard runtime 80 library in the language itself. In Kaleidoscope, we can implement 81 significant parts of the language in the library! 82 83 We will break down implementation of these features into two parts: 84 implementing support for user-defined binary operators and adding unary 85 operators. 86 87 User-defined Binary Operators 88 ============================= 89 90 Adding support for user-defined binary operators is pretty simple with 91 our current framework. We'll first add support for the unary/binary 92 keywords: 93 94 .. code-block:: ocaml 95 96 type token = 97 ... 98 (* operators *) 99 | Binary | Unary 100 101 ... 102 103 and lex_ident buffer = parser 104 ... 105 | "for" -> [< 'Token.For; stream >] 106 | "in" -> [< 'Token.In; stream >] 107 | "binary" -> [< 'Token.Binary; stream >] 108 | "unary" -> [< 'Token.Unary; stream >] 109 110 This just adds lexer support for the unary and binary keywords, like we 111 did in `previous chapters <OCamlLangImpl5.html#lexer-extensions-for-if-then-else>`_. One nice 112 thing about our current AST, is that we represent binary operators with 113 full generalisation by using their ASCII code as the opcode. For our 114 extended operators, we'll use this same representation, so we don't need 115 any new AST or parser support. 116 117 On the other hand, we have to be able to represent the definitions of 118 these new operators, in the "def binary\| 5" part of the function 119 definition. In our grammar so far, the "name" for the function 120 definition is parsed as the "prototype" production and into the 121 ``Ast.Prototype`` AST node. To represent our new user-defined operators 122 as prototypes, we have to extend the ``Ast.Prototype`` AST node like 123 this: 124 125 .. code-block:: ocaml 126 127 (* proto - This type represents the "prototype" for a function, which captures 128 * its name, and its argument names (thus implicitly the number of arguments the 129 * function takes). *) 130 type proto = 131 | Prototype of string * string array 132 | BinOpPrototype of string * string array * int 133 134 Basically, in addition to knowing a name for the prototype, we now keep 135 track of whether it was an operator, and if it was, what precedence 136 level the operator is at. The precedence is only used for binary 137 operators (as you'll see below, it just doesn't apply for unary 138 operators). Now that we have a way to represent the prototype for a 139 user-defined operator, we need to parse it: 140 141 .. code-block:: ocaml 142 143 (* prototype 144 * ::= id '(' id* ')' 145 * ::= binary LETTER number? (id, id) 146 * ::= unary LETTER number? (id) *) 147 let parse_prototype = 148 let rec parse_args accumulator = parser 149 | [< 'Token.Ident id; e=parse_args (id::accumulator) >] -> e 150 | [< >] -> accumulator 151 in 152 let parse_operator = parser 153 | [< 'Token.Unary >] -> "unary", 1 154 | [< 'Token.Binary >] -> "binary", 2 155 in 156 let parse_binary_precedence = parser 157 | [< 'Token.Number n >] -> int_of_float n 158 | [< >] -> 30 159 in 160 parser 161 | [< 'Token.Ident id; 162 'Token.Kwd '(' ?? "expected '(' in prototype"; 163 args=parse_args []; 164 'Token.Kwd ')' ?? "expected ')' in prototype" >] -> 165 (* success. *) 166 Ast.Prototype (id, Array.of_list (List.rev args)) 167 | [< (prefix, kind)=parse_operator; 168 'Token.Kwd op ?? "expected an operator"; 169 (* Read the precedence if present. *) 170 binary_precedence=parse_binary_precedence; 171 'Token.Kwd '(' ?? "expected '(' in prototype"; 172 args=parse_args []; 173 'Token.Kwd ')' ?? "expected ')' in prototype" >] -> 174 let name = prefix ^ (String.make 1 op) in 175 let args = Array.of_list (List.rev args) in 176 177 (* Verify right number of arguments for operator. *) 178 if Array.length args != kind 179 then raise (Stream.Error "invalid number of operands for operator") 180 else 181 if kind == 1 then 182 Ast.Prototype (name, args) 183 else 184 Ast.BinOpPrototype (name, args, binary_precedence) 185 | [< >] -> 186 raise (Stream.Error "expected function name in prototype") 187 188 This is all fairly straightforward parsing code, and we have already 189 seen a lot of similar code in the past. One interesting part about the 190 code above is the couple lines that set up ``name`` for binary 191 operators. This builds names like "binary@" for a newly defined "@" 192 operator. This then takes advantage of the fact that symbol names in the 193 LLVM symbol table are allowed to have any character in them, including 194 embedded nul characters. 195 196 The next interesting thing to add, is codegen support for these binary 197 operators. Given our current structure, this is a simple addition of a 198 default case for our existing binary operator node: 199 200 .. code-block:: ocaml 201 202 let codegen_expr = function 203 ... 204 | Ast.Binary (op, lhs, rhs) -> 205 let lhs_val = codegen_expr lhs in 206 let rhs_val = codegen_expr rhs in 207 begin 208 match op with 209 | '+' -> build_add lhs_val rhs_val "addtmp" builder 210 | '-' -> build_sub lhs_val rhs_val "subtmp" builder 211 | '*' -> build_mul lhs_val rhs_val "multmp" builder 212 | '<' -> 213 (* Convert bool 0/1 to double 0.0 or 1.0 *) 214 let i = build_fcmp Fcmp.Ult lhs_val rhs_val "cmptmp" builder in 215 build_uitofp i double_type "booltmp" builder 216 | _ -> 217 (* If it wasn't a builtin binary operator, it must be a user defined 218 * one. Emit a call to it. *) 219 let callee = "binary" ^ (String.make 1 op) in 220 let callee = 221 match lookup_function callee the_module with 222 | Some callee -> callee 223 | None -> raise (Error "binary operator not found!") 224 in 225 build_call callee [|lhs_val; rhs_val|] "binop" builder 226 end 227 228 As you can see above, the new code is actually really simple. It just 229 does a lookup for the appropriate operator in the symbol table and 230 generates a function call to it. Since user-defined operators are just 231 built as normal functions (because the "prototype" boils down to a 232 function with the right name) everything falls into place. 233 234 The final piece of code we are missing, is a bit of top level magic: 235 236 .. code-block:: ocaml 237 238 let codegen_func the_fpm = function 239 | Ast.Function (proto, body) -> 240 Hashtbl.clear named_values; 241 let the_function = codegen_proto proto in 242 243 (* If this is an operator, install it. *) 244 begin match proto with 245 | Ast.BinOpPrototype (name, args, prec) -> 246 let op = name.[String.length name - 1] in 247 Hashtbl.add Parser.binop_precedence op prec; 248 | _ -> () 249 end; 250 251 (* Create a new basic block to start insertion into. *) 252 let bb = append_block context "entry" the_function in 253 position_at_end bb builder; 254 ... 255 256 Basically, before codegening a function, if it is a user-defined 257 operator, we register it in the precedence table. This allows the binary 258 operator parsing logic we already have in place to handle it. Since we 259 are working on a fully-general operator precedence parser, this is all 260 we need to do to "extend the grammar". 261 262 Now we have useful user-defined binary operators. This builds a lot on 263 the previous framework we built for other operators. Adding unary 264 operators is a bit more challenging, because we don't have any framework 265 for it yet - lets see what it takes. 266 267 User-defined Unary Operators 268 ============================ 269 270 Since we don't currently support unary operators in the Kaleidoscope 271 language, we'll need to add everything to support them. Above, we added 272 simple support for the 'unary' keyword to the lexer. In addition to 273 that, we need an AST node: 274 275 .. code-block:: ocaml 276 277 type expr = 278 ... 279 (* variant for a unary operator. *) 280 | Unary of char * expr 281 ... 282 283 This AST node is very simple and obvious by now. It directly mirrors the 284 binary operator AST node, except that it only has one child. With this, 285 we need to add the parsing logic. Parsing a unary operator is pretty 286 simple: we'll add a new function to do it: 287 288 .. code-block:: ocaml 289 290 (* unary 291 * ::= primary 292 * ::= '!' unary *) 293 and parse_unary = parser 294 (* If this is a unary operator, read it. *) 295 | [< 'Token.Kwd op when op != '(' && op != ')'; operand=parse_expr >] -> 296 Ast.Unary (op, operand) 297 298 (* If the current token is not an operator, it must be a primary expr. *) 299 | [< stream >] -> parse_primary stream 300 301 The grammar we add is pretty straightforward here. If we see a unary 302 operator when parsing a primary operator, we eat the operator as a 303 prefix and parse the remaining piece as another unary operator. This 304 allows us to handle multiple unary operators (e.g. "!!x"). Note that 305 unary operators can't have ambiguous parses like binary operators can, 306 so there is no need for precedence information. 307 308 The problem with this function, is that we need to call ParseUnary from 309 somewhere. To do this, we change previous callers of ParsePrimary to 310 call ``parse_unary`` instead: 311 312 .. code-block:: ocaml 313 314 (* binoprhs 315 * ::= ('+' primary)* *) 316 and parse_bin_rhs expr_prec lhs stream = 317 ... 318 (* Parse the unary expression after the binary operator. *) 319 let rhs = parse_unary stream in 320 ... 321 322 ... 323 324 (* expression 325 * ::= primary binoprhs *) 326 and parse_expr = parser 327 | [< lhs=parse_unary; stream >] -> parse_bin_rhs 0 lhs stream 328 329 With these two simple changes, we are now able to parse unary operators 330 and build the AST for them. Next up, we need to add parser support for 331 prototypes, to parse the unary operator prototype. We extend the binary 332 operator code above with: 333 334 .. code-block:: ocaml 335 336 (* prototype 337 * ::= id '(' id* ')' 338 * ::= binary LETTER number? (id, id) 339 * ::= unary LETTER number? (id) *) 340 let parse_prototype = 341 let rec parse_args accumulator = parser 342 | [< 'Token.Ident id; e=parse_args (id::accumulator) >] -> e 343 | [< >] -> accumulator 344 in 345 let parse_operator = parser 346 | [< 'Token.Unary >] -> "unary", 1 347 | [< 'Token.Binary >] -> "binary", 2 348 in 349 let parse_binary_precedence = parser 350 | [< 'Token.Number n >] -> int_of_float n 351 | [< >] -> 30 352 in 353 parser 354 | [< 'Token.Ident id; 355 'Token.Kwd '(' ?? "expected '(' in prototype"; 356 args=parse_args []; 357 'Token.Kwd ')' ?? "expected ')' in prototype" >] -> 358 (* success. *) 359 Ast.Prototype (id, Array.of_list (List.rev args)) 360 | [< (prefix, kind)=parse_operator; 361 'Token.Kwd op ?? "expected an operator"; 362 (* Read the precedence if present. *) 363 binary_precedence=parse_binary_precedence; 364 'Token.Kwd '(' ?? "expected '(' in prototype"; 365 args=parse_args []; 366 'Token.Kwd ')' ?? "expected ')' in prototype" >] -> 367 let name = prefix ^ (String.make 1 op) in 368 let args = Array.of_list (List.rev args) in 369 370 (* Verify right number of arguments for operator. *) 371 if Array.length args != kind 372 then raise (Stream.Error "invalid number of operands for operator") 373 else 374 if kind == 1 then 375 Ast.Prototype (name, args) 376 else 377 Ast.BinOpPrototype (name, args, binary_precedence) 378 | [< >] -> 379 raise (Stream.Error "expected function name in prototype") 380 381 As with binary operators, we name unary operators with a name that 382 includes the operator character. This assists us at code generation 383 time. Speaking of, the final piece we need to add is codegen support for 384 unary operators. It looks like this: 385 386 .. code-block:: ocaml 387 388 let rec codegen_expr = function 389 ... 390 | Ast.Unary (op, operand) -> 391 let operand = codegen_expr operand in 392 let callee = "unary" ^ (String.make 1 op) in 393 let callee = 394 match lookup_function callee the_module with 395 | Some callee -> callee 396 | None -> raise (Error "unknown unary operator") 397 in 398 build_call callee [|operand|] "unop" builder 399 400 This code is similar to, but simpler than, the code for binary 401 operators. It is simpler primarily because it doesn't need to handle any 402 predefined operators. 403 404 Kicking the Tires 405 ================= 406 407 It is somewhat hard to believe, but with a few simple extensions we've 408 covered in the last chapters, we have grown a real-ish language. With 409 this, we can do a lot of interesting things, including I/O, math, and a 410 bunch of other things. For example, we can now add a nice sequencing 411 operator (printd is defined to print out the specified value and a 412 newline): 413 414 :: 415 416 ready> extern printd(x); 417 Read extern: declare double @printd(double) 418 ready> def binary : 1 (x y) 0; # Low-precedence operator that ignores operands. 419 .. 420 ready> printd(123) : printd(456) : printd(789); 421 123.000000 422 456.000000 423 789.000000 424 Evaluated to 0.000000 425 426 We can also define a bunch of other "primitive" operations, such as: 427 428 :: 429 430 # Logical unary not. 431 def unary!(v) 432 if v then 433 0 434 else 435 1; 436 437 # Unary negate. 438 def unary-(v) 439 0-v; 440 441 # Define > with the same precedence as <. 442 def binary> 10 (LHS RHS) 443 RHS < LHS; 444 445 # Binary logical or, which does not short circuit. 446 def binary| 5 (LHS RHS) 447 if LHS then 448 1 449 else if RHS then 450 1 451 else 452 0; 453 454 # Binary logical and, which does not short circuit. 455 def binary& 6 (LHS RHS) 456 if !LHS then 457 0 458 else 459 !!RHS; 460 461 # Define = with slightly lower precedence than relationals. 462 def binary = 9 (LHS RHS) 463 !(LHS < RHS | LHS > RHS); 464 465 Given the previous if/then/else support, we can also define interesting 466 functions for I/O. For example, the following prints out a character 467 whose "density" reflects the value passed in: the lower the value, the 468 denser the character: 469 470 :: 471 472 ready> 473 474 extern putchard(char) 475 def printdensity(d) 476 if d > 8 then 477 putchard(32) # ' ' 478 else if d > 4 then 479 putchard(46) # '.' 480 else if d > 2 then 481 putchard(43) # '+' 482 else 483 putchard(42); # '*' 484 ... 485 ready> printdensity(1): printdensity(2): printdensity(3) : 486 printdensity(4): printdensity(5): printdensity(9): putchard(10); 487 *++.. 488 Evaluated to 0.000000 489 490 Based on these simple primitive operations, we can start to define more 491 interesting things. For example, here's a little function that solves 492 for the number of iterations it takes a function in the complex plane to 493 converge: 494 495 :: 496 497 # determine whether the specific location diverges. 498 # Solve for z = z^2 + c in the complex plane. 499 def mandelconverger(real imag iters creal cimag) 500 if iters > 255 | (real*real + imag*imag > 4) then 501 iters 502 else 503 mandelconverger(real*real - imag*imag + creal, 504 2*real*imag + cimag, 505 iters+1, creal, cimag); 506 507 # return the number of iterations required for the iteration to escape 508 def mandelconverge(real imag) 509 mandelconverger(real, imag, 0, real, imag); 510 511 This "z = z\ :sup:`2`\ + c" function is a beautiful little creature 512 that is the basis for computation of the `Mandelbrot 513 Set <http://en.wikipedia.org/wiki/Mandelbrot_set>`_. Our 514 ``mandelconverge`` function returns the number of iterations that it 515 takes for a complex orbit to escape, saturating to 255. This is not a 516 very useful function by itself, but if you plot its value over a 517 two-dimensional plane, you can see the Mandelbrot set. Given that we are 518 limited to using putchard here, our amazing graphical output is limited, 519 but we can whip together something using the density plotter above: 520 521 :: 522 523 # compute and plot the mandelbrot set with the specified 2 dimensional range 524 # info. 525 def mandelhelp(xmin xmax xstep ymin ymax ystep) 526 for y = ymin, y < ymax, ystep in ( 527 (for x = xmin, x < xmax, xstep in 528 printdensity(mandelconverge(x,y))) 529 : putchard(10) 530 ) 531 532 # mandel - This is a convenient helper function for plotting the mandelbrot set 533 # from the specified position with the specified Magnification. 534 def mandel(realstart imagstart realmag imagmag) 535 mandelhelp(realstart, realstart+realmag*78, realmag, 536 imagstart, imagstart+imagmag*40, imagmag); 537 538 Given this, we can try plotting out the mandelbrot set! Lets try it out: 539 540 :: 541 542 ready> mandel(-2.3, -1.3, 0.05, 0.07); 543 *******************************+++++++++++************************************* 544 *************************+++++++++++++++++++++++******************************* 545 **********************+++++++++++++++++++++++++++++**************************** 546 *******************+++++++++++++++++++++.. ...++++++++************************* 547 *****************++++++++++++++++++++++.... ...+++++++++*********************** 548 ***************+++++++++++++++++++++++..... ...+++++++++********************* 549 **************+++++++++++++++++++++++.... ....+++++++++******************** 550 *************++++++++++++++++++++++...... .....++++++++******************* 551 ************+++++++++++++++++++++....... .......+++++++****************** 552 ***********+++++++++++++++++++.... ... .+++++++***************** 553 **********+++++++++++++++++....... .+++++++**************** 554 *********++++++++++++++........... ...+++++++*************** 555 ********++++++++++++............ ...++++++++************** 556 ********++++++++++... .......... .++++++++************** 557 *******+++++++++..... .+++++++++************* 558 *******++++++++...... ..+++++++++************* 559 *******++++++....... ..+++++++++************* 560 *******+++++...... ..+++++++++************* 561 *******.... .... ...+++++++++************* 562 *******.... . ...+++++++++************* 563 *******+++++...... ...+++++++++************* 564 *******++++++....... ..+++++++++************* 565 *******++++++++...... .+++++++++************* 566 *******+++++++++..... ..+++++++++************* 567 ********++++++++++... .......... .++++++++************** 568 ********++++++++++++............ ...++++++++************** 569 *********++++++++++++++.......... ...+++++++*************** 570 **********++++++++++++++++........ .+++++++**************** 571 **********++++++++++++++++++++.... ... ..+++++++**************** 572 ***********++++++++++++++++++++++....... .......++++++++***************** 573 ************+++++++++++++++++++++++...... ......++++++++****************** 574 **************+++++++++++++++++++++++.... ....++++++++******************** 575 ***************+++++++++++++++++++++++..... ...+++++++++********************* 576 *****************++++++++++++++++++++++.... ...++++++++*********************** 577 *******************+++++++++++++++++++++......++++++++************************* 578 *********************++++++++++++++++++++++.++++++++*************************** 579 *************************+++++++++++++++++++++++******************************* 580 ******************************+++++++++++++************************************ 581 ******************************************************************************* 582 ******************************************************************************* 583 ******************************************************************************* 584 Evaluated to 0.000000 585 ready> mandel(-2, -1, 0.02, 0.04); 586 **************************+++++++++++++++++++++++++++++++++++++++++++++++++++++ 587 ***********************++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 588 *********************+++++++++++++++++++++++++++++++++++++++++++++++++++++++++. 589 *******************+++++++++++++++++++++++++++++++++++++++++++++++++++++++++... 590 *****************+++++++++++++++++++++++++++++++++++++++++++++++++++++++++..... 591 ***************++++++++++++++++++++++++++++++++++++++++++++++++++++++++........ 592 **************++++++++++++++++++++++++++++++++++++++++++++++++++++++........... 593 ************+++++++++++++++++++++++++++++++++++++++++++++++++++++.............. 594 ***********++++++++++++++++++++++++++++++++++++++++++++++++++........ . 595 **********++++++++++++++++++++++++++++++++++++++++++++++............. 596 ********+++++++++++++++++++++++++++++++++++++++++++.................. 597 *******+++++++++++++++++++++++++++++++++++++++....................... 598 ******+++++++++++++++++++++++++++++++++++........................... 599 *****++++++++++++++++++++++++++++++++............................ 600 *****++++++++++++++++++++++++++++............................... 601 ****++++++++++++++++++++++++++...... ......................... 602 ***++++++++++++++++++++++++......... ...... ........... 603 ***++++++++++++++++++++++............ 604 **+++++++++++++++++++++.............. 605 **+++++++++++++++++++................ 606 *++++++++++++++++++................. 607 *++++++++++++++++............ ... 608 *++++++++++++++.............. 609 *+++....++++................ 610 *.......... ........... 611 * 612 *.......... ........... 613 *+++....++++................ 614 *++++++++++++++.............. 615 *++++++++++++++++............ ... 616 *++++++++++++++++++................. 617 **+++++++++++++++++++................ 618 **+++++++++++++++++++++.............. 619 ***++++++++++++++++++++++............ 620 ***++++++++++++++++++++++++......... ...... ........... 621 ****++++++++++++++++++++++++++...... ......................... 622 *****++++++++++++++++++++++++++++............................... 623 *****++++++++++++++++++++++++++++++++............................ 624 ******+++++++++++++++++++++++++++++++++++........................... 625 *******+++++++++++++++++++++++++++++++++++++++....................... 626 ********+++++++++++++++++++++++++++++++++++++++++++.................. 627 Evaluated to 0.000000 628 ready> mandel(-0.9, -1.4, 0.02, 0.03); 629 ******************************************************************************* 630 ******************************************************************************* 631 ******************************************************************************* 632 **********+++++++++++++++++++++************************************************ 633 *+++++++++++++++++++++++++++++++++++++++*************************************** 634 +++++++++++++++++++++++++++++++++++++++++++++********************************** 635 ++++++++++++++++++++++++++++++++++++++++++++++++++***************************** 636 ++++++++++++++++++++++++++++++++++++++++++++++++++++++************************* 637 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++********************** 638 +++++++++++++++++++++++++++++++++.........++++++++++++++++++******************* 639 +++++++++++++++++++++++++++++++.... ......+++++++++++++++++++**************** 640 +++++++++++++++++++++++++++++....... ........+++++++++++++++++++************** 641 ++++++++++++++++++++++++++++........ ........++++++++++++++++++++************ 642 +++++++++++++++++++++++++++......... .. ...+++++++++++++++++++++********** 643 ++++++++++++++++++++++++++........... ....++++++++++++++++++++++******** 644 ++++++++++++++++++++++++............. .......++++++++++++++++++++++****** 645 +++++++++++++++++++++++............. ........+++++++++++++++++++++++**** 646 ++++++++++++++++++++++........... ..........++++++++++++++++++++++*** 647 ++++++++++++++++++++........... .........++++++++++++++++++++++* 648 ++++++++++++++++++............ ...........++++++++++++++++++++ 649 ++++++++++++++++............... .............++++++++++++++++++ 650 ++++++++++++++................. ...............++++++++++++++++ 651 ++++++++++++.................. .................++++++++++++++ 652 +++++++++.................. .................+++++++++++++ 653 ++++++........ . ......... ..++++++++++++ 654 ++............ ...... ....++++++++++ 655 .............. ...++++++++++ 656 .............. ....+++++++++ 657 .............. .....++++++++ 658 ............. ......++++++++ 659 ........... .......++++++++ 660 ......... ........+++++++ 661 ......... ........+++++++ 662 ......... ....+++++++ 663 ........ ...+++++++ 664 ....... ...+++++++ 665 ....+++++++ 666 .....+++++++ 667 ....+++++++ 668 ....+++++++ 669 ....+++++++ 670 Evaluated to 0.000000 671 ready> ^D 672 673 At this point, you may be starting to realize that Kaleidoscope is a 674 real and powerful language. It may not be self-similar :), but it can be 675 used to plot things that are! 676 677 With this, we conclude the "adding user-defined operators" chapter of 678 the tutorial. We have successfully augmented our language, adding the 679 ability to extend the language in the library, and we have shown how 680 this can be used to build a simple but interesting end-user application 681 in Kaleidoscope. At this point, Kaleidoscope can build a variety of 682 applications that are functional and can call functions with 683 side-effects, but it can't actually define and mutate a variable itself. 684 685 Strikingly, variable mutation is an important feature of some languages, 686 and it is not at all obvious how to `add support for mutable 687 variables <OCamlLangImpl7.html>`_ without having to add an "SSA 688 construction" phase to your front-end. In the next chapter, we will 689 describe how you can add variable mutation without building SSA in your 690 front-end. 691 692 Full Code Listing 693 ================= 694 695 Here is the complete code listing for our running example, enhanced with 696 the if/then/else and for expressions.. To build this example, use: 697 698 .. code-block:: bash 699 700 # Compile 701 ocamlbuild toy.byte 702 # Run 703 ./toy.byte 704 705 Here is the code: 706 707 \_tags: 708 :: 709 710 <{lexer,parser}.ml>: use_camlp4, pp(camlp4of) 711 <*.{byte,native}>: g++, use_llvm, use_llvm_analysis 712 <*.{byte,native}>: use_llvm_executionengine, use_llvm_target 713 <*.{byte,native}>: use_llvm_scalar_opts, use_bindings 714 715 myocamlbuild.ml: 716 .. code-block:: ocaml 717 718 open Ocamlbuild_plugin;; 719 720 ocaml_lib ~extern:true "llvm";; 721 ocaml_lib ~extern:true "llvm_analysis";; 722 ocaml_lib ~extern:true "llvm_executionengine";; 723 ocaml_lib ~extern:true "llvm_target";; 724 ocaml_lib ~extern:true "llvm_scalar_opts";; 725 726 flag ["link"; "ocaml"; "g++"] (S[A"-cc"; A"g++"; A"-cclib"; A"-rdynamic"]);; 727 dep ["link"; "ocaml"; "use_bindings"] ["bindings.o"];; 728 729 token.ml: 730 .. code-block:: ocaml 731 732 (*===----------------------------------------------------------------------=== 733 * Lexer Tokens 734 *===----------------------------------------------------------------------===*) 735 736 (* The lexer returns these 'Kwd' if it is an unknown character, otherwise one of 737 * these others for known things. *) 738 type token = 739 (* commands *) 740 | Def | Extern 741 742 (* primary *) 743 | Ident of string | Number of float 744 745 (* unknown *) 746 | Kwd of char 747 748 (* control *) 749 | If | Then | Else 750 | For | In 751 752 (* operators *) 753 | Binary | Unary 754 755 lexer.ml: 756 .. code-block:: ocaml 757 758 (*===----------------------------------------------------------------------=== 759 * Lexer 760 *===----------------------------------------------------------------------===*) 761 762 let rec lex = parser 763 (* Skip any whitespace. *) 764 | [< ' (' ' | '\n' | '\r' | '\t'); stream >] -> lex stream 765 766 (* identifier: [a-zA-Z][a-zA-Z0-9] *) 767 | [< ' ('A' .. 'Z' | 'a' .. 'z' as c); stream >] -> 768 let buffer = Buffer.create 1 in 769 Buffer.add_char buffer c; 770 lex_ident buffer stream 771 772 (* number: [0-9.]+ *) 773 | [< ' ('0' .. '9' as c); stream >] -> 774 let buffer = Buffer.create 1 in 775 Buffer.add_char buffer c; 776 lex_number buffer stream 777 778 (* Comment until end of line. *) 779 | [< ' ('#'); stream >] -> 780 lex_comment stream 781 782 (* Otherwise, just return the character as its ascii value. *) 783 | [< 'c; stream >] -> 784 [< 'Token.Kwd c; lex stream >] 785 786 (* end of stream. *) 787 | [< >] -> [< >] 788 789 and lex_number buffer = parser 790 | [< ' ('0' .. '9' | '.' as c); stream >] -> 791 Buffer.add_char buffer c; 792 lex_number buffer stream 793 | [< stream=lex >] -> 794 [< 'Token.Number (float_of_string (Buffer.contents buffer)); stream >] 795 796 and lex_ident buffer = parser 797 | [< ' ('A' .. 'Z' | 'a' .. 'z' | '0' .. '9' as c); stream >] -> 798 Buffer.add_char buffer c; 799 lex_ident buffer stream 800 | [< stream=lex >] -> 801 match Buffer.contents buffer with 802 | "def" -> [< 'Token.Def; stream >] 803 | "extern" -> [< 'Token.Extern; stream >] 804 | "if" -> [< 'Token.If; stream >] 805 | "then" -> [< 'Token.Then; stream >] 806 | "else" -> [< 'Token.Else; stream >] 807 | "for" -> [< 'Token.For; stream >] 808 | "in" -> [< 'Token.In; stream >] 809 | "binary" -> [< 'Token.Binary; stream >] 810 | "unary" -> [< 'Token.Unary; stream >] 811 | id -> [< 'Token.Ident id; stream >] 812 813 and lex_comment = parser 814 | [< ' ('\n'); stream=lex >] -> stream 815 | [< 'c; e=lex_comment >] -> e 816 | [< >] -> [< >] 817 818 ast.ml: 819 .. code-block:: ocaml 820 821 (*===----------------------------------------------------------------------=== 822 * Abstract Syntax Tree (aka Parse Tree) 823 *===----------------------------------------------------------------------===*) 824 825 (* expr - Base type for all expression nodes. *) 826 type expr = 827 (* variant for numeric literals like "1.0". *) 828 | Number of float 829 830 (* variant for referencing a variable, like "a". *) 831 | Variable of string 832 833 (* variant for a unary operator. *) 834 | Unary of char * expr 835 836 (* variant for a binary operator. *) 837 | Binary of char * expr * expr 838 839 (* variant for function calls. *) 840 | Call of string * expr array 841 842 (* variant for if/then/else. *) 843 | If of expr * expr * expr 844 845 (* variant for for/in. *) 846 | For of string * expr * expr * expr option * expr 847 848 (* proto - This type represents the "prototype" for a function, which captures 849 * its name, and its argument names (thus implicitly the number of arguments the 850 * function takes). *) 851 type proto = 852 | Prototype of string * string array 853 | BinOpPrototype of string * string array * int 854 855 (* func - This type represents a function definition itself. *) 856 type func = Function of proto * expr 857 858 parser.ml: 859 .. code-block:: ocaml 860 861 (*===---------------------------------------------------------------------=== 862 * Parser 863 *===---------------------------------------------------------------------===*) 864 865 (* binop_precedence - This holds the precedence for each binary operator that is 866 * defined *) 867 let binop_precedence:(char, int) Hashtbl.t = Hashtbl.create 10 868 869 (* precedence - Get the precedence of the pending binary operator token. *) 870 let precedence c = try Hashtbl.find binop_precedence c with Not_found -> -1 871 872 (* primary 873 * ::= identifier 874 * ::= numberexpr 875 * ::= parenexpr 876 * ::= ifexpr 877 * ::= forexpr *) 878 let rec parse_primary = parser 879 (* numberexpr ::= number *) 880 | [< 'Token.Number n >] -> Ast.Number n 881 882 (* parenexpr ::= '(' expression ')' *) 883 | [< 'Token.Kwd '('; e=parse_expr; 'Token.Kwd ')' ?? "expected ')'" >] -> e 884 885 (* identifierexpr 886 * ::= identifier 887 * ::= identifier '(' argumentexpr ')' *) 888 | [< 'Token.Ident id; stream >] -> 889 let rec parse_args accumulator = parser 890 | [< e=parse_expr; stream >] -> 891 begin parser 892 | [< 'Token.Kwd ','; e=parse_args (e :: accumulator) >] -> e 893 | [< >] -> e :: accumulator 894 end stream 895 | [< >] -> accumulator 896 in 897 let rec parse_ident id = parser 898 (* Call. *) 899 | [< 'Token.Kwd '('; 900 args=parse_args []; 901 'Token.Kwd ')' ?? "expected ')'">] -> 902 Ast.Call (id, Array.of_list (List.rev args)) 903 904 (* Simple variable ref. *) 905 | [< >] -> Ast.Variable id 906 in 907 parse_ident id stream 908 909 (* ifexpr ::= 'if' expr 'then' expr 'else' expr *) 910 | [< 'Token.If; c=parse_expr; 911 'Token.Then ?? "expected 'then'"; t=parse_expr; 912 'Token.Else ?? "expected 'else'"; e=parse_expr >] -> 913 Ast.If (c, t, e) 914 915 (* forexpr 916 ::= 'for' identifier '=' expr ',' expr (',' expr)? 'in' expression *) 917 | [< 'Token.For; 918 'Token.Ident id ?? "expected identifier after for"; 919 'Token.Kwd '=' ?? "expected '=' after for"; 920 stream >] -> 921 begin parser 922 | [< 923 start=parse_expr; 924 'Token.Kwd ',' ?? "expected ',' after for"; 925 end_=parse_expr; 926 stream >] -> 927 let step = 928 begin parser 929 | [< 'Token.Kwd ','; step=parse_expr >] -> Some step 930 | [< >] -> None 931 end stream 932 in 933 begin parser 934 | [< 'Token.In; body=parse_expr >] -> 935 Ast.For (id, start, end_, step, body) 936 | [< >] -> 937 raise (Stream.Error "expected 'in' after for") 938 end stream 939 | [< >] -> 940 raise (Stream.Error "expected '=' after for") 941 end stream 942 943 | [< >] -> raise (Stream.Error "unknown token when expecting an expression.") 944 945 (* unary 946 * ::= primary 947 * ::= '!' unary *) 948 and parse_unary = parser 949 (* If this is a unary operator, read it. *) 950 | [< 'Token.Kwd op when op != '(' && op != ')'; operand=parse_expr >] -> 951 Ast.Unary (op, operand) 952 953 (* If the current token is not an operator, it must be a primary expr. *) 954 | [< stream >] -> parse_primary stream 955 956 (* binoprhs 957 * ::= ('+' primary)* *) 958 and parse_bin_rhs expr_prec lhs stream = 959 match Stream.peek stream with 960 (* If this is a binop, find its precedence. *) 961 | Some (Token.Kwd c) when Hashtbl.mem binop_precedence c -> 962 let token_prec = precedence c in 963 964 (* If this is a binop that binds at least as tightly as the current binop, 965 * consume it, otherwise we are done. *) 966 if token_prec < expr_prec then lhs else begin 967 (* Eat the binop. *) 968 Stream.junk stream; 969 970 (* Parse the unary expression after the binary operator. *) 971 let rhs = parse_unary stream in 972 973 (* Okay, we know this is a binop. *) 974 let rhs = 975 match Stream.peek stream with 976 | Some (Token.Kwd c2) -> 977 (* If BinOp binds less tightly with rhs than the operator after 978 * rhs, let the pending operator take rhs as its lhs. *) 979 let next_prec = precedence c2 in 980 if token_prec < next_prec 981 then parse_bin_rhs (token_prec + 1) rhs stream 982 else rhs 983 | _ -> rhs 984 in 985 986 (* Merge lhs/rhs. *) 987 let lhs = Ast.Binary (c, lhs, rhs) in 988 parse_bin_rhs expr_prec lhs stream 989 end 990 | _ -> lhs 991 992 (* expression 993 * ::= primary binoprhs *) 994 and parse_expr = parser 995 | [< lhs=parse_unary; stream >] -> parse_bin_rhs 0 lhs stream 996 997 (* prototype 998 * ::= id '(' id* ')' 999 * ::= binary LETTER number? (id, id) 1000 * ::= unary LETTER number? (id) *) 1001 let parse_prototype = 1002 let rec parse_args accumulator = parser 1003 | [< 'Token.Ident id; e=parse_args (id::accumulator) >] -> e 1004 | [< >] -> accumulator 1005 in 1006 let parse_operator = parser 1007 | [< 'Token.Unary >] -> "unary", 1 1008 | [< 'Token.Binary >] -> "binary", 2 1009 in 1010 let parse_binary_precedence = parser 1011 | [< 'Token.Number n >] -> int_of_float n 1012 | [< >] -> 30 1013 in 1014 parser 1015 | [< 'Token.Ident id; 1016 'Token.Kwd '(' ?? "expected '(' in prototype"; 1017 args=parse_args []; 1018 'Token.Kwd ')' ?? "expected ')' in prototype" >] -> 1019 (* success. *) 1020 Ast.Prototype (id, Array.of_list (List.rev args)) 1021 | [< (prefix, kind)=parse_operator; 1022 'Token.Kwd op ?? "expected an operator"; 1023 (* Read the precedence if present. *) 1024 binary_precedence=parse_binary_precedence; 1025 'Token.Kwd '(' ?? "expected '(' in prototype"; 1026 args=parse_args []; 1027 'Token.Kwd ')' ?? "expected ')' in prototype" >] -> 1028 let name = prefix ^ (String.make 1 op) in 1029 let args = Array.of_list (List.rev args) in 1030 1031 (* Verify right number of arguments for operator. *) 1032 if Array.length args != kind 1033 then raise (Stream.Error "invalid number of operands for operator") 1034 else 1035 if kind == 1 then 1036 Ast.Prototype (name, args) 1037 else 1038 Ast.BinOpPrototype (name, args, binary_precedence) 1039 | [< >] -> 1040 raise (Stream.Error "expected function name in prototype") 1041 1042 (* definition ::= 'def' prototype expression *) 1043 let parse_definition = parser 1044 | [< 'Token.Def; p=parse_prototype; e=parse_expr >] -> 1045 Ast.Function (p, e) 1046 1047 (* toplevelexpr ::= expression *) 1048 let parse_toplevel = parser 1049 | [< e=parse_expr >] -> 1050 (* Make an anonymous proto. *) 1051 Ast.Function (Ast.Prototype ("", [||]), e) 1052 1053 (* external ::= 'extern' prototype *) 1054 let parse_extern = parser 1055 | [< 'Token.Extern; e=parse_prototype >] -> e 1056 1057 codegen.ml: 1058 .. code-block:: ocaml 1059 1060 (*===----------------------------------------------------------------------=== 1061 * Code Generation 1062 *===----------------------------------------------------------------------===*) 1063 1064 open Llvm 1065 1066 exception Error of string 1067 1068 let context = global_context () 1069 let the_module = create_module context "my cool jit" 1070 let builder = builder context 1071 let named_values:(string, llvalue) Hashtbl.t = Hashtbl.create 10 1072 let double_type = double_type context 1073 1074 let rec codegen_expr = function 1075 | Ast.Number n -> const_float double_type n 1076 | Ast.Variable name -> 1077 (try Hashtbl.find named_values name with 1078 | Not_found -> raise (Error "unknown variable name")) 1079 | Ast.Unary (op, operand) -> 1080 let operand = codegen_expr operand in 1081 let callee = "unary" ^ (String.make 1 op) in 1082 let callee = 1083 match lookup_function callee the_module with 1084 | Some callee -> callee 1085 | None -> raise (Error "unknown unary operator") 1086 in 1087 build_call callee [|operand|] "unop" builder 1088 | Ast.Binary (op, lhs, rhs) -> 1089 let lhs_val = codegen_expr lhs in 1090 let rhs_val = codegen_expr rhs in 1091 begin 1092 match op with 1093 | '+' -> build_add lhs_val rhs_val "addtmp" builder 1094 | '-' -> build_sub lhs_val rhs_val "subtmp" builder 1095 | '*' -> build_mul lhs_val rhs_val "multmp" builder 1096 | '<' -> 1097 (* Convert bool 0/1 to double 0.0 or 1.0 *) 1098 let i = build_fcmp Fcmp.Ult lhs_val rhs_val "cmptmp" builder in 1099 build_uitofp i double_type "booltmp" builder 1100 | _ -> 1101 (* If it wasn't a builtin binary operator, it must be a user defined 1102 * one. Emit a call to it. *) 1103 let callee = "binary" ^ (String.make 1 op) in 1104 let callee = 1105 match lookup_function callee the_module with 1106 | Some callee -> callee 1107 | None -> raise (Error "binary operator not found!") 1108 in 1109 build_call callee [|lhs_val; rhs_val|] "binop" builder 1110 end 1111 | Ast.Call (callee, args) -> 1112 (* Look up the name in the module table. *) 1113 let callee = 1114 match lookup_function callee the_module with 1115 | Some callee -> callee 1116 | None -> raise (Error "unknown function referenced") 1117 in 1118 let params = params callee in 1119 1120 (* If argument mismatch error. *) 1121 if Array.length params == Array.length args then () else 1122 raise (Error "incorrect # arguments passed"); 1123 let args = Array.map codegen_expr args in 1124 build_call callee args "calltmp" builder 1125 | Ast.If (cond, then_, else_) -> 1126 let cond = codegen_expr cond in 1127 1128 (* Convert condition to a bool by comparing equal to 0.0 *) 1129 let zero = const_float double_type 0.0 in 1130 let cond_val = build_fcmp Fcmp.One cond zero "ifcond" builder in 1131 1132 (* Grab the first block so that we might later add the conditional branch 1133 * to it at the end of the function. *) 1134 let start_bb = insertion_block builder in 1135 let the_function = block_parent start_bb in 1136 1137 let then_bb = append_block context "then" the_function in 1138 1139 (* Emit 'then' value. *) 1140 position_at_end then_bb builder; 1141 let then_val = codegen_expr then_ in 1142 1143 (* Codegen of 'then' can change the current block, update then_bb for the 1144 * phi. We create a new name because one is used for the phi node, and the 1145 * other is used for the conditional branch. *) 1146 let new_then_bb = insertion_block builder in 1147 1148 (* Emit 'else' value. *) 1149 let else_bb = append_block context "else" the_function in 1150 position_at_end else_bb builder; 1151 let else_val = codegen_expr else_ in 1152 1153 (* Codegen of 'else' can change the current block, update else_bb for the 1154 * phi. *) 1155 let new_else_bb = insertion_block builder in 1156 1157 (* Emit merge block. *) 1158 let merge_bb = append_block context "ifcont" the_function in 1159 position_at_end merge_bb builder; 1160 let incoming = [(then_val, new_then_bb); (else_val, new_else_bb)] in 1161 let phi = build_phi incoming "iftmp" builder in 1162 1163 (* Return to the start block to add the conditional branch. *) 1164 position_at_end start_bb builder; 1165 ignore (build_cond_br cond_val then_bb else_bb builder); 1166 1167 (* Set a unconditional branch at the end of the 'then' block and the 1168 * 'else' block to the 'merge' block. *) 1169 position_at_end new_then_bb builder; ignore (build_br merge_bb builder); 1170 position_at_end new_else_bb builder; ignore (build_br merge_bb builder); 1171 1172 (* Finally, set the builder to the end of the merge block. *) 1173 position_at_end merge_bb builder; 1174 1175 phi 1176 | Ast.For (var_name, start, end_, step, body) -> 1177 (* Emit the start code first, without 'variable' in scope. *) 1178 let start_val = codegen_expr start in 1179 1180 (* Make the new basic block for the loop header, inserting after current 1181 * block. *) 1182 let preheader_bb = insertion_block builder in 1183 let the_function = block_parent preheader_bb in 1184 let loop_bb = append_block context "loop" the_function in 1185 1186 (* Insert an explicit fall through from the current block to the 1187 * loop_bb. *) 1188 ignore (build_br loop_bb builder); 1189 1190 (* Start insertion in loop_bb. *) 1191 position_at_end loop_bb builder; 1192 1193 (* Start the PHI node with an entry for start. *) 1194 let variable = build_phi [(start_val, preheader_bb)] var_name builder in 1195 1196 (* Within the loop, the variable is defined equal to the PHI node. If it 1197 * shadows an existing variable, we have to restore it, so save it 1198 * now. *) 1199 let old_val = 1200 try Some (Hashtbl.find named_values var_name) with Not_found -> None 1201 in 1202 Hashtbl.add named_values var_name variable; 1203 1204 (* Emit the body of the loop. This, like any other expr, can change the 1205 * current BB. Note that we ignore the value computed by the body, but 1206 * don't allow an error *) 1207 ignore (codegen_expr body); 1208 1209 (* Emit the step value. *) 1210 let step_val = 1211 match step with 1212 | Some step -> codegen_expr step 1213 (* If not specified, use 1.0. *) 1214 | None -> const_float double_type 1.0 1215 in 1216 1217 let next_var = build_add variable step_val "nextvar" builder in 1218 1219 (* Compute the end condition. *) 1220 let end_cond = codegen_expr end_ in 1221 1222 (* Convert condition to a bool by comparing equal to 0.0. *) 1223 let zero = const_float double_type 0.0 in 1224 let end_cond = build_fcmp Fcmp.One end_cond zero "loopcond" builder in 1225 1226 (* Create the "after loop" block and insert it. *) 1227 let loop_end_bb = insertion_block builder in 1228 let after_bb = append_block context "afterloop" the_function in 1229 1230 (* Insert the conditional branch into the end of loop_end_bb. *) 1231 ignore (build_cond_br end_cond loop_bb after_bb builder); 1232 1233 (* Any new code will be inserted in after_bb. *) 1234 position_at_end after_bb builder; 1235 1236 (* Add a new entry to the PHI node for the backedge. *) 1237 add_incoming (next_var, loop_end_bb) variable; 1238 1239 (* Restore the unshadowed variable. *) 1240 begin match old_val with 1241 | Some old_val -> Hashtbl.add named_values var_name old_val 1242 | None -> () 1243 end; 1244 1245 (* for expr always returns 0.0. *) 1246 const_null double_type 1247 1248 let codegen_proto = function 1249 | Ast.Prototype (name, args) | Ast.BinOpPrototype (name, args, _) -> 1250 (* Make the function type: double(double,double) etc. *) 1251 let doubles = Array.make (Array.length args) double_type in 1252 let ft = function_type double_type doubles in 1253 let f = 1254 match lookup_function name the_module with 1255 | None -> declare_function name ft the_module 1256 1257 (* If 'f' conflicted, there was already something named 'name'. If it 1258 * has a body, don't allow redefinition or reextern. *) 1259 | Some f -> 1260 (* If 'f' already has a body, reject this. *) 1261 if block_begin f <> At_end f then 1262 raise (Error "redefinition of function"); 1263 1264 (* If 'f' took a different number of arguments, reject. *) 1265 if element_type (type_of f) <> ft then 1266 raise (Error "redefinition of function with different # args"); 1267 f 1268 in 1269 1270 (* Set names for all arguments. *) 1271 Array.iteri (fun i a -> 1272 let n = args.(i) in 1273 set_value_name n a; 1274 Hashtbl.add named_values n a; 1275 ) (params f); 1276 f 1277 1278 let codegen_func the_fpm = function 1279 | Ast.Function (proto, body) -> 1280 Hashtbl.clear named_values; 1281 let the_function = codegen_proto proto in 1282 1283 (* If this is an operator, install it. *) 1284 begin match proto with 1285 | Ast.BinOpPrototype (name, args, prec) -> 1286 let op = name.[String.length name - 1] in 1287 Hashtbl.add Parser.binop_precedence op prec; 1288 | _ -> () 1289 end; 1290 1291 (* Create a new basic block to start insertion into. *) 1292 let bb = append_block context "entry" the_function in 1293 position_at_end bb builder; 1294 1295 try 1296 let ret_val = codegen_expr body in 1297 1298 (* Finish off the function. *) 1299 let _ = build_ret ret_val builder in 1300 1301 (* Validate the generated code, checking for consistency. *) 1302 Llvm_analysis.assert_valid_function the_function; 1303 1304 (* Optimize the function. *) 1305 let _ = PassManager.run_function the_function the_fpm in 1306 1307 the_function 1308 with e -> 1309 delete_function the_function; 1310 raise e 1311 1312 toplevel.ml: 1313 .. code-block:: ocaml 1314 1315 (*===----------------------------------------------------------------------=== 1316 * Top-Level parsing and JIT Driver 1317 *===----------------------------------------------------------------------===*) 1318 1319 open Llvm 1320 open Llvm_executionengine 1321 1322 (* top ::= definition | external | expression | ';' *) 1323 let rec main_loop the_fpm the_execution_engine stream = 1324 match Stream.peek stream with 1325 | None -> () 1326 1327 (* ignore top-level semicolons. *) 1328 | Some (Token.Kwd ';') -> 1329 Stream.junk stream; 1330 main_loop the_fpm the_execution_engine stream 1331 1332 | Some token -> 1333 begin 1334 try match token with 1335 | Token.Def -> 1336 let e = Parser.parse_definition stream in 1337 print_endline "parsed a function definition."; 1338 dump_value (Codegen.codegen_func the_fpm e); 1339 | Token.Extern -> 1340 let e = Parser.parse_extern stream in 1341 print_endline "parsed an extern."; 1342 dump_value (Codegen.codegen_proto e); 1343 | _ -> 1344 (* Evaluate a top-level expression into an anonymous function. *) 1345 let e = Parser.parse_toplevel stream in 1346 print_endline "parsed a top-level expr"; 1347 let the_function = Codegen.codegen_func the_fpm e in 1348 dump_value the_function; 1349 1350 (* JIT the function, returning a function pointer. *) 1351 let result = ExecutionEngine.run_function the_function [||] 1352 the_execution_engine in 1353 1354 print_string "Evaluated to "; 1355 print_float (GenericValue.as_float Codegen.double_type result); 1356 print_newline (); 1357 with Stream.Error s | Codegen.Error s -> 1358 (* Skip token for error recovery. *) 1359 Stream.junk stream; 1360 print_endline s; 1361 end; 1362 print_string "ready> "; flush stdout; 1363 main_loop the_fpm the_execution_engine stream 1364 1365 toy.ml: 1366 .. code-block:: ocaml 1367 1368 (*===----------------------------------------------------------------------=== 1369 * Main driver code. 1370 *===----------------------------------------------------------------------===*) 1371 1372 open Llvm 1373 open Llvm_executionengine 1374 open Llvm_target 1375 open Llvm_scalar_opts 1376 1377 let main () = 1378 ignore (initialize_native_target ()); 1379 1380 (* Install standard binary operators. 1381 * 1 is the lowest precedence. *) 1382 Hashtbl.add Parser.binop_precedence '<' 10; 1383 Hashtbl.add Parser.binop_precedence '+' 20; 1384 Hashtbl.add Parser.binop_precedence '-' 20; 1385 Hashtbl.add Parser.binop_precedence '*' 40; (* highest. *) 1386 1387 (* Prime the first token. *) 1388 print_string "ready> "; flush stdout; 1389 let stream = Lexer.lex (Stream.of_channel stdin) in 1390 1391 (* Create the JIT. *) 1392 let the_execution_engine = ExecutionEngine.create Codegen.the_module in 1393 let the_fpm = PassManager.create_function Codegen.the_module in 1394 1395 (* Set up the optimizer pipeline. Start with registering info about how the 1396 * target lays out data structures. *) 1397 DataLayout.add (ExecutionEngine.target_data the_execution_engine) the_fpm; 1398 1399 (* Do simple "peephole" optimizations and bit-twiddling optzn. *) 1400 add_instruction_combination the_fpm; 1401 1402 (* reassociate expressions. *) 1403 add_reassociation the_fpm; 1404 1405 (* Eliminate Common SubExpressions. *) 1406 add_gvn the_fpm; 1407 1408 (* Simplify the control flow graph (deleting unreachable blocks, etc). *) 1409 add_cfg_simplification the_fpm; 1410 1411 ignore (PassManager.initialize the_fpm); 1412 1413 (* Run the main "interpreter loop" now. *) 1414 Toplevel.main_loop the_fpm the_execution_engine stream; 1415 1416 (* Print out all the generated code. *) 1417 dump_module Codegen.the_module 1418 ;; 1419 1420 main () 1421 1422 bindings.c 1423 .. code-block:: c 1424 1425 #include <stdio.h> 1426 1427 /* putchard - putchar that takes a double and returns 0. */ 1428 extern double putchard(double X) { 1429 putchar((char)X); 1430 return 0; 1431 } 1432 1433 /* printd - printf that takes a double prints it as "%f\n", returning 0. */ 1434 extern double printd(double X) { 1435 printf("%f\n", X); 1436 return 0; 1437 } 1438 1439 `Next: Extending the language: mutable variables / SSA 1440 construction <OCamlLangImpl7.html>`_ 1441 1442