1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" 2 "http://www.w3.org/TR/html4/strict.dtd"> 3 4 <html> 5 <head> 6 <title>Kaleidoscope: Extending the Language: User-defined Operators</title> 7 <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> 8 <meta name="author" content="Chris Lattner"> 9 <meta name="author" content="Erick Tryzelaar"> 10 <link rel="stylesheet" href="../llvm.css" type="text/css"> 11 </head> 12 13 <body> 14 15 <h1>Kaleidoscope: Extending the Language: User-defined Operators</h1> 16 17 <ul> 18 <li><a href="index.html">Up to Tutorial Index</a></li> 19 <li>Chapter 6 20 <ol> 21 <li><a href="#intro">Chapter 6 Introduction</a></li> 22 <li><a href="#idea">User-defined Operators: the Idea</a></li> 23 <li><a href="#binary">User-defined Binary Operators</a></li> 24 <li><a href="#unary">User-defined Unary Operators</a></li> 25 <li><a href="#example">Kicking the Tires</a></li> 26 <li><a href="#code">Full Code Listing</a></li> 27 </ol> 28 </li> 29 <li><a href="OCamlLangImpl7.html">Chapter 7</a>: Extending the Language: Mutable 30 Variables / SSA Construction</li> 31 </ul> 32 33 <div class="doc_author"> 34 <p> 35 Written by <a href="mailto:sabre (a] nondot.org">Chris Lattner</a> 36 and <a href="mailto:idadesub (a] users.sourceforge.net">Erick Tryzelaar</a> 37 </p> 38 </div> 39 40 <!-- *********************************************************************** --> 41 <h2><a name="intro">Chapter 6 Introduction</a></h2> 42 <!-- *********************************************************************** --> 43 44 <div> 45 46 <p>Welcome to Chapter 6 of the "<a href="index.html">Implementing a language 47 with LLVM</a>" tutorial. At this point in our tutorial, we now have a fully 48 functional language that is fairly minimal, but also useful. There 49 is still one big problem with it, however. Our language doesn't have many 50 useful operators (like division, logical negation, or even any comparisons 51 besides less-than).</p> 52 53 <p>This chapter of the tutorial takes a wild digression into adding user-defined 54 operators to the simple and beautiful Kaleidoscope language. This digression now 55 gives us a simple and ugly language in some ways, but also a powerful one at the 56 same time. One of the great things about creating your own language is that you 57 get to decide what is good or bad. In this tutorial we'll assume that it is 58 okay to use this as a way to show some interesting parsing techniques.</p> 59 60 <p>At the end of this tutorial, we'll run through an example Kaleidoscope 61 application that <a href="#example">renders the Mandelbrot set</a>. This gives 62 an example of what you can build with Kaleidoscope and its feature set.</p> 63 64 </div> 65 66 <!-- *********************************************************************** --> 67 <h2><a name="idea">User-defined Operators: the Idea</a></h2> 68 <!-- *********************************************************************** --> 69 70 <div> 71 72 <p> 73 The "operator overloading" that we will add to Kaleidoscope is more general than 74 languages like C++. In C++, you are only allowed to redefine existing 75 operators: you can't programatically change the grammar, introduce new 76 operators, change precedence levels, etc. In this chapter, we will add this 77 capability to Kaleidoscope, which will let the user round out the set of 78 operators that are supported.</p> 79 80 <p>The point of going into user-defined operators in a tutorial like this is to 81 show the power and flexibility of using a hand-written parser. Thus far, the parser 82 we have been implementing uses recursive descent for most parts of the grammar and 83 operator precedence parsing for the expressions. See <a 84 href="OCamlLangImpl2.html">Chapter 2</a> for details. Without using operator 85 precedence parsing, it would be very difficult to allow the programmer to 86 introduce new operators into the grammar: the grammar is dynamically extensible 87 as the JIT runs.</p> 88 89 <p>The two specific features we'll add are programmable unary operators (right 90 now, Kaleidoscope has no unary operators at all) as well as binary operators. 91 An example of this is:</p> 92 93 <div class="doc_code"> 94 <pre> 95 # Logical unary not. 96 def unary!(v) 97 if v then 98 0 99 else 100 1; 101 102 # Define > with the same precedence as <. 103 def binary> 10 (LHS RHS) 104 RHS < LHS; 105 106 # Binary "logical or", (note that it does not "short circuit") 107 def binary| 5 (LHS RHS) 108 if LHS then 109 1 110 else if RHS then 111 1 112 else 113 0; 114 115 # Define = with slightly lower precedence than relationals. 116 def binary= 9 (LHS RHS) 117 !(LHS < RHS | LHS > RHS); 118 </pre> 119 </div> 120 121 <p>Many languages aspire to being able to implement their standard runtime 122 library in the language itself. In Kaleidoscope, we can implement significant 123 parts of the language in the library!</p> 124 125 <p>We will break down implementation of these features into two parts: 126 implementing support for user-defined binary operators and adding unary 127 operators.</p> 128 129 </div> 130 131 <!-- *********************************************************************** --> 132 <h2><a name="binary">User-defined Binary Operators</a></h2> 133 <!-- *********************************************************************** --> 134 135 <div> 136 137 <p>Adding support for user-defined binary operators is pretty simple with our 138 current framework. We'll first add support for the unary/binary keywords:</p> 139 140 <div class="doc_code"> 141 <pre> 142 type token = 143 ... 144 <b>(* operators *) 145 | Binary | Unary</b> 146 147 ... 148 149 and lex_ident buffer = parser 150 ... 151 | "for" -> [< 'Token.For; stream >] 152 | "in" -> [< 'Token.In; stream >] 153 <b>| "binary" -> [< 'Token.Binary; stream >] 154 | "unary" -> [< 'Token.Unary; stream >]</b> 155 </pre> 156 </div> 157 158 <p>This just adds lexer support for the unary and binary keywords, like we 159 did in <a href="OCamlLangImpl5.html#iflexer">previous chapters</a>. One nice 160 thing about our current AST, is that we represent binary operators with full 161 generalisation by using their ASCII code as the opcode. For our extended 162 operators, we'll use this same representation, so we don't need any new AST or 163 parser support.</p> 164 165 <p>On the other hand, we have to be able to represent the definitions of these 166 new operators, in the "def binary| 5" part of the function definition. In our 167 grammar so far, the "name" for the function definition is parsed as the 168 "prototype" production and into the <tt>Ast.Prototype</tt> AST node. To 169 represent our new user-defined operators as prototypes, we have to extend 170 the <tt>Ast.Prototype</tt> AST node like this:</p> 171 172 <div class="doc_code"> 173 <pre> 174 (* proto - This type represents the "prototype" for a function, which captures 175 * its name, and its argument names (thus implicitly the number of arguments the 176 * function takes). *) 177 type proto = 178 | Prototype of string * string array 179 <b>| BinOpPrototype of string * string array * int</b> 180 </pre> 181 </div> 182 183 <p>Basically, in addition to knowing a name for the prototype, we now keep track 184 of whether it was an operator, and if it was, what precedence level the operator 185 is at. The precedence is only used for binary operators (as you'll see below, 186 it just doesn't apply for unary operators). Now that we have a way to represent 187 the prototype for a user-defined operator, we need to parse it:</p> 188 189 <div class="doc_code"> 190 <pre> 191 (* prototype 192 * ::= id '(' id* ')' 193 <b>* ::= binary LETTER number? (id, id) 194 * ::= unary LETTER number? (id) *)</b> 195 let parse_prototype = 196 let rec parse_args accumulator = parser 197 | [< 'Token.Ident id; e=parse_args (id::accumulator) >] -> e 198 | [< >] -> accumulator 199 in 200 let parse_operator = parser 201 | [< 'Token.Unary >] -> "unary", 1 202 | [< 'Token.Binary >] -> "binary", 2 203 in 204 let parse_binary_precedence = parser 205 | [< 'Token.Number n >] -> int_of_float n 206 | [< >] -> 30 207 in 208 parser 209 | [< 'Token.Ident id; 210 'Token.Kwd '(' ?? "expected '(' in prototype"; 211 args=parse_args []; 212 'Token.Kwd ')' ?? "expected ')' in prototype" >] -> 213 (* success. *) 214 Ast.Prototype (id, Array.of_list (List.rev args)) 215 <b>| [< (prefix, kind)=parse_operator; 216 'Token.Kwd op ?? "expected an operator"; 217 (* Read the precedence if present. *) 218 binary_precedence=parse_binary_precedence; 219 'Token.Kwd '(' ?? "expected '(' in prototype"; 220 args=parse_args []; 221 'Token.Kwd ')' ?? "expected ')' in prototype" >] -> 222 let name = prefix ^ (String.make 1 op) in 223 let args = Array.of_list (List.rev args) in 224 225 (* Verify right number of arguments for operator. *) 226 if Array.length args != kind 227 then raise (Stream.Error "invalid number of operands for operator") 228 else 229 if kind == 1 then 230 Ast.Prototype (name, args) 231 else 232 Ast.BinOpPrototype (name, args, binary_precedence)</b> 233 | [< >] -> 234 raise (Stream.Error "expected function name in prototype") 235 </pre> 236 </div> 237 238 <p>This is all fairly straightforward parsing code, and we have already seen 239 a lot of similar code in the past. One interesting part about the code above is 240 the couple lines that set up <tt>name</tt> for binary operators. This builds 241 names like "binary@" for a newly defined "@" operator. This then takes 242 advantage of the fact that symbol names in the LLVM symbol table are allowed to 243 have any character in them, including embedded nul characters.</p> 244 245 <p>The next interesting thing to add, is codegen support for these binary 246 operators. Given our current structure, this is a simple addition of a default 247 case for our existing binary operator node:</p> 248 249 <div class="doc_code"> 250 <pre> 251 let codegen_expr = function 252 ... 253 | Ast.Binary (op, lhs, rhs) -> 254 let lhs_val = codegen_expr lhs in 255 let rhs_val = codegen_expr rhs in 256 begin 257 match op with 258 | '+' -> build_add lhs_val rhs_val "addtmp" builder 259 | '-' -> build_sub lhs_val rhs_val "subtmp" builder 260 | '*' -> build_mul lhs_val rhs_val "multmp" builder 261 | '<' -> 262 (* Convert bool 0/1 to double 0.0 or 1.0 *) 263 let i = build_fcmp Fcmp.Ult lhs_val rhs_val "cmptmp" builder in 264 build_uitofp i double_type "booltmp" builder 265 <b>| _ -> 266 (* If it wasn't a builtin binary operator, it must be a user defined 267 * one. Emit a call to it. *) 268 let callee = "binary" ^ (String.make 1 op) in 269 let callee = 270 match lookup_function callee the_module with 271 | Some callee -> callee 272 | None -> raise (Error "binary operator not found!") 273 in 274 build_call callee [|lhs_val; rhs_val|] "binop" builder</b> 275 end 276 </pre> 277 </div> 278 279 <p>As you can see above, the new code is actually really simple. It just does 280 a lookup for the appropriate operator in the symbol table and generates a 281 function call to it. Since user-defined operators are just built as normal 282 functions (because the "prototype" boils down to a function with the right 283 name) everything falls into place.</p> 284 285 <p>The final piece of code we are missing, is a bit of top level magic:</p> 286 287 <div class="doc_code"> 288 <pre> 289 let codegen_func the_fpm = function 290 | Ast.Function (proto, body) -> 291 Hashtbl.clear named_values; 292 let the_function = codegen_proto proto in 293 294 <b>(* If this is an operator, install it. *) 295 begin match proto with 296 | Ast.BinOpPrototype (name, args, prec) -> 297 let op = name.[String.length name - 1] in 298 Hashtbl.add Parser.binop_precedence op prec; 299 | _ -> () 300 end;</b> 301 302 (* Create a new basic block to start insertion into. *) 303 let bb = append_block context "entry" the_function in 304 position_at_end bb builder; 305 ... 306 </pre> 307 </div> 308 309 <p>Basically, before codegening a function, if it is a user-defined operator, we 310 register it in the precedence table. This allows the binary operator parsing 311 logic we already have in place to handle it. Since we are working on a 312 fully-general operator precedence parser, this is all we need to do to "extend 313 the grammar".</p> 314 315 <p>Now we have useful user-defined binary operators. This builds a lot 316 on the previous framework we built for other operators. Adding unary operators 317 is a bit more challenging, because we don't have any framework for it yet - lets 318 see what it takes.</p> 319 320 </div> 321 322 <!-- *********************************************************************** --> 323 <h2><a name="unary">User-defined Unary Operators</a></h2> 324 <!-- *********************************************************************** --> 325 326 <div> 327 328 <p>Since we don't currently support unary operators in the Kaleidoscope 329 language, we'll need to add everything to support them. Above, we added simple 330 support for the 'unary' keyword to the lexer. In addition to that, we need an 331 AST node:</p> 332 333 <div class="doc_code"> 334 <pre> 335 type expr = 336 ... 337 (* variant for a unary operator. *) 338 | Unary of char * expr 339 ... 340 </pre> 341 </div> 342 343 <p>This AST node is very simple and obvious by now. It directly mirrors the 344 binary operator AST node, except that it only has one child. With this, we 345 need to add the parsing logic. Parsing a unary operator is pretty simple: we'll 346 add a new function to do it:</p> 347 348 <div class="doc_code"> 349 <pre> 350 (* unary 351 * ::= primary 352 * ::= '!' unary *) 353 and parse_unary = parser 354 (* If this is a unary operator, read it. *) 355 | [< 'Token.Kwd op when op != '(' && op != ')'; operand=parse_expr >] -> 356 Ast.Unary (op, operand) 357 358 (* If the current token is not an operator, it must be a primary expr. *) 359 | [< stream >] -> parse_primary stream 360 </pre> 361 </div> 362 363 <p>The grammar we add is pretty straightforward here. If we see a unary 364 operator when parsing a primary operator, we eat the operator as a prefix and 365 parse the remaining piece as another unary operator. This allows us to handle 366 multiple unary operators (e.g. "!!x"). Note that unary operators can't have 367 ambiguous parses like binary operators can, so there is no need for precedence 368 information.</p> 369 370 <p>The problem with this function, is that we need to call ParseUnary from 371 somewhere. To do this, we change previous callers of ParsePrimary to call 372 <tt>parse_unary</tt> instead:</p> 373 374 <div class="doc_code"> 375 <pre> 376 (* binoprhs 377 * ::= ('+' primary)* *) 378 and parse_bin_rhs expr_prec lhs stream = 379 ... 380 <b>(* Parse the unary expression after the binary operator. *) 381 let rhs = parse_unary stream in</b> 382 ... 383 384 ... 385 386 (* expression 387 * ::= primary binoprhs *) 388 and parse_expr = parser 389 | [< lhs=<b>parse_unary</b>; stream >] -> parse_bin_rhs 0 lhs stream 390 </pre> 391 </div> 392 393 <p>With these two simple changes, we are now able to parse unary operators and build the 394 AST for them. Next up, we need to add parser support for prototypes, to parse 395 the unary operator prototype. We extend the binary operator code above 396 with:</p> 397 398 <div class="doc_code"> 399 <pre> 400 (* prototype 401 * ::= id '(' id* ')' 402 * ::= binary LETTER number? (id, id) 403 <b>* ::= unary LETTER number? (id)</b> *) 404 let parse_prototype = 405 let rec parse_args accumulator = parser 406 | [< 'Token.Ident id; e=parse_args (id::accumulator) >] -> e 407 | [< >] -> accumulator 408 in 409 <b>let parse_operator = parser 410 | [< 'Token.Unary >] -> "unary", 1 411 | [< 'Token.Binary >] -> "binary", 2 412 in</b> 413 let parse_binary_precedence = parser 414 | [< 'Token.Number n >] -> int_of_float n 415 | [< >] -> 30 416 in 417 parser 418 | [< 'Token.Ident id; 419 'Token.Kwd '(' ?? "expected '(' in prototype"; 420 args=parse_args []; 421 'Token.Kwd ')' ?? "expected ')' in prototype" >] -> 422 (* success. *) 423 Ast.Prototype (id, Array.of_list (List.rev args)) 424 <b>| [< (prefix, kind)=parse_operator; 425 'Token.Kwd op ?? "expected an operator"; 426 (* Read the precedence if present. *) 427 binary_precedence=parse_binary_precedence; 428 'Token.Kwd '(' ?? "expected '(' in prototype"; 429 args=parse_args []; 430 'Token.Kwd ')' ?? "expected ')' in prototype" >] -> 431 let name = prefix ^ (String.make 1 op) in 432 let args = Array.of_list (List.rev args) in 433 434 (* Verify right number of arguments for operator. *) 435 if Array.length args != kind 436 then raise (Stream.Error "invalid number of operands for operator") 437 else 438 if kind == 1 then 439 Ast.Prototype (name, args) 440 else 441 Ast.BinOpPrototype (name, args, binary_precedence)</b> 442 | [< >] -> 443 raise (Stream.Error "expected function name in prototype") 444 </pre> 445 </div> 446 447 <p>As with binary operators, we name unary operators with a name that includes 448 the operator character. This assists us at code generation time. Speaking of, 449 the final piece we need to add is codegen support for unary operators. It looks 450 like this:</p> 451 452 <div class="doc_code"> 453 <pre> 454 let rec codegen_expr = function 455 ... 456 | Ast.Unary (op, operand) -> 457 let operand = codegen_expr operand in 458 let callee = "unary" ^ (String.make 1 op) in 459 let callee = 460 match lookup_function callee the_module with 461 | Some callee -> callee 462 | None -> raise (Error "unknown unary operator") 463 in 464 build_call callee [|operand|] "unop" builder 465 </pre> 466 </div> 467 468 <p>This code is similar to, but simpler than, the code for binary operators. It 469 is simpler primarily because it doesn't need to handle any predefined operators. 470 </p> 471 472 </div> 473 474 <!-- *********************************************************************** --> 475 <h2><a name="example">Kicking the Tires</a></h2> 476 <!-- *********************************************************************** --> 477 478 <div> 479 480 <p>It is somewhat hard to believe, but with a few simple extensions we've 481 covered in the last chapters, we have grown a real-ish language. With this, we 482 can do a lot of interesting things, including I/O, math, and a bunch of other 483 things. For example, we can now add a nice sequencing operator (printd is 484 defined to print out the specified value and a newline):</p> 485 486 <div class="doc_code"> 487 <pre> 488 ready> <b>extern printd(x);</b> 489 Read extern: declare double @printd(double) 490 ready> <b>def binary : 1 (x y) 0; # Low-precedence operator that ignores operands.</b> 491 .. 492 ready> <b>printd(123) : printd(456) : printd(789);</b> 493 123.000000 494 456.000000 495 789.000000 496 Evaluated to 0.000000 497 </pre> 498 </div> 499 500 <p>We can also define a bunch of other "primitive" operations, such as:</p> 501 502 <div class="doc_code"> 503 <pre> 504 # Logical unary not. 505 def unary!(v) 506 if v then 507 0 508 else 509 1; 510 511 # Unary negate. 512 def unary-(v) 513 0-v; 514 515 # Define > with the same precedence as <. 516 def binary> 10 (LHS RHS) 517 RHS < LHS; 518 519 # Binary logical or, which does not short circuit. 520 def binary| 5 (LHS RHS) 521 if LHS then 522 1 523 else if RHS then 524 1 525 else 526 0; 527 528 # Binary logical and, which does not short circuit. 529 def binary& 6 (LHS RHS) 530 if !LHS then 531 0 532 else 533 !!RHS; 534 535 # Define = with slightly lower precedence than relationals. 536 def binary = 9 (LHS RHS) 537 !(LHS < RHS | LHS > RHS); 538 539 </pre> 540 </div> 541 542 543 <p>Given the previous if/then/else support, we can also define interesting 544 functions for I/O. For example, the following prints out a character whose 545 "density" reflects the value passed in: the lower the value, the denser the 546 character:</p> 547 548 <div class="doc_code"> 549 <pre> 550 ready> 551 <b> 552 extern putchard(char) 553 def printdensity(d) 554 if d > 8 then 555 putchard(32) # ' ' 556 else if d > 4 then 557 putchard(46) # '.' 558 else if d > 2 then 559 putchard(43) # '+' 560 else 561 putchard(42); # '*'</b> 562 ... 563 ready> <b>printdensity(1): printdensity(2): printdensity(3) : 564 printdensity(4): printdensity(5): printdensity(9): putchard(10);</b> 565 *++.. 566 Evaluated to 0.000000 567 </pre> 568 </div> 569 570 <p>Based on these simple primitive operations, we can start to define more 571 interesting things. For example, here's a little function that solves for the 572 number of iterations it takes a function in the complex plane to 573 converge:</p> 574 575 <div class="doc_code"> 576 <pre> 577 # determine whether the specific location diverges. 578 # Solve for z = z^2 + c in the complex plane. 579 def mandleconverger(real imag iters creal cimag) 580 if iters > 255 | (real*real + imag*imag > 4) then 581 iters 582 else 583 mandleconverger(real*real - imag*imag + creal, 584 2*real*imag + cimag, 585 iters+1, creal, cimag); 586 587 # return the number of iterations required for the iteration to escape 588 def mandleconverge(real imag) 589 mandleconverger(real, imag, 0, real, imag); 590 </pre> 591 </div> 592 593 <p>This "z = z<sup>2</sup> + c" function is a beautiful little creature that is the basis 594 for computation of the <a 595 href="http://en.wikipedia.org/wiki/Mandelbrot_set">Mandelbrot Set</a>. Our 596 <tt>mandelconverge</tt> function returns the number of iterations that it takes 597 for a complex orbit to escape, saturating to 255. This is not a very useful 598 function by itself, but if you plot its value over a two-dimensional plane, 599 you can see the Mandelbrot set. Given that we are limited to using putchard 600 here, our amazing graphical output is limited, but we can whip together 601 something using the density plotter above:</p> 602 603 <div class="doc_code"> 604 <pre> 605 # compute and plot the mandlebrot set with the specified 2 dimensional range 606 # info. 607 def mandelhelp(xmin xmax xstep ymin ymax ystep) 608 for y = ymin, y < ymax, ystep in ( 609 (for x = xmin, x < xmax, xstep in 610 printdensity(mandleconverge(x,y))) 611 : putchard(10) 612 ) 613 614 # mandel - This is a convenient helper function for ploting the mandelbrot set 615 # from the specified position with the specified Magnification. 616 def mandel(realstart imagstart realmag imagmag) 617 mandelhelp(realstart, realstart+realmag*78, realmag, 618 imagstart, imagstart+imagmag*40, imagmag); 619 </pre> 620 </div> 621 622 <p>Given this, we can try plotting out the mandlebrot set! Lets try it out:</p> 623 624 <div class="doc_code"> 625 <pre> 626 ready> <b>mandel(-2.3, -1.3, 0.05, 0.07);</b> 627 *******************************+++++++++++************************************* 628 *************************+++++++++++++++++++++++******************************* 629 **********************+++++++++++++++++++++++++++++**************************** 630 *******************+++++++++++++++++++++.. ...++++++++************************* 631 *****************++++++++++++++++++++++.... ...+++++++++*********************** 632 ***************+++++++++++++++++++++++..... ...+++++++++********************* 633 **************+++++++++++++++++++++++.... ....+++++++++******************** 634 *************++++++++++++++++++++++...... .....++++++++******************* 635 ************+++++++++++++++++++++....... .......+++++++****************** 636 ***********+++++++++++++++++++.... ... .+++++++***************** 637 **********+++++++++++++++++....... .+++++++**************** 638 *********++++++++++++++........... ...+++++++*************** 639 ********++++++++++++............ ...++++++++************** 640 ********++++++++++... .......... .++++++++************** 641 *******+++++++++..... .+++++++++************* 642 *******++++++++...... ..+++++++++************* 643 *******++++++....... ..+++++++++************* 644 *******+++++...... ..+++++++++************* 645 *******.... .... ...+++++++++************* 646 *******.... . ...+++++++++************* 647 *******+++++...... ...+++++++++************* 648 *******++++++....... ..+++++++++************* 649 *******++++++++...... .+++++++++************* 650 *******+++++++++..... ..+++++++++************* 651 ********++++++++++... .......... .++++++++************** 652 ********++++++++++++............ ...++++++++************** 653 *********++++++++++++++.......... ...+++++++*************** 654 **********++++++++++++++++........ .+++++++**************** 655 **********++++++++++++++++++++.... ... ..+++++++**************** 656 ***********++++++++++++++++++++++....... .......++++++++***************** 657 ************+++++++++++++++++++++++...... ......++++++++****************** 658 **************+++++++++++++++++++++++.... ....++++++++******************** 659 ***************+++++++++++++++++++++++..... ...+++++++++********************* 660 *****************++++++++++++++++++++++.... ...++++++++*********************** 661 *******************+++++++++++++++++++++......++++++++************************* 662 *********************++++++++++++++++++++++.++++++++*************************** 663 *************************+++++++++++++++++++++++******************************* 664 ******************************+++++++++++++************************************ 665 ******************************************************************************* 666 ******************************************************************************* 667 ******************************************************************************* 668 Evaluated to 0.000000 669 ready> <b>mandel(-2, -1, 0.02, 0.04);</b> 670 **************************+++++++++++++++++++++++++++++++++++++++++++++++++++++ 671 ***********************++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 672 *********************+++++++++++++++++++++++++++++++++++++++++++++++++++++++++. 673 *******************+++++++++++++++++++++++++++++++++++++++++++++++++++++++++... 674 *****************+++++++++++++++++++++++++++++++++++++++++++++++++++++++++..... 675 ***************++++++++++++++++++++++++++++++++++++++++++++++++++++++++........ 676 **************++++++++++++++++++++++++++++++++++++++++++++++++++++++........... 677 ************+++++++++++++++++++++++++++++++++++++++++++++++++++++.............. 678 ***********++++++++++++++++++++++++++++++++++++++++++++++++++........ . 679 **********++++++++++++++++++++++++++++++++++++++++++++++............. 680 ********+++++++++++++++++++++++++++++++++++++++++++.................. 681 *******+++++++++++++++++++++++++++++++++++++++....................... 682 ******+++++++++++++++++++++++++++++++++++........................... 683 *****++++++++++++++++++++++++++++++++............................ 684 *****++++++++++++++++++++++++++++............................... 685 ****++++++++++++++++++++++++++...... ......................... 686 ***++++++++++++++++++++++++......... ...... ........... 687 ***++++++++++++++++++++++............ 688 **+++++++++++++++++++++.............. 689 **+++++++++++++++++++................ 690 *++++++++++++++++++................. 691 *++++++++++++++++............ ... 692 *++++++++++++++.............. 693 *+++....++++................ 694 *.......... ........... 695 * 696 *.......... ........... 697 *+++....++++................ 698 *++++++++++++++.............. 699 *++++++++++++++++............ ... 700 *++++++++++++++++++................. 701 **+++++++++++++++++++................ 702 **+++++++++++++++++++++.............. 703 ***++++++++++++++++++++++............ 704 ***++++++++++++++++++++++++......... ...... ........... 705 ****++++++++++++++++++++++++++...... ......................... 706 *****++++++++++++++++++++++++++++............................... 707 *****++++++++++++++++++++++++++++++++............................ 708 ******+++++++++++++++++++++++++++++++++++........................... 709 *******+++++++++++++++++++++++++++++++++++++++....................... 710 ********+++++++++++++++++++++++++++++++++++++++++++.................. 711 Evaluated to 0.000000 712 ready> <b>mandel(-0.9, -1.4, 0.02, 0.03);</b> 713 ******************************************************************************* 714 ******************************************************************************* 715 ******************************************************************************* 716 **********+++++++++++++++++++++************************************************ 717 *+++++++++++++++++++++++++++++++++++++++*************************************** 718 +++++++++++++++++++++++++++++++++++++++++++++********************************** 719 ++++++++++++++++++++++++++++++++++++++++++++++++++***************************** 720 ++++++++++++++++++++++++++++++++++++++++++++++++++++++************************* 721 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++********************** 722 +++++++++++++++++++++++++++++++++.........++++++++++++++++++******************* 723 +++++++++++++++++++++++++++++++.... ......+++++++++++++++++++**************** 724 +++++++++++++++++++++++++++++....... ........+++++++++++++++++++************** 725 ++++++++++++++++++++++++++++........ ........++++++++++++++++++++************ 726 +++++++++++++++++++++++++++......... .. ...+++++++++++++++++++++********** 727 ++++++++++++++++++++++++++........... ....++++++++++++++++++++++******** 728 ++++++++++++++++++++++++............. .......++++++++++++++++++++++****** 729 +++++++++++++++++++++++............. ........+++++++++++++++++++++++**** 730 ++++++++++++++++++++++........... ..........++++++++++++++++++++++*** 731 ++++++++++++++++++++........... .........++++++++++++++++++++++* 732 ++++++++++++++++++............ ...........++++++++++++++++++++ 733 ++++++++++++++++............... .............++++++++++++++++++ 734 ++++++++++++++................. ...............++++++++++++++++ 735 ++++++++++++.................. .................++++++++++++++ 736 +++++++++.................. .................+++++++++++++ 737 ++++++........ . ......... ..++++++++++++ 738 ++............ ...... ....++++++++++ 739 .............. ...++++++++++ 740 .............. ....+++++++++ 741 .............. .....++++++++ 742 ............. ......++++++++ 743 ........... .......++++++++ 744 ......... ........+++++++ 745 ......... ........+++++++ 746 ......... ....+++++++ 747 ........ ...+++++++ 748 ....... ...+++++++ 749 ....+++++++ 750 .....+++++++ 751 ....+++++++ 752 ....+++++++ 753 ....+++++++ 754 Evaluated to 0.000000 755 ready> <b>^D</b> 756 </pre> 757 </div> 758 759 <p>At this point, you may be starting to realize that Kaleidoscope is a real 760 and powerful language. It may not be self-similar :), but it can be used to 761 plot things that are!</p> 762 763 <p>With this, we conclude the "adding user-defined operators" chapter of the 764 tutorial. We have successfully augmented our language, adding the ability to 765 extend the language in the library, and we have shown how this can be used to 766 build a simple but interesting end-user application in Kaleidoscope. At this 767 point, Kaleidoscope can build a variety of applications that are functional and 768 can call functions with side-effects, but it can't actually define and mutate a 769 variable itself.</p> 770 771 <p>Strikingly, variable mutation is an important feature of some 772 languages, and it is not at all obvious how to <a href="OCamlLangImpl7.html">add 773 support for mutable variables</a> without having to add an "SSA construction" 774 phase to your front-end. In the next chapter, we will describe how you can 775 add variable mutation without building SSA in your front-end.</p> 776 777 </div> 778 779 780 <!-- *********************************************************************** --> 781 <h2><a name="code">Full Code Listing</a></h2> 782 <!-- *********************************************************************** --> 783 784 <div> 785 786 <p> 787 Here is the complete code listing for our running example, enhanced with the 788 if/then/else and for expressions.. To build this example, use: 789 </p> 790 791 <div class="doc_code"> 792 <pre> 793 # Compile 794 ocamlbuild toy.byte 795 # Run 796 ./toy.byte 797 </pre> 798 </div> 799 800 <p>Here is the code:</p> 801 802 <dl> 803 <dt>_tags:</dt> 804 <dd class="doc_code"> 805 <pre> 806 <{lexer,parser}.ml>: use_camlp4, pp(camlp4of) 807 <*.{byte,native}>: g++, use_llvm, use_llvm_analysis 808 <*.{byte,native}>: use_llvm_executionengine, use_llvm_target 809 <*.{byte,native}>: use_llvm_scalar_opts, use_bindings 810 </pre> 811 </dd> 812 813 <dt>myocamlbuild.ml:</dt> 814 <dd class="doc_code"> 815 <pre> 816 open Ocamlbuild_plugin;; 817 818 ocaml_lib ~extern:true "llvm";; 819 ocaml_lib ~extern:true "llvm_analysis";; 820 ocaml_lib ~extern:true "llvm_executionengine";; 821 ocaml_lib ~extern:true "llvm_target";; 822 ocaml_lib ~extern:true "llvm_scalar_opts";; 823 824 flag ["link"; "ocaml"; "g++"] (S[A"-cc"; A"g++"; A"-cclib"; A"-rdynamic"]);; 825 dep ["link"; "ocaml"; "use_bindings"] ["bindings.o"];; 826 </pre> 827 </dd> 828 829 <dt>token.ml:</dt> 830 <dd class="doc_code"> 831 <pre> 832 (*===----------------------------------------------------------------------=== 833 * Lexer Tokens 834 *===----------------------------------------------------------------------===*) 835 836 (* The lexer returns these 'Kwd' if it is an unknown character, otherwise one of 837 * these others for known things. *) 838 type token = 839 (* commands *) 840 | Def | Extern 841 842 (* primary *) 843 | Ident of string | Number of float 844 845 (* unknown *) 846 | Kwd of char 847 848 (* control *) 849 | If | Then | Else 850 | For | In 851 852 (* operators *) 853 | Binary | Unary 854 </pre> 855 </dd> 856 857 <dt>lexer.ml:</dt> 858 <dd class="doc_code"> 859 <pre> 860 (*===----------------------------------------------------------------------=== 861 * Lexer 862 *===----------------------------------------------------------------------===*) 863 864 let rec lex = parser 865 (* Skip any whitespace. *) 866 | [< ' (' ' | '\n' | '\r' | '\t'); stream >] -> lex stream 867 868 (* identifier: [a-zA-Z][a-zA-Z0-9] *) 869 | [< ' ('A' .. 'Z' | 'a' .. 'z' as c); stream >] -> 870 let buffer = Buffer.create 1 in 871 Buffer.add_char buffer c; 872 lex_ident buffer stream 873 874 (* number: [0-9.]+ *) 875 | [< ' ('0' .. '9' as c); stream >] -> 876 let buffer = Buffer.create 1 in 877 Buffer.add_char buffer c; 878 lex_number buffer stream 879 880 (* Comment until end of line. *) 881 | [< ' ('#'); stream >] -> 882 lex_comment stream 883 884 (* Otherwise, just return the character as its ascii value. *) 885 | [< 'c; stream >] -> 886 [< 'Token.Kwd c; lex stream >] 887 888 (* end of stream. *) 889 | [< >] -> [< >] 890 891 and lex_number buffer = parser 892 | [< ' ('0' .. '9' | '.' as c); stream >] -> 893 Buffer.add_char buffer c; 894 lex_number buffer stream 895 | [< stream=lex >] -> 896 [< 'Token.Number (float_of_string (Buffer.contents buffer)); stream >] 897 898 and lex_ident buffer = parser 899 | [< ' ('A' .. 'Z' | 'a' .. 'z' | '0' .. '9' as c); stream >] -> 900 Buffer.add_char buffer c; 901 lex_ident buffer stream 902 | [< stream=lex >] -> 903 match Buffer.contents buffer with 904 | "def" -> [< 'Token.Def; stream >] 905 | "extern" -> [< 'Token.Extern; stream >] 906 | "if" -> [< 'Token.If; stream >] 907 | "then" -> [< 'Token.Then; stream >] 908 | "else" -> [< 'Token.Else; stream >] 909 | "for" -> [< 'Token.For; stream >] 910 | "in" -> [< 'Token.In; stream >] 911 | "binary" -> [< 'Token.Binary; stream >] 912 | "unary" -> [< 'Token.Unary; stream >] 913 | id -> [< 'Token.Ident id; stream >] 914 915 and lex_comment = parser 916 | [< ' ('\n'); stream=lex >] -> stream 917 | [< 'c; e=lex_comment >] -> e 918 | [< >] -> [< >] 919 </pre> 920 </dd> 921 922 <dt>ast.ml:</dt> 923 <dd class="doc_code"> 924 <pre> 925 (*===----------------------------------------------------------------------=== 926 * Abstract Syntax Tree (aka Parse Tree) 927 *===----------------------------------------------------------------------===*) 928 929 (* expr - Base type for all expression nodes. *) 930 type expr = 931 (* variant for numeric literals like "1.0". *) 932 | Number of float 933 934 (* variant for referencing a variable, like "a". *) 935 | Variable of string 936 937 (* variant for a unary operator. *) 938 | Unary of char * expr 939 940 (* variant for a binary operator. *) 941 | Binary of char * expr * expr 942 943 (* variant for function calls. *) 944 | Call of string * expr array 945 946 (* variant for if/then/else. *) 947 | If of expr * expr * expr 948 949 (* variant for for/in. *) 950 | For of string * expr * expr * expr option * expr 951 952 (* proto - This type represents the "prototype" for a function, which captures 953 * its name, and its argument names (thus implicitly the number of arguments the 954 * function takes). *) 955 type proto = 956 | Prototype of string * string array 957 | BinOpPrototype of string * string array * int 958 959 (* func - This type represents a function definition itself. *) 960 type func = Function of proto * expr 961 </pre> 962 </dd> 963 964 <dt>parser.ml:</dt> 965 <dd class="doc_code"> 966 <pre> 967 (*===---------------------------------------------------------------------=== 968 * Parser 969 *===---------------------------------------------------------------------===*) 970 971 (* binop_precedence - This holds the precedence for each binary operator that is 972 * defined *) 973 let binop_precedence:(char, int) Hashtbl.t = Hashtbl.create 10 974 975 (* precedence - Get the precedence of the pending binary operator token. *) 976 let precedence c = try Hashtbl.find binop_precedence c with Not_found -> -1 977 978 (* primary 979 * ::= identifier 980 * ::= numberexpr 981 * ::= parenexpr 982 * ::= ifexpr 983 * ::= forexpr *) 984 let rec parse_primary = parser 985 (* numberexpr ::= number *) 986 | [< 'Token.Number n >] -> Ast.Number n 987 988 (* parenexpr ::= '(' expression ')' *) 989 | [< 'Token.Kwd '('; e=parse_expr; 'Token.Kwd ')' ?? "expected ')'" >] -> e 990 991 (* identifierexpr 992 * ::= identifier 993 * ::= identifier '(' argumentexpr ')' *) 994 | [< 'Token.Ident id; stream >] -> 995 let rec parse_args accumulator = parser 996 | [< e=parse_expr; stream >] -> 997 begin parser 998 | [< 'Token.Kwd ','; e=parse_args (e :: accumulator) >] -> e 999 | [< >] -> e :: accumulator 1000 end stream 1001 | [< >] -> accumulator 1002 in 1003 let rec parse_ident id = parser 1004 (* Call. *) 1005 | [< 'Token.Kwd '('; 1006 args=parse_args []; 1007 'Token.Kwd ')' ?? "expected ')'">] -> 1008 Ast.Call (id, Array.of_list (List.rev args)) 1009 1010 (* Simple variable ref. *) 1011 | [< >] -> Ast.Variable id 1012 in 1013 parse_ident id stream 1014 1015 (* ifexpr ::= 'if' expr 'then' expr 'else' expr *) 1016 | [< 'Token.If; c=parse_expr; 1017 'Token.Then ?? "expected 'then'"; t=parse_expr; 1018 'Token.Else ?? "expected 'else'"; e=parse_expr >] -> 1019 Ast.If (c, t, e) 1020 1021 (* forexpr 1022 ::= 'for' identifier '=' expr ',' expr (',' expr)? 'in' expression *) 1023 | [< 'Token.For; 1024 'Token.Ident id ?? "expected identifier after for"; 1025 'Token.Kwd '=' ?? "expected '=' after for"; 1026 stream >] -> 1027 begin parser 1028 | [< 1029 start=parse_expr; 1030 'Token.Kwd ',' ?? "expected ',' after for"; 1031 end_=parse_expr; 1032 stream >] -> 1033 let step = 1034 begin parser 1035 | [< 'Token.Kwd ','; step=parse_expr >] -> Some step 1036 | [< >] -> None 1037 end stream 1038 in 1039 begin parser 1040 | [< 'Token.In; body=parse_expr >] -> 1041 Ast.For (id, start, end_, step, body) 1042 | [< >] -> 1043 raise (Stream.Error "expected 'in' after for") 1044 end stream 1045 | [< >] -> 1046 raise (Stream.Error "expected '=' after for") 1047 end stream 1048 1049 | [< >] -> raise (Stream.Error "unknown token when expecting an expression.") 1050 1051 (* unary 1052 * ::= primary 1053 * ::= '!' unary *) 1054 and parse_unary = parser 1055 (* If this is a unary operator, read it. *) 1056 | [< 'Token.Kwd op when op != '(' && op != ')'; operand=parse_expr >] -> 1057 Ast.Unary (op, operand) 1058 1059 (* If the current token is not an operator, it must be a primary expr. *) 1060 | [< stream >] -> parse_primary stream 1061 1062 (* binoprhs 1063 * ::= ('+' primary)* *) 1064 and parse_bin_rhs expr_prec lhs stream = 1065 match Stream.peek stream with 1066 (* If this is a binop, find its precedence. *) 1067 | Some (Token.Kwd c) when Hashtbl.mem binop_precedence c -> 1068 let token_prec = precedence c in 1069 1070 (* If this is a binop that binds at least as tightly as the current binop, 1071 * consume it, otherwise we are done. *) 1072 if token_prec < expr_prec then lhs else begin 1073 (* Eat the binop. *) 1074 Stream.junk stream; 1075 1076 (* Parse the unary expression after the binary operator. *) 1077 let rhs = parse_unary stream in 1078 1079 (* Okay, we know this is a binop. *) 1080 let rhs = 1081 match Stream.peek stream with 1082 | Some (Token.Kwd c2) -> 1083 (* If BinOp binds less tightly with rhs than the operator after 1084 * rhs, let the pending operator take rhs as its lhs. *) 1085 let next_prec = precedence c2 in 1086 if token_prec < next_prec 1087 then parse_bin_rhs (token_prec + 1) rhs stream 1088 else rhs 1089 | _ -> rhs 1090 in 1091 1092 (* Merge lhs/rhs. *) 1093 let lhs = Ast.Binary (c, lhs, rhs) in 1094 parse_bin_rhs expr_prec lhs stream 1095 end 1096 | _ -> lhs 1097 1098 (* expression 1099 * ::= primary binoprhs *) 1100 and parse_expr = parser 1101 | [< lhs=parse_unary; stream >] -> parse_bin_rhs 0 lhs stream 1102 1103 (* prototype 1104 * ::= id '(' id* ')' 1105 * ::= binary LETTER number? (id, id) 1106 * ::= unary LETTER number? (id) *) 1107 let parse_prototype = 1108 let rec parse_args accumulator = parser 1109 | [< 'Token.Ident id; e=parse_args (id::accumulator) >] -> e 1110 | [< >] -> accumulator 1111 in 1112 let parse_operator = parser 1113 | [< 'Token.Unary >] -> "unary", 1 1114 | [< 'Token.Binary >] -> "binary", 2 1115 in 1116 let parse_binary_precedence = parser 1117 | [< 'Token.Number n >] -> int_of_float n 1118 | [< >] -> 30 1119 in 1120 parser 1121 | [< 'Token.Ident id; 1122 'Token.Kwd '(' ?? "expected '(' in prototype"; 1123 args=parse_args []; 1124 'Token.Kwd ')' ?? "expected ')' in prototype" >] -> 1125 (* success. *) 1126 Ast.Prototype (id, Array.of_list (List.rev args)) 1127 | [< (prefix, kind)=parse_operator; 1128 'Token.Kwd op ?? "expected an operator"; 1129 (* Read the precedence if present. *) 1130 binary_precedence=parse_binary_precedence; 1131 'Token.Kwd '(' ?? "expected '(' in prototype"; 1132 args=parse_args []; 1133 'Token.Kwd ')' ?? "expected ')' in prototype" >] -> 1134 let name = prefix ^ (String.make 1 op) in 1135 let args = Array.of_list (List.rev args) in 1136 1137 (* Verify right number of arguments for operator. *) 1138 if Array.length args != kind 1139 then raise (Stream.Error "invalid number of operands for operator") 1140 else 1141 if kind == 1 then 1142 Ast.Prototype (name, args) 1143 else 1144 Ast.BinOpPrototype (name, args, binary_precedence) 1145 | [< >] -> 1146 raise (Stream.Error "expected function name in prototype") 1147 1148 (* definition ::= 'def' prototype expression *) 1149 let parse_definition = parser 1150 | [< 'Token.Def; p=parse_prototype; e=parse_expr >] -> 1151 Ast.Function (p, e) 1152 1153 (* toplevelexpr ::= expression *) 1154 let parse_toplevel = parser 1155 | [< e=parse_expr >] -> 1156 (* Make an anonymous proto. *) 1157 Ast.Function (Ast.Prototype ("", [||]), e) 1158 1159 (* external ::= 'extern' prototype *) 1160 let parse_extern = parser 1161 | [< 'Token.Extern; e=parse_prototype >] -> e 1162 </pre> 1163 </dd> 1164 1165 <dt>codegen.ml:</dt> 1166 <dd class="doc_code"> 1167 <pre> 1168 (*===----------------------------------------------------------------------=== 1169 * Code Generation 1170 *===----------------------------------------------------------------------===*) 1171 1172 open Llvm 1173 1174 exception Error of string 1175 1176 let context = global_context () 1177 let the_module = create_module context "my cool jit" 1178 let builder = builder context 1179 let named_values:(string, llvalue) Hashtbl.t = Hashtbl.create 10 1180 let double_type = double_type context 1181 1182 let rec codegen_expr = function 1183 | Ast.Number n -> const_float double_type n 1184 | Ast.Variable name -> 1185 (try Hashtbl.find named_values name with 1186 | Not_found -> raise (Error "unknown variable name")) 1187 | Ast.Unary (op, operand) -> 1188 let operand = codegen_expr operand in 1189 let callee = "unary" ^ (String.make 1 op) in 1190 let callee = 1191 match lookup_function callee the_module with 1192 | Some callee -> callee 1193 | None -> raise (Error "unknown unary operator") 1194 in 1195 build_call callee [|operand|] "unop" builder 1196 | Ast.Binary (op, lhs, rhs) -> 1197 let lhs_val = codegen_expr lhs in 1198 let rhs_val = codegen_expr rhs in 1199 begin 1200 match op with 1201 | '+' -> build_add lhs_val rhs_val "addtmp" builder 1202 | '-' -> build_sub lhs_val rhs_val "subtmp" builder 1203 | '*' -> build_mul lhs_val rhs_val "multmp" builder 1204 | '<' -> 1205 (* Convert bool 0/1 to double 0.0 or 1.0 *) 1206 let i = build_fcmp Fcmp.Ult lhs_val rhs_val "cmptmp" builder in 1207 build_uitofp i double_type "booltmp" builder 1208 | _ -> 1209 (* If it wasn't a builtin binary operator, it must be a user defined 1210 * one. Emit a call to it. *) 1211 let callee = "binary" ^ (String.make 1 op) in 1212 let callee = 1213 match lookup_function callee the_module with 1214 | Some callee -> callee 1215 | None -> raise (Error "binary operator not found!") 1216 in 1217 build_call callee [|lhs_val; rhs_val|] "binop" builder 1218 end 1219 | Ast.Call (callee, args) -> 1220 (* Look up the name in the module table. *) 1221 let callee = 1222 match lookup_function callee the_module with 1223 | Some callee -> callee 1224 | None -> raise (Error "unknown function referenced") 1225 in 1226 let params = params callee in 1227 1228 (* If argument mismatch error. *) 1229 if Array.length params == Array.length args then () else 1230 raise (Error "incorrect # arguments passed"); 1231 let args = Array.map codegen_expr args in 1232 build_call callee args "calltmp" builder 1233 | Ast.If (cond, then_, else_) -> 1234 let cond = codegen_expr cond in 1235 1236 (* Convert condition to a bool by comparing equal to 0.0 *) 1237 let zero = const_float double_type 0.0 in 1238 let cond_val = build_fcmp Fcmp.One cond zero "ifcond" builder in 1239 1240 (* Grab the first block so that we might later add the conditional branch 1241 * to it at the end of the function. *) 1242 let start_bb = insertion_block builder in 1243 let the_function = block_parent start_bb in 1244 1245 let then_bb = append_block context "then" the_function in 1246 1247 (* Emit 'then' value. *) 1248 position_at_end then_bb builder; 1249 let then_val = codegen_expr then_ in 1250 1251 (* Codegen of 'then' can change the current block, update then_bb for the 1252 * phi. We create a new name because one is used for the phi node, and the 1253 * other is used for the conditional branch. *) 1254 let new_then_bb = insertion_block builder in 1255 1256 (* Emit 'else' value. *) 1257 let else_bb = append_block context "else" the_function in 1258 position_at_end else_bb builder; 1259 let else_val = codegen_expr else_ in 1260 1261 (* Codegen of 'else' can change the current block, update else_bb for the 1262 * phi. *) 1263 let new_else_bb = insertion_block builder in 1264 1265 (* Emit merge block. *) 1266 let merge_bb = append_block context "ifcont" the_function in 1267 position_at_end merge_bb builder; 1268 let incoming = [(then_val, new_then_bb); (else_val, new_else_bb)] in 1269 let phi = build_phi incoming "iftmp" builder in 1270 1271 (* Return to the start block to add the conditional branch. *) 1272 position_at_end start_bb builder; 1273 ignore (build_cond_br cond_val then_bb else_bb builder); 1274 1275 (* Set a unconditional branch at the end of the 'then' block and the 1276 * 'else' block to the 'merge' block. *) 1277 position_at_end new_then_bb builder; ignore (build_br merge_bb builder); 1278 position_at_end new_else_bb builder; ignore (build_br merge_bb builder); 1279 1280 (* Finally, set the builder to the end of the merge block. *) 1281 position_at_end merge_bb builder; 1282 1283 phi 1284 | Ast.For (var_name, start, end_, step, body) -> 1285 (* Emit the start code first, without 'variable' in scope. *) 1286 let start_val = codegen_expr start in 1287 1288 (* Make the new basic block for the loop header, inserting after current 1289 * block. *) 1290 let preheader_bb = insertion_block builder in 1291 let the_function = block_parent preheader_bb in 1292 let loop_bb = append_block context "loop" the_function in 1293 1294 (* Insert an explicit fall through from the current block to the 1295 * loop_bb. *) 1296 ignore (build_br loop_bb builder); 1297 1298 (* Start insertion in loop_bb. *) 1299 position_at_end loop_bb builder; 1300 1301 (* Start the PHI node with an entry for start. *) 1302 let variable = build_phi [(start_val, preheader_bb)] var_name builder in 1303 1304 (* Within the loop, the variable is defined equal to the PHI node. If it 1305 * shadows an existing variable, we have to restore it, so save it 1306 * now. *) 1307 let old_val = 1308 try Some (Hashtbl.find named_values var_name) with Not_found -> None 1309 in 1310 Hashtbl.add named_values var_name variable; 1311 1312 (* Emit the body of the loop. This, like any other expr, can change the 1313 * current BB. Note that we ignore the value computed by the body, but 1314 * don't allow an error *) 1315 ignore (codegen_expr body); 1316 1317 (* Emit the step value. *) 1318 let step_val = 1319 match step with 1320 | Some step -> codegen_expr step 1321 (* If not specified, use 1.0. *) 1322 | None -> const_float double_type 1.0 1323 in 1324 1325 let next_var = build_add variable step_val "nextvar" builder in 1326 1327 (* Compute the end condition. *) 1328 let end_cond = codegen_expr end_ in 1329 1330 (* Convert condition to a bool by comparing equal to 0.0. *) 1331 let zero = const_float double_type 0.0 in 1332 let end_cond = build_fcmp Fcmp.One end_cond zero "loopcond" builder in 1333 1334 (* Create the "after loop" block and insert it. *) 1335 let loop_end_bb = insertion_block builder in 1336 let after_bb = append_block context "afterloop" the_function in 1337 1338 (* Insert the conditional branch into the end of loop_end_bb. *) 1339 ignore (build_cond_br end_cond loop_bb after_bb builder); 1340 1341 (* Any new code will be inserted in after_bb. *) 1342 position_at_end after_bb builder; 1343 1344 (* Add a new entry to the PHI node for the backedge. *) 1345 add_incoming (next_var, loop_end_bb) variable; 1346 1347 (* Restore the unshadowed variable. *) 1348 begin match old_val with 1349 | Some old_val -> Hashtbl.add named_values var_name old_val 1350 | None -> () 1351 end; 1352 1353 (* for expr always returns 0.0. *) 1354 const_null double_type 1355 1356 let codegen_proto = function 1357 | Ast.Prototype (name, args) | Ast.BinOpPrototype (name, args, _) -> 1358 (* Make the function type: double(double,double) etc. *) 1359 let doubles = Array.make (Array.length args) double_type in 1360 let ft = function_type double_type doubles in 1361 let f = 1362 match lookup_function name the_module with 1363 | None -> declare_function name ft the_module 1364 1365 (* If 'f' conflicted, there was already something named 'name'. If it 1366 * has a body, don't allow redefinition or reextern. *) 1367 | Some f -> 1368 (* If 'f' already has a body, reject this. *) 1369 if block_begin f <> At_end f then 1370 raise (Error "redefinition of function"); 1371 1372 (* If 'f' took a different number of arguments, reject. *) 1373 if element_type (type_of f) <> ft then 1374 raise (Error "redefinition of function with different # args"); 1375 f 1376 in 1377 1378 (* Set names for all arguments. *) 1379 Array.iteri (fun i a -> 1380 let n = args.(i) in 1381 set_value_name n a; 1382 Hashtbl.add named_values n a; 1383 ) (params f); 1384 f 1385 1386 let codegen_func the_fpm = function 1387 | Ast.Function (proto, body) -> 1388 Hashtbl.clear named_values; 1389 let the_function = codegen_proto proto in 1390 1391 (* If this is an operator, install it. *) 1392 begin match proto with 1393 | Ast.BinOpPrototype (name, args, prec) -> 1394 let op = name.[String.length name - 1] in 1395 Hashtbl.add Parser.binop_precedence op prec; 1396 | _ -> () 1397 end; 1398 1399 (* Create a new basic block to start insertion into. *) 1400 let bb = append_block context "entry" the_function in 1401 position_at_end bb builder; 1402 1403 try 1404 let ret_val = codegen_expr body in 1405 1406 (* Finish off the function. *) 1407 let _ = build_ret ret_val builder in 1408 1409 (* Validate the generated code, checking for consistency. *) 1410 Llvm_analysis.assert_valid_function the_function; 1411 1412 (* Optimize the function. *) 1413 let _ = PassManager.run_function the_function the_fpm in 1414 1415 the_function 1416 with e -> 1417 delete_function the_function; 1418 raise e 1419 </pre> 1420 </dd> 1421 1422 <dt>toplevel.ml:</dt> 1423 <dd class="doc_code"> 1424 <pre> 1425 (*===----------------------------------------------------------------------=== 1426 * Top-Level parsing and JIT Driver 1427 *===----------------------------------------------------------------------===*) 1428 1429 open Llvm 1430 open Llvm_executionengine 1431 1432 (* top ::= definition | external | expression | ';' *) 1433 let rec main_loop the_fpm the_execution_engine stream = 1434 match Stream.peek stream with 1435 | None -> () 1436 1437 (* ignore top-level semicolons. *) 1438 | Some (Token.Kwd ';') -> 1439 Stream.junk stream; 1440 main_loop the_fpm the_execution_engine stream 1441 1442 | Some token -> 1443 begin 1444 try match token with 1445 | Token.Def -> 1446 let e = Parser.parse_definition stream in 1447 print_endline "parsed a function definition."; 1448 dump_value (Codegen.codegen_func the_fpm e); 1449 | Token.Extern -> 1450 let e = Parser.parse_extern stream in 1451 print_endline "parsed an extern."; 1452 dump_value (Codegen.codegen_proto e); 1453 | _ -> 1454 (* Evaluate a top-level expression into an anonymous function. *) 1455 let e = Parser.parse_toplevel stream in 1456 print_endline "parsed a top-level expr"; 1457 let the_function = Codegen.codegen_func the_fpm e in 1458 dump_value the_function; 1459 1460 (* JIT the function, returning a function pointer. *) 1461 let result = ExecutionEngine.run_function the_function [||] 1462 the_execution_engine in 1463 1464 print_string "Evaluated to "; 1465 print_float (GenericValue.as_float Codegen.double_type result); 1466 print_newline (); 1467 with Stream.Error s | Codegen.Error s -> 1468 (* Skip token for error recovery. *) 1469 Stream.junk stream; 1470 print_endline s; 1471 end; 1472 print_string "ready> "; flush stdout; 1473 main_loop the_fpm the_execution_engine stream 1474 </pre> 1475 </dd> 1476 1477 <dt>toy.ml:</dt> 1478 <dd class="doc_code"> 1479 <pre> 1480 (*===----------------------------------------------------------------------=== 1481 * Main driver code. 1482 *===----------------------------------------------------------------------===*) 1483 1484 open Llvm 1485 open Llvm_executionengine 1486 open Llvm_target 1487 open Llvm_scalar_opts 1488 1489 let main () = 1490 ignore (initialize_native_target ()); 1491 1492 (* Install standard binary operators. 1493 * 1 is the lowest precedence. *) 1494 Hashtbl.add Parser.binop_precedence '<' 10; 1495 Hashtbl.add Parser.binop_precedence '+' 20; 1496 Hashtbl.add Parser.binop_precedence '-' 20; 1497 Hashtbl.add Parser.binop_precedence '*' 40; (* highest. *) 1498 1499 (* Prime the first token. *) 1500 print_string "ready> "; flush stdout; 1501 let stream = Lexer.lex (Stream.of_channel stdin) in 1502 1503 (* Create the JIT. *) 1504 let the_execution_engine = ExecutionEngine.create Codegen.the_module in 1505 let the_fpm = PassManager.create_function Codegen.the_module in 1506 1507 (* Set up the optimizer pipeline. Start with registering info about how the 1508 * target lays out data structures. *) 1509 TargetData.add (ExecutionEngine.target_data the_execution_engine) the_fpm; 1510 1511 (* Do simple "peephole" optimizations and bit-twiddling optzn. *) 1512 add_instruction_combination the_fpm; 1513 1514 (* reassociate expressions. *) 1515 add_reassociation the_fpm; 1516 1517 (* Eliminate Common SubExpressions. *) 1518 add_gvn the_fpm; 1519 1520 (* Simplify the control flow graph (deleting unreachable blocks, etc). *) 1521 add_cfg_simplification the_fpm; 1522 1523 ignore (PassManager.initialize the_fpm); 1524 1525 (* Run the main "interpreter loop" now. *) 1526 Toplevel.main_loop the_fpm the_execution_engine stream; 1527 1528 (* Print out all the generated code. *) 1529 dump_module Codegen.the_module 1530 ;; 1531 1532 main () 1533 </pre> 1534 </dd> 1535 1536 <dt>bindings.c</dt> 1537 <dd class="doc_code"> 1538 <pre> 1539 #include <stdio.h> 1540 1541 /* putchard - putchar that takes a double and returns 0. */ 1542 extern double putchard(double X) { 1543 putchar((char)X); 1544 return 0; 1545 } 1546 1547 /* printd - printf that takes a double prints it as "%f\n", returning 0. */ 1548 extern double printd(double X) { 1549 printf("%f\n", X); 1550 return 0; 1551 } 1552 </pre> 1553 </dd> 1554 </dl> 1555 1556 <a href="OCamlLangImpl7.html">Next: Extending the language: mutable variables / 1557 SSA construction</a> 1558 </div> 1559 1560 <!-- *********************************************************************** --> 1561 <hr> 1562 <address> 1563 <a href="http://jigsaw.w3.org/css-validator/check/referer"><img 1564 src="http://jigsaw.w3.org/css-validator/images/vcss" alt="Valid CSS!"></a> 1565 <a href="http://validator.w3.org/check/referer"><img 1566 src="http://www.w3.org/Icons/valid-html401" alt="Valid HTML 4.01!"></a> 1567 1568 <a href="mailto:sabre (a] nondot.org">Chris Lattner</a><br> 1569 <a href="mailto:idadesub (a] users.sourceforge.net">Erick Tryzelaar</a><br> 1570 <a href="http://llvm.org/">The LLVM Compiler Infrastructure</a><br> 1571 Last modified: $Date: 2011-04-22 20:30:22 -0400 (Fri, 22 Apr 2011) $ 1572 </address> 1573 </body> 1574 </html> 1575