Home | History | Annotate | Download | only in tutorial
      1 ============================================================
      2 Kaleidoscope: Extending the Language: User-defined Operators
      3 ============================================================
      4 
      5 .. contents::
      6    :local:
      7 
      8 Chapter 6 Introduction
      9 ======================
     10 
     11 Welcome to Chapter 6 of the "`Implementing a language with
     12 LLVM <index.html>`_" tutorial. At this point in our tutorial, we now
     13 have a fully functional language that is fairly minimal, but also
     14 useful. There is still one big problem with it, however. Our language
     15 doesn't have many useful operators (like division, logical negation, or
     16 even any comparisons besides less-than).
     17 
     18 This chapter of the tutorial takes a wild digression into adding
     19 user-defined operators to the simple and beautiful Kaleidoscope
     20 language. This digression now gives us a simple and ugly language in
     21 some ways, but also a powerful one at the same time. One of the great
     22 things about creating your own language is that you get to decide what
     23 is good or bad. In this tutorial we'll assume that it is okay to use
     24 this as a way to show some interesting parsing techniques.
     25 
     26 At the end of this tutorial, we'll run through an example Kaleidoscope
     27 application that `renders the Mandelbrot set <#example>`_. This gives an
     28 example of what you can build with Kaleidoscope and its feature set.
     29 
     30 User-defined Operators: the Idea
     31 ================================
     32 
     33 The "operator overloading" that we will add to Kaleidoscope is more
     34 general than languages like C++. In C++, you are only allowed to
     35 redefine existing operators: you can't programatically change the
     36 grammar, introduce new operators, change precedence levels, etc. In this
     37 chapter, we will add this capability to Kaleidoscope, which will let the
     38 user round out the set of operators that are supported.
     39 
     40 The point of going into user-defined operators in a tutorial like this
     41 is to show the power and flexibility of using a hand-written parser.
     42 Thus far, the parser we have been implementing uses recursive descent
     43 for most parts of the grammar and operator precedence parsing for the
     44 expressions. See `Chapter 2 <OCamlLangImpl2.html>`_ for details. Without
     45 using operator precedence parsing, it would be very difficult to allow
     46 the programmer to introduce new operators into the grammar: the grammar
     47 is dynamically extensible as the JIT runs.
     48 
     49 The two specific features we'll add are programmable unary operators
     50 (right now, Kaleidoscope has no unary operators at all) as well as
     51 binary operators. An example of this is:
     52 
     53 ::
     54 
     55     # Logical unary not.
     56     def unary!(v)
     57       if v then
     58         0
     59       else
     60         1;
     61 
     62     # Define > with the same precedence as <.
     63     def binary> 10 (LHS RHS)
     64       RHS < LHS;
     65 
     66     # Binary "logical or", (note that it does not "short circuit")
     67     def binary| 5 (LHS RHS)
     68       if LHS then
     69         1
     70       else if RHS then
     71         1
     72       else
     73         0;
     74 
     75     # Define = with slightly lower precedence than relationals.
     76     def binary= 9 (LHS RHS)
     77       !(LHS < RHS | LHS > RHS);
     78 
     79 Many languages aspire to being able to implement their standard runtime
     80 library in the language itself. In Kaleidoscope, we can implement
     81 significant parts of the language in the library!
     82 
     83 We will break down implementation of these features into two parts:
     84 implementing support for user-defined binary operators and adding unary
     85 operators.
     86 
     87 User-defined Binary Operators
     88 =============================
     89 
     90 Adding support for user-defined binary operators is pretty simple with
     91 our current framework. We'll first add support for the unary/binary
     92 keywords:
     93 
     94 .. code-block:: ocaml
     95 
     96     type token =
     97       ...
     98       (* operators *)
     99       | Binary | Unary
    100 
    101     ...
    102 
    103     and lex_ident buffer = parser
    104       ...
    105           | "for" -> [< 'Token.For; stream >]
    106           | "in" -> [< 'Token.In; stream >]
    107           | "binary" -> [< 'Token.Binary; stream >]
    108           | "unary" -> [< 'Token.Unary; stream >]
    109 
    110 This just adds lexer support for the unary and binary keywords, like we
    111 did in `previous chapters <OCamlLangImpl5.html#iflexer>`_. One nice
    112 thing about our current AST, is that we represent binary operators with
    113 full generalisation by using their ASCII code as the opcode. For our
    114 extended operators, we'll use this same representation, so we don't need
    115 any new AST or parser support.
    116 
    117 On the other hand, we have to be able to represent the definitions of
    118 these new operators, in the "def binary\| 5" part of the function
    119 definition. In our grammar so far, the "name" for the function
    120 definition is parsed as the "prototype" production and into the
    121 ``Ast.Prototype`` AST node. To represent our new user-defined operators
    122 as prototypes, we have to extend the ``Ast.Prototype`` AST node like
    123 this:
    124 
    125 .. code-block:: ocaml
    126 
    127     (* proto - This type represents the "prototype" for a function, which captures
    128      * its name, and its argument names (thus implicitly the number of arguments the
    129      * function takes). *)
    130     type proto =
    131       | Prototype of string * string array
    132       | BinOpPrototype of string * string array * int
    133 
    134 Basically, in addition to knowing a name for the prototype, we now keep
    135 track of whether it was an operator, and if it was, what precedence
    136 level the operator is at. The precedence is only used for binary
    137 operators (as you'll see below, it just doesn't apply for unary
    138 operators). Now that we have a way to represent the prototype for a
    139 user-defined operator, we need to parse it:
    140 
    141 .. code-block:: ocaml
    142 
    143     (* prototype
    144      *   ::= id '(' id* ')'
    145      *   ::= binary LETTER number? (id, id)
    146      *   ::= unary LETTER number? (id) *)
    147     let parse_prototype =
    148       let rec parse_args accumulator = parser
    149         | [< 'Token.Ident id; e=parse_args (id::accumulator) >] -> e
    150         | [< >] -> accumulator
    151       in
    152       let parse_operator = parser
    153         | [< 'Token.Unary >] -> "unary", 1
    154         | [< 'Token.Binary >] -> "binary", 2
    155       in
    156       let parse_binary_precedence = parser
    157         | [< 'Token.Number n >] -> int_of_float n
    158         | [< >] -> 30
    159       in
    160       parser
    161       | [< 'Token.Ident id;
    162            'Token.Kwd '(' ?? "expected '(' in prototype";
    163            args=parse_args [];
    164            'Token.Kwd ')' ?? "expected ')' in prototype" >] ->
    165           (* success. *)
    166           Ast.Prototype (id, Array.of_list (List.rev args))
    167       | [< (prefix, kind)=parse_operator;
    168            'Token.Kwd op ?? "expected an operator";
    169            (* Read the precedence if present. *)
    170            binary_precedence=parse_binary_precedence;
    171            'Token.Kwd '(' ?? "expected '(' in prototype";
    172             args=parse_args [];
    173            'Token.Kwd ')' ?? "expected ')' in prototype" >] ->
    174           let name = prefix ^ (String.make 1 op) in
    175           let args = Array.of_list (List.rev args) in
    176 
    177           (* Verify right number of arguments for operator. *)
    178           if Array.length args != kind
    179           then raise (Stream.Error "invalid number of operands for operator")
    180           else
    181             if kind == 1 then
    182               Ast.Prototype (name, args)
    183             else
    184               Ast.BinOpPrototype (name, args, binary_precedence)
    185       | [< >] ->
    186           raise (Stream.Error "expected function name in prototype")
    187 
    188 This is all fairly straightforward parsing code, and we have already
    189 seen a lot of similar code in the past. One interesting part about the
    190 code above is the couple lines that set up ``name`` for binary
    191 operators. This builds names like "binary@" for a newly defined "@"
    192 operator. This then takes advantage of the fact that symbol names in the
    193 LLVM symbol table are allowed to have any character in them, including
    194 embedded nul characters.
    195 
    196 The next interesting thing to add, is codegen support for these binary
    197 operators. Given our current structure, this is a simple addition of a
    198 default case for our existing binary operator node:
    199 
    200 .. code-block:: ocaml
    201 
    202     let codegen_expr = function
    203       ...
    204       | Ast.Binary (op, lhs, rhs) ->
    205           let lhs_val = codegen_expr lhs in
    206           let rhs_val = codegen_expr rhs in
    207           begin
    208             match op with
    209             | '+' -> build_add lhs_val rhs_val "addtmp" builder
    210             | '-' -> build_sub lhs_val rhs_val "subtmp" builder
    211             | '*' -> build_mul lhs_val rhs_val "multmp" builder
    212             | '<' ->
    213                 (* Convert bool 0/1 to double 0.0 or 1.0 *)
    214                 let i = build_fcmp Fcmp.Ult lhs_val rhs_val "cmptmp" builder in
    215                 build_uitofp i double_type "booltmp" builder
    216             | _ ->
    217                 (* If it wasn't a builtin binary operator, it must be a user defined
    218                  * one. Emit a call to it. *)
    219                 let callee = "binary" ^ (String.make 1 op) in
    220                 let callee =
    221                   match lookup_function callee the_module with
    222                   | Some callee -> callee
    223                   | None -> raise (Error "binary operator not found!")
    224                 in
    225                 build_call callee [|lhs_val; rhs_val|] "binop" builder
    226           end
    227 
    228 As you can see above, the new code is actually really simple. It just
    229 does a lookup for the appropriate operator in the symbol table and
    230 generates a function call to it. Since user-defined operators are just
    231 built as normal functions (because the "prototype" boils down to a
    232 function with the right name) everything falls into place.
    233 
    234 The final piece of code we are missing, is a bit of top level magic:
    235 
    236 .. code-block:: ocaml
    237 
    238     let codegen_func the_fpm = function
    239       | Ast.Function (proto, body) ->
    240           Hashtbl.clear named_values;
    241           let the_function = codegen_proto proto in
    242 
    243           (* If this is an operator, install it. *)
    244           begin match proto with
    245           | Ast.BinOpPrototype (name, args, prec) ->
    246               let op = name.[String.length name - 1] in
    247               Hashtbl.add Parser.binop_precedence op prec;
    248           | _ -> ()
    249           end;
    250 
    251           (* Create a new basic block to start insertion into. *)
    252           let bb = append_block context "entry" the_function in
    253           position_at_end bb builder;
    254           ...
    255 
    256 Basically, before codegening a function, if it is a user-defined
    257 operator, we register it in the precedence table. This allows the binary
    258 operator parsing logic we already have in place to handle it. Since we
    259 are working on a fully-general operator precedence parser, this is all
    260 we need to do to "extend the grammar".
    261 
    262 Now we have useful user-defined binary operators. This builds a lot on
    263 the previous framework we built for other operators. Adding unary
    264 operators is a bit more challenging, because we don't have any framework
    265 for it yet - lets see what it takes.
    266 
    267 User-defined Unary Operators
    268 ============================
    269 
    270 Since we don't currently support unary operators in the Kaleidoscope
    271 language, we'll need to add everything to support them. Above, we added
    272 simple support for the 'unary' keyword to the lexer. In addition to
    273 that, we need an AST node:
    274 
    275 .. code-block:: ocaml
    276 
    277     type expr =
    278       ...
    279       (* variant for a unary operator. *)
    280       | Unary of char * expr
    281       ...
    282 
    283 This AST node is very simple and obvious by now. It directly mirrors the
    284 binary operator AST node, except that it only has one child. With this,
    285 we need to add the parsing logic. Parsing a unary operator is pretty
    286 simple: we'll add a new function to do it:
    287 
    288 .. code-block:: ocaml
    289 
    290     (* unary
    291      *   ::= primary
    292      *   ::= '!' unary *)
    293     and parse_unary = parser
    294       (* If this is a unary operator, read it. *)
    295       | [< 'Token.Kwd op when op != '(' && op != ')'; operand=parse_expr >] ->
    296           Ast.Unary (op, operand)
    297 
    298       (* If the current token is not an operator, it must be a primary expr. *)
    299       | [< stream >] -> parse_primary stream
    300 
    301 The grammar we add is pretty straightforward here. If we see a unary
    302 operator when parsing a primary operator, we eat the operator as a
    303 prefix and parse the remaining piece as another unary operator. This
    304 allows us to handle multiple unary operators (e.g. "!!x"). Note that
    305 unary operators can't have ambiguous parses like binary operators can,
    306 so there is no need for precedence information.
    307 
    308 The problem with this function, is that we need to call ParseUnary from
    309 somewhere. To do this, we change previous callers of ParsePrimary to
    310 call ``parse_unary`` instead:
    311 
    312 .. code-block:: ocaml
    313 
    314     (* binoprhs
    315      *   ::= ('+' primary)* *)
    316     and parse_bin_rhs expr_prec lhs stream =
    317             ...
    318             (* Parse the unary expression after the binary operator. *)
    319             let rhs = parse_unary stream in
    320             ...
    321 
    322     ...
    323 
    324     (* expression
    325      *   ::= primary binoprhs *)
    326     and parse_expr = parser
    327       | [< lhs=parse_unary; stream >] -> parse_bin_rhs 0 lhs stream
    328 
    329 With these two simple changes, we are now able to parse unary operators
    330 and build the AST for them. Next up, we need to add parser support for
    331 prototypes, to parse the unary operator prototype. We extend the binary
    332 operator code above with:
    333 
    334 .. code-block:: ocaml
    335 
    336     (* prototype
    337      *   ::= id '(' id* ')'
    338      *   ::= binary LETTER number? (id, id)
    339      *   ::= unary LETTER number? (id) *)
    340     let parse_prototype =
    341       let rec parse_args accumulator = parser
    342         | [< 'Token.Ident id; e=parse_args (id::accumulator) >] -> e
    343         | [< >] -> accumulator
    344       in
    345       let parse_operator = parser
    346         | [< 'Token.Unary >] -> "unary", 1
    347         | [< 'Token.Binary >] -> "binary", 2
    348       in
    349       let parse_binary_precedence = parser
    350         | [< 'Token.Number n >] -> int_of_float n
    351         | [< >] -> 30
    352       in
    353       parser
    354       | [< 'Token.Ident id;
    355            'Token.Kwd '(' ?? "expected '(' in prototype";
    356            args=parse_args [];
    357            'Token.Kwd ')' ?? "expected ')' in prototype" >] ->
    358           (* success. *)
    359           Ast.Prototype (id, Array.of_list (List.rev args))
    360       | [< (prefix, kind)=parse_operator;
    361            'Token.Kwd op ?? "expected an operator";
    362            (* Read the precedence if present. *)
    363            binary_precedence=parse_binary_precedence;
    364            'Token.Kwd '(' ?? "expected '(' in prototype";
    365             args=parse_args [];
    366            'Token.Kwd ')' ?? "expected ')' in prototype" >] ->
    367           let name = prefix ^ (String.make 1 op) in
    368           let args = Array.of_list (List.rev args) in
    369 
    370           (* Verify right number of arguments for operator. *)
    371           if Array.length args != kind
    372           then raise (Stream.Error "invalid number of operands for operator")
    373           else
    374             if kind == 1 then
    375               Ast.Prototype (name, args)
    376             else
    377               Ast.BinOpPrototype (name, args, binary_precedence)
    378       | [< >] ->
    379           raise (Stream.Error "expected function name in prototype")
    380 
    381 As with binary operators, we name unary operators with a name that
    382 includes the operator character. This assists us at code generation
    383 time. Speaking of, the final piece we need to add is codegen support for
    384 unary operators. It looks like this:
    385 
    386 .. code-block:: ocaml
    387 
    388     let rec codegen_expr = function
    389       ...
    390       | Ast.Unary (op, operand) ->
    391           let operand = codegen_expr operand in
    392           let callee = "unary" ^ (String.make 1 op) in
    393           let callee =
    394             match lookup_function callee the_module with
    395             | Some callee -> callee
    396             | None -> raise (Error "unknown unary operator")
    397           in
    398           build_call callee [|operand|] "unop" builder
    399 
    400 This code is similar to, but simpler than, the code for binary
    401 operators. It is simpler primarily because it doesn't need to handle any
    402 predefined operators.
    403 
    404 Kicking the Tires
    405 =================
    406 
    407 It is somewhat hard to believe, but with a few simple extensions we've
    408 covered in the last chapters, we have grown a real-ish language. With
    409 this, we can do a lot of interesting things, including I/O, math, and a
    410 bunch of other things. For example, we can now add a nice sequencing
    411 operator (printd is defined to print out the specified value and a
    412 newline):
    413 
    414 ::
    415 
    416     ready> extern printd(x);
    417     Read extern: declare double @printd(double)
    418     ready> def binary : 1 (x y) 0;  # Low-precedence operator that ignores operands.
    419     ..
    420     ready> printd(123) : printd(456) : printd(789);
    421     123.000000
    422     456.000000
    423     789.000000
    424     Evaluated to 0.000000
    425 
    426 We can also define a bunch of other "primitive" operations, such as:
    427 
    428 ::
    429 
    430     # Logical unary not.
    431     def unary!(v)
    432       if v then
    433         0
    434       else
    435         1;
    436 
    437     # Unary negate.
    438     def unary-(v)
    439       0-v;
    440 
    441     # Define > with the same precedence as <.
    442     def binary> 10 (LHS RHS)
    443       RHS < LHS;
    444 
    445     # Binary logical or, which does not short circuit.
    446     def binary| 5 (LHS RHS)
    447       if LHS then
    448         1
    449       else if RHS then
    450         1
    451       else
    452         0;
    453 
    454     # Binary logical and, which does not short circuit.
    455     def binary& 6 (LHS RHS)
    456       if !LHS then
    457         0
    458       else
    459         !!RHS;
    460 
    461     # Define = with slightly lower precedence than relationals.
    462     def binary = 9 (LHS RHS)
    463       !(LHS < RHS | LHS > RHS);
    464 
    465 Given the previous if/then/else support, we can also define interesting
    466 functions for I/O. For example, the following prints out a character
    467 whose "density" reflects the value passed in: the lower the value, the
    468 denser the character:
    469 
    470 ::
    471 
    472     ready>
    473 
    474     extern putchard(char)
    475     def printdensity(d)
    476       if d > 8 then
    477         putchard(32)  # ' '
    478       else if d > 4 then
    479         putchard(46)  # '.'
    480       else if d > 2 then
    481         putchard(43)  # '+'
    482       else
    483         putchard(42); # '*'
    484     ...
    485     ready> printdensity(1): printdensity(2): printdensity(3) :
    486               printdensity(4): printdensity(5): printdensity(9): putchard(10);
    487     *++..
    488     Evaluated to 0.000000
    489 
    490 Based on these simple primitive operations, we can start to define more
    491 interesting things. For example, here's a little function that solves
    492 for the number of iterations it takes a function in the complex plane to
    493 converge:
    494 
    495 ::
    496 
    497     # determine whether the specific location diverges.
    498     # Solve for z = z^2 + c in the complex plane.
    499     def mandleconverger(real imag iters creal cimag)
    500       if iters > 255 | (real*real + imag*imag > 4) then
    501         iters
    502       else
    503         mandleconverger(real*real - imag*imag + creal,
    504                         2*real*imag + cimag,
    505                         iters+1, creal, cimag);
    506 
    507     # return the number of iterations required for the iteration to escape
    508     def mandleconverge(real imag)
    509       mandleconverger(real, imag, 0, real, imag);
    510 
    511 This "z = z\ :sup:`2`\  + c" function is a beautiful little creature
    512 that is the basis for computation of the `Mandelbrot
    513 Set <http://en.wikipedia.org/wiki/Mandelbrot_set>`_. Our
    514 ``mandelconverge`` function returns the number of iterations that it
    515 takes for a complex orbit to escape, saturating to 255. This is not a
    516 very useful function by itself, but if you plot its value over a
    517 two-dimensional plane, you can see the Mandelbrot set. Given that we are
    518 limited to using putchard here, our amazing graphical output is limited,
    519 but we can whip together something using the density plotter above:
    520 
    521 ::
    522 
    523     # compute and plot the mandlebrot set with the specified 2 dimensional range
    524     # info.
    525     def mandelhelp(xmin xmax xstep   ymin ymax ystep)
    526       for y = ymin, y < ymax, ystep in (
    527         (for x = xmin, x < xmax, xstep in
    528            printdensity(mandleconverge(x,y)))
    529         : putchard(10)
    530       )
    531 
    532     # mandel - This is a convenient helper function for plotting the mandelbrot set
    533     # from the specified position with the specified Magnification.
    534     def mandel(realstart imagstart realmag imagmag)
    535       mandelhelp(realstart, realstart+realmag*78, realmag,
    536                  imagstart, imagstart+imagmag*40, imagmag);
    537 
    538 Given this, we can try plotting out the mandlebrot set! Lets try it out:
    539 
    540 ::
    541 
    542     ready> mandel(-2.3, -1.3, 0.05, 0.07);
    543     *******************************+++++++++++*************************************
    544     *************************+++++++++++++++++++++++*******************************
    545     **********************+++++++++++++++++++++++++++++****************************
    546     *******************+++++++++++++++++++++.. ...++++++++*************************
    547     *****************++++++++++++++++++++++.... ...+++++++++***********************
    548     ***************+++++++++++++++++++++++.....   ...+++++++++*********************
    549     **************+++++++++++++++++++++++....     ....+++++++++********************
    550     *************++++++++++++++++++++++......      .....++++++++*******************
    551     ************+++++++++++++++++++++.......       .......+++++++******************
    552     ***********+++++++++++++++++++....                ... .+++++++*****************
    553     **********+++++++++++++++++.......                     .+++++++****************
    554     *********++++++++++++++...........                    ...+++++++***************
    555     ********++++++++++++............                      ...++++++++**************
    556     ********++++++++++... ..........                        .++++++++**************
    557     *******+++++++++.....                                   .+++++++++*************
    558     *******++++++++......                                  ..+++++++++*************
    559     *******++++++.......                                   ..+++++++++*************
    560     *******+++++......                                     ..+++++++++*************
    561     *******.... ....                                      ...+++++++++*************
    562     *******.... .                                         ...+++++++++*************
    563     *******+++++......                                    ...+++++++++*************
    564     *******++++++.......                                   ..+++++++++*************
    565     *******++++++++......                                   .+++++++++*************
    566     *******+++++++++.....                                  ..+++++++++*************
    567     ********++++++++++... ..........                        .++++++++**************
    568     ********++++++++++++............                      ...++++++++**************
    569     *********++++++++++++++..........                     ...+++++++***************
    570     **********++++++++++++++++........                     .+++++++****************
    571     **********++++++++++++++++++++....                ... ..+++++++****************
    572     ***********++++++++++++++++++++++.......       .......++++++++*****************
    573     ************+++++++++++++++++++++++......      ......++++++++******************
    574     **************+++++++++++++++++++++++....      ....++++++++********************
    575     ***************+++++++++++++++++++++++.....   ...+++++++++*********************
    576     *****************++++++++++++++++++++++....  ...++++++++***********************
    577     *******************+++++++++++++++++++++......++++++++*************************
    578     *********************++++++++++++++++++++++.++++++++***************************
    579     *************************+++++++++++++++++++++++*******************************
    580     ******************************+++++++++++++************************************
    581     *******************************************************************************
    582     *******************************************************************************
    583     *******************************************************************************
    584     Evaluated to 0.000000
    585     ready> mandel(-2, -1, 0.02, 0.04);
    586     **************************+++++++++++++++++++++++++++++++++++++++++++++++++++++
    587     ***********************++++++++++++++++++++++++++++++++++++++++++++++++++++++++
    588     *********************+++++++++++++++++++++++++++++++++++++++++++++++++++++++++.
    589     *******************+++++++++++++++++++++++++++++++++++++++++++++++++++++++++...
    590     *****************+++++++++++++++++++++++++++++++++++++++++++++++++++++++++.....
    591     ***************++++++++++++++++++++++++++++++++++++++++++++++++++++++++........
    592     **************++++++++++++++++++++++++++++++++++++++++++++++++++++++...........
    593     ************+++++++++++++++++++++++++++++++++++++++++++++++++++++..............
    594     ***********++++++++++++++++++++++++++++++++++++++++++++++++++........        .
    595     **********++++++++++++++++++++++++++++++++++++++++++++++.............
    596     ********+++++++++++++++++++++++++++++++++++++++++++..................
    597     *******+++++++++++++++++++++++++++++++++++++++.......................
    598     ******+++++++++++++++++++++++++++++++++++...........................
    599     *****++++++++++++++++++++++++++++++++............................
    600     *****++++++++++++++++++++++++++++...............................
    601     ****++++++++++++++++++++++++++......   .........................
    602     ***++++++++++++++++++++++++.........     ......    ...........
    603     ***++++++++++++++++++++++............
    604     **+++++++++++++++++++++..............
    605     **+++++++++++++++++++................
    606     *++++++++++++++++++.................
    607     *++++++++++++++++............ ...
    608     *++++++++++++++..............
    609     *+++....++++................
    610     *..........  ...........
    611     *
    612     *..........  ...........
    613     *+++....++++................
    614     *++++++++++++++..............
    615     *++++++++++++++++............ ...
    616     *++++++++++++++++++.................
    617     **+++++++++++++++++++................
    618     **+++++++++++++++++++++..............
    619     ***++++++++++++++++++++++............
    620     ***++++++++++++++++++++++++.........     ......    ...........
    621     ****++++++++++++++++++++++++++......   .........................
    622     *****++++++++++++++++++++++++++++...............................
    623     *****++++++++++++++++++++++++++++++++............................
    624     ******+++++++++++++++++++++++++++++++++++...........................
    625     *******+++++++++++++++++++++++++++++++++++++++.......................
    626     ********+++++++++++++++++++++++++++++++++++++++++++..................
    627     Evaluated to 0.000000
    628     ready> mandel(-0.9, -1.4, 0.02, 0.03);
    629     *******************************************************************************
    630     *******************************************************************************
    631     *******************************************************************************
    632     **********+++++++++++++++++++++************************************************
    633     *+++++++++++++++++++++++++++++++++++++++***************************************
    634     +++++++++++++++++++++++++++++++++++++++++++++**********************************
    635     ++++++++++++++++++++++++++++++++++++++++++++++++++*****************************
    636     ++++++++++++++++++++++++++++++++++++++++++++++++++++++*************************
    637     +++++++++++++++++++++++++++++++++++++++++++++++++++++++++**********************
    638     +++++++++++++++++++++++++++++++++.........++++++++++++++++++*******************
    639     +++++++++++++++++++++++++++++++....   ......+++++++++++++++++++****************
    640     +++++++++++++++++++++++++++++.......  ........+++++++++++++++++++**************
    641     ++++++++++++++++++++++++++++........   ........++++++++++++++++++++************
    642     +++++++++++++++++++++++++++.........     ..  ...+++++++++++++++++++++**********
    643     ++++++++++++++++++++++++++...........        ....++++++++++++++++++++++********
    644     ++++++++++++++++++++++++.............       .......++++++++++++++++++++++******
    645     +++++++++++++++++++++++.............        ........+++++++++++++++++++++++****
    646     ++++++++++++++++++++++...........           ..........++++++++++++++++++++++***
    647     ++++++++++++++++++++...........                .........++++++++++++++++++++++*
    648     ++++++++++++++++++............                  ...........++++++++++++++++++++
    649     ++++++++++++++++...............                 .............++++++++++++++++++
    650     ++++++++++++++.................                 ...............++++++++++++++++
    651     ++++++++++++..................                  .................++++++++++++++
    652     +++++++++..................                      .................+++++++++++++
    653     ++++++........        .                               .........  ..++++++++++++
    654     ++............                                         ......    ....++++++++++
    655     ..............                                                    ...++++++++++
    656     ..............                                                    ....+++++++++
    657     ..............                                                    .....++++++++
    658     .............                                                    ......++++++++
    659     ...........                                                     .......++++++++
    660     .........                                                       ........+++++++
    661     .........                                                       ........+++++++
    662     .........                                                           ....+++++++
    663     ........                                                             ...+++++++
    664     .......                                                              ...+++++++
    665                                                                         ....+++++++
    666                                                                        .....+++++++
    667                                                                         ....+++++++
    668                                                                         ....+++++++
    669                                                                         ....+++++++
    670     Evaluated to 0.000000
    671     ready> ^D
    672 
    673 At this point, you may be starting to realize that Kaleidoscope is a
    674 real and powerful language. It may not be self-similar :), but it can be
    675 used to plot things that are!
    676 
    677 With this, we conclude the "adding user-defined operators" chapter of
    678 the tutorial. We have successfully augmented our language, adding the
    679 ability to extend the language in the library, and we have shown how
    680 this can be used to build a simple but interesting end-user application
    681 in Kaleidoscope. At this point, Kaleidoscope can build a variety of
    682 applications that are functional and can call functions with
    683 side-effects, but it can't actually define and mutate a variable itself.
    684 
    685 Strikingly, variable mutation is an important feature of some languages,
    686 and it is not at all obvious how to `add support for mutable
    687 variables <OCamlLangImpl7.html>`_ without having to add an "SSA
    688 construction" phase to your front-end. In the next chapter, we will
    689 describe how you can add variable mutation without building SSA in your
    690 front-end.
    691 
    692 Full Code Listing
    693 =================
    694 
    695 Here is the complete code listing for our running example, enhanced with
    696 the if/then/else and for expressions.. To build this example, use:
    697 
    698 .. code-block:: bash
    699 
    700     # Compile
    701     ocamlbuild toy.byte
    702     # Run
    703     ./toy.byte
    704 
    705 Here is the code:
    706 
    707 \_tags:
    708     ::
    709 
    710         <{lexer,parser}.ml>: use_camlp4, pp(camlp4of)
    711         <*.{byte,native}>: g++, use_llvm, use_llvm_analysis
    712         <*.{byte,native}>: use_llvm_executionengine, use_llvm_target
    713         <*.{byte,native}>: use_llvm_scalar_opts, use_bindings
    714 
    715 myocamlbuild.ml:
    716     .. code-block:: ocaml
    717 
    718         open Ocamlbuild_plugin;;
    719 
    720         ocaml_lib ~extern:true "llvm";;
    721         ocaml_lib ~extern:true "llvm_analysis";;
    722         ocaml_lib ~extern:true "llvm_executionengine";;
    723         ocaml_lib ~extern:true "llvm_target";;
    724         ocaml_lib ~extern:true "llvm_scalar_opts";;
    725 
    726         flag ["link"; "ocaml"; "g++"] (S[A"-cc"; A"g++"; A"-cclib"; A"-rdynamic"]);;
    727         dep ["link"; "ocaml"; "use_bindings"] ["bindings.o"];;
    728 
    729 token.ml:
    730     .. code-block:: ocaml
    731 
    732         (*===----------------------------------------------------------------------===
    733          * Lexer Tokens
    734          *===----------------------------------------------------------------------===*)
    735 
    736         (* The lexer returns these 'Kwd' if it is an unknown character, otherwise one of
    737          * these others for known things. *)
    738         type token =
    739           (* commands *)
    740           | Def | Extern
    741 
    742           (* primary *)
    743           | Ident of string | Number of float
    744 
    745           (* unknown *)
    746           | Kwd of char
    747 
    748           (* control *)
    749           | If | Then | Else
    750           | For | In
    751 
    752           (* operators *)
    753           | Binary | Unary
    754 
    755 lexer.ml:
    756     .. code-block:: ocaml
    757 
    758         (*===----------------------------------------------------------------------===
    759          * Lexer
    760          *===----------------------------------------------------------------------===*)
    761 
    762         let rec lex = parser
    763           (* Skip any whitespace. *)
    764           | [< ' (' ' | '\n' | '\r' | '\t'); stream >] -> lex stream
    765 
    766           (* identifier: [a-zA-Z][a-zA-Z0-9] *)
    767           | [< ' ('A' .. 'Z' | 'a' .. 'z' as c); stream >] ->
    768               let buffer = Buffer.create 1 in
    769               Buffer.add_char buffer c;
    770               lex_ident buffer stream
    771 
    772           (* number: [0-9.]+ *)
    773           | [< ' ('0' .. '9' as c); stream >] ->
    774               let buffer = Buffer.create 1 in
    775               Buffer.add_char buffer c;
    776               lex_number buffer stream
    777 
    778           (* Comment until end of line. *)
    779           | [< ' ('#'); stream >] ->
    780               lex_comment stream
    781 
    782           (* Otherwise, just return the character as its ascii value. *)
    783           | [< 'c; stream >] ->
    784               [< 'Token.Kwd c; lex stream >]
    785 
    786           (* end of stream. *)
    787           | [< >] -> [< >]
    788 
    789         and lex_number buffer = parser
    790           | [< ' ('0' .. '9' | '.' as c); stream >] ->
    791               Buffer.add_char buffer c;
    792               lex_number buffer stream
    793           | [< stream=lex >] ->
    794               [< 'Token.Number (float_of_string (Buffer.contents buffer)); stream >]
    795 
    796         and lex_ident buffer = parser
    797           | [< ' ('A' .. 'Z' | 'a' .. 'z' | '0' .. '9' as c); stream >] ->
    798               Buffer.add_char buffer c;
    799               lex_ident buffer stream
    800           | [< stream=lex >] ->
    801               match Buffer.contents buffer with
    802               | "def" -> [< 'Token.Def; stream >]
    803               | "extern" -> [< 'Token.Extern; stream >]
    804               | "if" -> [< 'Token.If; stream >]
    805               | "then" -> [< 'Token.Then; stream >]
    806               | "else" -> [< 'Token.Else; stream >]
    807               | "for" -> [< 'Token.For; stream >]
    808               | "in" -> [< 'Token.In; stream >]
    809               | "binary" -> [< 'Token.Binary; stream >]
    810               | "unary" -> [< 'Token.Unary; stream >]
    811               | id -> [< 'Token.Ident id; stream >]
    812 
    813         and lex_comment = parser
    814           | [< ' ('\n'); stream=lex >] -> stream
    815           | [< 'c; e=lex_comment >] -> e
    816           | [< >] -> [< >]
    817 
    818 ast.ml:
    819     .. code-block:: ocaml
    820 
    821         (*===----------------------------------------------------------------------===
    822          * Abstract Syntax Tree (aka Parse Tree)
    823          *===----------------------------------------------------------------------===*)
    824 
    825         (* expr - Base type for all expression nodes. *)
    826         type expr =
    827           (* variant for numeric literals like "1.0". *)
    828           | Number of float
    829 
    830           (* variant for referencing a variable, like "a". *)
    831           | Variable of string
    832 
    833           (* variant for a unary operator. *)
    834           | Unary of char * expr
    835 
    836           (* variant for a binary operator. *)
    837           | Binary of char * expr * expr
    838 
    839           (* variant for function calls. *)
    840           | Call of string * expr array
    841 
    842           (* variant for if/then/else. *)
    843           | If of expr * expr * expr
    844 
    845           (* variant for for/in. *)
    846           | For of string * expr * expr * expr option * expr
    847 
    848         (* proto - This type represents the "prototype" for a function, which captures
    849          * its name, and its argument names (thus implicitly the number of arguments the
    850          * function takes). *)
    851         type proto =
    852           | Prototype of string * string array
    853           | BinOpPrototype of string * string array * int
    854 
    855         (* func - This type represents a function definition itself. *)
    856         type func = Function of proto * expr
    857 
    858 parser.ml:
    859     .. code-block:: ocaml
    860 
    861         (*===---------------------------------------------------------------------===
    862          * Parser
    863          *===---------------------------------------------------------------------===*)
    864 
    865         (* binop_precedence - This holds the precedence for each binary operator that is
    866          * defined *)
    867         let binop_precedence:(char, int) Hashtbl.t = Hashtbl.create 10
    868 
    869         (* precedence - Get the precedence of the pending binary operator token. *)
    870         let precedence c = try Hashtbl.find binop_precedence c with Not_found -> -1
    871 
    872         (* primary
    873          *   ::= identifier
    874          *   ::= numberexpr
    875          *   ::= parenexpr
    876          *   ::= ifexpr
    877          *   ::= forexpr *)
    878         let rec parse_primary = parser
    879           (* numberexpr ::= number *)
    880           | [< 'Token.Number n >] -> Ast.Number n
    881 
    882           (* parenexpr ::= '(' expression ')' *)
    883           | [< 'Token.Kwd '('; e=parse_expr; 'Token.Kwd ')' ?? "expected ')'" >] -> e
    884 
    885           (* identifierexpr
    886            *   ::= identifier
    887            *   ::= identifier '(' argumentexpr ')' *)
    888           | [< 'Token.Ident id; stream >] ->
    889               let rec parse_args accumulator = parser
    890                 | [< e=parse_expr; stream >] ->
    891                     begin parser
    892                       | [< 'Token.Kwd ','; e=parse_args (e :: accumulator) >] -> e
    893                       | [< >] -> e :: accumulator
    894                     end stream
    895                 | [< >] -> accumulator
    896               in
    897               let rec parse_ident id = parser
    898                 (* Call. *)
    899                 | [< 'Token.Kwd '(';
    900                      args=parse_args [];
    901                      'Token.Kwd ')' ?? "expected ')'">] ->
    902                     Ast.Call (id, Array.of_list (List.rev args))
    903 
    904                 (* Simple variable ref. *)
    905                 | [< >] -> Ast.Variable id
    906               in
    907               parse_ident id stream
    908 
    909           (* ifexpr ::= 'if' expr 'then' expr 'else' expr *)
    910           | [< 'Token.If; c=parse_expr;
    911                'Token.Then ?? "expected 'then'"; t=parse_expr;
    912                'Token.Else ?? "expected 'else'"; e=parse_expr >] ->
    913               Ast.If (c, t, e)
    914 
    915           (* forexpr
    916                 ::= 'for' identifier '=' expr ',' expr (',' expr)? 'in' expression *)
    917           | [< 'Token.For;
    918                'Token.Ident id ?? "expected identifier after for";
    919                'Token.Kwd '=' ?? "expected '=' after for";
    920                stream >] ->
    921               begin parser
    922                 | [<
    923                      start=parse_expr;
    924                      'Token.Kwd ',' ?? "expected ',' after for";
    925                      end_=parse_expr;
    926                      stream >] ->
    927                     let step =
    928                       begin parser
    929                       | [< 'Token.Kwd ','; step=parse_expr >] -> Some step
    930                       | [< >] -> None
    931                       end stream
    932                     in
    933                     begin parser
    934                     | [< 'Token.In; body=parse_expr >] ->
    935                         Ast.For (id, start, end_, step, body)
    936                     | [< >] ->
    937                         raise (Stream.Error "expected 'in' after for")
    938                     end stream
    939                 | [< >] ->
    940                     raise (Stream.Error "expected '=' after for")
    941               end stream
    942 
    943           | [< >] -> raise (Stream.Error "unknown token when expecting an expression.")
    944 
    945         (* unary
    946          *   ::= primary
    947          *   ::= '!' unary *)
    948         and parse_unary = parser
    949           (* If this is a unary operator, read it. *)
    950           | [< 'Token.Kwd op when op != '(' && op != ')'; operand=parse_expr >] ->
    951               Ast.Unary (op, operand)
    952 
    953           (* If the current token is not an operator, it must be a primary expr. *)
    954           | [< stream >] -> parse_primary stream
    955 
    956         (* binoprhs
    957          *   ::= ('+' primary)* *)
    958         and parse_bin_rhs expr_prec lhs stream =
    959           match Stream.peek stream with
    960           (* If this is a binop, find its precedence. *)
    961           | Some (Token.Kwd c) when Hashtbl.mem binop_precedence c ->
    962               let token_prec = precedence c in
    963 
    964               (* If this is a binop that binds at least as tightly as the current binop,
    965                * consume it, otherwise we are done. *)
    966               if token_prec < expr_prec then lhs else begin
    967                 (* Eat the binop. *)
    968                 Stream.junk stream;
    969 
    970                 (* Parse the unary expression after the binary operator. *)
    971                 let rhs = parse_unary stream in
    972 
    973                 (* Okay, we know this is a binop. *)
    974                 let rhs =
    975                   match Stream.peek stream with
    976                   | Some (Token.Kwd c2) ->
    977                       (* If BinOp binds less tightly with rhs than the operator after
    978                        * rhs, let the pending operator take rhs as its lhs. *)
    979                       let next_prec = precedence c2 in
    980                       if token_prec < next_prec
    981                       then parse_bin_rhs (token_prec + 1) rhs stream
    982                       else rhs
    983                   | _ -> rhs
    984                 in
    985 
    986                 (* Merge lhs/rhs. *)
    987                 let lhs = Ast.Binary (c, lhs, rhs) in
    988                 parse_bin_rhs expr_prec lhs stream
    989               end
    990           | _ -> lhs
    991 
    992         (* expression
    993          *   ::= primary binoprhs *)
    994         and parse_expr = parser
    995           | [< lhs=parse_unary; stream >] -> parse_bin_rhs 0 lhs stream
    996 
    997         (* prototype
    998          *   ::= id '(' id* ')'
    999          *   ::= binary LETTER number? (id, id)
   1000          *   ::= unary LETTER number? (id) *)
   1001         let parse_prototype =
   1002           let rec parse_args accumulator = parser
   1003             | [< 'Token.Ident id; e=parse_args (id::accumulator) >] -> e
   1004             | [< >] -> accumulator
   1005           in
   1006           let parse_operator = parser
   1007             | [< 'Token.Unary >] -> "unary", 1
   1008             | [< 'Token.Binary >] -> "binary", 2
   1009           in
   1010           let parse_binary_precedence = parser
   1011             | [< 'Token.Number n >] -> int_of_float n
   1012             | [< >] -> 30
   1013           in
   1014           parser
   1015           | [< 'Token.Ident id;
   1016                'Token.Kwd '(' ?? "expected '(' in prototype";
   1017                args=parse_args [];
   1018                'Token.Kwd ')' ?? "expected ')' in prototype" >] ->
   1019               (* success. *)
   1020               Ast.Prototype (id, Array.of_list (List.rev args))
   1021           | [< (prefix, kind)=parse_operator;
   1022                'Token.Kwd op ?? "expected an operator";
   1023                (* Read the precedence if present. *)
   1024                binary_precedence=parse_binary_precedence;
   1025                'Token.Kwd '(' ?? "expected '(' in prototype";
   1026                 args=parse_args [];
   1027                'Token.Kwd ')' ?? "expected ')' in prototype" >] ->
   1028               let name = prefix ^ (String.make 1 op) in
   1029               let args = Array.of_list (List.rev args) in
   1030 
   1031               (* Verify right number of arguments for operator. *)
   1032               if Array.length args != kind
   1033               then raise (Stream.Error "invalid number of operands for operator")
   1034               else
   1035                 if kind == 1 then
   1036                   Ast.Prototype (name, args)
   1037                 else
   1038                   Ast.BinOpPrototype (name, args, binary_precedence)
   1039           | [< >] ->
   1040               raise (Stream.Error "expected function name in prototype")
   1041 
   1042         (* definition ::= 'def' prototype expression *)
   1043         let parse_definition = parser
   1044           | [< 'Token.Def; p=parse_prototype; e=parse_expr >] ->
   1045               Ast.Function (p, e)
   1046 
   1047         (* toplevelexpr ::= expression *)
   1048         let parse_toplevel = parser
   1049           | [< e=parse_expr >] ->
   1050               (* Make an anonymous proto. *)
   1051               Ast.Function (Ast.Prototype ("", [||]), e)
   1052 
   1053         (*  external ::= 'extern' prototype *)
   1054         let parse_extern = parser
   1055           | [< 'Token.Extern; e=parse_prototype >] -> e
   1056 
   1057 codegen.ml:
   1058     .. code-block:: ocaml
   1059 
   1060         (*===----------------------------------------------------------------------===
   1061          * Code Generation
   1062          *===----------------------------------------------------------------------===*)
   1063 
   1064         open Llvm
   1065 
   1066         exception Error of string
   1067 
   1068         let context = global_context ()
   1069         let the_module = create_module context "my cool jit"
   1070         let builder = builder context
   1071         let named_values:(string, llvalue) Hashtbl.t = Hashtbl.create 10
   1072         let double_type = double_type context
   1073 
   1074         let rec codegen_expr = function
   1075           | Ast.Number n -> const_float double_type n
   1076           | Ast.Variable name ->
   1077               (try Hashtbl.find named_values name with
   1078                 | Not_found -> raise (Error "unknown variable name"))
   1079           | Ast.Unary (op, operand) ->
   1080               let operand = codegen_expr operand in
   1081               let callee = "unary" ^ (String.make 1 op) in
   1082               let callee =
   1083                 match lookup_function callee the_module with
   1084                 | Some callee -> callee
   1085                 | None -> raise (Error "unknown unary operator")
   1086               in
   1087               build_call callee [|operand|] "unop" builder
   1088           | Ast.Binary (op, lhs, rhs) ->
   1089               let lhs_val = codegen_expr lhs in
   1090               let rhs_val = codegen_expr rhs in
   1091               begin
   1092                 match op with
   1093                 | '+' -> build_add lhs_val rhs_val "addtmp" builder
   1094                 | '-' -> build_sub lhs_val rhs_val "subtmp" builder
   1095                 | '*' -> build_mul lhs_val rhs_val "multmp" builder
   1096                 | '<' ->
   1097                     (* Convert bool 0/1 to double 0.0 or 1.0 *)
   1098                     let i = build_fcmp Fcmp.Ult lhs_val rhs_val "cmptmp" builder in
   1099                     build_uitofp i double_type "booltmp" builder
   1100                 | _ ->
   1101                     (* If it wasn't a builtin binary operator, it must be a user defined
   1102                      * one. Emit a call to it. *)
   1103                     let callee = "binary" ^ (String.make 1 op) in
   1104                     let callee =
   1105                       match lookup_function callee the_module with
   1106                       | Some callee -> callee
   1107                       | None -> raise (Error "binary operator not found!")
   1108                     in
   1109                     build_call callee [|lhs_val; rhs_val|] "binop" builder
   1110               end
   1111           | Ast.Call (callee, args) ->
   1112               (* Look up the name in the module table. *)
   1113               let callee =
   1114                 match lookup_function callee the_module with
   1115                 | Some callee -> callee
   1116                 | None -> raise (Error "unknown function referenced")
   1117               in
   1118               let params = params callee in
   1119 
   1120               (* If argument mismatch error. *)
   1121               if Array.length params == Array.length args then () else
   1122                 raise (Error "incorrect # arguments passed");
   1123               let args = Array.map codegen_expr args in
   1124               build_call callee args "calltmp" builder
   1125           | Ast.If (cond, then_, else_) ->
   1126               let cond = codegen_expr cond in
   1127 
   1128               (* Convert condition to a bool by comparing equal to 0.0 *)
   1129               let zero = const_float double_type 0.0 in
   1130               let cond_val = build_fcmp Fcmp.One cond zero "ifcond" builder in
   1131 
   1132               (* Grab the first block so that we might later add the conditional branch
   1133                * to it at the end of the function. *)
   1134               let start_bb = insertion_block builder in
   1135               let the_function = block_parent start_bb in
   1136 
   1137               let then_bb = append_block context "then" the_function in
   1138 
   1139               (* Emit 'then' value. *)
   1140               position_at_end then_bb builder;
   1141               let then_val = codegen_expr then_ in
   1142 
   1143               (* Codegen of 'then' can change the current block, update then_bb for the
   1144                * phi. We create a new name because one is used for the phi node, and the
   1145                * other is used for the conditional branch. *)
   1146               let new_then_bb = insertion_block builder in
   1147 
   1148               (* Emit 'else' value. *)
   1149               let else_bb = append_block context "else" the_function in
   1150               position_at_end else_bb builder;
   1151               let else_val = codegen_expr else_ in
   1152 
   1153               (* Codegen of 'else' can change the current block, update else_bb for the
   1154                * phi. *)
   1155               let new_else_bb = insertion_block builder in
   1156 
   1157               (* Emit merge block. *)
   1158               let merge_bb = append_block context "ifcont" the_function in
   1159               position_at_end merge_bb builder;
   1160               let incoming = [(then_val, new_then_bb); (else_val, new_else_bb)] in
   1161               let phi = build_phi incoming "iftmp" builder in
   1162 
   1163               (* Return to the start block to add the conditional branch. *)
   1164               position_at_end start_bb builder;
   1165               ignore (build_cond_br cond_val then_bb else_bb builder);
   1166 
   1167               (* Set a unconditional branch at the end of the 'then' block and the
   1168                * 'else' block to the 'merge' block. *)
   1169               position_at_end new_then_bb builder; ignore (build_br merge_bb builder);
   1170               position_at_end new_else_bb builder; ignore (build_br merge_bb builder);
   1171 
   1172               (* Finally, set the builder to the end of the merge block. *)
   1173               position_at_end merge_bb builder;
   1174 
   1175               phi
   1176           | Ast.For (var_name, start, end_, step, body) ->
   1177               (* Emit the start code first, without 'variable' in scope. *)
   1178               let start_val = codegen_expr start in
   1179 
   1180               (* Make the new basic block for the loop header, inserting after current
   1181                * block. *)
   1182               let preheader_bb = insertion_block builder in
   1183               let the_function = block_parent preheader_bb in
   1184               let loop_bb = append_block context "loop" the_function in
   1185 
   1186               (* Insert an explicit fall through from the current block to the
   1187                * loop_bb. *)
   1188               ignore (build_br loop_bb builder);
   1189 
   1190               (* Start insertion in loop_bb. *)
   1191               position_at_end loop_bb builder;
   1192 
   1193               (* Start the PHI node with an entry for start. *)
   1194               let variable = build_phi [(start_val, preheader_bb)] var_name builder in
   1195 
   1196               (* Within the loop, the variable is defined equal to the PHI node. If it
   1197                * shadows an existing variable, we have to restore it, so save it
   1198                * now. *)
   1199               let old_val =
   1200                 try Some (Hashtbl.find named_values var_name) with Not_found -> None
   1201               in
   1202               Hashtbl.add named_values var_name variable;
   1203 
   1204               (* Emit the body of the loop.  This, like any other expr, can change the
   1205                * current BB.  Note that we ignore the value computed by the body, but
   1206                * don't allow an error *)
   1207               ignore (codegen_expr body);
   1208 
   1209               (* Emit the step value. *)
   1210               let step_val =
   1211                 match step with
   1212                 | Some step -> codegen_expr step
   1213                 (* If not specified, use 1.0. *)
   1214                 | None -> const_float double_type 1.0
   1215               in
   1216 
   1217               let next_var = build_add variable step_val "nextvar" builder in
   1218 
   1219               (* Compute the end condition. *)
   1220               let end_cond = codegen_expr end_ in
   1221 
   1222               (* Convert condition to a bool by comparing equal to 0.0. *)
   1223               let zero = const_float double_type 0.0 in
   1224               let end_cond = build_fcmp Fcmp.One end_cond zero "loopcond" builder in
   1225 
   1226               (* Create the "after loop" block and insert it. *)
   1227               let loop_end_bb = insertion_block builder in
   1228               let after_bb = append_block context "afterloop" the_function in
   1229 
   1230               (* Insert the conditional branch into the end of loop_end_bb. *)
   1231               ignore (build_cond_br end_cond loop_bb after_bb builder);
   1232 
   1233               (* Any new code will be inserted in after_bb. *)
   1234               position_at_end after_bb builder;
   1235 
   1236               (* Add a new entry to the PHI node for the backedge. *)
   1237               add_incoming (next_var, loop_end_bb) variable;
   1238 
   1239               (* Restore the unshadowed variable. *)
   1240               begin match old_val with
   1241               | Some old_val -> Hashtbl.add named_values var_name old_val
   1242               | None -> ()
   1243               end;
   1244 
   1245               (* for expr always returns 0.0. *)
   1246               const_null double_type
   1247 
   1248         let codegen_proto = function
   1249           | Ast.Prototype (name, args) | Ast.BinOpPrototype (name, args, _) ->
   1250               (* Make the function type: double(double,double) etc. *)
   1251               let doubles = Array.make (Array.length args) double_type in
   1252               let ft = function_type double_type doubles in
   1253               let f =
   1254                 match lookup_function name the_module with
   1255                 | None -> declare_function name ft the_module
   1256 
   1257                 (* If 'f' conflicted, there was already something named 'name'. If it
   1258                  * has a body, don't allow redefinition or reextern. *)
   1259                 | Some f ->
   1260                     (* If 'f' already has a body, reject this. *)
   1261                     if block_begin f <> At_end f then
   1262                       raise (Error "redefinition of function");
   1263 
   1264                     (* If 'f' took a different number of arguments, reject. *)
   1265                     if element_type (type_of f) <> ft then
   1266                       raise (Error "redefinition of function with different # args");
   1267                     f
   1268               in
   1269 
   1270               (* Set names for all arguments. *)
   1271               Array.iteri (fun i a ->
   1272                 let n = args.(i) in
   1273                 set_value_name n a;
   1274                 Hashtbl.add named_values n a;
   1275               ) (params f);
   1276               f
   1277 
   1278         let codegen_func the_fpm = function
   1279           | Ast.Function (proto, body) ->
   1280               Hashtbl.clear named_values;
   1281               let the_function = codegen_proto proto in
   1282 
   1283               (* If this is an operator, install it. *)
   1284               begin match proto with
   1285               | Ast.BinOpPrototype (name, args, prec) ->
   1286                   let op = name.[String.length name - 1] in
   1287                   Hashtbl.add Parser.binop_precedence op prec;
   1288               | _ -> ()
   1289               end;
   1290 
   1291               (* Create a new basic block to start insertion into. *)
   1292               let bb = append_block context "entry" the_function in
   1293               position_at_end bb builder;
   1294 
   1295               try
   1296                 let ret_val = codegen_expr body in
   1297 
   1298                 (* Finish off the function. *)
   1299                 let _ = build_ret ret_val builder in
   1300 
   1301                 (* Validate the generated code, checking for consistency. *)
   1302                 Llvm_analysis.assert_valid_function the_function;
   1303 
   1304                 (* Optimize the function. *)
   1305                 let _ = PassManager.run_function the_function the_fpm in
   1306 
   1307                 the_function
   1308               with e ->
   1309                 delete_function the_function;
   1310                 raise e
   1311 
   1312 toplevel.ml:
   1313     .. code-block:: ocaml
   1314 
   1315         (*===----------------------------------------------------------------------===
   1316          * Top-Level parsing and JIT Driver
   1317          *===----------------------------------------------------------------------===*)
   1318 
   1319         open Llvm
   1320         open Llvm_executionengine
   1321 
   1322         (* top ::= definition | external | expression | ';' *)
   1323         let rec main_loop the_fpm the_execution_engine stream =
   1324           match Stream.peek stream with
   1325           | None -> ()
   1326 
   1327           (* ignore top-level semicolons. *)
   1328           | Some (Token.Kwd ';') ->
   1329               Stream.junk stream;
   1330               main_loop the_fpm the_execution_engine stream
   1331 
   1332           | Some token ->
   1333               begin
   1334                 try match token with
   1335                 | Token.Def ->
   1336                     let e = Parser.parse_definition stream in
   1337                     print_endline "parsed a function definition.";
   1338                     dump_value (Codegen.codegen_func the_fpm e);
   1339                 | Token.Extern ->
   1340                     let e = Parser.parse_extern stream in
   1341                     print_endline "parsed an extern.";
   1342                     dump_value (Codegen.codegen_proto e);
   1343                 | _ ->
   1344                     (* Evaluate a top-level expression into an anonymous function. *)
   1345                     let e = Parser.parse_toplevel stream in
   1346                     print_endline "parsed a top-level expr";
   1347                     let the_function = Codegen.codegen_func the_fpm e in
   1348                     dump_value the_function;
   1349 
   1350                     (* JIT the function, returning a function pointer. *)
   1351                     let result = ExecutionEngine.run_function the_function [||]
   1352                       the_execution_engine in
   1353 
   1354                     print_string "Evaluated to ";
   1355                     print_float (GenericValue.as_float Codegen.double_type result);
   1356                     print_newline ();
   1357                 with Stream.Error s | Codegen.Error s ->
   1358                   (* Skip token for error recovery. *)
   1359                   Stream.junk stream;
   1360                   print_endline s;
   1361               end;
   1362               print_string "ready> "; flush stdout;
   1363               main_loop the_fpm the_execution_engine stream
   1364 
   1365 toy.ml:
   1366     .. code-block:: ocaml
   1367 
   1368         (*===----------------------------------------------------------------------===
   1369          * Main driver code.
   1370          *===----------------------------------------------------------------------===*)
   1371 
   1372         open Llvm
   1373         open Llvm_executionengine
   1374         open Llvm_target
   1375         open Llvm_scalar_opts
   1376 
   1377         let main () =
   1378           ignore (initialize_native_target ());
   1379 
   1380           (* Install standard binary operators.
   1381            * 1 is the lowest precedence. *)
   1382           Hashtbl.add Parser.binop_precedence '<' 10;
   1383           Hashtbl.add Parser.binop_precedence '+' 20;
   1384           Hashtbl.add Parser.binop_precedence '-' 20;
   1385           Hashtbl.add Parser.binop_precedence '*' 40;    (* highest. *)
   1386 
   1387           (* Prime the first token. *)
   1388           print_string "ready> "; flush stdout;
   1389           let stream = Lexer.lex (Stream.of_channel stdin) in
   1390 
   1391           (* Create the JIT. *)
   1392           let the_execution_engine = ExecutionEngine.create Codegen.the_module in
   1393           let the_fpm = PassManager.create_function Codegen.the_module in
   1394 
   1395           (* Set up the optimizer pipeline.  Start with registering info about how the
   1396            * target lays out data structures. *)
   1397           DataLayout.add (ExecutionEngine.target_data the_execution_engine) the_fpm;
   1398 
   1399           (* Do simple "peephole" optimizations and bit-twiddling optzn. *)
   1400           add_instruction_combination the_fpm;
   1401 
   1402           (* reassociate expressions. *)
   1403           add_reassociation the_fpm;
   1404 
   1405           (* Eliminate Common SubExpressions. *)
   1406           add_gvn the_fpm;
   1407 
   1408           (* Simplify the control flow graph (deleting unreachable blocks, etc). *)
   1409           add_cfg_simplification the_fpm;
   1410 
   1411           ignore (PassManager.initialize the_fpm);
   1412 
   1413           (* Run the main "interpreter loop" now. *)
   1414           Toplevel.main_loop the_fpm the_execution_engine stream;
   1415 
   1416           (* Print out all the generated code. *)
   1417           dump_module Codegen.the_module
   1418         ;;
   1419 
   1420         main ()
   1421 
   1422 bindings.c
   1423     .. code-block:: c
   1424 
   1425         #include <stdio.h>
   1426 
   1427         /* putchard - putchar that takes a double and returns 0. */
   1428         extern double putchard(double X) {
   1429           putchar((char)X);
   1430           return 0;
   1431         }
   1432 
   1433         /* printd - printf that takes a double prints it as "%f\n", returning 0. */
   1434         extern double printd(double X) {
   1435           printf("%f\n", X);
   1436           return 0;
   1437         }
   1438 
   1439 `Next: Extending the language: mutable variables / SSA
   1440 construction <OCamlLangImpl7.html>`_
   1441 
   1442