Home | History | Annotate | Download | only in tutorial
      1 ============================================================
      2 Kaleidoscope: Extending the Language: User-defined Operators
      3 ============================================================
      4 
      5 .. contents::
      6    :local:
      7 
      8 Chapter 6 Introduction
      9 ======================
     10 
     11 Welcome to Chapter 6 of the "`Implementing a language with
     12 LLVM <index.html>`_" tutorial. At this point in our tutorial, we now
     13 have a fully functional language that is fairly minimal, but also
     14 useful. There is still one big problem with it, however. Our language
     15 doesn't have many useful operators (like division, logical negation, or
     16 even any comparisons besides less-than).
     17 
     18 This chapter of the tutorial takes a wild digression into adding
     19 user-defined operators to the simple and beautiful Kaleidoscope
     20 language. This digression now gives us a simple and ugly language in
     21 some ways, but also a powerful one at the same time. One of the great
     22 things about creating your own language is that you get to decide what
     23 is good or bad. In this tutorial we'll assume that it is okay to use
     24 this as a way to show some interesting parsing techniques.
     25 
     26 At the end of this tutorial, we'll run through an example Kaleidoscope
     27 application that `renders the Mandelbrot set <#example>`_. This gives an
     28 example of what you can build with Kaleidoscope and its feature set.
     29 
     30 User-defined Operators: the Idea
     31 ================================
     32 
     33 The "operator overloading" that we will add to Kaleidoscope is more
     34 general than languages like C++. In C++, you are only allowed to
     35 redefine existing operators: you can't programatically change the
     36 grammar, introduce new operators, change precedence levels, etc. In this
     37 chapter, we will add this capability to Kaleidoscope, which will let the
     38 user round out the set of operators that are supported.
     39 
     40 The point of going into user-defined operators in a tutorial like this
     41 is to show the power and flexibility of using a hand-written parser.
     42 Thus far, the parser we have been implementing uses recursive descent
     43 for most parts of the grammar and operator precedence parsing for the
     44 expressions. See `Chapter 2 <LangImpl2.html>`_ for details. Without
     45 using operator precedence parsing, it would be very difficult to allow
     46 the programmer to introduce new operators into the grammar: the grammar
     47 is dynamically extensible as the JIT runs.
     48 
     49 The two specific features we'll add are programmable unary operators
     50 (right now, Kaleidoscope has no unary operators at all) as well as
     51 binary operators. An example of this is:
     52 
     53 ::
     54 
     55     # Logical unary not.
     56     def unary!(v)
     57       if v then
     58         0
     59       else
     60         1;
     61 
     62     # Define > with the same precedence as <.
     63     def binary> 10 (LHS RHS)
     64       RHS < LHS;
     65 
     66     # Binary "logical or", (note that it does not "short circuit")
     67     def binary| 5 (LHS RHS)
     68       if LHS then
     69         1
     70       else if RHS then
     71         1
     72       else
     73         0;
     74 
     75     # Define = with slightly lower precedence than relationals.
     76     def binary= 9 (LHS RHS)
     77       !(LHS < RHS | LHS > RHS);
     78 
     79 Many languages aspire to being able to implement their standard runtime
     80 library in the language itself. In Kaleidoscope, we can implement
     81 significant parts of the language in the library!
     82 
     83 We will break down implementation of these features into two parts:
     84 implementing support for user-defined binary operators and adding unary
     85 operators.
     86 
     87 User-defined Binary Operators
     88 =============================
     89 
     90 Adding support for user-defined binary operators is pretty simple with
     91 our current framework. We'll first add support for the unary/binary
     92 keywords:
     93 
     94 .. code-block:: c++
     95 
     96     enum Token {
     97       ...
     98       // operators
     99       tok_binary = -11, tok_unary = -12
    100     };
    101     ...
    102     static int gettok() {
    103     ...
    104         if (IdentifierStr == "for") return tok_for;
    105         if (IdentifierStr == "in") return tok_in;
    106         if (IdentifierStr == "binary") return tok_binary;
    107         if (IdentifierStr == "unary") return tok_unary;
    108         return tok_identifier;
    109 
    110 This just adds lexer support for the unary and binary keywords, like we
    111 did in `previous chapters <LangImpl5.html#iflexer>`_. One nice thing
    112 about our current AST, is that we represent binary operators with full
    113 generalisation by using their ASCII code as the opcode. For our extended
    114 operators, we'll use this same representation, so we don't need any new
    115 AST or parser support.
    116 
    117 On the other hand, we have to be able to represent the definitions of
    118 these new operators, in the "def binary\| 5" part of the function
    119 definition. In our grammar so far, the "name" for the function
    120 definition is parsed as the "prototype" production and into the
    121 ``PrototypeAST`` AST node. To represent our new user-defined operators
    122 as prototypes, we have to extend the ``PrototypeAST`` AST node like
    123 this:
    124 
    125 .. code-block:: c++
    126 
    127     /// PrototypeAST - This class represents the "prototype" for a function,
    128     /// which captures its argument names as well as if it is an operator.
    129     class PrototypeAST {
    130       std::string Name;
    131       std::vector<std::string> Args;
    132       bool isOperator;
    133       unsigned Precedence;  // Precedence if a binary op.
    134     public:
    135       PrototypeAST(const std::string &name, const std::vector<std::string> &args,
    136                    bool isoperator = false, unsigned prec = 0)
    137       : Name(name), Args(args), isOperator(isoperator), Precedence(prec) {}
    138 
    139       bool isUnaryOp() const { return isOperator && Args.size() == 1; }
    140       bool isBinaryOp() const { return isOperator && Args.size() == 2; }
    141 
    142       char getOperatorName() const {
    143         assert(isUnaryOp() || isBinaryOp());
    144         return Name[Name.size()-1];
    145       }
    146 
    147       unsigned getBinaryPrecedence() const { return Precedence; }
    148 
    149       Function *Codegen();
    150     };
    151 
    152 Basically, in addition to knowing a name for the prototype, we now keep
    153 track of whether it was an operator, and if it was, what precedence
    154 level the operator is at. The precedence is only used for binary
    155 operators (as you'll see below, it just doesn't apply for unary
    156 operators). Now that we have a way to represent the prototype for a
    157 user-defined operator, we need to parse it:
    158 
    159 .. code-block:: c++
    160 
    161     /// prototype
    162     ///   ::= id '(' id* ')'
    163     ///   ::= binary LETTER number? (id, id)
    164     static PrototypeAST *ParsePrototype() {
    165       std::string FnName;
    166 
    167       unsigned Kind = 0;  // 0 = identifier, 1 = unary, 2 = binary.
    168       unsigned BinaryPrecedence = 30;
    169 
    170       switch (CurTok) {
    171       default:
    172         return ErrorP("Expected function name in prototype");
    173       case tok_identifier:
    174         FnName = IdentifierStr;
    175         Kind = 0;
    176         getNextToken();
    177         break;
    178       case tok_binary:
    179         getNextToken();
    180         if (!isascii(CurTok))
    181           return ErrorP("Expected binary operator");
    182         FnName = "binary";
    183         FnName += (char)CurTok;
    184         Kind = 2;
    185         getNextToken();
    186 
    187         // Read the precedence if present.
    188         if (CurTok == tok_number) {
    189           if (NumVal < 1 || NumVal > 100)
    190             return ErrorP("Invalid precedecnce: must be 1..100");
    191           BinaryPrecedence = (unsigned)NumVal;
    192           getNextToken();
    193         }
    194         break;
    195       }
    196 
    197       if (CurTok != '(')
    198         return ErrorP("Expected '(' in prototype");
    199 
    200       std::vector<std::string> ArgNames;
    201       while (getNextToken() == tok_identifier)
    202         ArgNames.push_back(IdentifierStr);
    203       if (CurTok != ')')
    204         return ErrorP("Expected ')' in prototype");
    205 
    206       // success.
    207       getNextToken();  // eat ')'.
    208 
    209       // Verify right number of names for operator.
    210       if (Kind && ArgNames.size() != Kind)
    211         return ErrorP("Invalid number of operands for operator");
    212 
    213       return new PrototypeAST(FnName, ArgNames, Kind != 0, BinaryPrecedence);
    214     }
    215 
    216 This is all fairly straightforward parsing code, and we have already
    217 seen a lot of similar code in the past. One interesting part about the
    218 code above is the couple lines that set up ``FnName`` for binary
    219 operators. This builds names like "binary@" for a newly defined "@"
    220 operator. This then takes advantage of the fact that symbol names in the
    221 LLVM symbol table are allowed to have any character in them, including
    222 embedded nul characters.
    223 
    224 The next interesting thing to add, is codegen support for these binary
    225 operators. Given our current structure, this is a simple addition of a
    226 default case for our existing binary operator node:
    227 
    228 .. code-block:: c++
    229 
    230     Value *BinaryExprAST::Codegen() {
    231       Value *L = LHS->Codegen();
    232       Value *R = RHS->Codegen();
    233       if (L == 0 || R == 0) return 0;
    234 
    235       switch (Op) {
    236       case '+': return Builder.CreateFAdd(L, R, "addtmp");
    237       case '-': return Builder.CreateFSub(L, R, "subtmp");
    238       case '*': return Builder.CreateFMul(L, R, "multmp");
    239       case '<':
    240         L = Builder.CreateFCmpULT(L, R, "cmptmp");
    241         // Convert bool 0/1 to double 0.0 or 1.0
    242         return Builder.CreateUIToFP(L, Type::getDoubleTy(getGlobalContext()),
    243                                     "booltmp");
    244       default: break;
    245       }
    246 
    247       // If it wasn't a builtin binary operator, it must be a user defined one. Emit
    248       // a call to it.
    249       Function *F = TheModule->getFunction(std::string("binary")+Op);
    250       assert(F && "binary operator not found!");
    251 
    252       Value *Ops[2] = { L, R };
    253       return Builder.CreateCall(F, Ops, "binop");
    254     }
    255 
    256 As you can see above, the new code is actually really simple. It just
    257 does a lookup for the appropriate operator in the symbol table and
    258 generates a function call to it. Since user-defined operators are just
    259 built as normal functions (because the "prototype" boils down to a
    260 function with the right name) everything falls into place.
    261 
    262 The final piece of code we are missing, is a bit of top-level magic:
    263 
    264 .. code-block:: c++
    265 
    266     Function *FunctionAST::Codegen() {
    267       NamedValues.clear();
    268 
    269       Function *TheFunction = Proto->Codegen();
    270       if (TheFunction == 0)
    271         return 0;
    272 
    273       // If this is an operator, install it.
    274       if (Proto->isBinaryOp())
    275         BinopPrecedence[Proto->getOperatorName()] = Proto->getBinaryPrecedence();
    276 
    277       // Create a new basic block to start insertion into.
    278       BasicBlock *BB = BasicBlock::Create(getGlobalContext(), "entry", TheFunction);
    279       Builder.SetInsertPoint(BB);
    280 
    281       if (Value *RetVal = Body->Codegen()) {
    282         ...
    283 
    284 Basically, before codegening a function, if it is a user-defined
    285 operator, we register it in the precedence table. This allows the binary
    286 operator parsing logic we already have in place to handle it. Since we
    287 are working on a fully-general operator precedence parser, this is all
    288 we need to do to "extend the grammar".
    289 
    290 Now we have useful user-defined binary operators. This builds a lot on
    291 the previous framework we built for other operators. Adding unary
    292 operators is a bit more challenging, because we don't have any framework
    293 for it yet - lets see what it takes.
    294 
    295 User-defined Unary Operators
    296 ============================
    297 
    298 Since we don't currently support unary operators in the Kaleidoscope
    299 language, we'll need to add everything to support them. Above, we added
    300 simple support for the 'unary' keyword to the lexer. In addition to
    301 that, we need an AST node:
    302 
    303 .. code-block:: c++
    304 
    305     /// UnaryExprAST - Expression class for a unary operator.
    306     class UnaryExprAST : public ExprAST {
    307       char Opcode;
    308       ExprAST *Operand;
    309     public:
    310       UnaryExprAST(char opcode, ExprAST *operand)
    311         : Opcode(opcode), Operand(operand) {}
    312       virtual Value *Codegen();
    313     };
    314 
    315 This AST node is very simple and obvious by now. It directly mirrors the
    316 binary operator AST node, except that it only has one child. With this,
    317 we need to add the parsing logic. Parsing a unary operator is pretty
    318 simple: we'll add a new function to do it:
    319 
    320 .. code-block:: c++
    321 
    322     /// unary
    323     ///   ::= primary
    324     ///   ::= '!' unary
    325     static ExprAST *ParseUnary() {
    326       // If the current token is not an operator, it must be a primary expr.
    327       if (!isascii(CurTok) || CurTok == '(' || CurTok == ',')
    328         return ParsePrimary();
    329 
    330       // If this is a unary operator, read it.
    331       int Opc = CurTok;
    332       getNextToken();
    333       if (ExprAST *Operand = ParseUnary())
    334         return new UnaryExprAST(Opc, Operand);
    335       return 0;
    336     }
    337 
    338 The grammar we add is pretty straightforward here. If we see a unary
    339 operator when parsing a primary operator, we eat the operator as a
    340 prefix and parse the remaining piece as another unary operator. This
    341 allows us to handle multiple unary operators (e.g. "!!x"). Note that
    342 unary operators can't have ambiguous parses like binary operators can,
    343 so there is no need for precedence information.
    344 
    345 The problem with this function, is that we need to call ParseUnary from
    346 somewhere. To do this, we change previous callers of ParsePrimary to
    347 call ParseUnary instead:
    348 
    349 .. code-block:: c++
    350 
    351     /// binoprhs
    352     ///   ::= ('+' unary)*
    353     static ExprAST *ParseBinOpRHS(int ExprPrec, ExprAST *LHS) {
    354       ...
    355         // Parse the unary expression after the binary operator.
    356         ExprAST *RHS = ParseUnary();
    357         if (!RHS) return 0;
    358       ...
    359     }
    360     /// expression
    361     ///   ::= unary binoprhs
    362     ///
    363     static ExprAST *ParseExpression() {
    364       ExprAST *LHS = ParseUnary();
    365       if (!LHS) return 0;
    366 
    367       return ParseBinOpRHS(0, LHS);
    368     }
    369 
    370 With these two simple changes, we are now able to parse unary operators
    371 and build the AST for them. Next up, we need to add parser support for
    372 prototypes, to parse the unary operator prototype. We extend the binary
    373 operator code above with:
    374 
    375 .. code-block:: c++
    376 
    377     /// prototype
    378     ///   ::= id '(' id* ')'
    379     ///   ::= binary LETTER number? (id, id)
    380     ///   ::= unary LETTER (id)
    381     static PrototypeAST *ParsePrototype() {
    382       std::string FnName;
    383 
    384       unsigned Kind = 0;  // 0 = identifier, 1 = unary, 2 = binary.
    385       unsigned BinaryPrecedence = 30;
    386 
    387       switch (CurTok) {
    388       default:
    389         return ErrorP("Expected function name in prototype");
    390       case tok_identifier:
    391         FnName = IdentifierStr;
    392         Kind = 0;
    393         getNextToken();
    394         break;
    395       case tok_unary:
    396         getNextToken();
    397         if (!isascii(CurTok))
    398           return ErrorP("Expected unary operator");
    399         FnName = "unary";
    400         FnName += (char)CurTok;
    401         Kind = 1;
    402         getNextToken();
    403         break;
    404       case tok_binary:
    405         ...
    406 
    407 As with binary operators, we name unary operators with a name that
    408 includes the operator character. This assists us at code generation
    409 time. Speaking of, the final piece we need to add is codegen support for
    410 unary operators. It looks like this:
    411 
    412 .. code-block:: c++
    413 
    414     Value *UnaryExprAST::Codegen() {
    415       Value *OperandV = Operand->Codegen();
    416       if (OperandV == 0) return 0;
    417 
    418       Function *F = TheModule->getFunction(std::string("unary")+Opcode);
    419       if (F == 0)
    420         return ErrorV("Unknown unary operator");
    421 
    422       return Builder.CreateCall(F, OperandV, "unop");
    423     }
    424 
    425 This code is similar to, but simpler than, the code for binary
    426 operators. It is simpler primarily because it doesn't need to handle any
    427 predefined operators.
    428 
    429 Kicking the Tires
    430 =================
    431 
    432 It is somewhat hard to believe, but with a few simple extensions we've
    433 covered in the last chapters, we have grown a real-ish language. With
    434 this, we can do a lot of interesting things, including I/O, math, and a
    435 bunch of other things. For example, we can now add a nice sequencing
    436 operator (printd is defined to print out the specified value and a
    437 newline):
    438 
    439 ::
    440 
    441     ready> extern printd(x);
    442     Read extern:
    443     declare double @printd(double)
    444 
    445     ready> def binary : 1 (x y) 0;  # Low-precedence operator that ignores operands.
    446     ..
    447     ready> printd(123) : printd(456) : printd(789);
    448     123.000000
    449     456.000000
    450     789.000000
    451     Evaluated to 0.000000
    452 
    453 We can also define a bunch of other "primitive" operations, such as:
    454 
    455 ::
    456 
    457     # Logical unary not.
    458     def unary!(v)
    459       if v then
    460         0
    461       else
    462         1;
    463 
    464     # Unary negate.
    465     def unary-(v)
    466       0-v;
    467 
    468     # Define > with the same precedence as <.
    469     def binary> 10 (LHS RHS)
    470       RHS < LHS;
    471 
    472     # Binary logical or, which does not short circuit.
    473     def binary| 5 (LHS RHS)
    474       if LHS then
    475         1
    476       else if RHS then
    477         1
    478       else
    479         0;
    480 
    481     # Binary logical and, which does not short circuit.
    482     def binary& 6 (LHS RHS)
    483       if !LHS then
    484         0
    485       else
    486         !!RHS;
    487 
    488     # Define = with slightly lower precedence than relationals.
    489     def binary = 9 (LHS RHS)
    490       !(LHS < RHS | LHS > RHS);
    491 
    492     # Define ':' for sequencing: as a low-precedence operator that ignores operands
    493     # and just returns the RHS.
    494     def binary : 1 (x y) y;
    495 
    496 Given the previous if/then/else support, we can also define interesting
    497 functions for I/O. For example, the following prints out a character
    498 whose "density" reflects the value passed in: the lower the value, the
    499 denser the character:
    500 
    501 ::
    502 
    503     ready>
    504 
    505     extern putchard(char)
    506     def printdensity(d)
    507       if d > 8 then
    508         putchard(32)  # ' '
    509       else if d > 4 then
    510         putchard(46)  # '.'
    511       else if d > 2 then
    512         putchard(43)  # '+'
    513       else
    514         putchard(42); # '*'
    515     ...
    516     ready> printdensity(1): printdensity(2): printdensity(3):
    517            printdensity(4): printdensity(5): printdensity(9):
    518            putchard(10);
    519     **++.
    520     Evaluated to 0.000000
    521 
    522 Based on these simple primitive operations, we can start to define more
    523 interesting things. For example, here's a little function that solves
    524 for the number of iterations it takes a function in the complex plane to
    525 converge:
    526 
    527 ::
    528 
    529     # Determine whether the specific location diverges.
    530     # Solve for z = z^2 + c in the complex plane.
    531     def mandleconverger(real imag iters creal cimag)
    532       if iters > 255 | (real*real + imag*imag > 4) then
    533         iters
    534       else
    535         mandleconverger(real*real - imag*imag + creal,
    536                         2*real*imag + cimag,
    537                         iters+1, creal, cimag);
    538 
    539     # Return the number of iterations required for the iteration to escape
    540     def mandleconverge(real imag)
    541       mandleconverger(real, imag, 0, real, imag);
    542 
    543 This "``z = z2 + c``" function is a beautiful little creature that is
    544 the basis for computation of the `Mandelbrot
    545 Set <http://en.wikipedia.org/wiki/Mandelbrot_set>`_. Our
    546 ``mandelconverge`` function returns the number of iterations that it
    547 takes for a complex orbit to escape, saturating to 255. This is not a
    548 very useful function by itself, but if you plot its value over a
    549 two-dimensional plane, you can see the Mandelbrot set. Given that we are
    550 limited to using putchard here, our amazing graphical output is limited,
    551 but we can whip together something using the density plotter above:
    552 
    553 ::
    554 
    555     # Compute and plot the mandlebrot set with the specified 2 dimensional range
    556     # info.
    557     def mandelhelp(xmin xmax xstep   ymin ymax ystep)
    558       for y = ymin, y < ymax, ystep in (
    559         (for x = xmin, x < xmax, xstep in
    560            printdensity(mandleconverge(x,y)))
    561         : putchard(10)
    562       )
    563 
    564     # mandel - This is a convenient helper function for plotting the mandelbrot set
    565     # from the specified position with the specified Magnification.
    566     def mandel(realstart imagstart realmag imagmag)
    567       mandelhelp(realstart, realstart+realmag*78, realmag,
    568                  imagstart, imagstart+imagmag*40, imagmag);
    569 
    570 Given this, we can try plotting out the mandlebrot set! Lets try it out:
    571 
    572 ::
    573 
    574     ready> mandel(-2.3, -1.3, 0.05, 0.07);
    575     *******************************+++++++++++*************************************
    576     *************************+++++++++++++++++++++++*******************************
    577     **********************+++++++++++++++++++++++++++++****************************
    578     *******************+++++++++++++++++++++.. ...++++++++*************************
    579     *****************++++++++++++++++++++++.... ...+++++++++***********************
    580     ***************+++++++++++++++++++++++.....   ...+++++++++*********************
    581     **************+++++++++++++++++++++++....     ....+++++++++********************
    582     *************++++++++++++++++++++++......      .....++++++++*******************
    583     ************+++++++++++++++++++++.......       .......+++++++******************
    584     ***********+++++++++++++++++++....                ... .+++++++*****************
    585     **********+++++++++++++++++.......                     .+++++++****************
    586     *********++++++++++++++...........                    ...+++++++***************
    587     ********++++++++++++............                      ...++++++++**************
    588     ********++++++++++... ..........                        .++++++++**************
    589     *******+++++++++.....                                   .+++++++++*************
    590     *******++++++++......                                  ..+++++++++*************
    591     *******++++++.......                                   ..+++++++++*************
    592     *******+++++......                                     ..+++++++++*************
    593     *******.... ....                                      ...+++++++++*************
    594     *******.... .                                         ...+++++++++*************
    595     *******+++++......                                    ...+++++++++*************
    596     *******++++++.......                                   ..+++++++++*************
    597     *******++++++++......                                   .+++++++++*************
    598     *******+++++++++.....                                  ..+++++++++*************
    599     ********++++++++++... ..........                        .++++++++**************
    600     ********++++++++++++............                      ...++++++++**************
    601     *********++++++++++++++..........                     ...+++++++***************
    602     **********++++++++++++++++........                     .+++++++****************
    603     **********++++++++++++++++++++....                ... ..+++++++****************
    604     ***********++++++++++++++++++++++.......       .......++++++++*****************
    605     ************+++++++++++++++++++++++......      ......++++++++******************
    606     **************+++++++++++++++++++++++....      ....++++++++********************
    607     ***************+++++++++++++++++++++++.....   ...+++++++++*********************
    608     *****************++++++++++++++++++++++....  ...++++++++***********************
    609     *******************+++++++++++++++++++++......++++++++*************************
    610     *********************++++++++++++++++++++++.++++++++***************************
    611     *************************+++++++++++++++++++++++*******************************
    612     ******************************+++++++++++++************************************
    613     *******************************************************************************
    614     *******************************************************************************
    615     *******************************************************************************
    616     Evaluated to 0.000000
    617     ready> mandel(-2, -1, 0.02, 0.04);
    618     **************************+++++++++++++++++++++++++++++++++++++++++++++++++++++
    619     ***********************++++++++++++++++++++++++++++++++++++++++++++++++++++++++
    620     *********************+++++++++++++++++++++++++++++++++++++++++++++++++++++++++.
    621     *******************+++++++++++++++++++++++++++++++++++++++++++++++++++++++++...
    622     *****************+++++++++++++++++++++++++++++++++++++++++++++++++++++++++.....
    623     ***************++++++++++++++++++++++++++++++++++++++++++++++++++++++++........
    624     **************++++++++++++++++++++++++++++++++++++++++++++++++++++++...........
    625     ************+++++++++++++++++++++++++++++++++++++++++++++++++++++..............
    626     ***********++++++++++++++++++++++++++++++++++++++++++++++++++........        .
    627     **********++++++++++++++++++++++++++++++++++++++++++++++.............
    628     ********+++++++++++++++++++++++++++++++++++++++++++..................
    629     *******+++++++++++++++++++++++++++++++++++++++.......................
    630     ******+++++++++++++++++++++++++++++++++++...........................
    631     *****++++++++++++++++++++++++++++++++............................
    632     *****++++++++++++++++++++++++++++...............................
    633     ****++++++++++++++++++++++++++......   .........................
    634     ***++++++++++++++++++++++++.........     ......    ...........
    635     ***++++++++++++++++++++++............
    636     **+++++++++++++++++++++..............
    637     **+++++++++++++++++++................
    638     *++++++++++++++++++.................
    639     *++++++++++++++++............ ...
    640     *++++++++++++++..............
    641     *+++....++++................
    642     *..........  ...........
    643     *
    644     *..........  ...........
    645     *+++....++++................
    646     *++++++++++++++..............
    647     *++++++++++++++++............ ...
    648     *++++++++++++++++++.................
    649     **+++++++++++++++++++................
    650     **+++++++++++++++++++++..............
    651     ***++++++++++++++++++++++............
    652     ***++++++++++++++++++++++++.........     ......    ...........
    653     ****++++++++++++++++++++++++++......   .........................
    654     *****++++++++++++++++++++++++++++...............................
    655     *****++++++++++++++++++++++++++++++++............................
    656     ******+++++++++++++++++++++++++++++++++++...........................
    657     *******+++++++++++++++++++++++++++++++++++++++.......................
    658     ********+++++++++++++++++++++++++++++++++++++++++++..................
    659     Evaluated to 0.000000
    660     ready> mandel(-0.9, -1.4, 0.02, 0.03);
    661     *******************************************************************************
    662     *******************************************************************************
    663     *******************************************************************************
    664     **********+++++++++++++++++++++************************************************
    665     *+++++++++++++++++++++++++++++++++++++++***************************************
    666     +++++++++++++++++++++++++++++++++++++++++++++**********************************
    667     ++++++++++++++++++++++++++++++++++++++++++++++++++*****************************
    668     ++++++++++++++++++++++++++++++++++++++++++++++++++++++*************************
    669     +++++++++++++++++++++++++++++++++++++++++++++++++++++++++**********************
    670     +++++++++++++++++++++++++++++++++.........++++++++++++++++++*******************
    671     +++++++++++++++++++++++++++++++....   ......+++++++++++++++++++****************
    672     +++++++++++++++++++++++++++++.......  ........+++++++++++++++++++**************
    673     ++++++++++++++++++++++++++++........   ........++++++++++++++++++++************
    674     +++++++++++++++++++++++++++.........     ..  ...+++++++++++++++++++++**********
    675     ++++++++++++++++++++++++++...........        ....++++++++++++++++++++++********
    676     ++++++++++++++++++++++++.............       .......++++++++++++++++++++++******
    677     +++++++++++++++++++++++.............        ........+++++++++++++++++++++++****
    678     ++++++++++++++++++++++...........           ..........++++++++++++++++++++++***
    679     ++++++++++++++++++++...........                .........++++++++++++++++++++++*
    680     ++++++++++++++++++............                  ...........++++++++++++++++++++
    681     ++++++++++++++++...............                 .............++++++++++++++++++
    682     ++++++++++++++.................                 ...............++++++++++++++++
    683     ++++++++++++..................                  .................++++++++++++++
    684     +++++++++..................                      .................+++++++++++++
    685     ++++++........        .                               .........  ..++++++++++++
    686     ++............                                         ......    ....++++++++++
    687     ..............                                                    ...++++++++++
    688     ..............                                                    ....+++++++++
    689     ..............                                                    .....++++++++
    690     .............                                                    ......++++++++
    691     ...........                                                     .......++++++++
    692     .........                                                       ........+++++++
    693     .........                                                       ........+++++++
    694     .........                                                           ....+++++++
    695     ........                                                             ...+++++++
    696     .......                                                              ...+++++++
    697                                                                         ....+++++++
    698                                                                        .....+++++++
    699                                                                         ....+++++++
    700                                                                         ....+++++++
    701                                                                         ....+++++++
    702     Evaluated to 0.000000
    703     ready> ^D
    704 
    705 At this point, you may be starting to realize that Kaleidoscope is a
    706 real and powerful language. It may not be self-similar :), but it can be
    707 used to plot things that are!
    708 
    709 With this, we conclude the "adding user-defined operators" chapter of
    710 the tutorial. We have successfully augmented our language, adding the
    711 ability to extend the language in the library, and we have shown how
    712 this can be used to build a simple but interesting end-user application
    713 in Kaleidoscope. At this point, Kaleidoscope can build a variety of
    714 applications that are functional and can call functions with
    715 side-effects, but it can't actually define and mutate a variable itself.
    716 
    717 Strikingly, variable mutation is an important feature of some languages,
    718 and it is not at all obvious how to `add support for mutable
    719 variables <LangImpl7.html>`_ without having to add an "SSA construction"
    720 phase to your front-end. In the next chapter, we will describe how you
    721 can add variable mutation without building SSA in your front-end.
    722 
    723 Full Code Listing
    724 =================
    725 
    726 Here is the complete code listing for our running example, enhanced with
    727 the if/then/else and for expressions.. To build this example, use:
    728 
    729 .. code-block:: bash
    730 
    731     # Compile
    732     clang++ -g toy.cpp `llvm-config --cppflags --ldflags --libs core jit native` -O3 -o toy
    733     # Run
    734     ./toy
    735 
    736 On some platforms, you will need to specify -rdynamic or
    737 -Wl,--export-dynamic when linking. This ensures that symbols defined in
    738 the main executable are exported to the dynamic linker and so are
    739 available for symbol resolution at run time. This is not needed if you
    740 compile your support code into a shared library, although doing that
    741 will cause problems on Windows.
    742 
    743 Here is the code:
    744 
    745 .. code-block:: c++
    746 
    747     #include "llvm/DerivedTypes.h"
    748     #include "llvm/ExecutionEngine/ExecutionEngine.h"
    749     #include "llvm/ExecutionEngine/JIT.h"
    750     #include "llvm/IRBuilder.h"
    751     #include "llvm/LLVMContext.h"
    752     #include "llvm/Module.h"
    753     #include "llvm/PassManager.h"
    754     #include "llvm/Analysis/Verifier.h"
    755     #include "llvm/Analysis/Passes.h"
    756     #include "llvm/DataLayout.h"
    757     #include "llvm/Transforms/Scalar.h"
    758     #include "llvm/Support/TargetSelect.h"
    759     #include <cstdio>
    760     #include <string>
    761     #include <map>
    762     #include <vector>
    763     using namespace llvm;
    764 
    765     //===----------------------------------------------------------------------===//
    766     // Lexer
    767     //===----------------------------------------------------------------------===//
    768 
    769     // The lexer returns tokens [0-255] if it is an unknown character, otherwise one
    770     // of these for known things.
    771     enum Token {
    772       tok_eof = -1,
    773 
    774       // commands
    775       tok_def = -2, tok_extern = -3,
    776 
    777       // primary
    778       tok_identifier = -4, tok_number = -5,
    779 
    780       // control
    781       tok_if = -6, tok_then = -7, tok_else = -8,
    782       tok_for = -9, tok_in = -10,
    783 
    784       // operators
    785       tok_binary = -11, tok_unary = -12
    786     };
    787 
    788     static std::string IdentifierStr;  // Filled in if tok_identifier
    789     static double NumVal;              // Filled in if tok_number
    790 
    791     /// gettok - Return the next token from standard input.
    792     static int gettok() {
    793       static int LastChar = ' ';
    794 
    795       // Skip any whitespace.
    796       while (isspace(LastChar))
    797         LastChar = getchar();
    798 
    799       if (isalpha(LastChar)) { // identifier: [a-zA-Z][a-zA-Z0-9]*
    800         IdentifierStr = LastChar;
    801         while (isalnum((LastChar = getchar())))
    802           IdentifierStr += LastChar;
    803 
    804         if (IdentifierStr == "def") return tok_def;
    805         if (IdentifierStr == "extern") return tok_extern;
    806         if (IdentifierStr == "if") return tok_if;
    807         if (IdentifierStr == "then") return tok_then;
    808         if (IdentifierStr == "else") return tok_else;
    809         if (IdentifierStr == "for") return tok_for;
    810         if (IdentifierStr == "in") return tok_in;
    811         if (IdentifierStr == "binary") return tok_binary;
    812         if (IdentifierStr == "unary") return tok_unary;
    813         return tok_identifier;
    814       }
    815 
    816       if (isdigit(LastChar) || LastChar == '.') {   // Number: [0-9.]+
    817         std::string NumStr;
    818         do {
    819           NumStr += LastChar;
    820           LastChar = getchar();
    821         } while (isdigit(LastChar) || LastChar == '.');
    822 
    823         NumVal = strtod(NumStr.c_str(), 0);
    824         return tok_number;
    825       }
    826 
    827       if (LastChar == '#') {
    828         // Comment until end of line.
    829         do LastChar = getchar();
    830         while (LastChar != EOF && LastChar != '\n' && LastChar != '\r');
    831 
    832         if (LastChar != EOF)
    833           return gettok();
    834       }
    835 
    836       // Check for end of file.  Don't eat the EOF.
    837       if (LastChar == EOF)
    838         return tok_eof;
    839 
    840       // Otherwise, just return the character as its ascii value.
    841       int ThisChar = LastChar;
    842       LastChar = getchar();
    843       return ThisChar;
    844     }
    845 
    846     //===----------------------------------------------------------------------===//
    847     // Abstract Syntax Tree (aka Parse Tree)
    848     //===----------------------------------------------------------------------===//
    849 
    850     /// ExprAST - Base class for all expression nodes.
    851     class ExprAST {
    852     public:
    853       virtual ~ExprAST() {}
    854       virtual Value *Codegen() = 0;
    855     };
    856 
    857     /// NumberExprAST - Expression class for numeric literals like "1.0".
    858     class NumberExprAST : public ExprAST {
    859       double Val;
    860     public:
    861       NumberExprAST(double val) : Val(val) {}
    862       virtual Value *Codegen();
    863     };
    864 
    865     /// VariableExprAST - Expression class for referencing a variable, like "a".
    866     class VariableExprAST : public ExprAST {
    867       std::string Name;
    868     public:
    869       VariableExprAST(const std::string &name) : Name(name) {}
    870       virtual Value *Codegen();
    871     };
    872 
    873     /// UnaryExprAST - Expression class for a unary operator.
    874     class UnaryExprAST : public ExprAST {
    875       char Opcode;
    876       ExprAST *Operand;
    877     public:
    878       UnaryExprAST(char opcode, ExprAST *operand)
    879         : Opcode(opcode), Operand(operand) {}
    880       virtual Value *Codegen();
    881     };
    882 
    883     /// BinaryExprAST - Expression class for a binary operator.
    884     class BinaryExprAST : public ExprAST {
    885       char Op;
    886       ExprAST *LHS, *RHS;
    887     public:
    888       BinaryExprAST(char op, ExprAST *lhs, ExprAST *rhs)
    889         : Op(op), LHS(lhs), RHS(rhs) {}
    890       virtual Value *Codegen();
    891     };
    892 
    893     /// CallExprAST - Expression class for function calls.
    894     class CallExprAST : public ExprAST {
    895       std::string Callee;
    896       std::vector<ExprAST*> Args;
    897     public:
    898       CallExprAST(const std::string &callee, std::vector<ExprAST*> &args)
    899         : Callee(callee), Args(args) {}
    900       virtual Value *Codegen();
    901     };
    902 
    903     /// IfExprAST - Expression class for if/then/else.
    904     class IfExprAST : public ExprAST {
    905       ExprAST *Cond, *Then, *Else;
    906     public:
    907       IfExprAST(ExprAST *cond, ExprAST *then, ExprAST *_else)
    908       : Cond(cond), Then(then), Else(_else) {}
    909       virtual Value *Codegen();
    910     };
    911 
    912     /// ForExprAST - Expression class for for/in.
    913     class ForExprAST : public ExprAST {
    914       std::string VarName;
    915       ExprAST *Start, *End, *Step, *Body;
    916     public:
    917       ForExprAST(const std::string &varname, ExprAST *start, ExprAST *end,
    918                  ExprAST *step, ExprAST *body)
    919         : VarName(varname), Start(start), End(end), Step(step), Body(body) {}
    920       virtual Value *Codegen();
    921     };
    922 
    923     /// PrototypeAST - This class represents the "prototype" for a function,
    924     /// which captures its name, and its argument names (thus implicitly the number
    925     /// of arguments the function takes), as well as if it is an operator.
    926     class PrototypeAST {
    927       std::string Name;
    928       std::vector<std::string> Args;
    929       bool isOperator;
    930       unsigned Precedence;  // Precedence if a binary op.
    931     public:
    932       PrototypeAST(const std::string &name, const std::vector<std::string> &args,
    933                    bool isoperator = false, unsigned prec = 0)
    934       : Name(name), Args(args), isOperator(isoperator), Precedence(prec) {}
    935 
    936       bool isUnaryOp() const { return isOperator && Args.size() == 1; }
    937       bool isBinaryOp() const { return isOperator && Args.size() == 2; }
    938 
    939       char getOperatorName() const {
    940         assert(isUnaryOp() || isBinaryOp());
    941         return Name[Name.size()-1];
    942       }
    943 
    944       unsigned getBinaryPrecedence() const { return Precedence; }
    945 
    946       Function *Codegen();
    947     };
    948 
    949     /// FunctionAST - This class represents a function definition itself.
    950     class FunctionAST {
    951       PrototypeAST *Proto;
    952       ExprAST *Body;
    953     public:
    954       FunctionAST(PrototypeAST *proto, ExprAST *body)
    955         : Proto(proto), Body(body) {}
    956 
    957       Function *Codegen();
    958     };
    959 
    960     //===----------------------------------------------------------------------===//
    961     // Parser
    962     //===----------------------------------------------------------------------===//
    963 
    964     /// CurTok/getNextToken - Provide a simple token buffer.  CurTok is the current
    965     /// token the parser is looking at.  getNextToken reads another token from the
    966     /// lexer and updates CurTok with its results.
    967     static int CurTok;
    968     static int getNextToken() {
    969       return CurTok = gettok();
    970     }
    971 
    972     /// BinopPrecedence - This holds the precedence for each binary operator that is
    973     /// defined.
    974     static std::map<char, int> BinopPrecedence;
    975 
    976     /// GetTokPrecedence - Get the precedence of the pending binary operator token.
    977     static int GetTokPrecedence() {
    978       if (!isascii(CurTok))
    979         return -1;
    980 
    981       // Make sure it's a declared binop.
    982       int TokPrec = BinopPrecedence[CurTok];
    983       if (TokPrec <= 0) return -1;
    984       return TokPrec;
    985     }
    986 
    987     /// Error* - These are little helper functions for error handling.
    988     ExprAST *Error(const char *Str) { fprintf(stderr, "Error: %s\n", Str);return 0;}
    989     PrototypeAST *ErrorP(const char *Str) { Error(Str); return 0; }
    990     FunctionAST *ErrorF(const char *Str) { Error(Str); return 0; }
    991 
    992     static ExprAST *ParseExpression();
    993 
    994     /// identifierexpr
    995     ///   ::= identifier
    996     ///   ::= identifier '(' expression* ')'
    997     static ExprAST *ParseIdentifierExpr() {
    998       std::string IdName = IdentifierStr;
    999 
   1000       getNextToken();  // eat identifier.
   1001 
   1002       if (CurTok != '(') // Simple variable ref.
   1003         return new VariableExprAST(IdName);
   1004 
   1005       // Call.
   1006       getNextToken();  // eat (
   1007       std::vector<ExprAST*> Args;
   1008       if (CurTok != ')') {
   1009         while (1) {
   1010           ExprAST *Arg = ParseExpression();
   1011           if (!Arg) return 0;
   1012           Args.push_back(Arg);
   1013 
   1014           if (CurTok == ')') break;
   1015 
   1016           if (CurTok != ',')
   1017             return Error("Expected ')' or ',' in argument list");
   1018           getNextToken();
   1019         }
   1020       }
   1021 
   1022       // Eat the ')'.
   1023       getNextToken();
   1024 
   1025       return new CallExprAST(IdName, Args);
   1026     }
   1027 
   1028     /// numberexpr ::= number
   1029     static ExprAST *ParseNumberExpr() {
   1030       ExprAST *Result = new NumberExprAST(NumVal);
   1031       getNextToken(); // consume the number
   1032       return Result;
   1033     }
   1034 
   1035     /// parenexpr ::= '(' expression ')'
   1036     static ExprAST *ParseParenExpr() {
   1037       getNextToken();  // eat (.
   1038       ExprAST *V = ParseExpression();
   1039       if (!V) return 0;
   1040 
   1041       if (CurTok != ')')
   1042         return Error("expected ')'");
   1043       getNextToken();  // eat ).
   1044       return V;
   1045     }
   1046 
   1047     /// ifexpr ::= 'if' expression 'then' expression 'else' expression
   1048     static ExprAST *ParseIfExpr() {
   1049       getNextToken();  // eat the if.
   1050 
   1051       // condition.
   1052       ExprAST *Cond = ParseExpression();
   1053       if (!Cond) return 0;
   1054 
   1055       if (CurTok != tok_then)
   1056         return Error("expected then");
   1057       getNextToken();  // eat the then
   1058 
   1059       ExprAST *Then = ParseExpression();
   1060       if (Then == 0) return 0;
   1061 
   1062       if (CurTok != tok_else)
   1063         return Error("expected else");
   1064 
   1065       getNextToken();
   1066 
   1067       ExprAST *Else = ParseExpression();
   1068       if (!Else) return 0;
   1069 
   1070       return new IfExprAST(Cond, Then, Else);
   1071     }
   1072 
   1073     /// forexpr ::= 'for' identifier '=' expr ',' expr (',' expr)? 'in' expression
   1074     static ExprAST *ParseForExpr() {
   1075       getNextToken();  // eat the for.
   1076 
   1077       if (CurTok != tok_identifier)
   1078         return Error("expected identifier after for");
   1079 
   1080       std::string IdName = IdentifierStr;
   1081       getNextToken();  // eat identifier.
   1082 
   1083       if (CurTok != '=')
   1084         return Error("expected '=' after for");
   1085       getNextToken();  // eat '='.
   1086 
   1087 
   1088       ExprAST *Start = ParseExpression();
   1089       if (Start == 0) return 0;
   1090       if (CurTok != ',')
   1091         return Error("expected ',' after for start value");
   1092       getNextToken();
   1093 
   1094       ExprAST *End = ParseExpression();
   1095       if (End == 0) return 0;
   1096 
   1097       // The step value is optional.
   1098       ExprAST *Step = 0;
   1099       if (CurTok == ',') {
   1100         getNextToken();
   1101         Step = ParseExpression();
   1102         if (Step == 0) return 0;
   1103       }
   1104 
   1105       if (CurTok != tok_in)
   1106         return Error("expected 'in' after for");
   1107       getNextToken();  // eat 'in'.
   1108 
   1109       ExprAST *Body = ParseExpression();
   1110       if (Body == 0) return 0;
   1111 
   1112       return new ForExprAST(IdName, Start, End, Step, Body);
   1113     }
   1114 
   1115     /// primary
   1116     ///   ::= identifierexpr
   1117     ///   ::= numberexpr
   1118     ///   ::= parenexpr
   1119     ///   ::= ifexpr
   1120     ///   ::= forexpr
   1121     static ExprAST *ParsePrimary() {
   1122       switch (CurTok) {
   1123       default: return Error("unknown token when expecting an expression");
   1124       case tok_identifier: return ParseIdentifierExpr();
   1125       case tok_number:     return ParseNumberExpr();
   1126       case '(':            return ParseParenExpr();
   1127       case tok_if:         return ParseIfExpr();
   1128       case tok_for:        return ParseForExpr();
   1129       }
   1130     }
   1131 
   1132     /// unary
   1133     ///   ::= primary
   1134     ///   ::= '!' unary
   1135     static ExprAST *ParseUnary() {
   1136       // If the current token is not an operator, it must be a primary expr.
   1137       if (!isascii(CurTok) || CurTok == '(' || CurTok == ',')
   1138         return ParsePrimary();
   1139 
   1140       // If this is a unary operator, read it.
   1141       int Opc = CurTok;
   1142       getNextToken();
   1143       if (ExprAST *Operand = ParseUnary())
   1144         return new UnaryExprAST(Opc, Operand);
   1145       return 0;
   1146     }
   1147 
   1148     /// binoprhs
   1149     ///   ::= ('+' unary)*
   1150     static ExprAST *ParseBinOpRHS(int ExprPrec, ExprAST *LHS) {
   1151       // If this is a binop, find its precedence.
   1152       while (1) {
   1153         int TokPrec = GetTokPrecedence();
   1154 
   1155         // If this is a binop that binds at least as tightly as the current binop,
   1156         // consume it, otherwise we are done.
   1157         if (TokPrec < ExprPrec)
   1158           return LHS;
   1159 
   1160         // Okay, we know this is a binop.
   1161         int BinOp = CurTok;
   1162         getNextToken();  // eat binop
   1163 
   1164         // Parse the unary expression after the binary operator.
   1165         ExprAST *RHS = ParseUnary();
   1166         if (!RHS) return 0;
   1167 
   1168         // If BinOp binds less tightly with RHS than the operator after RHS, let
   1169         // the pending operator take RHS as its LHS.
   1170         int NextPrec = GetTokPrecedence();
   1171         if (TokPrec < NextPrec) {
   1172           RHS = ParseBinOpRHS(TokPrec+1, RHS);
   1173           if (RHS == 0) return 0;
   1174         }
   1175 
   1176         // Merge LHS/RHS.
   1177         LHS = new BinaryExprAST(BinOp, LHS, RHS);
   1178       }
   1179     }
   1180 
   1181     /// expression
   1182     ///   ::= unary binoprhs
   1183     ///
   1184     static ExprAST *ParseExpression() {
   1185       ExprAST *LHS = ParseUnary();
   1186       if (!LHS) return 0;
   1187 
   1188       return ParseBinOpRHS(0, LHS);
   1189     }
   1190 
   1191     /// prototype
   1192     ///   ::= id '(' id* ')'
   1193     ///   ::= binary LETTER number? (id, id)
   1194     ///   ::= unary LETTER (id)
   1195     static PrototypeAST *ParsePrototype() {
   1196       std::string FnName;
   1197 
   1198       unsigned Kind = 0; // 0 = identifier, 1 = unary, 2 = binary.
   1199       unsigned BinaryPrecedence = 30;
   1200 
   1201       switch (CurTok) {
   1202       default:
   1203         return ErrorP("Expected function name in prototype");
   1204       case tok_identifier:
   1205         FnName = IdentifierStr;
   1206         Kind = 0;
   1207         getNextToken();
   1208         break;
   1209       case tok_unary:
   1210         getNextToken();
   1211         if (!isascii(CurTok))
   1212           return ErrorP("Expected unary operator");
   1213         FnName = "unary";
   1214         FnName += (char)CurTok;
   1215         Kind = 1;
   1216         getNextToken();
   1217         break;
   1218       case tok_binary:
   1219         getNextToken();
   1220         if (!isascii(CurTok))
   1221           return ErrorP("Expected binary operator");
   1222         FnName = "binary";
   1223         FnName += (char)CurTok;
   1224         Kind = 2;
   1225         getNextToken();
   1226 
   1227         // Read the precedence if present.
   1228         if (CurTok == tok_number) {
   1229           if (NumVal < 1 || NumVal > 100)
   1230             return ErrorP("Invalid precedecnce: must be 1..100");
   1231           BinaryPrecedence = (unsigned)NumVal;
   1232           getNextToken();
   1233         }
   1234         break;
   1235       }
   1236 
   1237       if (CurTok != '(')
   1238         return ErrorP("Expected '(' in prototype");
   1239 
   1240       std::vector<std::string> ArgNames;
   1241       while (getNextToken() == tok_identifier)
   1242         ArgNames.push_back(IdentifierStr);
   1243       if (CurTok != ')')
   1244         return ErrorP("Expected ')' in prototype");
   1245 
   1246       // success.
   1247       getNextToken();  // eat ')'.
   1248 
   1249       // Verify right number of names for operator.
   1250       if (Kind && ArgNames.size() != Kind)
   1251         return ErrorP("Invalid number of operands for operator");
   1252 
   1253       return new PrototypeAST(FnName, ArgNames, Kind != 0, BinaryPrecedence);
   1254     }
   1255 
   1256     /// definition ::= 'def' prototype expression
   1257     static FunctionAST *ParseDefinition() {
   1258       getNextToken();  // eat def.
   1259       PrototypeAST *Proto = ParsePrototype();
   1260       if (Proto == 0) return 0;
   1261 
   1262       if (ExprAST *E = ParseExpression())
   1263         return new FunctionAST(Proto, E);
   1264       return 0;
   1265     }
   1266 
   1267     /// toplevelexpr ::= expression
   1268     static FunctionAST *ParseTopLevelExpr() {
   1269       if (ExprAST *E = ParseExpression()) {
   1270         // Make an anonymous proto.
   1271         PrototypeAST *Proto = new PrototypeAST("", std::vector<std::string>());
   1272         return new FunctionAST(Proto, E);
   1273       }
   1274       return 0;
   1275     }
   1276 
   1277     /// external ::= 'extern' prototype
   1278     static PrototypeAST *ParseExtern() {
   1279       getNextToken();  // eat extern.
   1280       return ParsePrototype();
   1281     }
   1282 
   1283     //===----------------------------------------------------------------------===//
   1284     // Code Generation
   1285     //===----------------------------------------------------------------------===//
   1286 
   1287     static Module *TheModule;
   1288     static IRBuilder<> Builder(getGlobalContext());
   1289     static std::map<std::string, Value*> NamedValues;
   1290     static FunctionPassManager *TheFPM;
   1291 
   1292     Value *ErrorV(const char *Str) { Error(Str); return 0; }
   1293 
   1294     Value *NumberExprAST::Codegen() {
   1295       return ConstantFP::get(getGlobalContext(), APFloat(Val));
   1296     }
   1297 
   1298     Value *VariableExprAST::Codegen() {
   1299       // Look this variable up in the function.
   1300       Value *V = NamedValues[Name];
   1301       return V ? V : ErrorV("Unknown variable name");
   1302     }
   1303 
   1304     Value *UnaryExprAST::Codegen() {
   1305       Value *OperandV = Operand->Codegen();
   1306       if (OperandV == 0) return 0;
   1307 
   1308       Function *F = TheModule->getFunction(std::string("unary")+Opcode);
   1309       if (F == 0)
   1310         return ErrorV("Unknown unary operator");
   1311 
   1312       return Builder.CreateCall(F, OperandV, "unop");
   1313     }
   1314 
   1315     Value *BinaryExprAST::Codegen() {
   1316       Value *L = LHS->Codegen();
   1317       Value *R = RHS->Codegen();
   1318       if (L == 0 || R == 0) return 0;
   1319 
   1320       switch (Op) {
   1321       case '+': return Builder.CreateFAdd(L, R, "addtmp");
   1322       case '-': return Builder.CreateFSub(L, R, "subtmp");
   1323       case '*': return Builder.CreateFMul(L, R, "multmp");
   1324       case '<':
   1325         L = Builder.CreateFCmpULT(L, R, "cmptmp");
   1326         // Convert bool 0/1 to double 0.0 or 1.0
   1327         return Builder.CreateUIToFP(L, Type::getDoubleTy(getGlobalContext()),
   1328                                     "booltmp");
   1329       default: break;
   1330       }
   1331 
   1332       // If it wasn't a builtin binary operator, it must be a user defined one. Emit
   1333       // a call to it.
   1334       Function *F = TheModule->getFunction(std::string("binary")+Op);
   1335       assert(F && "binary operator not found!");
   1336 
   1337       Value *Ops[2] = { L, R };
   1338       return Builder.CreateCall(F, Ops, "binop");
   1339     }
   1340 
   1341     Value *CallExprAST::Codegen() {
   1342       // Look up the name in the global module table.
   1343       Function *CalleeF = TheModule->getFunction(Callee);
   1344       if (CalleeF == 0)
   1345         return ErrorV("Unknown function referenced");
   1346 
   1347       // If argument mismatch error.
   1348       if (CalleeF->arg_size() != Args.size())
   1349         return ErrorV("Incorrect # arguments passed");
   1350 
   1351       std::vector<Value*> ArgsV;
   1352       for (unsigned i = 0, e = Args.size(); i != e; ++i) {
   1353         ArgsV.push_back(Args[i]->Codegen());
   1354         if (ArgsV.back() == 0) return 0;
   1355       }
   1356 
   1357       return Builder.CreateCall(CalleeF, ArgsV, "calltmp");
   1358     }
   1359 
   1360     Value *IfExprAST::Codegen() {
   1361       Value *CondV = Cond->Codegen();
   1362       if (CondV == 0) return 0;
   1363 
   1364       // Convert condition to a bool by comparing equal to 0.0.
   1365       CondV = Builder.CreateFCmpONE(CondV,
   1366                                   ConstantFP::get(getGlobalContext(), APFloat(0.0)),
   1367                                     "ifcond");
   1368 
   1369       Function *TheFunction = Builder.GetInsertBlock()->getParent();
   1370 
   1371       // Create blocks for the then and else cases.  Insert the 'then' block at the
   1372       // end of the function.
   1373       BasicBlock *ThenBB = BasicBlock::Create(getGlobalContext(), "then", TheFunction);
   1374       BasicBlock *ElseBB = BasicBlock::Create(getGlobalContext(), "else");
   1375       BasicBlock *MergeBB = BasicBlock::Create(getGlobalContext(), "ifcont");
   1376 
   1377       Builder.CreateCondBr(CondV, ThenBB, ElseBB);
   1378 
   1379       // Emit then value.
   1380       Builder.SetInsertPoint(ThenBB);
   1381 
   1382       Value *ThenV = Then->Codegen();
   1383       if (ThenV == 0) return 0;
   1384 
   1385       Builder.CreateBr(MergeBB);
   1386       // Codegen of 'Then' can change the current block, update ThenBB for the PHI.
   1387       ThenBB = Builder.GetInsertBlock();
   1388 
   1389       // Emit else block.
   1390       TheFunction->getBasicBlockList().push_back(ElseBB);
   1391       Builder.SetInsertPoint(ElseBB);
   1392 
   1393       Value *ElseV = Else->Codegen();
   1394       if (ElseV == 0) return 0;
   1395 
   1396       Builder.CreateBr(MergeBB);
   1397       // Codegen of 'Else' can change the current block, update ElseBB for the PHI.
   1398       ElseBB = Builder.GetInsertBlock();
   1399 
   1400       // Emit merge block.
   1401       TheFunction->getBasicBlockList().push_back(MergeBB);
   1402       Builder.SetInsertPoint(MergeBB);
   1403       PHINode *PN = Builder.CreatePHI(Type::getDoubleTy(getGlobalContext()), 2,
   1404                                       "iftmp");
   1405 
   1406       PN->addIncoming(ThenV, ThenBB);
   1407       PN->addIncoming(ElseV, ElseBB);
   1408       return PN;
   1409     }
   1410 
   1411     Value *ForExprAST::Codegen() {
   1412       // Output this as:
   1413       //   ...
   1414       //   start = startexpr
   1415       //   goto loop
   1416       // loop:
   1417       //   variable = phi [start, loopheader], [nextvariable, loopend]
   1418       //   ...
   1419       //   bodyexpr
   1420       //   ...
   1421       // loopend:
   1422       //   step = stepexpr
   1423       //   nextvariable = variable + step
   1424       //   endcond = endexpr
   1425       //   br endcond, loop, endloop
   1426       // outloop:
   1427 
   1428       // Emit the start code first, without 'variable' in scope.
   1429       Value *StartVal = Start->Codegen();
   1430       if (StartVal == 0) return 0;
   1431 
   1432       // Make the new basic block for the loop header, inserting after current
   1433       // block.
   1434       Function *TheFunction = Builder.GetInsertBlock()->getParent();
   1435       BasicBlock *PreheaderBB = Builder.GetInsertBlock();
   1436       BasicBlock *LoopBB = BasicBlock::Create(getGlobalContext(), "loop", TheFunction);
   1437 
   1438       // Insert an explicit fall through from the current block to the LoopBB.
   1439       Builder.CreateBr(LoopBB);
   1440 
   1441       // Start insertion in LoopBB.
   1442       Builder.SetInsertPoint(LoopBB);
   1443 
   1444       // Start the PHI node with an entry for Start.
   1445       PHINode *Variable = Builder.CreatePHI(Type::getDoubleTy(getGlobalContext()), 2, VarName.c_str());
   1446       Variable->addIncoming(StartVal, PreheaderBB);
   1447 
   1448       // Within the loop, the variable is defined equal to the PHI node.  If it
   1449       // shadows an existing variable, we have to restore it, so save it now.
   1450       Value *OldVal = NamedValues[VarName];
   1451       NamedValues[VarName] = Variable;
   1452 
   1453       // Emit the body of the loop.  This, like any other expr, can change the
   1454       // current BB.  Note that we ignore the value computed by the body, but don't
   1455       // allow an error.
   1456       if (Body->Codegen() == 0)
   1457         return 0;
   1458 
   1459       // Emit the step value.
   1460       Value *StepVal;
   1461       if (Step) {
   1462         StepVal = Step->Codegen();
   1463         if (StepVal == 0) return 0;
   1464       } else {
   1465         // If not specified, use 1.0.
   1466         StepVal = ConstantFP::get(getGlobalContext(), APFloat(1.0));
   1467       }
   1468 
   1469       Value *NextVar = Builder.CreateFAdd(Variable, StepVal, "nextvar");
   1470 
   1471       // Compute the end condition.
   1472       Value *EndCond = End->Codegen();
   1473       if (EndCond == 0) return EndCond;
   1474 
   1475       // Convert condition to a bool by comparing equal to 0.0.
   1476       EndCond = Builder.CreateFCmpONE(EndCond,
   1477                                   ConstantFP::get(getGlobalContext(), APFloat(0.0)),
   1478                                       "loopcond");
   1479 
   1480       // Create the "after loop" block and insert it.
   1481       BasicBlock *LoopEndBB = Builder.GetInsertBlock();
   1482       BasicBlock *AfterBB = BasicBlock::Create(getGlobalContext(), "afterloop", TheFunction);
   1483 
   1484       // Insert the conditional branch into the end of LoopEndBB.
   1485       Builder.CreateCondBr(EndCond, LoopBB, AfterBB);
   1486 
   1487       // Any new code will be inserted in AfterBB.
   1488       Builder.SetInsertPoint(AfterBB);
   1489 
   1490       // Add a new entry to the PHI node for the backedge.
   1491       Variable->addIncoming(NextVar, LoopEndBB);
   1492 
   1493       // Restore the unshadowed variable.
   1494       if (OldVal)
   1495         NamedValues[VarName] = OldVal;
   1496       else
   1497         NamedValues.erase(VarName);
   1498 
   1499 
   1500       // for expr always returns 0.0.
   1501       return Constant::getNullValue(Type::getDoubleTy(getGlobalContext()));
   1502     }
   1503 
   1504     Function *PrototypeAST::Codegen() {
   1505       // Make the function type:  double(double,double) etc.
   1506       std::vector<Type*> Doubles(Args.size(),
   1507                                  Type::getDoubleTy(getGlobalContext()));
   1508       FunctionType *FT = FunctionType::get(Type::getDoubleTy(getGlobalContext()),
   1509                                            Doubles, false);
   1510 
   1511       Function *F = Function::Create(FT, Function::ExternalLinkage, Name, TheModule);
   1512 
   1513       // If F conflicted, there was already something named 'Name'.  If it has a
   1514       // body, don't allow redefinition or reextern.
   1515       if (F->getName() != Name) {
   1516         // Delete the one we just made and get the existing one.
   1517         F->eraseFromParent();
   1518         F = TheModule->getFunction(Name);
   1519 
   1520         // If F already has a body, reject this.
   1521         if (!F->empty()) {
   1522           ErrorF("redefinition of function");
   1523           return 0;
   1524         }
   1525 
   1526         // If F took a different number of args, reject.
   1527         if (F->arg_size() != Args.size()) {
   1528           ErrorF("redefinition of function with different # args");
   1529           return 0;
   1530         }
   1531       }
   1532 
   1533       // Set names for all arguments.
   1534       unsigned Idx = 0;
   1535       for (Function::arg_iterator AI = F->arg_begin(); Idx != Args.size();
   1536            ++AI, ++Idx) {
   1537         AI->setName(Args[Idx]);
   1538 
   1539         // Add arguments to variable symbol table.
   1540         NamedValues[Args[Idx]] = AI;
   1541       }
   1542 
   1543       return F;
   1544     }
   1545 
   1546     Function *FunctionAST::Codegen() {
   1547       NamedValues.clear();
   1548 
   1549       Function *TheFunction = Proto->Codegen();
   1550       if (TheFunction == 0)
   1551         return 0;
   1552 
   1553       // If this is an operator, install it.
   1554       if (Proto->isBinaryOp())
   1555         BinopPrecedence[Proto->getOperatorName()] = Proto->getBinaryPrecedence();
   1556 
   1557       // Create a new basic block to start insertion into.
   1558       BasicBlock *BB = BasicBlock::Create(getGlobalContext(), "entry", TheFunction);
   1559       Builder.SetInsertPoint(BB);
   1560 
   1561       if (Value *RetVal = Body->Codegen()) {
   1562         // Finish off the function.
   1563         Builder.CreateRet(RetVal);
   1564 
   1565         // Validate the generated code, checking for consistency.
   1566         verifyFunction(*TheFunction);
   1567 
   1568         // Optimize the function.
   1569         TheFPM->run(*TheFunction);
   1570 
   1571         return TheFunction;
   1572       }
   1573 
   1574       // Error reading body, remove function.
   1575       TheFunction->eraseFromParent();
   1576 
   1577       if (Proto->isBinaryOp())
   1578         BinopPrecedence.erase(Proto->getOperatorName());
   1579       return 0;
   1580     }
   1581 
   1582     //===----------------------------------------------------------------------===//
   1583     // Top-Level parsing and JIT Driver
   1584     //===----------------------------------------------------------------------===//
   1585 
   1586     static ExecutionEngine *TheExecutionEngine;
   1587 
   1588     static void HandleDefinition() {
   1589       if (FunctionAST *F = ParseDefinition()) {
   1590         if (Function *LF = F->Codegen()) {
   1591           fprintf(stderr, "Read function definition:");
   1592           LF->dump();
   1593         }
   1594       } else {
   1595         // Skip token for error recovery.
   1596         getNextToken();
   1597       }
   1598     }
   1599 
   1600     static void HandleExtern() {
   1601       if (PrototypeAST *P = ParseExtern()) {
   1602         if (Function *F = P->Codegen()) {
   1603           fprintf(stderr, "Read extern: ");
   1604           F->dump();
   1605         }
   1606       } else {
   1607         // Skip token for error recovery.
   1608         getNextToken();
   1609       }
   1610     }
   1611 
   1612     static void HandleTopLevelExpression() {
   1613       // Evaluate a top-level expression into an anonymous function.
   1614       if (FunctionAST *F = ParseTopLevelExpr()) {
   1615         if (Function *LF = F->Codegen()) {
   1616           // JIT the function, returning a function pointer.
   1617           void *FPtr = TheExecutionEngine->getPointerToFunction(LF);
   1618 
   1619           // Cast it to the right type (takes no arguments, returns a double) so we
   1620           // can call it as a native function.
   1621           double (*FP)() = (double (*)())(intptr_t)FPtr;
   1622           fprintf(stderr, "Evaluated to %f\n", FP());
   1623         }
   1624       } else {
   1625         // Skip token for error recovery.
   1626         getNextToken();
   1627       }
   1628     }
   1629 
   1630     /// top ::= definition | external | expression | ';'
   1631     static void MainLoop() {
   1632       while (1) {
   1633         fprintf(stderr, "ready> ");
   1634         switch (CurTok) {
   1635         case tok_eof:    return;
   1636         case ';':        getNextToken(); break;  // ignore top-level semicolons.
   1637         case tok_def:    HandleDefinition(); break;
   1638         case tok_extern: HandleExtern(); break;
   1639         default:         HandleTopLevelExpression(); break;
   1640         }
   1641       }
   1642     }
   1643 
   1644     //===----------------------------------------------------------------------===//
   1645     // "Library" functions that can be "extern'd" from user code.
   1646     //===----------------------------------------------------------------------===//
   1647 
   1648     /// putchard - putchar that takes a double and returns 0.
   1649     extern "C"
   1650     double putchard(double X) {
   1651       putchar((char)X);
   1652       return 0;
   1653     }
   1654 
   1655     /// printd - printf that takes a double prints it as "%f\n", returning 0.
   1656     extern "C"
   1657     double printd(double X) {
   1658       printf("%f\n", X);
   1659       return 0;
   1660     }
   1661 
   1662     //===----------------------------------------------------------------------===//
   1663     // Main driver code.
   1664     //===----------------------------------------------------------------------===//
   1665 
   1666     int main() {
   1667       InitializeNativeTarget();
   1668       LLVMContext &Context = getGlobalContext();
   1669 
   1670       // Install standard binary operators.
   1671       // 1 is lowest precedence.
   1672       BinopPrecedence['<'] = 10;
   1673       BinopPrecedence['+'] = 20;
   1674       BinopPrecedence['-'] = 20;
   1675       BinopPrecedence['*'] = 40;  // highest.
   1676 
   1677       // Prime the first token.
   1678       fprintf(stderr, "ready> ");
   1679       getNextToken();
   1680 
   1681       // Make the module, which holds all the code.
   1682       TheModule = new Module("my cool jit", Context);
   1683 
   1684       // Create the JIT.  This takes ownership of the module.
   1685       std::string ErrStr;
   1686       TheExecutionEngine = EngineBuilder(TheModule).setErrorStr(&ErrStr).create();
   1687       if (!TheExecutionEngine) {
   1688         fprintf(stderr, "Could not create ExecutionEngine: %s\n", ErrStr.c_str());
   1689         exit(1);
   1690       }
   1691 
   1692       FunctionPassManager OurFPM(TheModule);
   1693 
   1694       // Set up the optimizer pipeline.  Start with registering info about how the
   1695       // target lays out data structures.
   1696       OurFPM.add(new DataLayout(*TheExecutionEngine->getDataLayout()));
   1697       // Provide basic AliasAnalysis support for GVN.
   1698       OurFPM.add(createBasicAliasAnalysisPass());
   1699       // Do simple "peephole" optimizations and bit-twiddling optzns.
   1700       OurFPM.add(createInstructionCombiningPass());
   1701       // Reassociate expressions.
   1702       OurFPM.add(createReassociatePass());
   1703       // Eliminate Common SubExpressions.
   1704       OurFPM.add(createGVNPass());
   1705       // Simplify the control flow graph (deleting unreachable blocks, etc).
   1706       OurFPM.add(createCFGSimplificationPass());
   1707 
   1708       OurFPM.doInitialization();
   1709 
   1710       // Set the global so the code gen can use this.
   1711       TheFPM = &OurFPM;
   1712 
   1713       // Run the main "interpreter loop" now.
   1714       MainLoop();
   1715 
   1716       TheFPM = 0;
   1717 
   1718       // Print out all of the generated code.
   1719       TheModule->dump();
   1720 
   1721       return 0;
   1722     }
   1723 
   1724 `Next: Extending the language: mutable variables / SSA
   1725 construction <LangImpl7.html>`_
   1726 
   1727