Home | History | Annotate | Download | only in tutorial
      1 =======================================================
      2 Building a JIT: Starting out with KaleidoscopeJIT
      3 =======================================================
      4 
      5 .. contents::
      6    :local:
      7 
      8 Chapter 1 Introduction
      9 ======================
     10 
     11 **Warning: This text is currently out of date due to ORC API updates.**
     12 
     13 **The example code has been updated and can be used. The text will be updated
     14 once the API churn dies down.**
     15 
     16 Welcome to Chapter 1 of the "Building an ORC-based JIT in LLVM" tutorial. This
     17 tutorial runs through the implementation of a JIT compiler using LLVM's
     18 On-Request-Compilation (ORC) APIs. It begins with a simplified version of the
     19 KaleidoscopeJIT class used in the
     20 `Implementing a language with LLVM <LangImpl01.html>`_ tutorials and then
     21 introduces new features like optimization, lazy compilation and remote
     22 execution.
     23 
     24 The goal of this tutorial is to introduce you to LLVM's ORC JIT APIs, show how
     25 these APIs interact with other parts of LLVM, and to teach you how to recombine
     26 them to build a custom JIT that is suited to your use-case.
     27 
     28 The structure of the tutorial is:
     29 
     30 - Chapter #1: Investigate the simple KaleidoscopeJIT class. This will
     31   introduce some of the basic concepts of the ORC JIT APIs, including the
     32   idea of an ORC *Layer*.
     33 
     34 - `Chapter #2 <BuildingAJIT2.html>`_: Extend the basic KaleidoscopeJIT by adding
     35   a new layer that will optimize IR and generated code.
     36 
     37 - `Chapter #3 <BuildingAJIT3.html>`_: Further extend the JIT by adding a
     38   Compile-On-Demand layer to lazily compile IR.
     39 
     40 - `Chapter #4 <BuildingAJIT4.html>`_: Improve the laziness of our JIT by
     41   replacing the Compile-On-Demand layer with a custom layer that uses the ORC
     42   Compile Callbacks API directly to defer IR-generation until functions are
     43   called.
     44 
     45 - `Chapter #5 <BuildingAJIT5.html>`_: Add process isolation by JITing code into
     46   a remote process with reduced privileges using the JIT Remote APIs.
     47 
     48 To provide input for our JIT we will use the Kaleidoscope REPL from
     49 `Chapter 7 <LangImpl07.html>`_ of the "Implementing a language in LLVM tutorial",
     50 with one minor modification: We will remove the FunctionPassManager from the
     51 code for that chapter and replace it with optimization support in our JIT class
     52 in Chapter #2.
     53 
     54 Finally, a word on API generations: ORC is the 3rd generation of LLVM JIT API.
     55 It was preceded by MCJIT, and before that by the (now deleted) legacy JIT.
     56 These tutorials don't assume any experience with these earlier APIs, but
     57 readers acquainted with them will see many familiar elements. Where appropriate
     58 we will make this connection with the earlier APIs explicit to help people who
     59 are transitioning from them to ORC.
     60 
     61 JIT API Basics
     62 ==============
     63 
     64 The purpose of a JIT compiler is to compile code "on-the-fly" as it is needed,
     65 rather than compiling whole programs to disk ahead of time as a traditional
     66 compiler does. To support that aim our initial, bare-bones JIT API will be:
     67 
     68 1. Handle addModule(Module &M) -- Make the given IR module available for
     69    execution.
     70 2. JITSymbol findSymbol(const std::string &Name) -- Search for pointers to
     71    symbols (functions or variables) that have been added to the JIT.
     72 3. void removeModule(Handle H) -- Remove a module from the JIT, releasing any
     73    memory that had been used for the compiled code.
     74 
     75 A basic use-case for this API, executing the 'main' function from a module,
     76 will look like:
     77 
     78 .. code-block:: c++
     79 
     80   std::unique_ptr<Module> M = buildModule();
     81   JIT J;
     82   Handle H = J.addModule(*M);
     83   int (*Main)(int, char*[]) = (int(*)(int, char*[]))J.getSymbolAddress("main");
     84   int Result = Main();
     85   J.removeModule(H);
     86 
     87 The APIs that we build in these tutorials will all be variations on this simple
     88 theme. Behind the API we will refine the implementation of the JIT to add
     89 support for optimization and lazy compilation. Eventually we will extend the
     90 API itself to allow higher-level program representations (e.g. ASTs) to be
     91 added to the JIT.
     92 
     93 KaleidoscopeJIT
     94 ===============
     95 
     96 In the previous section we described our API, now we examine a simple
     97 implementation of it: The KaleidoscopeJIT class [1]_ that was used in the
     98 `Implementing a language with LLVM <LangImpl01.html>`_ tutorials. We will use
     99 the REPL code from `Chapter 7 <LangImpl07.html>`_ of that tutorial to supply the
    100 input for our JIT: Each time the user enters an expression the REPL will add a
    101 new IR module containing the code for that expression to the JIT. If the
    102 expression is a top-level expression like '1+1' or 'sin(x)', the REPL will also
    103 use the findSymbol method of our JIT class find and execute the code for the
    104 expression, and then use the removeModule method to remove the code again
    105 (since there's no way to re-invoke an anonymous expression). In later chapters
    106 of this tutorial we'll modify the REPL to enable new interactions with our JIT
    107 class, but for now we will take this setup for granted and focus our attention on
    108 the implementation of our JIT itself.
    109 
    110 Our KaleidoscopeJIT class is defined in the KaleidoscopeJIT.h header. After the
    111 usual include guards and #includes [2]_, we get to the definition of our class:
    112 
    113 .. code-block:: c++
    114 
    115   #ifndef LLVM_EXECUTIONENGINE_ORC_KALEIDOSCOPEJIT_H
    116   #define LLVM_EXECUTIONENGINE_ORC_KALEIDOSCOPEJIT_H
    117 
    118   #include "llvm/ADT/STLExtras.h"
    119   #include "llvm/ExecutionEngine/ExecutionEngine.h"
    120   #include "llvm/ExecutionEngine/JITSymbol.h"
    121   #include "llvm/ExecutionEngine/RTDyldMemoryManager.h"
    122   #include "llvm/ExecutionEngine/SectionMemoryManager.h"
    123   #include "llvm/ExecutionEngine/Orc/CompileUtils.h"
    124   #include "llvm/ExecutionEngine/Orc/IRCompileLayer.h"
    125   #include "llvm/ExecutionEngine/Orc/LambdaResolver.h"
    126   #include "llvm/ExecutionEngine/Orc/RTDyldObjectLinkingLayer.h"
    127   #include "llvm/IR/DataLayout.h"
    128   #include "llvm/IR/Mangler.h"
    129   #include "llvm/Support/DynamicLibrary.h"
    130   #include "llvm/Support/raw_ostream.h"
    131   #include "llvm/Target/TargetMachine.h"
    132   #include <algorithm>
    133   #include <memory>
    134   #include <string>
    135   #include <vector>
    136 
    137   namespace llvm {
    138   namespace orc {
    139 
    140   class KaleidoscopeJIT {
    141   private:
    142     std::unique_ptr<TargetMachine> TM;
    143     const DataLayout DL;
    144     RTDyldObjectLinkingLayer ObjectLayer;
    145     IRCompileLayer<decltype(ObjectLayer), SimpleCompiler> CompileLayer;
    146 
    147   public:
    148     using ModuleHandle = decltype(CompileLayer)::ModuleHandleT;
    149 
    150 Our class begins with four members: A TargetMachine, TM, which will be used to
    151 build our LLVM compiler instance; A DataLayout, DL, which will be used for
    152 symbol mangling (more on that later), and two ORC *layers*: an
    153 RTDyldObjectLinkingLayer and a CompileLayer. We'll be talking more about layers
    154 in the next chapter, but for now you can think of them as analogous to LLVM
    155 Passes: they wrap up useful JIT utilities behind an easy to compose interface.
    156 The first layer, ObjectLayer, is the foundation of our JIT: it takes in-memory
    157 object files produced by a compiler and links them on the fly to make them
    158 executable. This JIT-on-top-of-a-linker design was introduced in MCJIT, however
    159 the linker was hidden inside the MCJIT class. In ORC we expose the linker so
    160 that clients can access and configure it directly if they need to. In this
    161 tutorial our ObjectLayer will just be used to support the next layer in our
    162 stack: the CompileLayer, which will be responsible for taking LLVM IR, compiling
    163 it, and passing the resulting in-memory object files down to the object linking
    164 layer below.
    165 
    166 That's it for member variables, after that we have a single typedef:
    167 ModuleHandle. This is the handle type that will be returned from our JIT's
    168 addModule method, and can be passed to the removeModule method to remove a
    169 module. The IRCompileLayer class already provides a convenient handle type
    170 (IRCompileLayer::ModuleHandleT), so we just alias our ModuleHandle to this.
    171 
    172 .. code-block:: c++
    173 
    174   KaleidoscopeJIT()
    175       : TM(EngineBuilder().selectTarget()), DL(TM->createDataLayout()),
    176         ObjectLayer([]() { return std::make_shared<SectionMemoryManager>(); }),
    177         CompileLayer(ObjectLayer, SimpleCompiler(*TM)) {
    178     llvm::sys::DynamicLibrary::LoadLibraryPermanently(nullptr);
    179   }
    180 
    181   TargetMachine &getTargetMachine() { return *TM; }
    182 
    183 Next up we have our class constructor. We begin by initializing TM using the
    184 EngineBuilder::selectTarget helper method which constructs a TargetMachine for
    185 the current process. Then we use our newly created TargetMachine to initialize
    186 DL, our DataLayout. After that we need to initialize our ObjectLayer. The
    187 ObjectLayer requires a function object that will build a JIT memory manager for
    188 each module that is added (a JIT memory manager manages memory allocations,
    189 memory permissions, and registration of exception handlers for JIT'd code). For
    190 this we use a lambda that returns a SectionMemoryManager, an off-the-shelf
    191 utility that provides all the basic memory management functionality required for
    192 this chapter. Next we initialize our CompileLayer. The CompileLayer needs two
    193 things: (1) A reference to our object layer, and (2) a compiler instance to use
    194 to perform the actual compilation from IR to object files. We use the
    195 off-the-shelf SimpleCompiler instance for now. Finally, in the body of the
    196 constructor, we call the DynamicLibrary::LoadLibraryPermanently method with a
    197 nullptr argument. Normally the LoadLibraryPermanently method is called with the
    198 path of a dynamic library to load, but when passed a null pointer it will 'load'
    199 the host process itself, making its exported symbols available for execution.
    200 
    201 .. code-block:: c++
    202 
    203   ModuleHandle addModule(std::unique_ptr<Module> M) {
    204     // Build our symbol resolver:
    205     // Lambda 1: Look back into the JIT itself to find symbols that are part of
    206     //           the same "logical dylib".
    207     // Lambda 2: Search for external symbols in the host process.
    208     auto Resolver = createLambdaResolver(
    209         [&](const std::string &Name) {
    210           if (auto Sym = CompileLayer.findSymbol(Name, false))
    211             return Sym;
    212           return JITSymbol(nullptr);
    213         },
    214         [](const std::string &Name) {
    215           if (auto SymAddr =
    216                 RTDyldMemoryManager::getSymbolAddressInProcess(Name))
    217             return JITSymbol(SymAddr, JITSymbolFlags::Exported);
    218           return JITSymbol(nullptr);
    219         });
    220 
    221     // Add the set to the JIT with the resolver we created above and a newly
    222     // created SectionMemoryManager.
    223     return cantFail(CompileLayer.addModule(std::move(M),
    224                                            std::move(Resolver)));
    225   }
    226 
    227 Now we come to the first of our JIT API methods: addModule. This method is
    228 responsible for adding IR to the JIT and making it available for execution. In
    229 this initial implementation of our JIT we will make our modules "available for
    230 execution" by adding them straight to the CompileLayer, which will immediately
    231 compile them. In later chapters we will teach our JIT to defer compilation
    232 of individual functions until they're actually called.
    233 
    234 To add our module to the CompileLayer we need to supply both the module and a
    235 symbol resolver. The symbol resolver is responsible for supplying the JIT with
    236 an address for each *external symbol* in the module we are adding. External
    237 symbols are any symbol not defined within the module itself, including calls to
    238 functions outside the JIT and calls to functions defined in other modules that
    239 have already been added to the JIT. (It may seem as though modules added to the
    240 JIT should know about one another by default, but since we would still have to
    241 supply a symbol resolver for references to code outside the JIT it turns out to
    242 be easier to re-use this one mechanism for all symbol resolution.) This has the
    243 added benefit that the user has full control over the symbol resolution
    244 process. Should we search for definitions within the JIT first, then fall back
    245 on external definitions? Or should we prefer external definitions where
    246 available and only JIT code if we don't already have an available
    247 implementation? By using a single symbol resolution scheme we are free to choose
    248 whatever makes the most sense for any given use case.
    249 
    250 Building a symbol resolver is made especially easy by the *createLambdaResolver*
    251 function. This function takes two lambdas [3]_ and returns a JITSymbolResolver
    252 instance. The first lambda is used as the implementation of the resolver's
    253 findSymbolInLogicalDylib method, which searches for symbol definitions that
    254 should be thought of as being part of the same "logical" dynamic library as this
    255 Module. If you are familiar with static linking: this means that
    256 findSymbolInLogicalDylib should expose symbols with common linkage and hidden
    257 visibility. If all this sounds foreign you can ignore the details and just
    258 remember that this is the first method that the linker will use to try to find a
    259 symbol definition. If the findSymbolInLogicalDylib method returns a null result
    260 then the linker will call the second symbol resolver method, called findSymbol,
    261 which searches for symbols that should be thought of as external to (but
    262 visibile from) the module and its logical dylib. In this tutorial we will adopt
    263 the following simple scheme: All modules added to the JIT will behave as if they
    264 were linked into a single, ever-growing logical dylib. To implement this our
    265 first lambda (the one defining findSymbolInLogicalDylib) will just search for
    266 JIT'd code by calling the CompileLayer's findSymbol method. If we don't find a
    267 symbol in the JIT itself we'll fall back to our second lambda, which implements
    268 findSymbol. This will use the RTDyldMemoryManager::getSymbolAddressInProcess
    269 method to search for the symbol within the program itself. If we can't find a
    270 symbol definition via either of these paths, the JIT will refuse to accept our
    271 module, returning a "symbol not found" error.
    272 
    273 Now that we've built our symbol resolver, we're ready to add our module to the
    274 JIT. We do this by calling the CompileLayer's addModule method. The addModule
    275 method returns an ``Expected<CompileLayer::ModuleHandle>``, since in more
    276 advanced JIT configurations it could fail. In our basic configuration we know
    277 that it will always succeed so we use the cantFail utility to assert that no
    278 error occurred, and extract the handle value. Since we have already typedef'd
    279 our ModuleHandle type to be the same as the CompileLayer's handle type, we can
    280 return the unwrapped handle directly.
    281 
    282 .. code-block:: c++
    283 
    284   JITSymbol findSymbol(const std::string Name) {
    285     std::string MangledName;
    286     raw_string_ostream MangledNameStream(MangledName);
    287     Mangler::getNameWithPrefix(MangledNameStream, Name, DL);
    288     return CompileLayer.findSymbol(MangledNameStream.str(), true);
    289   }
    290 
    291   JITTargetAddress getSymbolAddress(const std::string Name) {
    292     return cantFail(findSymbol(Name).getAddress());
    293   }
    294 
    295   void removeModule(ModuleHandle H) {
    296     cantFail(CompileLayer.removeModule(H));
    297   }
    298 
    299 Now that we can add code to our JIT, we need a way to find the symbols we've
    300 added to it. To do that we call the findSymbol method on our CompileLayer, but
    301 with a twist: We have to *mangle* the name of the symbol we're searching for
    302 first. The ORC JIT components use mangled symbols internally the same way a
    303 static compiler and linker would, rather than using plain IR symbol names. This
    304 allows JIT'd code to interoperate easily with precompiled code in the
    305 application or shared libraries. The kind of mangling will depend on the
    306 DataLayout, which in turn depends on the target platform. To allow us to remain
    307 portable and search based on the un-mangled name, we just re-produce this
    308 mangling ourselves.
    309 
    310 Next we have a convenience function, getSymbolAddress, which returns the address
    311 of a given symbol. Like CompileLayer's addModule function, JITSymbol's getAddress
    312 function is allowed to fail [4]_, however we know that it will not in our simple
    313 example, so we wrap it in a call to cantFail.
    314 
    315 We now come to the last method in our JIT API: removeModule. This method is
    316 responsible for destructing the MemoryManager and SymbolResolver that were
    317 added with a given module, freeing any resources they were using in the
    318 process. In our Kaleidoscope demo we rely on this method to remove the module
    319 representing the most recent top-level expression, preventing it from being
    320 treated as a duplicate definition when the next top-level expression is
    321 entered. It is generally good to free any module that you know you won't need
    322 to call further, just to free up the resources dedicated to it. However, you
    323 don't strictly need to do this: All resources will be cleaned up when your
    324 JIT class is destructed, if they haven't been freed before then. Like
    325 ``CompileLayer::addModule`` and ``JITSymbol::getAddress``, removeModule may
    326 fail in general but will never fail in our example, so we wrap it in a call to
    327 cantFail.
    328 
    329 This brings us to the end of Chapter 1 of Building a JIT. You now have a basic
    330 but fully functioning JIT stack that you can use to take LLVM IR and make it
    331 executable within the context of your JIT process. In the next chapter we'll
    332 look at how to extend this JIT to produce better quality code, and in the
    333 process take a deeper look at the ORC layer concept.
    334 
    335 `Next: Extending the KaleidoscopeJIT <BuildingAJIT2.html>`_
    336 
    337 Full Code Listing
    338 =================
    339 
    340 Here is the complete code listing for our running example. To build this
    341 example, use:
    342 
    343 .. code-block:: bash
    344 
    345     # Compile
    346     clang++ -g toy.cpp `llvm-config --cxxflags --ldflags --system-libs --libs core orcjit native` -O3 -o toy
    347     # Run
    348     ./toy
    349 
    350 Here is the code:
    351 
    352 .. literalinclude:: ../../examples/Kaleidoscope/BuildingAJIT/Chapter1/KaleidoscopeJIT.h
    353    :language: c++
    354 
    355 .. [1] Actually we use a cut-down version of KaleidoscopeJIT that makes a
    356        simplifying assumption: symbols cannot be re-defined. This will make it
    357        impossible to re-define symbols in the REPL, but will make our symbol
    358        lookup logic simpler. Re-introducing support for symbol redefinition is
    359        left as an exercise for the reader. (The KaleidoscopeJIT.h used in the
    360        original tutorials will be a helpful reference).
    361 
    362 .. [2] +-----------------------------+-----------------------------------------------+
    363        |         File                |               Reason for inclusion            |
    364        +=============================+===============================================+
    365        |      STLExtras.h            | LLVM utilities that are useful when working   |
    366        |                             | with the STL.                                 |
    367        +-----------------------------+-----------------------------------------------+
    368        |   ExecutionEngine.h         | Access to the EngineBuilder::selectTarget     |
    369        |                             | method.                                       |
    370        +-----------------------------+-----------------------------------------------+
    371        |                             | Access to the                                 |
    372        | RTDyldMemoryManager.h       | RTDyldMemoryManager::getSymbolAddressInProcess|
    373        |                             | method.                                       |
    374        +-----------------------------+-----------------------------------------------+
    375        |    CompileUtils.h           | Provides the SimpleCompiler class.            |
    376        +-----------------------------+-----------------------------------------------+
    377        |   IRCompileLayer.h          | Provides the IRCompileLayer class.            |
    378        +-----------------------------+-----------------------------------------------+
    379        |                             | Access the createLambdaResolver function,     |
    380        |   LambdaResolver.h          | which provides easy construction of symbol    |
    381        |                             | resolvers.                                    |
    382        +-----------------------------+-----------------------------------------------+
    383        |  RTDyldObjectLinkingLayer.h | Provides the RTDyldObjectLinkingLayer class.  |
    384        +-----------------------------+-----------------------------------------------+
    385        |       Mangler.h             | Provides the Mangler class for platform       |
    386        |                             | specific name-mangling.                       |
    387        +-----------------------------+-----------------------------------------------+
    388        |   DynamicLibrary.h          | Provides the DynamicLibrary class, which      |
    389        |                             | makes symbols in the host process searchable. |
    390        +-----------------------------+-----------------------------------------------+
    391        |                             | A fast output stream class. We use the        |
    392        |     raw_ostream.h           | raw_string_ostream subclass for symbol        |
    393        |                             | mangling                                      |
    394        +-----------------------------+-----------------------------------------------+
    395        |   TargetMachine.h           | LLVM target machine description class.        |
    396        +-----------------------------+-----------------------------------------------+
    397 
    398 .. [3] Actually they don't have to be lambdas, any object with a call operator
    399        will do, including plain old functions or std::functions.
    400 
    401 .. [4] ``JITSymbol::getAddress`` will force the JIT to compile the definition of
    402        the symbol if it hasn't already been compiled, and since the compilation
    403        process could fail getAddress must be able to return this failure.
    404