Home | History | Annotate | Download | only in docs
      1 ===============================
      2 MCJIT Design and Implementation
      3 ===============================
      4 
      5 Introduction
      6 ============
      7 
      8 This document describes the internal workings of the MCJIT execution
      9 engine and the RuntimeDyld component.  It is intended as a high level
     10 overview of the implementation, showing the flow and interactions of
     11 objects throughout the code generation and dynamic loading process.
     12 
     13 Engine Creation
     14 ===============
     15 
     16 In most cases, an EngineBuilder object is used to create an instance of
     17 the MCJIT execution engine.  The EngineBuilder takes an llvm::Module
     18 object as an argument to its constructor.  The client may then set various
     19 options that we control the later be passed along to the MCJIT engine,
     20 including the selection of MCJIT as the engine type to be created.
     21 Of particular interest is the EngineBuilder::setMCJITMemoryManager
     22 function.  If the client does not explicitly create a memory manager at
     23 this time, a default memory manager (specifically SectionMemoryManager)
     24 will be created when the MCJIT engine is instantiated.
     25 
     26 Once the options have been set, a client calls EngineBuilder::create to
     27 create an instance of the MCJIT engine.  If the client does not use the
     28 form of this function that takes a TargetMachine as a parameter, a new
     29 TargetMachine will be created based on the target triple associated with
     30 the Module that was used to create the EngineBuilder.
     31 
     32 .. image:: MCJIT-engine-builder.png
     33  
     34 EngineBuilder::create will call the static MCJIT::createJIT function,
     35 passing in its pointers to the module, memory manager and target machine
     36 objects, all of which will subsequently be owned by the MCJIT object.
     37 
     38 The MCJIT class has a member variable, Dyld, which contains an instance of
     39 the RuntimeDyld wrapper class.  This member will be used for
     40 communications between MCJIT and the actual RuntimeDyldImpl object that
     41 gets created when an object is loaded.
     42 
     43 .. image:: MCJIT-creation.png
     44  
     45 Upon creation, MCJIT holds a pointer to the Module object that it received
     46 from EngineBuilder but it does not immediately generate code for this
     47 module.  Code generation is deferred until either the
     48 MCJIT::finalizeObject method is called explicitly or a function such as
     49 MCJIT::getPointerToFunction is called which requires the code to have been
     50 generated.
     51 
     52 Code Generation
     53 ===============
     54 
     55 When code generation is triggered, as described above, MCJIT will first
     56 attempt to retrieve an object image from its ObjectCache member, if one
     57 has been set.  If a cached object image cannot be retrieved, MCJIT will
     58 call its emitObject method.  MCJIT::emitObject uses a local PassManager
     59 instance and creates a new ObjectBufferStream instance, both of which it
     60 passes to TargetMachine::addPassesToEmitMC before calling PassManager::run
     61 on the Module with which it was created.
     62 
     63 .. image:: MCJIT-load.png
     64  
     65 The PassManager::run call causes the MC code generation mechanisms to emit
     66 a complete relocatable binary object image (either in either ELF or MachO
     67 format, depending on the target) into the ObjectBufferStream object, which
     68 is flushed to complete the process.  If an ObjectCache is being used, the
     69 image will be passed to the ObjectCache here.
     70 
     71 At this point, the ObjectBufferStream contains the raw object image.
     72 Before the code can be executed, the code and data sections from this
     73 image must be loaded into suitable memory, relocations must be applied and
     74 memory permission and code cache invalidation (if required) must be completed.
     75 
     76 Object Loading
     77 ==============
     78 
     79 Once an object image has been obtained, either through code generation or
     80 having been retrieved from an ObjectCache, it is passed to RuntimeDyld to
     81 be loaded.  The RuntimeDyld wrapper class examines the object to determine
     82 its file format and creates an instance of either RuntimeDyldELF or
     83 RuntimeDyldMachO (both of which derive from the RuntimeDyldImpl base
     84 class) and calls the RuntimeDyldImpl::loadObject method to perform that
     85 actual loading.
     86 
     87 .. image:: MCJIT-dyld-load.png
     88  
     89 RuntimeDyldImpl::loadObject begins by creating an ObjectImage instance
     90 from the ObjectBuffer it received.  ObjectImage, which wraps the
     91 ObjectFile class, is a helper class which parses the binary object image
     92 and provides access to the information contained in the format-specific
     93 headers, including section, symbol and relocation information.
     94 
     95 RuntimeDyldImpl::loadObject then iterates through the symbols in the
     96 image.  Information about common symbols is collected for later use.  For
     97 each function or data symbol, the associated section is loaded into memory
     98 and the symbol is stored in a symbol table map data structure.  When the
     99 iteration is complete, a section is emitted for the common symbols.
    100 
    101 Next, RuntimeDyldImpl::loadObject iterates through the sections in the
    102 object image and for each section iterates through the relocations for
    103 that sections.  For each relocation, it calls the format-specific
    104 processRelocationRef method, which will examine the relocation and store
    105 it in one of two data structures, a section-based relocation list map and
    106 an external symbol relocation map.
    107 
    108 .. image:: MCJIT-load-object.png
    109  
    110 When RuntimeDyldImpl::loadObject returns, all of the code and data
    111 sections for the object will have been loaded into memory allocated by the
    112 memory manager and relocation information will have been prepared, but the
    113 relocations have not yet been applied and the generated code is still not
    114 ready to be executed.
    115 
    116 [Currently (as of August 2013) the MCJIT engine will immediately apply
    117 relocations when loadObject completes.  However, this shouldn't be
    118 happening.  Because the code may have been generated for a remote target,
    119 the client should be given a chance to re-map the section addresses before
    120 relocations are applied.  It is possible to apply relocations multiple
    121 times, but in the case where addresses are to be re-mapped, this first
    122 application is wasted effort.]
    123 
    124 Address Remapping
    125 =================
    126 
    127 At any time after initial code has been generated and before
    128 finalizeObject is called, the client can remap the address of sections in
    129 the object.  Typically this is done because the code was generated for an
    130 external process and is being mapped into that process' address space.
    131 The client remaps the section address by calling MCJIT::mapSectionAddress.
    132 This should happen before the section memory is copied to its new
    133 location.
    134 
    135 When MCJIT::mapSectionAddress is called, MCJIT passes the call on to
    136 RuntimeDyldImpl (via its Dyld member).  RuntimeDyldImpl stores the new
    137 address in an internal data structure but does not update the code at this
    138 time, since other sections are likely to change.
    139 
    140 When the client is finished remapping section addresses, it will call
    141 MCJIT::finalizeObject to complete the remapping process.
    142 
    143 Final Preparations
    144 ==================
    145 
    146 When MCJIT::finalizeObject is called, MCJIT calls
    147 RuntimeDyld::resolveRelocations.  This function will attempt to locate any
    148 external symbols and then apply all relocations for the object.
    149 
    150 External symbols are resolved by calling the memory manager's
    151 getPointerToNamedFunction method.  The memory manager will return the
    152 address of the requested symbol in the target address space.  (Note, this
    153 may not be a valid pointer in the host process.)  RuntimeDyld will then
    154 iterate through the list of relocations it has stored which are associated
    155 with this symbol and invoke the resolveRelocation method which, through an
    156 format-specific implementation, will apply the relocation to the loaded
    157 section memory.
    158 
    159 Next, RuntimeDyld::resolveRelocations iterates through the list of
    160 sections and for each section iterates through a list of relocations that
    161 have been saved which reference that symbol and call resolveRelocation for
    162 each entry in this list.  The relocation list here is a list of
    163 relocations for which the symbol associated with the relocation is located
    164 in the section associated with the list.  Each of these locations will
    165 have a target location at which the relocation will be applied that is
    166 likely located in a different section.
    167 
    168 .. image:: MCJIT-resolve-relocations.png
    169  
    170 Once relocations have been applied as described above, MCJIT calls
    171 RuntimeDyld::getEHFrameSection, and if a non-zero result is returned
    172 passes the section data to the memory manager's registerEHFrames method.
    173 This allows the memory manager to call any desired target-specific
    174 functions, such as registering the EH frame information with a debugger.
    175 
    176 Finally, MCJIT calls the memory manager's finalizeMemory method.  In this
    177 method, the memory manager will invalidate the target code cache, if
    178 necessary, and apply final permissions to the memory pages it has
    179 allocated for code and data memory.
    180 
    181