Home | History | Annotate | Download | only in docs
      1 <html>
      2 <head>
      3     <title>Dalvik Porting Guide</title>
      4 </head>
      5 
      6 <body>
      7 <h1>Dalvik Porting Guide</h1>
      8 
      9 <p>
     10 The Dalvik virtual machine is intended to run on a variety of platforms.
     11 The baseline system is expected to be a variant of UNIX (Linux, BSD, Mac
     12 OS X) running the GNU C compiler.  Little-endian CPUs have been exercised
     13 the most heavily, but big-endian systems are explicitly supported.
     14 </p><p>
     15 There are two general categories of work: porting to a Linux system
     16 with a previously unseen CPU architecture, and porting to a different
     17 operating system.  This document covers the former.
     18 </p><p>
     19 Basic familiarity with the Android platform, source code structure, and
     20 build system is assumed.
     21 </p>
     22 
     23 
     24 <h2>Core Libraries</h2>
     25 
     26 <p>
     27 The native code in the core libraries (chiefly <code>dalvik/libcore</code>,
     28 but also <code>dalvik/vm/native</code>) is written in C/C++ and is expected
     29 to work without modification in a Linux environment.  Much of the code
     30 comes directly from the Apache Harmony project.
     31 </p><p>
     32 The core libraries pull in code from many other projects, including
     33 OpenSSL, zlib, and ICU.  These will also need to be ported before the VM
     34 can be used.
     35 </p>
     36 
     37 
     38 <h2>JNI Call Bridge</h2>
     39 
     40 <p>
     41 Most of the Dalvik VM runtime is written in portable C.  The one
     42 non-portable component of the runtime is the JNI call bridge.  Simply put,
     43 this converts an array of integers into function arguments of various
     44 types, and calls a function.  This must be done according to the C calling
     45 conventions for the platform.  The task could be as simple as pushing all
     46 of the arguments onto the stack, or involve complex rules for register
     47 assignment and stack alignment.
     48 </p><p>
     49 To ease porting to new platforms, the <a href="http://sourceware.org/libffi/">
     50 open-source FFI library</a> (Foreign Function Interface) is used when a
     51 custom bridge is unavailable.  FFI is not as fast as a native implementation,
     52 and the optional performance improvements it does offer are not used, so
     53 writing a replacement is a good first step.
     54 </p><p>
     55 The code lives in <code>dalvik/vm/arch/*</code>, with the FFI-based version
     56 in the "generic" directory.  There are two source files for each architecture.
     57 One defines the call bridge itself:
     58 </p><p><blockquote>
     59 <code>void dvmPlatformInvoke(void* pEnv, ClassObject* clazz, int argInfo,
     60 int argc, const u4* argv, const char* signature, void* func,
     61 JValue* pReturn)</code>
     62 </blockquote></p><p>
     63 This will invoke a C/C++ function declared:
     64 </p><p><blockquote>
     65     <code>return_type func(JNIEnv* pEnv, Object* this [, <i>args</i>])<br></code>
     66 </blockquote>or (for a "static" method):<blockquote>
     67     <code>return_type func(JNIEnv* pEnv, ClassObject* clazz [, <i>args</i>])</code>
     68 </blockquote></p><p>
     69 The role of <code>dvmPlatformInvoke</code> is to convert the values in
     70 <code>argv</code> into C-style calling conventions, call the method, and
     71 then place the return type into <code>pReturn</code> (a union that holds
     72 all of the basic JNI types).  The code may use the method signature
     73 (a DEX "shorty" signature, with one character for the return type and one
     74 per argument) to determine how to handle the values.
     75 </p><p>
     76 The other source file involved here defines a 32-bit "hint".  The hint
     77 is computed when the method's class is loaded, and passed in as the
     78 "argInfo" argument.  The hint can be used to avoid scanning the ASCII
     79 method signature for things like the return value, total argument size,
     80 or inter-argument 64-bit alignment restrictions.
     81 
     82 
     83 <h2>Interpreter</h2>
     84 
     85 <p>
     86 The Dalvik runtime includes two interpreters, labeled "portable" and "fast".
     87 The portable interpreter is largely contained within a single C function,
     88 and should compile on any system that supports gcc.  (If you don't have gcc,
     89 you may need to disable the "threaded" execution model, which relies on
     90 gcc's "goto table" implementation; look for the THREADED_INTERP define.)
     91 </p><p>
     92 The fast interpreter uses hand-coded assembly fragments.  If none are
     93 available for the current architecture, the build system will create an
     94 interpreter out of C "stubs".  The resulting "all stubs" interpreter is
     95 quite a bit slower than the portable interpreter, making "fast" something
     96 of a misnomer.
     97 </p><p>
     98 The fast interpreter is enabled by default.  On platforms without native
     99 support, you may want to switch to the portable interpreter.  This can
    100 be controlled with the <code>dalvik.vm.execution-mode</code> system
    101 property.  For example, if you:
    102 </p><p><blockquote>
    103 <code>adb shell "echo dalvik.vm.execution-mode = int:portable >> /data/local.prop"</code>
    104 </blockquote></p><p>
    105 and reboot, the Android app framework will start the VM with the portable
    106 interpreter enabled.
    107 </p>
    108 
    109 
    110 <h3>Mterp Interpreter Structure</h3>
    111 
    112 <p>
    113 There may be significant performance advantages to rewriting the
    114 interpreter core in assembly language, using architecture-specific
    115 optimizations.  In Dalvik this can be done one instruction at a time.
    116 </p><p>
    117 The simplest way to implement an interpreter is to have a large "switch"
    118 statement.  After each instruction is handled, the interpreter returns to
    119 the top of the loop, fetches the next instruction, and jumps to the
    120 appropriate label.
    121 </p><p>
    122 An improvement on this is called "threaded" execution.  The instruction
    123 fetch and dispatch are included at the end of every instruction handler.
    124 This makes the interpreter a little larger overall, but you get to avoid
    125 the (potentially expensive) branch back to the top of the switch statement.
    126 </p><p>
    127 Dalvik mterp goes one step further, using a computed goto instead of a goto
    128 table.  Instead of looking up the address in a table, which requires an
    129 extra memory fetch on every instruction, mterp multiplies the opcode number
    130 by a fixed value.  By default, each handler is allowed 64 bytes of space.
    131 </p><p>
    132 Not all handlers fit in 64 bytes.  Those that don't can have subroutines
    133 or simply continue on to additional code outside the basic space.  Some of
    134 this is handled automatically by Dalvik, but there's no portable way to detect
    135 overflow of a 64-byte handler until the VM starts executing.
    136 </p><p>
    137 The choice of 64 bytes is somewhat arbitrary, but has worked out well for
    138 ARM and x86.
    139 </p><p>
    140 In the course of development it's useful to have C and assembly
    141 implementations of each handler, and be able to flip back and forth
    142 between them when hunting problems down.  In mterp this is relatively
    143 straightforward.  You can always see the files being fed to the compiler
    144 and assembler for your platform by looking in the
    145 <code>dalvik/vm/mterp/out</code> directory.
    146 </p><p>
    147 The interpreter sources live in <code>dalvik/vm/mterp</code>.  If you
    148 haven't yet, you should read <code>dalvik/vm/mterp/README.txt</code> now.
    149 </p>
    150 
    151 
    152 <h3>Getting Started With Mterp</h3>
    153 
    154 </p><p>
    155 Getting started:
    156 <ol>
    157 <li>Decide on the name of your architecture.  For the sake of discussion,
    158 let's call it <code>myarch</code>.
    159 <li>Make a copy of <code>dalvik/vm/mterp/config-allstubs</code> to
    160 <code>dalvik/vm/mterp/config-myarch</code>.
    161 <li>Create a <code>dalvik/vm/mterp/myarch</code> directory to hold your
    162 source files.
    163 <li>Add <code>myarch</code> to the list in
    164 <code>dalvik/vm/mterp/rebuild.sh</code>.
    165 <li>Make sure <code>dalvik/vm/Android.mk</code> will find the files for
    166 your architecture.  If <code>$(TARGET_ARCH)</code> is configured this
    167 will happen automatically.
    168 </ol>
    169 </p><p>
    170 You now have the basic framework in place.  Whenever you make a change, you
    171 need to perform two steps: regenerate the mterp output, and build the
    172 core VM library.  (It's two steps because we didn't want the build system
    173 to require Python 2.5.  Which, incidentally, you need to have.)
    174 <ol>
    175 <li>In the <code>dalvik/vm/mterp</code> directory, regenerate the contents
    176 of the files in <code>dalvik/vm/mterp/out</code> by executing
    177 <code>./rebuild.sh</code>.  Note there are two files, one in C and one
    178 in assembly.
    179 <li>In the <code>dalvik</code> directory, regenerate the
    180 <code>libdvm.so</code> library with <code>mm</code>.  You can also use
    181 <code>make libdvm</code> from the top of the tree.
    182 </ol>
    183 </p><p>
    184 This will leave you with an updated libdvm.so, which can be pushed out to
    185 a device with <code>adb sync</code> or <code>adb push</code>.  If you're
    186 using the emulator, you need to add <code>make snod</code> (System image,
    187 NO Dependency check) to rebuild the system image file.  You should not
    188 need to do a top-level "make" and rebuild the dependent binaries.
    189 </p><p>
    190 At this point you have an "all stubs" interpreter.  You can see how it
    191 works by examining <code>dalvik/vm/mterp/cstubs/entry.c</code>.  The
    192 code runs in a loop, pulling out the next opcode, and invoking the
    193 handler through a function pointer.  Each handler takes a "glue" argument
    194 that contains all of the useful state.
    195 </p><p>
    196 Your goal is to replace the entry method, exit method, and each individual
    197 instruction with custom implementations.  The first thing you need to do
    198 is create an entry function that calls the handler for the first instruction.
    199 After that, the instructions chain together, so you don't need a loop.
    200 (Look at the ARM or x86 implementation to see how they work.)
    201 </p><p>
    202 Once you have that, you need something to jump to.  You can't branch
    203 directly to the C stub because it's expecting to be called with a "glue"
    204 argument and then return.  We need a C stub "wrapper" that does the
    205 setup and jumps directly to the next handler.  We write this in assembly
    206 and then add it to the config file definition.
    207 </p><p>
    208 To see how this works, create a file called
    209 <code>dalvik/vm/mterp/myarch/stub.S</code> that contains one line:
    210 <pre>
    211 /* stub for ${opcode} */
    212 </pre>
    213 Then, in <code>dalvik/vm/mterp/config-myarch</code>, add this below the
    214 <code>handler-size</code> directive:
    215 <pre>
    216 # source for the instruction table stub
    217 asm-stub myarch/stub.S
    218 </pre>
    219 </p><p>
    220 Regenerate the sources with <code>./rebuild.sh</code>, and take a look
    221 inside <code>dalvik/vm/mterp/out/InterpAsm-myarch.S</code>.  You should
    222 see 256 copies of the stub function in a single large block after the
    223 <code>dvmAsmInstructionStart</code> label.  The <code>stub.S</code>
    224 code will be used anywhere you don't provide an assembly implementation.
    225 </p><p>
    226 Note that each block begins with a <code>.balign 64</code> directive.
    227 This is what pads each handler out to 64 bytes.  Note also that the
    228 <code>${opcode}</code> text changed into an opcode name, which should
    229 be used to call the C implementation (<code>dvmMterp_${opcode}</code>).
    230 </p><p>
    231 The actual contents of <code>stub.S</code> are up to you to define.
    232 See <code>entry.S</code> and <code>stub.S</code> in the <code>armv5te</code>
    233 or <code>x86</code> directories for working examples.
    234 </p><p>
    235 If you're working on a variation of an existing architecture, you may be
    236 able to use most of the existing code and just provide replacements for
    237 a few instructions.  Look at the <code>armv4t</code> implementation as
    238 an example.
    239 </p>
    240 
    241 
    242 <h3>Replacing Stubs</h3>
    243 
    244 <p>
    245 There are roughly 230 Dalvik opcodes, including some that are inserted by
    246 <a href="dexopt.html">dexopt</a> and aren't described in the
    247 <a href="dalvik-bytecode.html">Dalvik bytecode</a> documentation.  Each
    248 one must perform the appropriate actions, fetch the next opcode, and
    249 branch to the next handler.  The actions performed by the assembly version
    250 must exactly match those performed by the C version (in
    251 <code>dalvik/vm/mterp/c/OP_*</code>).
    252 </p><p>
    253 It is possible to customize the set of "optimized" instructions for your
    254 platform.  This is possible because optimized DEX files are not expected
    255 to work on multiple devices.  Adding, removing, or redefining instructions
    256 is beyond the scope of this document, and for simplicity it's best to stick
    257 with the basic set defined by the portable interpreter.
    258 </p><p>
    259 Once you have written a handler that looks like it should work, add
    260 it to the config file.  For example, suppose we have a working version
    261 of <code>OP_NOP</code>.  For demonstration purposes, fake it for now by
    262 putting this into <code>dalvik/vm/mterp/myarch/OP_NOP.S</code>:
    263 <pre>
    264 /* This is my NOP handler */
    265 </pre>
    266 </p><p>
    267 Then, in the <code>op-start</code> section of <code>config-myarch</code>, add:
    268 <pre>
    269     op OP_NOP myarch
    270 </pre>
    271 </p><p>
    272 This tells the generation script to use the assembly version from the
    273 <code>myarch</code> directory instead of the C version from the <code>c</code>
    274 directory.
    275 </p><p>
    276 Execute <code>./rebuild.sh</code>.  Look at <code>InterpAsm-myarch.S</code>
    277 and <code>InterpC-myarch.c</code> in the <code>out</code> directory.  You
    278 will see that the <code>OP_NOP</code> stub wrapper has been replaced with our
    279 new code in the assembly file, and the C stub implementation is no longer
    280 included.
    281 </p><p>
    282 As you implement instructions, the C version and corresponding stub wrapper
    283 will disappear from the output files.  Eventually you will have a 100%
    284 assembly interpreter.  You may find it saves a little time to examine
    285 the output of your compiler for some of the operations.  The
    286 <a href="porting-proto.c.txt">porting-proto.c</a> sample code can be
    287 helpful here.
    288 </p>
    289 
    290 
    291 <h3>Interpreter Switching</h3>
    292 
    293 <p>
    294 The Dalvik VM actually includes a third interpreter implementation: the debug
    295 interpreter.  This is a variation of the portable interpreter that includes
    296 support for debugging and profiling.
    297 </p><p>
    298 When a debugger attaches, or a profiling feature is enabled, the VM
    299 will switch interpreters at a convenient point.  This is done at the
    300 same time as the GC safe point check: on a backward branch, a method
    301 return, or an exception throw.  Similarly, when the debugger detaches
    302 or profiling is discontinued, execution transfers back to the "fast" or
    303 "portable" interpreter.
    304 </p><p>
    305 Your entry function needs to test the "entryPoint" value in the "glue"
    306 pointer to determine where execution should begin.  Your exit function
    307 will need to return a boolean that indicates whether the interpreter is
    308 exiting (because we reached the "bottom" of a thread stack) or wants to
    309 switch to the other implementation.
    310 </p><p>
    311 See the <code>entry.S</code> file in <code>x86</code> or <code>armv5te</code>
    312 for examples.
    313 </p>
    314 
    315 
    316 <h3>Testing</h3>
    317 
    318 <p>
    319 A number of VM tests can be found in <code>dalvik/tests</code>.  The most
    320 useful during interpreter development is <code>003-omnibus-opcodes</code>,
    321 which tests many different instructions.
    322 </p><p>
    323 The basic invocation is:
    324 <pre>
    325 $ cd dalvik/tests
    326 $ ./run-test 003
    327 </pre>
    328 </p><p>
    329 This will run test 003 on an attached device or emulator.  You can run
    330 the test against your desktop VM by specifying <code>--reference</code>
    331 if you suspect the test may be faulty.  You can also use
    332 <code>--portable</code> and <code>--fast</code> to explictly specify
    333 one Dalvik interpreter or the other.
    334 </p><p>
    335 Some instructions are replaced by <code>dexopt</code>, notably when
    336 "quickening" field accesses and method invocations.  To ensure
    337 that you are testing the basic form of the instruction, add the
    338 <code>--no-optimize</code> option.
    339 </p><p>
    340 There is no in-built instruction tracing mechanism.  If you want
    341 to know for sure that your implementation of an opcode handler
    342 is being used, the easiest approach is to insert a "printf"
    343 call.  For an example, look at <code>common_squeak</code> in
    344 <code>dalvik/vm/mterp/armv5te/footer.S</code>.
    345 </p><p>
    346 At some point you need to ensure that debuggers and profiling work with
    347 your interpreter.  The easiest way to do this is to simply connect a
    348 debugger or toggle profiling.  (A future test suite may include some
    349 tests for this.)
    350 </p>
    351 
    352 <p>
    353 <address>Copyright &copy; 2009 The Android Open Source Project</address>
    354 
    355 </body>
    356 </html>
    357