1 page.title=Dalvik 2 pdk.version=1.0 3 doc.type=porting 4 @jd:body 5 6 <div id="qv-wrapper"> 7 <div id="qv"> 8 <h2>In this document</h2> 9 <a name="toc"/> 10 <ul> 11 <li><a href="#dalvikCoreLibraries">Core Libraries</a></li> 12 <li><a href="#dalvikJNICallBridge">JNI Call Bridge</a></li> 13 <li><a href="#dalvikInterpreter">Interpreter</a></li> 14 </ul> 15 </div> 16 </div> 17 18 <p> 19 The Dalvik virtual machine is intended to run on a variety of platforms. 20 The baseline system is expected to be a variant of UNIX (Linux, BSD, Mac 21 OS X) running the GNU C compiler. Little-endian CPUs have been exercised 22 the most heavily, but big-endian systems are explicitly supported. 23 </p><p> 24 There are two general categories of work: porting to a Linux system 25 with a previously unseen CPU architecture, and porting to a different 26 operating system. This document covers the former. 27 </p> 28 29 30 <a name="dalvikCoreLibraries"></a><h3>Core Libraries</h3> 31 32 <p> 33 The native code in the core libraries (chiefly <code>dalvik/libcore</code>, 34 but also <code>dalvik/vm/native</code>) is written in C/C++ and is expected 35 to work without modification in a Linux environment. Much of the code 36 comes directly from the Apache Harmony project. 37 </p><p> 38 The core libraries pull in code from many other projects, including 39 OpenSSL, zlib, and ICU. These will also need to be ported before the VM 40 can be used. 41 </p> 42 43 44 <a name="dalvikJNICallBridge"></a><h3>JNI Call Bridge</h3> 45 46 <p> 47 Most of the Dalvik VM runtime is written in portable C. The one 48 non-portable component of the runtime is the JNI call bridge. Simply put, 49 this converts an array of integers into function arguments of various 50 types, and calls a function. This must be done according to the C calling 51 conventions for the platform. The task could be as simple as pushing all 52 of the arguments onto the stack, or involve complex rules for register 53 assignment and stack alignment. 54 </p><p> 55 To ease porting to new platforms, the <a href="http://sourceware.org/libffi/"> 56 open-source FFI library</a> (Foreign Function Interface) is used when a 57 custom bridge is unavailable. FFI is not as fast as a native implementation, 58 and the optional performance improvements it does offer are not used, so 59 writing a replacement is a good first step. 60 </p><p> 61 The code lives in <code>dalvik/vm/arch/*</code>, with the FFI-based version 62 in the "generic" directory. There are two source files for each architecture. 63 One defines the call bridge itself: 64 </p><p><blockquote> 65 <code>void dvmPlatformInvoke(void* pEnv, ClassObject* clazz, int argInfo, 66 int argc, const u4* argv, const char* signature, void* func, 67 JValue* pReturn)</code> 68 </blockquote></p><p> 69 This will invoke a C/C++ function declared: 70 </p><p><blockquote> 71 <code>return_type func(JNIEnv* pEnv, Object* this [, <i>args</i>])<br></code> 72 </blockquote>or (for a "static" method):<blockquote> 73 <code>return_type func(JNIEnv* pEnv, ClassObject* clazz [, <i>args</i>])</code> 74 </blockquote></p><p> 75 The role of <code>dvmPlatformInvoke</code> is to convert the values in 76 <code>argv</code> into C-style calling conventions, call the method, and 77 then place the return type into <code>pReturn</code> (a union that holds 78 all of the basic JNI types). The code may use the method signature 79 (a DEX "shorty" signature, with one character for the return type and one 80 per argument) to determine how to handle the values. 81 </p><p> 82 The other source file involved here defines a 32-bit "hint". The hint 83 is computed when the method's class is loaded, and passed in as the 84 "argInfo" argument. The hint can be used to avoid scanning the ASCII 85 method signature for things like the return value, total argument size, 86 or inter-argument 64-bit alignment restrictions. 87 </p> 88 89 <a name="dalvikInterpreter"></a><h3>Interpreter</h3> 90 91 <p> 92 The Dalvik runtime includes two interpreters, labeled "portable" and "fast". 93 The portable interpreter is largely contained within a single C function, 94 and should compile on any system that supports gcc. (If you don't have gcc, 95 you may need to disable the "threaded" execution model, which relies on 96 gcc's "goto table" implementation; look for the THREADED_INTERP define.) 97 </p><p> 98 The fast interpreter uses hand-coded assembly fragments. If none are 99 available for the current architecture, the build system will create an 100 interpreter out of C "stubs". The resulting "all stubs" interpreter is 101 quite a bit slower than the portable interpreter, making "fast" something 102 of a misnomer. 103 </p><p> 104 The fast interpreter is enabled by default. On platforms without native 105 support, you may want to switch to the portable interpreter. This can 106 be controlled with the <code>dalvik.vm.execution-mode</code> system 107 property. For example, if you: 108 </p><p><blockquote> 109 <code>adb shell "echo dalvik.vm.execution-mode = int:portable >> /data/local.prop"</code> 110 </blockquote></p><p> 111 and reboot, the Android app framework will start the VM with the portable 112 interpreter enabled. 113 </p> 114 115 116 <h3>Mterp Interpreter Structure</h3> 117 118 <p> 119 There may be significant performance advantages to rewriting the 120 interpreter core in assembly language, using architecture-specific 121 optimizations. In Dalvik this can be done one instruction at a time. 122 </p><p> 123 The simplest way to implement an interpreter is to have a large "switch" 124 statement. After each instruction is handled, the interpreter returns to 125 the top of the loop, fetches the next instruction, and jumps to the 126 appropriate label. 127 </p><p> 128 An improvement on this is called "threaded" execution. The instruction 129 fetch and dispatch are included at the end of every instruction handler. 130 This makes the interpreter a little larger overall, but you get to avoid 131 the (potentially expensive) branch back to the top of the switch statement. 132 </p><p> 133 Dalvik mterp goes one step further, using a computed goto instead of a goto 134 table. Instead of looking up the address in a table, which requires an 135 extra memory fetch on every instruction, mterp multiplies the opcode number 136 by a fixed value. By default, each handler is allowed 64 bytes of space. 137 </p><p> 138 Not all handlers fit in 64 bytes. Those that don't can have subroutines 139 or simply continue on to additional code outside the basic space. Some of 140 this is handled automatically by Dalvik, but there's no portable way to detect 141 overflow of a 64-byte handler until the VM starts executing. 142 </p><p> 143 The choice of 64 bytes is somewhat arbitrary, but has worked out well for 144 ARM and x86. 145 </p><p> 146 In the course of development it's useful to have C and assembly 147 implementations of each handler, and be able to flip back and forth 148 between them when hunting problems down. In mterp this is relatively 149 straightforward. You can always see the files being fed to the compiler 150 and assembler for your platform by looking in the 151 <code>dalvik/vm/mterp/out</code> directory. 152 </p><p> 153 The interpreter sources live in <code>dalvik/vm/mterp</code>. If you 154 haven't yet, you should read <code>dalvik/vm/mterp/README.txt</code> now. 155 </p> 156 157 158 <h3>Getting Started With Mterp</h3> 159 160 </p><p> 161 Getting started: 162 <ol> 163 <li>Decide on the name of your architecture. For the sake of discussion, 164 let's call it <code>myarch</code>. 165 <li>Make a copy of <code>dalvik/vm/mterp/config-allstubs</code> to 166 <code>dalvik/vm/mterp/config-myarch</code>. 167 <li>Create a <code>dalvik/vm/mterp/myarch</code> directory to hold your 168 source files. 169 <li>Add <code>myarch</code> to the list in 170 <code>dalvik/vm/mterp/rebuild.sh</code>. 171 <li>Make sure <code>dalvik/vm/Android.mk</code> will find the files for 172 your architecture. If <code>$(TARGET_ARCH)</code> is configured this 173 will happen automatically. 174 </ol> 175 </p><p> 176 You now have the basic framework in place. Whenever you make a change, you 177 need to perform two steps: regenerate the mterp output, and build the 178 core VM library. (It's two steps because we didn't want the build system 179 to require Python 2.5. Which, incidentally, you need to have.) 180 <ol> 181 <li>In the <code>dalvik/vm/mterp</code> directory, regenerate the contents 182 of the files in <code>dalvik/vm/mterp/out</code> by executing 183 <code>./rebuild.sh</code>. Note there are two files, one in C and one 184 in assembly. 185 <li>In the <code>dalvik</code> directory, regenerate the 186 <code>libdvm.so</code> library with <code>mm</code>. You can also use 187 <code>make libdvm</code> from the top of the tree. 188 </ol> 189 </p><p> 190 This will leave you with an updated libdvm.so, which can be pushed out to 191 a device with <code>adb sync</code> or <code>adb push</code>. If you're 192 using the emulator, you need to add <code>make snod</code> (System image, 193 NO Dependency check) to rebuild the system image file. You should not 194 need to do a top-level "make" and rebuild the dependent binaries. 195 </p><p> 196 At this point you have an "all stubs" interpreter. You can see how it 197 works by examining <code>dalvik/vm/mterp/cstubs/entry.c</code>. The 198 code runs in a loop, pulling out the next opcode, and invoking the 199 handler through a function pointer. Each handler takes a "glue" argument 200 that contains all of the useful state. 201 </p><p> 202 Your goal is to replace the entry method, exit method, and each individual 203 instruction with custom implementations. The first thing you need to do 204 is create an entry function that calls the handler for the first instruction. 205 After that, the instructions chain together, so you don't need a loop. 206 (Look at the ARM or x86 implementation to see how they work.) 207 </p><p> 208 Once you have that, you need something to jump to. You can't branch 209 directly to the C stub because it's expecting to be called with a "glue" 210 argument and then return. We need a C stub "wrapper" that does the 211 setup and jumps directly to the next handler. We write this in assembly 212 and then add it to the config file definition. 213 </p><p> 214 To see how this works, create a file called 215 <code>dalvik/vm/mterp/myarch/stub.S</code> that contains one line: 216 <pre> 217 /* stub for ${opcode} */ 218 </pre> 219 Then, in <code>dalvik/vm/mterp/config-myarch</code>, add this below the 220 <code>handler-size</code> directive: 221 <pre> 222 # source for the instruction table stub 223 asm-stub myarch/stub.S 224 </pre> 225 </p><p> 226 Regenerate the sources with <code>./rebuild.sh</code>, and take a look 227 inside <code>dalvik/vm/mterp/out/InterpAsm-myarch.S</code>. You should 228 see 256 copies of the stub function in a single large block after the 229 <code>dvmAsmInstructionStart</code> label. The <code>stub.S</code> 230 code will be used anywhere you don't provide an assembly implementation. 231 </p><p> 232 Note that each block begins with a <code>.balign 64</code> directive. 233 This is what pads each handler out to 64 bytes. Note also that the 234 <code>${opcode}</code> text changed into an opcode name, which should 235 be used to call the C implementation (<code>dvmMterp_${opcode}</code>). 236 </p><p> 237 The actual contents of <code>stub.S</code> are up to you to define. 238 See <code>entry.S</code> and <code>stub.S</code> in the <code>armv5te</code> 239 or <code>x86</code> directories for working examples. 240 </p><p> 241 If you're working on a variation of an existing architecture, you may be 242 able to use most of the existing code and just provide replacements for 243 a few instructions. Look at the <code>armv4t</code> implementation as 244 an example. 245 </p> 246 247 248 <h3>Replacing Stubs</h3> 249 250 <p> 251 There are roughly 230 Dalvik opcodes, including some that are inserted by 252 <a href="dexopt.html">dexopt</a> and aren't described in the 253 <a href="dalvik-bytecode.html">Dalvik bytecode</a> documentation. Each 254 one must perform the appropriate actions, fetch the next opcode, and 255 branch to the next handler. The actions performed by the assembly version 256 must exactly match those performed by the C version (in 257 <code>dalvik/vm/mterp/c/OP_*</code>). 258 </p><p> 259 It is possible to customize the set of "optimized" instructions for your 260 platform. This is possible because optimized DEX files are not expected 261 to work on multiple devices. Adding, removing, or redefining instructions 262 is beyond the scope of this document, and for simplicity it's best to stick 263 with the basic set defined by the portable interpreter. 264 </p><p> 265 Once you have written a handler that looks like it should work, add 266 it to the config file. For example, suppose we have a working version 267 of <code>OP_NOP</code>. For demonstration purposes, fake it for now by 268 putting this into <code>dalvik/vm/mterp/myarch/OP_NOP.S</code>: 269 <pre> 270 /* This is my NOP handler */ 271 </pre> 272 </p><p> 273 Then, in the <code>op-start</code> section of <code>config-myarch</code>, add: 274 <pre> 275 op OP_NOP myarch 276 </pre> 277 </p><p> 278 This tells the generation script to use the assembly version from the 279 <code>myarch</code> directory instead of the C version from the <code>c</code> 280 directory. 281 </p><p> 282 Execute <code>./rebuild.sh</code>. Look at <code>InterpAsm-myarch.S</code> 283 and <code>InterpC-myarch.c</code> in the <code>out</code> directory. You 284 will see that the <code>OP_NOP</code> stub wrapper has been replaced with our 285 new code in the assembly file, and the C stub implementation is no longer 286 included. 287 </p><p> 288 As you implement instructions, the C version and corresponding stub wrapper 289 will disappear from the output files. Eventually you will have a 100% 290 assembly interpreter. 291 </p> 292 293 294 <h3>Interpreter Switching</h3> 295 296 <p> 297 The Dalvik VM actually includes a third interpreter implementation: the debug 298 interpreter. This is a variation of the portable interpreter that includes 299 support for debugging and profiling. 300 </p><p> 301 When a debugger attaches, or a profiling feature is enabled, the VM 302 will switch interpreters at a convenient point. This is done at the 303 same time as the GC safe point check: on a backward branch, a method 304 return, or an exception throw. Similarly, when the debugger detaches 305 or profiling is discontinued, execution transfers back to the "fast" or 306 "portable" interpreter. 307 </p><p> 308 Your entry function needs to test the "entryPoint" value in the "glue" 309 pointer to determine where execution should begin. Your exit function 310 will need to return a boolean that indicates whether the interpreter is 311 exiting (because we reached the "bottom" of a thread stack) or wants to 312 switch to the other implementation. 313 </p><p> 314 See the <code>entry.S</code> file in <code>x86</code> or <code>armv5te</code> 315 for examples. 316 </p> 317 318 319 <h3>Testing</h3> 320 321 <p> 322 A number of VM tests can be found in <code>dalvik/tests</code>. The most 323 useful during interpreter development is <code>003-omnibus-opcodes</code>, 324 which tests many different instructions. 325 </p><p> 326 The basic invocation is: 327 <pre> 328 $ cd dalvik/tests 329 $ ./run-test 003 330 </pre> 331 </p><p> 332 This will run test 003 on an attached device or emulator. You can run 333 the test against your desktop VM by specifying <code>--reference</code> 334 if you suspect the test may be faulty. You can also use 335 <code>--portable</code> and <code>--fast</code> to explictly specify 336 one Dalvik interpreter or the other. 337 </p><p> 338 Some instructions are replaced by <code>dexopt</code>, notably when 339 "quickening" field accesses and method invocations. To ensure 340 that you are testing the basic form of the instruction, add the 341 <code>--no-optimize</code> option. 342 </p><p> 343 There is no in-built instruction tracing mechanism. If you want 344 to know for sure that your implementation of an opcode handler 345 is being used, the easiest approach is to insert a "printf" 346 call. For an example, look at <code>common_squeak</code> in 347 <code>dalvik/vm/mterp/armv5te/footer.S</code>. 348 </p><p> 349 At some point you need to ensure that debuggers and profiling work with 350 your interpreter. The easiest way to do this is to simply connect a 351 debugger or toggle profiling. (A future test suite may include some 352 tests for this.) 353 </p> 354 355 356