1 rt "mterp" README 2 3 NOTE: Find rebuilding instructions at the bottom of this file. 4 5 6 ==== Overview ==== 7 8 Every configuration has a "config-*" file that controls how the sources 9 are generated. The sources are written into the "out" directory, where 10 they are picked up by the Android build system. 11 12 The best way to become familiar with the interpreter is to look at the 13 generated files in the "out" directory. 14 15 16 ==== Config file format ==== 17 18 The config files are parsed from top to bottom. Each line in the file 19 may be blank, hold a comment (line starts with '#'), or be a command. 20 21 The commands are: 22 23 handler-style <computed-goto|jump-table> 24 25 Specify which style of interpreter to generate. In computed-goto, 26 each handler is allocated a fixed region, allowing transitions to 27 be done via table-start-address + (opcode * handler-size). With 28 jump-table style, handlers may be of any length, and the generated 29 table is an array of pointers to the handlers. This command is required, 30 and must be the first command in the config file. 31 32 handler-size <bytes> 33 34 Specify the size of the fixed region, in bytes. On most platforms 35 this will need to be a power of 2. For jump-table implementations, 36 this command is ignored. 37 38 import <filename> 39 40 The specified file is included immediately, in its entirety. No 41 substitutions are performed. ".cpp" and ".h" files are copied to the 42 C output, ".S" files are copied to the asm output. 43 44 asm-alt-stub <filename> 45 46 When present, this command will cause the generation of an alternate 47 set of entry points (for computed-goto interpreters) or an alternate 48 jump table (for jump-table interpreters). 49 50 fallback-stub <filename> 51 52 Specifies a file to be used for the special FALLBACK tag on the "op" 53 command below. Intended to be used to transfer control to an alternate 54 interpreter to single-step a not-yet-implemented opcode. Note: should 55 note be used on RETURN-class instructions. 56 57 op-start <directory> 58 59 Indicates the start of the opcode list. Must precede any "op" 60 commands. The specified directory is the default location to pull 61 instruction files from. 62 63 op <opcode> <directory>|FALLBACK 64 65 Can only appear after "op-start" and before "op-end". Overrides the 66 default source file location of the specified opcode. The opcode 67 definition will come from the specified file, e.g. "op OP_NOP arm" 68 will load from "arm/OP_NOP.S". A substitution dictionary will be 69 applied (see below). If the special "FALLBACK" token is used instead of 70 a directory name, the source file specified in fallback-stub will instead 71 be used for this opcode. 72 73 alt <opcode> <directory> 74 75 Can only appear after "op-start" and before "op-end". Similar to the 76 "op" command above, but denotes a source file to override the entry 77 in the alternate handler table. The opcode definition will come from 78 the specified file, e.g. "alt OP_NOP arm" will load from 79 "arm/ALT_OP_NOP.S". A substitution dictionary will be applied 80 (see below). 81 82 op-end 83 84 Indicates the end of the opcode list. All kNumPackedOpcodes 85 opcodes are emitted when this is seen, followed by any code that 86 didn't fit inside the fixed-size instruction handler space. 87 88 The order of "op" and "alt" directives are not significant; the generation 89 tool will extract ordering info from the VM sources. 90 91 Typically the form in which most opcodes currently exist is used in 92 the "op-start" directive. 93 94 ==== Instruction file format ==== 95 96 The assembly instruction files are simply fragments of assembly sources. 97 The starting label will be provided by the generation tool, as will 98 declarations for the segment type and alignment. The expected target 99 assembler is GNU "as", but others will work (may require fiddling with 100 some of the pseudo-ops emitted by the generation tool). 101 102 A substitution dictionary is applied to all opcode fragments as they are 103 appended to the output. Substitutions can look like "$value" or "${value}". 104 105 The dictionary always includes: 106 107 $opcode - opcode name, e.g. "OP_NOP" 108 $opnum - opcode number, e.g. 0 for OP_NOP 109 $handler_size_bytes - max size of an instruction handler, in bytes 110 $handler_size_bits - max size of an instruction handler, log 2 111 112 Both C and assembly sources will be passed through the C pre-processor, 113 so you can take advantage of C-style comments and preprocessor directives 114 like "#define". 115 116 Some generator operations are available. 117 118 %include "filename" [subst-dict] 119 120 Includes the file, which should look like "arm/OP_NOP.S". You can 121 specify values for the substitution dictionary, using standard Python 122 syntax. For example, this: 123 %include "arm/unop.S" {"result":"r1"} 124 would insert "arm/unop.S" at the current file position, replacing 125 occurrences of "$result" with "r1". 126 127 %default <subst-dict> 128 129 Specify default substitution dictionary values, using standard Python 130 syntax. Useful if you want to have a "base" version and variants. 131 132 %break 133 134 Identifies the split between the main portion of the instruction 135 handler (which must fit in "handler-size" bytes) and the "sister" 136 code, which is appended to the end of the instruction handler block. 137 In jump table implementations, %break is ignored. 138 139 The generation tool does *not* print a warning if your instructions 140 exceed "handler-size", but the VM will abort on startup if it detects an 141 oversized handler. On architectures with fixed-width instructions this 142 is easy to work with, on others this you will need to count bytes. 143 144 145 ==== Using C constants from assembly sources ==== 146 147 The file "art/runtime/asm_support.h" has some definitions for constant 148 values, structure sizes, and struct member offsets. The format is fairly 149 restricted, as simple macros are used to massage it for use with both C 150 (where it is verified) and assembly (where the definitions are used). 151 152 If a constant in the file becomes out of sync, the VM will log an error 153 message and abort during startup. 154 155 156 ==== Development tips ==== 157 158 If you need to debug the initial piece of an opcode handler, and your 159 debug code expands it beyond the handler size limit, you can insert a 160 generic header at the top: 161 162 b ${opcode}_start 163 %break 164 ${opcode}_start: 165 166 If you already have a %break, it's okay to leave it in place -- the second 167 %break is ignored. 168 169 170 ==== Rebuilding ==== 171 172 If you change any of the source file fragments, you need to rebuild the 173 combined source files in the "out" directory. Make sure the files in 174 "out" are editable, then: 175 176 $ cd mterp 177 $ ./rebuild.sh 178 179 The ultimate goal is to have the build system generate the necessary 180 output files without requiring this separate step, but we're not yet 181 ready to require Python in the build. 182 183 ==== Interpreter Control ==== 184 185 The mterp fast interpreter achieves much of its performance advantage 186 over the C++ interpreter through its efficient mechanism of 187 transitioning from one Dalvik bytecode to the next. Mterp for ARM targets 188 uses a computed-goto mechanism, in which the handler entrypoints are 189 located at the base of the handler table + (opcode * 128). 190 191 In normal operation, the dedicated register rIBASE 192 (r8 for ARM, edx for x86) holds a mainHandlerTable. If we need to switch 193 to a mode that requires inter-instruction checking, rIBASE is changed 194 to altHandlerTable. Note that this change is not immediate. What is actually 195 changed is the value of curHandlerTable - which is part of the interpBreak 196 structure. Rather than explicitly check for changes, each thread will 197 blindly refresh rIBASE at backward branches, exception throws and returns. 198