1 rt "mterp" README
2
3 NOTE: Find rebuilding instructions at the bottom of this file.
4
5
6 ==== Overview ====
7
8 Every configuration has a "config-*" file that controls how the sources
9 are generated. The sources are written into the "out" directory, where
10 they are picked up by the Android build system.
11
12 The best way to become familiar with the interpreter is to look at the
13 generated files in the "out" directory.
14
15
16 ==== Config file format ====
17
18 The config files are parsed from top to bottom. Each line in the file
19 may be blank, hold a comment (line starts with '#'), or be a command.
20
21 The commands are:
22
23 handler-style <computed-goto|jump-table>
24
25 Specify which style of interpreter to generate. In computed-goto,
26 each handler is allocated a fixed region, allowing transitions to
27 be done via table-start-address + (opcode * handler-size). With
28 jump-table style, handlers may be of any length, and the generated
29 table is an array of pointers to the handlers. This command is required,
30 and must be the first command in the config file.
31
32 handler-size <bytes>
33
34 Specify the size of the fixed region, in bytes. On most platforms
35 this will need to be a power of 2. For jump-table implementations,
36 this command is ignored.
37
38 import <filename>
39
40 The specified file is included immediately, in its entirety. No
41 substitutions are performed. ".cpp" and ".h" files are copied to the
42 C output, ".S" files are copied to the asm output.
43
44 asm-alt-stub <filename>
45
46 When present, this command will cause the generation of an alternate
47 set of entry points (for computed-goto interpreters) or an alternate
48 jump table (for jump-table interpreters).
49
50 fallback-stub <filename>
51
52 Specifies a file to be used for the special FALLBACK tag on the "op"
53 command below. Intended to be used to transfer control to an alternate
54 interpreter to single-step a not-yet-implemented opcode. Note: should
55 note be used on RETURN-class instructions.
56
57 op-start <directory>
58
59 Indicates the start of the opcode list. Must precede any "op"
60 commands. The specified directory is the default location to pull
61 instruction files from.
62
63 op <opcode> <directory>|FALLBACK
64
65 Can only appear after "op-start" and before "op-end". Overrides the
66 default source file location of the specified opcode. The opcode
67 definition will come from the specified file, e.g. "op OP_NOP arm"
68 will load from "arm/OP_NOP.S". A substitution dictionary will be
69 applied (see below). If the special "FALLBACK" token is used instead of
70 a directory name, the source file specified in fallback-stub will instead
71 be used for this opcode.
72
73 alt <opcode> <directory>
74
75 Can only appear after "op-start" and before "op-end". Similar to the
76 "op" command above, but denotes a source file to override the entry
77 in the alternate handler table. The opcode definition will come from
78 the specified file, e.g. "alt OP_NOP arm" will load from
79 "arm/ALT_OP_NOP.S". A substitution dictionary will be applied
80 (see below).
81
82 op-end
83
84 Indicates the end of the opcode list. All kNumPackedOpcodes
85 opcodes are emitted when this is seen, followed by any code that
86 didn't fit inside the fixed-size instruction handler space.
87
88 The order of "op" and "alt" directives are not significant; the generation
89 tool will extract ordering info from the VM sources.
90
91 Typically the form in which most opcodes currently exist is used in
92 the "op-start" directive.
93
94 ==== Instruction file format ====
95
96 The assembly instruction files are simply fragments of assembly sources.
97 The starting label will be provided by the generation tool, as will
98 declarations for the segment type and alignment. The expected target
99 assembler is GNU "as", but others will work (may require fiddling with
100 some of the pseudo-ops emitted by the generation tool).
101
102 A substitution dictionary is applied to all opcode fragments as they are
103 appended to the output. Substitutions can look like "$value" or "${value}".
104
105 The dictionary always includes:
106
107 $opcode - opcode name, e.g. "OP_NOP"
108 $opnum - opcode number, e.g. 0 for OP_NOP
109 $handler_size_bytes - max size of an instruction handler, in bytes
110 $handler_size_bits - max size of an instruction handler, log 2
111
112 Both C and assembly sources will be passed through the C pre-processor,
113 so you can take advantage of C-style comments and preprocessor directives
114 like "#define".
115
116 Some generator operations are available.
117
118 %include "filename" [subst-dict]
119
120 Includes the file, which should look like "arm/OP_NOP.S". You can
121 specify values for the substitution dictionary, using standard Python
122 syntax. For example, this:
123 %include "arm/unop.S" {"result":"r1"}
124 would insert "arm/unop.S" at the current file position, replacing
125 occurrences of "$result" with "r1".
126
127 %default <subst-dict>
128
129 Specify default substitution dictionary values, using standard Python
130 syntax. Useful if you want to have a "base" version and variants.
131
132 %break
133
134 Identifies the split between the main portion of the instruction
135 handler (which must fit in "handler-size" bytes) and the "sister"
136 code, which is appended to the end of the instruction handler block.
137 In jump table implementations, %break is ignored.
138
139 The generation tool does *not* print a warning if your instructions
140 exceed "handler-size", but the VM will abort on startup if it detects an
141 oversized handler. On architectures with fixed-width instructions this
142 is easy to work with, on others this you will need to count bytes.
143
144
145 ==== Using C constants from assembly sources ====
146
147 The file "art/runtime/asm_support.h" has some definitions for constant
148 values, structure sizes, and struct member offsets. The format is fairly
149 restricted, as simple macros are used to massage it for use with both C
150 (where it is verified) and assembly (where the definitions are used).
151
152 If a constant in the file becomes out of sync, the VM will log an error
153 message and abort during startup.
154
155
156 ==== Development tips ====
157
158 If you need to debug the initial piece of an opcode handler, and your
159 debug code expands it beyond the handler size limit, you can insert a
160 generic header at the top:
161
162 b ${opcode}_start
163 %break
164 ${opcode}_start:
165
166 If you already have a %break, it's okay to leave it in place -- the second
167 %break is ignored.
168
169
170 ==== Rebuilding ====
171
172 If you change any of the source file fragments, you need to rebuild the
173 combined source files in the "out" directory. Make sure the files in
174 "out" are editable, then:
175
176 $ cd mterp
177 $ ./rebuild.sh
178
179 The ultimate goal is to have the build system generate the necessary
180 output files without requiring this separate step, but we're not yet
181 ready to require Python in the build.
182
183 ==== Interpreter Control ====
184
185 The mterp fast interpreter achieves much of its performance advantage
186 over the C++ interpreter through its efficient mechanism of
187 transitioning from one Dalvik bytecode to the next. Mterp for ARM targets
188 uses a computed-goto mechanism, in which the handler entrypoints are
189 located at the base of the handler table + (opcode * 128).
190
191 In normal operation, the dedicated register rIBASE
192 (r8 for ARM, edx for x86) holds a mainHandlerTable. If we need to switch
193 to a mode that requires inter-instruction checking, rIBASE is changed
194 to altHandlerTable. Note that this change is not immediate. What is actually
195 changed is the value of curHandlerTable - which is part of the interpBreak
196 structure. Rather than explicitly check for changes, each thread will
197 blindly refresh rIBASE at backward branches, exception throws and returns.
198