README.html
1 <?xml version="1.0" encoding="utf-8" ?>
2 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
3 <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
4 <head>
5 <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
6 <meta name="generator" content="Docutils 0.6: http://docutils.sourceforge.net/" />
7 <title>libbcc: A Versatile Bitcode Execution Engine for Mobile Devices</title>
8 <style type="text/css">
9
10 /*
11 :Author: David Goodger (goodger (a] python.org)
12 :Id: $Id: html4css1.css 5951 2009-05-18 18:03:10Z milde $
13 :Copyright: This stylesheet has been placed in the public domain.
14
15 Default cascading style sheet for the HTML output of Docutils.
16
17 See http://docutils.sf.net/docs/howto/html-stylesheets.html for how to
18 customize this style sheet.
19 */
20
21 /* used to remove borders from tables and images */
22 .borderless, table.borderless td, table.borderless th {
23 border: 0 }
24
25 table.borderless td, table.borderless th {
26 /* Override padding for "table.docutils td" with "! important".
27 The right padding separates the table cells. */
28 padding: 0 0.5em 0 0 ! important }
29
30 .first {
31 /* Override more specific margin styles with "! important". */
32 margin-top: 0 ! important }
33
34 .last, .with-subtitle {
35 margin-bottom: 0 ! important }
36
37 .hidden {
38 display: none }
39
40 a.toc-backref {
41 text-decoration: none ;
42 color: black }
43
44 blockquote.epigraph {
45 margin: 2em 5em ; }
46
47 dl.docutils dd {
48 margin-bottom: 0.5em }
49
50 /* Uncomment (and remove this text!) to get bold-faced definition list terms
51 dl.docutils dt {
52 font-weight: bold }
53 */
54
55 div.abstract {
56 margin: 2em 5em }
57
58 div.abstract p.topic-title {
59 font-weight: bold ;
60 text-align: center }
61
62 div.admonition, div.attention, div.caution, div.danger, div.error,
63 div.hint, div.important, div.note, div.tip, div.warning {
64 margin: 2em ;
65 border: medium outset ;
66 padding: 1em }
67
68 div.admonition p.admonition-title, div.hint p.admonition-title,
69 div.important p.admonition-title, div.note p.admonition-title,
70 div.tip p.admonition-title {
71 font-weight: bold ;
72 font-family: sans-serif }
73
74 div.attention p.admonition-title, div.caution p.admonition-title,
75 div.danger p.admonition-title, div.error p.admonition-title,
76 div.warning p.admonition-title {
77 color: red ;
78 font-weight: bold ;
79 font-family: sans-serif }
80
81 /* Uncomment (and remove this text!) to get reduced vertical space in
82 compound paragraphs.
83 div.compound .compound-first, div.compound .compound-middle {
84 margin-bottom: 0.5em }
85
86 div.compound .compound-last, div.compound .compound-middle {
87 margin-top: 0.5em }
88 */
89
90 div.dedication {
91 margin: 2em 5em ;
92 text-align: center ;
93 font-style: italic }
94
95 div.dedication p.topic-title {
96 font-weight: bold ;
97 font-style: normal }
98
99 div.figure {
100 margin-left: 2em ;
101 margin-right: 2em }
102
103 div.footer, div.header {
104 clear: both;
105 font-size: smaller }
106
107 div.line-block {
108 display: block ;
109 margin-top: 1em ;
110 margin-bottom: 1em }
111
112 div.line-block div.line-block {
113 margin-top: 0 ;
114 margin-bottom: 0 ;
115 margin-left: 1.5em }
116
117 div.sidebar {
118 margin: 0 0 0.5em 1em ;
119 border: medium outset ;
120 padding: 1em ;
121 background-color: #ffffee ;
122 width: 40% ;
123 float: right ;
124 clear: right }
125
126 div.sidebar p.rubric {
127 font-family: sans-serif ;
128 font-size: medium }
129
130 div.system-messages {
131 margin: 5em }
132
133 div.system-messages h1 {
134 color: red }
135
136 div.system-message {
137 border: medium outset ;
138 padding: 1em }
139
140 div.system-message p.system-message-title {
141 color: red ;
142 font-weight: bold }
143
144 div.topic {
145 margin: 2em }
146
147 h1.section-subtitle, h2.section-subtitle, h3.section-subtitle,
148 h4.section-subtitle, h5.section-subtitle, h6.section-subtitle {
149 margin-top: 0.4em }
150
151 h1.title {
152 text-align: center }
153
154 h2.subtitle {
155 text-align: center }
156
157 hr.docutils {
158 width: 75% }
159
160 img.align-left, .figure.align-left{
161 clear: left ;
162 float: left ;
163 margin-right: 1em }
164
165 img.align-right, .figure.align-right {
166 clear: right ;
167 float: right ;
168 margin-left: 1em }
169
170 .align-left {
171 text-align: left }
172
173 .align-center {
174 clear: both ;
175 text-align: center }
176
177 .align-right {
178 text-align: right }
179
180 /* reset inner alignment in figures */
181 div.align-right {
182 text-align: left }
183
184 /* div.align-center * { */
185 /* text-align: left } */
186
187 ol.simple, ul.simple {
188 margin-bottom: 1em }
189
190 ol.arabic {
191 list-style: decimal }
192
193 ol.loweralpha {
194 list-style: lower-alpha }
195
196 ol.upperalpha {
197 list-style: upper-alpha }
198
199 ol.lowerroman {
200 list-style: lower-roman }
201
202 ol.upperroman {
203 list-style: upper-roman }
204
205 p.attribution {
206 text-align: right ;
207 margin-left: 50% }
208
209 p.caption {
210 font-style: italic }
211
212 p.credits {
213 font-style: italic ;
214 font-size: smaller }
215
216 p.label {
217 white-space: nowrap }
218
219 p.rubric {
220 font-weight: bold ;
221 font-size: larger ;
222 color: maroon ;
223 text-align: center }
224
225 p.sidebar-title {
226 font-family: sans-serif ;
227 font-weight: bold ;
228 font-size: larger }
229
230 p.sidebar-subtitle {
231 font-family: sans-serif ;
232 font-weight: bold }
233
234 p.topic-title {
235 font-weight: bold }
236
237 pre.address {
238 margin-bottom: 0 ;
239 margin-top: 0 ;
240 font: inherit }
241
242 pre.literal-block, pre.doctest-block {
243 margin-left: 2em ;
244 margin-right: 2em }
245
246 span.classifier {
247 font-family: sans-serif ;
248 font-style: oblique }
249
250 span.classifier-delimiter {
251 font-family: sans-serif ;
252 font-weight: bold }
253
254 span.interpreted {
255 font-family: sans-serif }
256
257 span.option {
258 white-space: nowrap }
259
260 span.pre {
261 white-space: pre }
262
263 span.problematic {
264 color: red }
265
266 span.section-subtitle {
267 /* font-size relative to parent (h1..h6 element) */
268 font-size: 80% }
269
270 table.citation {
271 border-left: solid 1px gray;
272 margin-left: 1px }
273
274 table.docinfo {
275 margin: 2em 4em }
276
277 table.docutils {
278 margin-top: 0.5em ;
279 margin-bottom: 0.5em }
280
281 table.footnote {
282 border-left: solid 1px black;
283 margin-left: 1px }
284
285 table.docutils td, table.docutils th,
286 table.docinfo td, table.docinfo th {
287 padding-left: 0.5em ;
288 padding-right: 0.5em ;
289 vertical-align: top }
290
291 table.docutils th.field-name, table.docinfo th.docinfo-name {
292 font-weight: bold ;
293 text-align: left ;
294 white-space: nowrap ;
295 padding-left: 0 }
296
297 h1 tt.docutils, h2 tt.docutils, h3 tt.docutils,
298 h4 tt.docutils, h5 tt.docutils, h6 tt.docutils {
299 font-size: 100% }
300
301 ul.auto-toc {
302 list-style-type: none }
303
304 </style>
305 </head>
306 <body>
307 <div class="document" id="libbcc-a-versatile-bitcode-execution-engine-for-mobile-devices">
308 <h1 class="title">libbcc: A Versatile Bitcode Execution Engine for Mobile Devices</h1>
309
310 <div class="section" id="introduction">
311 <h1>Introduction</h1>
312 <p>libbcc is an LLVM bitcode execution engine that compiles the bitcode
313 to an in-memory executable. libbcc is versatile because:</p>
314 <ul class="simple">
315 <li>it implements both AOT (Ahead-of-Time) and JIT (Just-in-Time)
316 compilation.</li>
317 <li>Android devices demand fast start-up time, small size, and high
318 performance <em>at the same time</em>. libbcc attempts to address these
319 design constraints.</li>
320 <li>it supports on-device linking. Each device vendor can supply his or
321 her own runtime bitcode library (lib*.bc) that differentiates his or
322 her system. Specialization becomes ecosystem-friendly.</li>
323 </ul>
324 <p>libbcc provides:</p>
325 <ul class="simple">
326 <li>a <em>just-in-time bitcode compiler</em>, which translates the LLVM bitcode
327 into machine code</li>
328 <li>a <em>caching mechanism</em>, which can:<ul>
329 <li>after each compilation, serialize the in-memory executable into a
330 cache file. Note that the compilation is triggered by a cache
331 miss.</li>
332 <li>load from the cache file upon cache-hit.</li>
333 </ul>
334 </li>
335 </ul>
336 <p>Highlights of libbcc are:</p>
337 <ul>
338 <li><p class="first">libbcc supports bitcode from various language frontends, such as
339 RenderScript, GLSL (pixelflinger2).</p>
340 </li>
341 <li><p class="first">libbcc strives to balance between library size, launch time and
342 steady-state performance:</p>
343 <ul>
344 <li><p class="first">The size of libbcc is aggressively reduced for mobile devices. We
345 customize and improve upon the default Execution Engine from
346 upstream. Otherwise, libbcc's execution engine can easily become
347 at least 2 times bigger.</p>
348 </li>
349 <li><p class="first">To reduce launch time, we support caching of
350 binaries. Just-in-Time compilation are oftentimes Just-too-Late,
351 if the given apps are performance-sensitive. Thus, we implemented
352 AOT to get the best of both worlds: Fast launch time and high
353 steady-state performance.</p>
354 <p>AOT is also important for projects such as NDK on LLVM with
355 portability enhancement. Launch time reduction after we
356 implemented AOT is signficant:</p>
357 <pre class="literal-block">
358 Apps libbcc without AOT libbcc with AOT
359 launch time in libbcc launch time in libbcc
360 App_1 1218ms 9ms
361 App_2 842ms 4ms
362 Wallpaper:
363 MagicSmoke 182ms 3ms
364 Halo 127ms 3ms
365 Balls 149ms 3ms
366 SceneGraph 146ms 90ms
367 Model 104ms 4ms
368 Fountain 57ms 3ms
369 </pre>
370 <p>AOT also masks the launching time overhead of on-device linking
371 and helps it become reality.</p>
372 </li>
373 <li><p class="first">For steady-state performance, we enable VFP3 and aggressive
374 optimizations.</p>
375 </li>
376 </ul>
377 </li>
378 <li><p class="first">Currently we disable Lazy JITting.</p>
379 </li>
380 </ul>
381 </div>
382 <div class="section" id="api">
383 <h1>API</h1>
384 <p><strong>Basic:</strong></p>
385 <ul class="simple">
386 <li><strong>bccCreateScript</strong> - Create new bcc script</li>
387 <li><strong>bccRegisterSymbolCallback</strong> - Register the callback function for external
388 symbol lookup</li>
389 <li><strong>bccReadBC</strong> - Set the source bitcode for compilation</li>
390 <li><strong>bccReadModule</strong> - Set the llvm::Module for compilation</li>
391 <li><strong>bccLinkBC</strong> - Set the library bitcode for linking</li>
392 <li><strong>bccPrepareExecutable</strong> - <em>deprecated</em> - Use bccPrepareExecutableEx instead</li>
393 <li><strong>bccPrepareExecutableEx</strong> - Create the in-memory executable by either
394 just-in-time compilation or cache loading</li>
395 <li><strong>bccGetFuncAddr</strong> - Get the entry address of the function</li>
396 <li><strong>bccDisposeScript</strong> - Destroy bcc script and release the resources</li>
397 <li><strong>bccGetError</strong> - <em>deprecated</em> - Don't use this</li>
398 </ul>
399 <p><strong>Reflection:</strong></p>
400 <ul class="simple">
401 <li><strong>bccGetExportVarCount</strong> - Get the count of exported variables</li>
402 <li><strong>bccGetExportVarList</strong> - Get the addresses of exported variables</li>
403 <li><strong>bccGetExportFuncCount</strong> - Get the count of exported functions</li>
404 <li><strong>bccGetExportFuncList</strong> - Get the addresses of exported functions</li>
405 <li><strong>bccGetPragmaCount</strong> - Get the count of pragmas</li>
406 <li><strong>bccGetPragmaList</strong> - Get the pragmas</li>
407 </ul>
408 <p><strong>Debug:</strong></p>
409 <ul class="simple">
410 <li><strong>bccGetFuncCount</strong> - Get the count of functions (including non-exported)</li>
411 <li><strong>bccGetFuncInfoList</strong> - Get the function information (name, base, size)</li>
412 </ul>
413 </div>
414 <div class="section" id="cache-file-format">
415 <h1>Cache File Format</h1>
416 <p>A cache file (denoted as *.oBCC) for libbcc consists of several sections:
417 header, string pool, dependencies table, relocation table, exported
418 variable list, exported function list, pragma list, function information
419 table, and bcc context. Every section should be aligned to a word size.
420 Here is the brief description of each sections:</p>
421 <ul class="simple">
422 <li><strong>Header</strong> (MCO_Header) - The header of a cache file. It contains the
423 magic word, version, machine integer type information (the endianness,
424 the size of off_t, size_t, and ptr_t), and the size
425 and offset of other sections. The header section is guaranteed
426 to be at the beginning of the cache file.</li>
427 <li><strong>String Pool</strong> (MCO_StringPool) - A collection of serialized variable
428 length strings. The strp_index in the other part of the cache file
429 represents the index of such string in this string pool.</li>
430 <li><strong>Dependencies Table</strong> (MCO_DependencyTable) - The dependencies table.
431 This table stores the resource name (or file path), the resource
432 type (rather in APK or on the file system), and the SHA1 checksum.</li>
433 <li><strong>Relocation Table</strong> (MCO_RelocationTable) - <em>not enabled</em></li>
434 <li><strong>Exported Variable List</strong> (MCO_ExportVarList) -
435 The list of the addresses of exported variables.</li>
436 <li><strong>Exported Function List</strong> (MCO_ExportFuncList) -
437 The list of the addresses of exported functions.</li>
438 <li><strong>Pragma List</strong> (MCO_PragmaList) - The list of pragma key-value pair.</li>
439 <li><strong>Function Information Table</strong> (MCO_FuncTable) - This is a table of
440 function information, such as function name, function entry address,
441 and function binary size. Besides, the table should be ordered by
442 function name.</li>
443 <li><strong>Context</strong> - The context of the in-memory executable, including
444 the code and the data. The offset of context should aligned to
445 a page size, so that we can mmap the context directly into memory.</li>
446 </ul>
447 <p>For furthur information, you may read <a class="reference external" href="include/bcc/bcc_cache.h">bcc_cache.h</a>,
448 <a class="reference external" href="lib/bcc/CacheReader.cpp">CacheReader.cpp</a>, and
449 <a class="reference external" href="lib/bcc/CacheWriter.cpp">CacheWriter.cpp</a> for details.</p>
450 </div>
451 <div class="section" id="jit-ed-code-calling-conventions">
452 <h1>JIT'ed Code Calling Conventions</h1>
453 <ol class="arabic">
454 <li><p class="first">Calls from Execution Environment or from/to within script:</p>
455 <p>On ARM, the first 4 arguments will go into r0, r1, r2, and r3, in that order.
456 The remaining (if any) will go through stack.</p>
457 <p>For ext_vec_types such as float2, a set of registers will be used. In the case
458 of float2, a register pair will be used. Specifically, if float2 is the first
459 argument in the function prototype, float2.x will go into r0, and float2.y,
460 r1.</p>
461 <p>Note: stack will be aligned to the coarsest-grained argument. In the case of
462 float2 above as an argument, parameter stack will be aligned to an 8-byte
463 boundary (if the sizes of other arguments are no greater than 8.)</p>
464 </li>
465 <li><p class="first">Calls from/to a separate compilation unit: (E.g., calls to Execution
466 Environment if those runtime library callees are not compiled using LLVM.)</p>
467 <p>On ARM, we use hardfp. Note that double will be placed in a register pair.</p>
468 </li>
469 </ol>
470 </div>
471 </div>
472 </body>
473 </html>
474
README.rst
1 ===============================================================
2 libbcc: A Versatile Bitcode Execution Engine for Mobile Devices
3 ===============================================================
4
5
6 Introduction
7 ------------
8
9 libbcc is an LLVM bitcode execution engine that compiles the bitcode
10 to an in-memory executable. libbcc is versatile because:
11
12 * it implements both AOT (Ahead-of-Time) and JIT (Just-in-Time)
13 compilation.
14
15 * Android devices demand fast start-up time, small size, and high
16 performance *at the same time*. libbcc attempts to address these
17 design constraints.
18
19 * it supports on-device linking. Each device vendor can supply his or
20 her own runtime bitcode library (lib*.bc) that differentiates his or
21 her system. Specialization becomes ecosystem-friendly.
22
23 libbcc provides:
24
25 * a *just-in-time bitcode compiler*, which translates the LLVM bitcode
26 into machine code
27
28 * a *caching mechanism*, which can:
29
30 * after each compilation, serialize the in-memory executable into a
31 cache file. Note that the compilation is triggered by a cache
32 miss.
33 * load from the cache file upon cache-hit.
34
35 Highlights of libbcc are:
36
37 * libbcc supports bitcode from various language frontends, such as
38 RenderScript, GLSL (pixelflinger2).
39
40 * libbcc strives to balance between library size, launch time and
41 steady-state performance:
42
43 * The size of libbcc is aggressively reduced for mobile devices. We
44 customize and improve upon the default Execution Engine from
45 upstream. Otherwise, libbcc's execution engine can easily become
46 at least 2 times bigger.
47
48 * To reduce launch time, we support caching of
49 binaries. Just-in-Time compilation are oftentimes Just-too-Late,
50 if the given apps are performance-sensitive. Thus, we implemented
51 AOT to get the best of both worlds: Fast launch time and high
52 steady-state performance.
53
54 AOT is also important for projects such as NDK on LLVM with
55 portability enhancement. Launch time reduction after we
56 implemented AOT is signficant::
57
58
59 Apps libbcc without AOT libbcc with AOT
60 launch time in libbcc launch time in libbcc
61 App_1 1218ms 9ms
62 App_2 842ms 4ms
63 Wallpaper:
64 MagicSmoke 182ms 3ms
65 Halo 127ms 3ms
66 Balls 149ms 3ms
67 SceneGraph 146ms 90ms
68 Model 104ms 4ms
69 Fountain 57ms 3ms
70
71 AOT also masks the launching time overhead of on-device linking
72 and helps it become reality.
73
74 * For steady-state performance, we enable VFP3 and aggressive
75 optimizations.
76
77 * Currently we disable Lazy JITting.
78
79
80
81 API
82 ---
83
84 **Basic:**
85
86 * **bccCreateScript** - Create new bcc script
87
88 * **bccRegisterSymbolCallback** - Register the callback function for external
89 symbol lookup
90
91 * **bccReadBC** - Set the source bitcode for compilation
92
93 * **bccReadModule** - Set the llvm::Module for compilation
94
95 * **bccLinkBC** - Set the library bitcode for linking
96
97 * **bccPrepareExecutable** - *deprecated* - Use bccPrepareExecutableEx instead
98
99 * **bccPrepareExecutableEx** - Create the in-memory executable by either
100 just-in-time compilation or cache loading
101
102 * **bccGetFuncAddr** - Get the entry address of the function
103
104 * **bccDisposeScript** - Destroy bcc script and release the resources
105
106 * **bccGetError** - *deprecated* - Don't use this
107
108
109 **Reflection:**
110
111 * **bccGetExportVarCount** - Get the count of exported variables
112
113 * **bccGetExportVarList** - Get the addresses of exported variables
114
115 * **bccGetExportFuncCount** - Get the count of exported functions
116
117 * **bccGetExportFuncList** - Get the addresses of exported functions
118
119 * **bccGetPragmaCount** - Get the count of pragmas
120
121 * **bccGetPragmaList** - Get the pragmas
122
123
124 **Debug:**
125
126 * **bccGetFuncCount** - Get the count of functions (including non-exported)
127
128 * **bccGetFuncInfoList** - Get the function information (name, base, size)
129
130
131
132 Cache File Format
133 -----------------
134
135 A cache file (denoted as \*.oBCC) for libbcc consists of several sections:
136 header, string pool, dependencies table, relocation table, exported
137 variable list, exported function list, pragma list, function information
138 table, and bcc context. Every section should be aligned to a word size.
139 Here is the brief description of each sections:
140
141 * **Header** (MCO_Header) - The header of a cache file. It contains the
142 magic word, version, machine integer type information (the endianness,
143 the size of off_t, size_t, and ptr_t), and the size
144 and offset of other sections. The header section is guaranteed
145 to be at the beginning of the cache file.
146
147 * **String Pool** (MCO_StringPool) - A collection of serialized variable
148 length strings. The strp_index in the other part of the cache file
149 represents the index of such string in this string pool.
150
151 * **Dependencies Table** (MCO_DependencyTable) - The dependencies table.
152 This table stores the resource name (or file path), the resource
153 type (rather in APK or on the file system), and the SHA1 checksum.
154
155 * **Relocation Table** (MCO_RelocationTable) - *not enabled*
156
157 * **Exported Variable List** (MCO_ExportVarList) -
158 The list of the addresses of exported variables.
159
160 * **Exported Function List** (MCO_ExportFuncList) -
161 The list of the addresses of exported functions.
162
163 * **Pragma List** (MCO_PragmaList) - The list of pragma key-value pair.
164
165 * **Function Information Table** (MCO_FuncTable) - This is a table of
166 function information, such as function name, function entry address,
167 and function binary size. Besides, the table should be ordered by
168 function name.
169
170 * **Context** - The context of the in-memory executable, including
171 the code and the data. The offset of context should aligned to
172 a page size, so that we can mmap the context directly into memory.
173
174 For furthur information, you may read `bcc_cache.h <include/bcc/bcc_cache.h>`_,
175 `CacheReader.cpp <lib/bcc/CacheReader.cpp>`_, and
176 `CacheWriter.cpp <lib/bcc/CacheWriter.cpp>`_ for details.
177
178
179
180 JIT'ed Code Calling Conventions
181 -------------------------------
182
183 1. Calls from Execution Environment or from/to within script:
184
185 On ARM, the first 4 arguments will go into r0, r1, r2, and r3, in that order.
186 The remaining (if any) will go through stack.
187
188 For ext_vec_types such as float2, a set of registers will be used. In the case
189 of float2, a register pair will be used. Specifically, if float2 is the first
190 argument in the function prototype, float2.x will go into r0, and float2.y,
191 r1.
192
193 Note: stack will be aligned to the coarsest-grained argument. In the case of
194 float2 above as an argument, parameter stack will be aligned to an 8-byte
195 boundary (if the sizes of other arguments are no greater than 8.)
196
197 2. Calls from/to a separate compilation unit: (E.g., calls to Execution
198 Environment if those runtime library callees are not compiled using LLVM.)
199
200 On ARM, we use hardfp. Note that double will be placed in a register pair.
201