Home | History | Annotate | Download | only in design
      1 page.title=Designing for Performance
      2 @jd:body
      3 
      4 <p>An Android application will run on a mobile device with limited computing
      5 power and storage, and constrained battery life. Because of
      6 this, it should be <em>efficient</em>. Battery life is one reason you might
      7 want to optimize your app even if it already seems to run "fast enough".
      8 Battery life is important to users, and Android's battery usage breakdown
      9 means users will know if your app is responsible draining their battery.</p>
     10 
     11 <p>This document covers these topics: </p>
     12 <ul>
     13     <li><a href="#intro">Introduction</a></li>
     14     <li><a href="#optimize_judiciously">Optimize Judiciously</a></li>
     15     <li><a href="#object_creation">Avoid Creating Objects</a></li>
     16     <li><a href="#myths">Performance Myths</a></li>
     17     <li><a href="#prefer_static">Prefer Static Over Virtual</a></li>
     18     <li><a href="#internal_get_set">Avoid Internal Getters/Setters</a></li>
     19     <li><a href="#use_final">Use Static Final For Constants</a></li>
     20     <li><a href="#foreach">Use Enhanced For Loop Syntax</a></li>
     21     <li><a href="#avoid_enums">Avoid Enums Where You Only Need Ints</a></li>
     22     <li><a href="#package_inner">Use Package Scope with Inner Classes</a></li>
     23     <li><a href="#avoidfloat">Use Floating-Point Judiciously</a> </li>
     24     <li><a href="#library">Know And Use The Libraries</a></li>
     25     <li><a href="#native_methods">Use Native Methods Judiciously</a></li>
     26     <li><a href="#closing_notes">Closing Notes</a></li>
     27 </ul>
     28 
     29 <p>Note that although this document primarily covers micro-optimizations,
     30 these will almost never make or break your software. Choosing the right
     31 algorithms and data structures should always be your priority, but is
     32 outside the scope of this document.</p>
     33 
     34 <a name="intro" id="intro"></a>
     35 <h2>Introduction</h2>
     36 
     37 <p>There are two basic rules for writing efficient code:</p>
     38 <ul>
     39     <li>Don't do work that you don't need to do.</li>
     40     <li>Don't allocate memory if you can avoid it.</li>
     41 </ul>
     42 
     43 <h2 id="optimize_judiciously">Optimize Judiciously</h2>
     44 
     45 <p>As you get started thinking about how to design your application, and as
     46 you write it, consider
     47 the cautionary points about optimization that Josh Bloch makes in his book
     48 <em>Effective Java</em>. Here's "Item 47: Optimize Judiciously", excerpted from
     49 the latest edition of the book with permission. Although Josh didn't have
     50 Android application development in mind when writing this section &mdash; for
     51 example, the <code style="color:black">java.awt.Component</code> class
     52 referenced is not available in Android, and Android uses the
     53 Dalvik VM, rather than a standard JVM &mdash; his points are still valid. </p>
     54 
     55 <blockquote>
     56 
     57 <p>There are three aphorisms concerning optimization that everyone should know.
     58 They are perhaps beginning to suffer from overexposure, but in case you aren't
     59 yet familiar with them, here they are:</p>
     60 
     61 <div style="padding-left:3em;padding-right:4em;">
     62 
     63 <p style="margin-bottom:.5em;">More computing sins are committed in the name of
     64 efficiency (without necessarily achieving it) than for any other single
     65 reason&mdash;including blind stupidity.</p>
     66 <p>&mdash;William A. Wulf <span style="font-size:80%;"><sup>1</sup></span></p>
     67 
     68 <p style="margin-bottom:.5em;">We should forget about small efficiencies, say
     69 about 97% of the time: premature optimization is the root of all evil. </p>
     70 <p>&mdash;Donald E. Knuth <span style="font-size:80%;"><sup>2</sup></span></p>
     71 
     72 
     73 <p style="margin-bottom:.5em;">We follow two rules in the matter of optimization:</p>
     74 <ul style="margin-bottom:0">
     75 <li>Rule 1. Don't do it.</li>
     76 <li>Rule 2 (for experts only). Don't do it yet &mdash; that is, not until you have a
     77 perfectly clear and unoptimized solution. </li>
     78 </ul>
     79 <p>&mdash;M. A. Jackson <span style="font-size:80%;"><sup>3</sup></span></p>
     80 </div>
     81 
     82 <p>All of these aphorisms predate the Java programming language by two decades.
     83 They tell a deep truth about optimization: it is easy to do more harm than good,
     84 especially if you optimize prematurely. In the process, you may produce software
     85 that is neither fast nor correct and cannot easily be fixed.</p>
     86 
     87 <p>Don't sacrifice sound architectural principles for performance.
     88 <strong>Strive to write good programs rather than fast ones.</strong> If a good
     89 program is not fast enough, its architecture will allow it to be optimized. Good
     90 programs embody the principle of <em>information hiding</em>: where possible,
     91 they localize design decisions within individual modules, so individual
     92 decisions can be changed without affecting the remainder of the system (Item
     93 13).</p>
     94 
     95 <p>This does <em>not</em> mean that you can ignore performance concerns until
     96 your program is complete. Implementation problems can be fixed by later
     97 optimization, but pervasive architectural flaws that limit performance can be
     98 impossible to fix without rewriting the system. Changing a fundamental facet of
     99 your design after the fact can result in an ill-structured system that is
    100 difficult to maintain and evolve. Therefore you must think about performance
    101 during the design process.</p>
    102 
    103 <p><strong>Strive to avoid design decisions that limit performance.</strong> The
    104 components of a design that are most difficult to change after the fact are
    105 those specifying interactions between modules and with the outside world. Chief
    106 among these design components are APIs, wire-level protocols, and persistent
    107 data formats. Not only are these design components difficult or impossible to
    108 change after the fact, but all of them can place significant limitations on the
    109 performance that a system can ever achieve.</p>
    110 
    111 <p><strong>Consider the performance consequences of your API design
    112 decisions.</strong> Making a public type mutable may require a lot of needless
    113 defensive copying (Item 39). Similarly, using inheritance in a public class
    114 where composition would have been appropriate ties the class forever to its
    115 superclass, which can place artificial limits on the performance of the subclass
    116 (Item 16). As a final example, using an implementation type rather than an
    117 interface in an API ties you to a specific implementation, even though faster
    118 implementations may be written in the future (Item 52).</p>
    119 
    120 <p>The effects of API design on performance are very real. Consider the <code
    121 style="color:black">getSize</code> method in the <code
    122 style="color:black">java.awt.Component</code> class. The decision that this
    123 performance-critical method was to return a <code
    124 style="color:black">Dimension</code> instance, coupled with the decision that
    125 <code style="color:black">Dimension</code> instances are mutable, forces any
    126 implementation of this method to allocate a new <code
    127 style="color:black">Dimension</code> instance on every invocation. Even though
    128 allocating small objects is inexpensive on a modern VM, allocating millions of
    129 objects needlessly can do real harm to performance.</p>
    130 
    131 <p>In this case, several alternatives existed. Ideally, <code
    132 style="color:black">Dimension</code> should have been immutable (Item 15);
    133 alternatively, the <code style="color:black">getSize</code> method could have
    134 been replaced by two methods returning the individual primitive components of a
    135 <code style="color:black">Dimension</code> object. In fact, two such methods
    136 were added to the Component API in the 1.2 release for performance reasons.
    137 Preexisting client code, however, still uses the <code
    138 style="color:black">getSize</code> method and still suffers the performance
    139 consequences of the original API design decisions.</p>
    140 
    141 <p>Luckily, it is generally the case that good API design is consistent with
    142 good performance. <strong>It is a very bad idea to warp an API to achieve good
    143 performance.</strong> The performance issue that caused you to warp the API may
    144 go away in a future release of the platform or other underlying software, but
    145 the warped API and the support headaches that come with it will be with you for
    146 life.</p>
    147 
    148 <p>Once you've carefully designed your program and produced a clear, concise,
    149 and well-structured implementation, <em>then</em> it may be time to consider
    150 optimization, assuming you're not already satisfied with the performance of the
    151 program.</p>
    152 
    153 <p>Recall that Jackson's two rules of optimization were "Don't do it," and "(for
    154 experts only). Don't do it yet." He could have added one more: <strong>measure
    155 performance before and after each attempted optimization.</strong> You may be
    156 surprised by what you find. Often, attempted optimizations have no measurable
    157 effect on performance; sometimes, they make it worse. The main reason is that
    158 it's difficult to guess where your program is spending its time. The part of the
    159 program that you think is slow may not be at fault, in which case you'd be
    160 wasting your time trying to optimize it. Common wisdom says that programs spend
    161 80 percent of their time in 20 percent of their code.</p>
    162 
    163 <p>Profiling tools can help you decide where to focus your optimization efforts.
    164 Such tools give you runtime information, such as roughly how much time each
    165 method is consuming and how many times it is invoked. In addition to focusing
    166 your tuning efforts, this can alert you to the need for algorithmic changes. If
    167 a quadratic (or worse) algorithm lurks inside your program, no amount of tuning
    168 will fix the problem. You must replace the algorithm with one that is more
    169 efficient. The more code in the system, the more important it is to use a
    170 profiler. It's like looking for a needle in a haystack: the bigger the haystack,
    171 the more useful it is to have a metal detector. The JDK comes with a simple
    172 profiler and modern IDEs provide more sophisticated profiling tools.</p>
    173 
    174 <p>The need to measure the effects of attempted optimization is even greater on
    175 the Java platform than on more traditional platforms, because the Java
    176 programming language does not have a strong <em>performance model</em>. The
    177 relative costs of the various primitive operations are not well defined. The
    178 "semantic gap" between what the programmer writes and what the CPU executes is
    179 far greater than in traditional statically compiled languages, which makes it
    180 very difficult to reliably predict the performance consequences of any
    181 optimization. There are plenty of performance myths floating around that turn
    182 out to be half-truths or outright lies.</p>
    183 
    184 <p>Not only is Java's performance model ill-defined, but it varies from JVM
    185 implementation to JVM implementation, from release to release, and from
    186 processor to processor. If you will be running your program on multiple JVM
    187 implementations or multiple hardware platforms, it is important that you measure
    188 the effects of your optimization on each. Occasionally you may be forced to make
    189 trade-offs between performance on different JVM implementations or hardware
    190 platforms.</p>
    191 
    192 <p>To summarize, do not strive to write fast programs &mdash; strive to write
    193 good ones; speed will follow. Do think about performance issues while you're
    194 designing systems and especially while you're designing APIs, wire-level
    195 protocols, and persistent data formats. When you've finished building the
    196 system, measure its performance. If it's fast enough, you're done. If not,
    197 locate the source of the problems with the aid of a profiler, and go to work
    198 optimizing the relevant parts of the system. The first step is to examine your
    199 choice of algorithms: no amount of low-level optimization can make up for a poor
    200 choice of algorithm. Repeat this process as necessary, measuring the performance
    201 after every change, until you're satisfied.</p>
    202 
    203 <p>&mdash;Excerpted from Josh Bloch's <em>Effective Java</em>, Second Ed.
    204 (Addison-Wesley, 2008).</em></p>
    205 
    206 <p style="font-size:80%;margin-bottom:0;"><sup>1</sup> Wulf, W. A Case Against
    207 the GOTO. <em>Proceedings of the 25th ACM National
    208 Conference</em> 2 (1972): 791797.</p>
    209 <p style="font-size:80%;margin-bottom:0;"><sup>2</sup> Knuth, Donald. Structured
    210 Programming with go to Statements. <em>Computing
    211 Surveys 6</em> (1974): 261301.</p>
    212 <p style="font-size:80%"><sup>3</sup> Jackson, M. A. <em>Principles of Program
    213 Design</em>, Academic Press, London, 1975.
    214 ISBN: 0123790506.</p>
    215 
    216 </blockquote>
    217 
    218 <p>One of the trickiest problems you'll face when micro-optimizing Android
    219 apps is that the "if you will be running your program on ... multiple hardware
    220 platforms" clause above is always true. And it's not even generally the case
    221 that you can say "device X is a factor F faster/slower than device Y".
    222 This is especially true if one of the devices is the emulator, or one of the
    223 devices has a JIT. If you want to know how your app performs on a given device,
    224 you need to test it on that device. Drawing conclusions from the emulator is
    225 particularly dangerous, as is attempting to compare JIT versus non-JIT
    226 performance: the performance <em>profiles</em> can differ wildly.</p>
    227 
    228 <a name="object_creation"></a>
    229 <h2>Avoid Creating Objects</h2>
    230 
    231 <p>Object creation is never free. A generational GC with per-thread allocation
    232 pools for temporary objects can make allocation cheaper, but allocating memory
    233 is always more expensive than not allocating memory.</p>
    234 
    235 <p>If you allocate objects in a user interface loop, you will force a periodic
    236 garbage collection, creating little "hiccups" in the user experience.</p>
    237 
    238 <p>Thus, you should avoid creating object instances you don't need to.  Some
    239 examples of things that can help:</p>
    240 
    241 <ul>
    242     <li>When extracting strings from a set of input data, try 
    243     to return a substring of the original data, instead of creating a copy.
    244     You will create a new String object, but it will share the char[]
    245     with the data.</li>
    246     <li>If you have a method returning a string, and you know that its result
    247     will always be appended to a StringBuffer anyway, change your signature
    248     and implementation so that the function does the append directly,
    249     instead of creating a short-lived temporary object.</li>
    250 </ul>
    251 
    252 <p>A somewhat more radical idea is to slice up multidimensional arrays into
    253 parallel single one-dimension arrays:</p>
    254 
    255 <ul>
    256     <li>An array of ints is a much better than an array of Integers,
    257     but this also generalizes to the fact that two parallel arrays of ints
    258     are also a <strong>lot</strong> more efficient than an array of (int,int)
    259     objects.  The same goes for any combination of primitive types.</li>
    260     <li>If you need to implement a container that stores tuples of (Foo,Bar)
    261     objects, try to remember that two parallel Foo[] and Bar[] arrays are
    262     generally much better than a single array of custom (Foo,Bar) objects.
    263     (The exception to this, of course, is when you're designing an API for
    264     other code to access;  in those cases, it's usually better to trade
    265     correct API design for a small hit in speed. But in your own internal 
    266     code, you should try and be as efficient as possible.)</li>
    267 </ul>
    268 
    269 <p>Generally speaking, avoid creating short-term temporary objects if you
    270 can.  Fewer objects created mean less-frequent garbage collection, which has
    271 a direct impact on user experience.</p>
    272 
    273 <a name="myths" id="myths"></a>
    274 <h2>Performance Myths</h2>
    275 
    276 <p>Previous versions of this document made various misleading claims. We
    277 address some of them here.</p>
    278 
    279 <p>On devices without a JIT, it is true that invoking methods via a
    280 variable with an exact type rather than an interface is slightly more
    281 efficient. (So, for example, it was cheaper to invoke methods on a
    282 <code>HashMap map</code> than a <code>Map map</code>, even though in both
    283 cases the map was a <code>HashMap</code>.) It was not the case that this
    284 was 2x slower; the actual difference was more like 6% slower. Furthermore,
    285 the JIT makes the two effectively indistinguishable.</p>
    286 
    287 <p>On devices without a JIT, caching field accesses is about 20% faster than
    288 repeatedly accesssing the field. With a JIT, field access costs about the same
    289 as local access, so this isn't a worthwhile optimization unless you feel it
    290 makes your code easier to read. (This is true of final, static, and static
    291 final fields too.)
    292 
    293 <a name="prefer_static" id="prefer_static"></a>
    294 <h2>Prefer Static Over Virtual</h2>
    295 
    296 <p>If you don't need to access an object's fields, make your method static.
    297 Invocations will be about 15%-20% faster.
    298 It's also good practice, because you can tell from the method
    299 signature that calling the method can't alter the object's state.</p>
    300 
    301 <a name="internal_get_set" id="internal_get_set"></a>
    302 <h2>Avoid Internal Getters/Setters</h2>
    303 
    304 <p>In native languages like C++ it's common practice to use getters (e.g.
    305 <code>i = getCount()</code>) instead of accessing the field directly (<code>i
    306 = mCount</code>). This is an excellent habit for C++, because the compiler can
    307 usually inline the access, and if you need to restrict or debug field access
    308 you can add the code at any time.</p>
    309 
    310 <p>On Android, this is a bad idea.  Virtual method calls are expensive,
    311 much more so than instance field lookups.  It's reasonable to follow
    312 common object-oriented programming practices and have getters and setters
    313 in the public interface, but within a class you should always access
    314 fields directly.</p>
    315 
    316 <p>Without a JIT, direct field access is about 3x faster than invoking a
    317 trivial getter. With the JIT (where direct field access is as cheap as
    318 accessing a local), direct field access is about 7x faster than invoking a
    319 trivial getter. This is true in Froyo, but will improve in the future when
    320 the JIT inlines getter methods.</p>
    321 
    322 <a name="use_final" id="use_final"></a>
    323 <h2>Use Static Final For Constants</h2>
    324 
    325 <p>Consider the following declaration at the top of a class:</p>
    326 
    327 <pre>static int intVal = 42;
    328 static String strVal = "Hello, world!";</pre>
    329 
    330 <p>The compiler generates a class initializer method, called
    331 <code>&lt;clinit&gt;</code>, that is executed when the class is first used.
    332 The method stores the value 42 into <code>intVal</code>, and extracts a
    333 reference from the classfile string constant table for <code>strVal</code>.
    334 When these values are referenced later on, they are accessed with field
    335 lookups.</p>
    336 
    337 <p>We can improve matters with the "final" keyword:</p>
    338 
    339 <pre>static final int intVal = 42;
    340 static final String strVal = "Hello, world!";</pre>
    341 
    342 <p>The class no longer requires a <code>&lt;clinit&gt;</code> method,
    343 because the constants go into static field initializers in the dex file.
    344 Code that refers to <code>intVal</code> will use
    345 the integer value 42 directly, and accesses to <code>strVal</code> will
    346 use a relatively inexpensive "string constant" instruction instead of a
    347 field lookup. (Note that this optimization only applies to primitive types and
    348 <code>String</code> constants, not arbitrary reference types. Still, it's good
    349 practice to declare constants <code>static final</code> whenever possible.)</p>
    350 
    351 <a name="foreach" id="foreach"></a>
    352 <h2>Use Enhanced For Loop Syntax</h2>
    353 
    354 <p>The enhanced for loop (also sometimes known as "for-each" loop) can be used
    355 for collections that implement the Iterable interface and for arrays.
    356 With collections, an iterator is allocated to make interface calls
    357 to hasNext() and next(). With an ArrayList, a hand-written counted loop is
    358 about 3x faster (with or without JIT), but for other collections the enhanced
    359 for loop syntax will be exactly equivalent to explicit iterator usage.</p>
    360 
    361 <p>There are several alternatives for iterating through an array:</p>
    362 
    363 <pre>    static class Foo {
    364         int mSplat;
    365     }
    366     Foo[] mArray = ...
    367 
    368     public void zero() {
    369         int sum = 0;
    370         for (int i = 0; i &lt; mArray.length; ++i) {
    371             sum += mArray[i].mSplat;
    372         }
    373     }
    374 
    375     public void one() {
    376         int sum = 0;
    377         Foo[] localArray = mArray;
    378         int len = localArray.length;
    379 
    380         for (int i = 0; i &lt; len; ++i) {
    381             sum += localArray[i].mSplat;
    382         }
    383     }
    384 
    385     public void two() {
    386         int sum = 0;
    387         for (Foo a : mArray) {
    388             sum += a.mSplat;
    389         }
    390     }
    391 </pre>
    392 
    393 <p><strong>zero()</strong> is slowest, because the JIT can't yet optimize away
    394 the cost of getting the array length once for every iteration through the
    395 loop.</p>
    396 
    397 <p><strong>one()</strong> is faster. It pulls everything out into local
    398 variables, avoiding the lookups. Only the array length offers a performance
    399 benefit.</p>
    400 
    401 <p><strong>two()</strong> is fastest for devices without a JIT, and
    402 indistinguishable from <strong>one()</strong> for devices with a JIT.
    403 It uses the enhanced for loop syntax introduced in version 1.5 of the Java
    404 programming language.</p>
    405 
    406 <p>To summarize: use the enhanced for loop by default, but consider a
    407 hand-written counted loop for performance-critical ArrayList iteration.</p>
    408 
    409 <p>(See also <em>Effective Java</em> item 46.)</p>
    410 
    411 <a name="avoid_enums" id="avoid_enums"></a>
    412 <h2>Avoid Enums Where You Only Need Ints</h2>
    413 
    414 <p>Enums are very convenient, but unfortunately can be painful when size
    415 and speed matter.  For example, this:</p>
    416 
    417 <pre>public enum Shrubbery { GROUND, CRAWLING, HANGING }</pre>
    418 
    419 <p>adds 740 bytes to your .dex file compared to the equivalent class
    420 with three public static final ints. On first use, the
    421 class initializer invokes the &lt;init&gt; method on objects representing each
    422 of the enumerated values. Each object gets its own static field, and the full
    423 set is stored in an array (a static field called "$VALUES"). That's a lot of
    424 code and data, just for three integers. Additionally, this:</p>
    425 
    426 <pre>Shrubbery shrub = Shrubbery.GROUND;</pre>
    427 
    428 <p>causes a static field lookup.  If "GROUND" were a static final int,
    429 the compiler would treat it as a known constant and inline it.</p>
    430 
    431 <p>The flip side, of course, is that with enums you get nicer APIs and
    432 some compile-time value checking.  So, the usual trade-off applies: you should
    433 by all means use enums for public APIs, but try to avoid them when performance
    434 matters.</p>
    435 
    436 <p>If you're using <code>Enum.ordinal</code>, that's usually a sign that you
    437 should be using ints instead. As a rule of thumb, if an enum doesn't have a
    438 constructor and doesn't define its own methods, and it's used in
    439 performance-critical code, you should consider <code>static final int</code>
    440 constants instead.</p>
    441 
    442 <a name="package_inner" id="package_inner"></a>
    443 <h2>Use Package Scope with Inner Classes</h2>
    444 
    445 <p>Consider the following class definition:</p>
    446 
    447 <pre>public class Foo {
    448     private int mValue;
    449 
    450     public void run() {
    451         Inner in = new Inner();
    452         mValue = 27;
    453         in.stuff();
    454     }
    455 
    456     private void doStuff(int value) {
    457         System.out.println("Value is " + value);
    458     }
    459 
    460     private class Inner {
    461         void stuff() {
    462             Foo.this.doStuff(Foo.this.mValue);
    463         }
    464     }
    465 }</pre>
    466 
    467 <p>The key things to note here are that we define an inner class (Foo$Inner)
    468 that directly accesses a private method and a private instance field
    469 in the outer class.  This is legal, and the code prints "Value is 27" as
    470 expected.</p>
    471 
    472 <p>The problem is that the VM considers direct access to Foo's private members
    473 from Foo$Inner to be illegal because Foo and Foo$Inner are different classes,
    474 even though the Java language allows an inner class to access an outer class'
    475 private members. To bridge the gap, the compiler generates a couple of
    476 synthetic methods:</p>
    477 
    478 <pre>/*package*/ static int Foo.access$100(Foo foo) {
    479     return foo.mValue;
    480 }
    481 /*package*/ static void Foo.access$200(Foo foo, int value) {
    482     foo.doStuff(value);
    483 }</pre>
    484 
    485 <p>The inner-class code calls these static methods whenever it needs to
    486 access the "mValue" field or invoke the "doStuff" method in the outer
    487 class. What this means is that the code above really boils down to a case
    488 where you're accessing member fields through accessor methods instead of
    489 directly.  Earlier we talked about how accessors are slower than direct field
    490 accesses, so this is an example of a certain language idiom resulting in an
    491 "invisible" performance hit.</p>
    492 
    493 <p>We can avoid this problem by declaring fields and methods accessed
    494 by inner classes to have package scope, rather than private scope.
    495 This runs faster and removes the overhead of the generated methods.
    496 (Unfortunately it also means the fields could be accessed directly by other
    497 classes in the same package, which runs counter to the standard
    498 practice of making all fields private. Once again, if you're
    499 designing a public API you might want to carefully consider using this
    500 optimization.)</p>
    501 
    502 <a name="avoidfloat" id="avoidfloat"></a>
    503 <h2>Use Floating-Point Judiciously</h2>
    504 
    505 <p>As a rule of thumb, floating-point is about 2x slower than integer on
    506 Android devices. This is true on a FPU-less, JIT-less G1 and a Nexus One with
    507 an FPU and the JIT. (Of course, absolute speed difference between those two
    508 devices is about 10x for arithmetic operations.)</p>
    509 
    510 <p>In speed terms, there's no difference between <code>float</code> and
    511 <code>double</code> on the more modern hardware. Space-wise, <code>double</code>
    512 is 2x larger. As with desktop machines, assuming space isn't an issue, you
    513 should prefer <code>double</code> to <code>float</code>.</p>
    514 
    515 <p>Also, even for integers, some chips have hardware multiply but lack
    516 hardware divide. In such cases, integer division and modulus operations are
    517 performed in software &mdash; something to think about if you're designing a
    518 hash table or doing lots of math.</p>
    519 
    520 <a name="library" id="library"></a>
    521 <h2>Know And Use The Libraries</h2>
    522 
    523 <p>In addition to all the usual reasons to prefer library code over rolling
    524 your own, bear in mind that the system is at liberty to replace calls
    525 to library methods with hand-coded assembler, which may be better than the
    526 best code the JIT can produce for the equivalent Java. The typical example
    527 here is <code>String.indexOf</code> and friends, which Dalvik replaces with
    528 an inlined intrinsic. Similarly, the <code>System.arraycopy</code> method
    529 is about 9x faster than a hand-coded loop on a Nexus One with the JIT.</p>
    530 
    531 <p>(See also <em>Effective Java</em> item 47.)</p>
    532 
    533 <a name="native_methods" id="native_methods"></a>
    534 <h2>Use Native Methods Judiciously</h2>
    535 
    536 <p>Native code isn't necessarily more efficient than Java. For one thing,
    537 there's a cost associated with the Java-native transition, and the JIT can't
    538 optimize across these boundaries. If you're allocating native resources (memory
    539 on the native heap, file descriptors, or whatever), it can be significantly
    540 more difficult to arrange timely collection of these resources. You also
    541 need to compile your code for each architecture you wish to run on (rather
    542 than rely on it having a JIT). You may even have to compile multiple versions
    543 for what you consider the same architecture: native code compiled for the ARM
    544 processor in the G1 can't take full advantage of the ARM in the Nexus One, and
    545 code compiled for the ARM in the Nexus One won't run on the ARM in the G1.</p>
    546 
    547 <p>Native code is primarily useful when you have an existing native codebase
    548 that you want to port to Android, not for "speeding up" parts of a Java app.</p>
    549 
    550 <p>(See also <em>Effective Java</em> item 54.)</p>
    551 
    552 <a name="closing_notes" id="closing_notes"></a>
    553 <h2>Closing Notes</h2>
    554 
    555 <p>One last thing: always measure. Before you start optimizing, make sure you
    556 have a problem. Make sure you can accurately measure your existing performance,
    557 or you won't be able to measure the benefit of the alternatives you try.</p>
    558 
    559 <p>Every claim made in this document is backed up by a benchmark. The source
    560 to these benchmarks can be found in the <a href="http://code.google.com/p/dalvik/source/browse/#svn/trunk/benchmarks">code.google.com "dalvik" project</a>.</p>
    561 
    562 <p>The benchmarks are built with the
    563 <a href="http://code.google.com/p/caliper/">Caliper</a> microbenchmarking
    564 framework for Java. Microbenchmarks are hard to get right, so Caliper goes out
    565 of its way to do the hard work for you, and even detect some cases where you're
    566 not measuring what you think you're measuring (because, say, the VM has
    567 managed to optimize all your code away). We highly recommend you use Caliper
    568 to run your own microbenchmarks.</p>
    569