Home | History | Annotate | Download | only in articles
      1 page.title=Performance Tips
      2 page.article=true
      3 @jd:body
      4 
      5 <div id="tb-wrapper">
      6 <div id="tb">
      7 
      8 <h2>In this document</h2>
      9 <ol class="nolist">
     10   <li><a href="#ObjectCreation">Avoid Creating Unnecessary Objects</a></li>
     11   <li><a href="#PreferStatic">Prefer Static Over Virtual</a></li>
     12   <li><a href="#UseFinal">Use Static Final For Constants</a></li>
     13   <li><a href="#GettersSetters">Avoid Internal Getters/Setters</a></li>
     14   <li><a href="#Loops">Use Enhanced For Loop Syntax</a></li>
     15   <li><a href="#PackageInner">Consider Package Instead of Private Access with Private Inner Classes</a></li>
     16   <li><a href="#AvoidFloat">Avoid Using Floating-Point</a></li>
     17   <li><a href="#UseLibraries">Know and Use the Libraries</a></li>
     18   <li><a href="#NativeMethods">Use Native Methods Carefully</a></li>
     19   <li><a href="#native_methods">Use Native Methods Judiciously</a></li>
     20   <li><a href="#closing_notes">Closing Notes</a></li>
     21 </ol>
     22 
     23 </div>
     24 </div>
     25 
     26 <p>This document primarily covers micro-optimizations that can improve overall app performance
     27 when combined, but it's unlikely that these changes will result in dramatic
     28 performance effects. Choosing the right algorithms and data structures should always be your
     29 priority, but is outside the scope of this document. You should use the tips in this document
     30 as general coding practices that you can incorporate into your habits for general code
     31 efficiency.</p>
     32 
     33 <p>There are two basic rules for writing efficient code:</p>
     34 <ul>
     35     <li>Don't do work that you don't need to do.</li>
     36     <li>Don't allocate memory if you can avoid it.</li>
     37 </ul>
     38 
     39 <p>One of the trickiest problems you'll face when micro-optimizing an Android
     40 app is that your app is certain to be running on multiple types of
     41 hardware. Different versions of the VM running on different
     42 processors running at different speeds. It's not even generally the case
     43 that you can simply say "device X is a factor F faster/slower than device Y",
     44 and scale your results from one device to others. In particular, measurement
     45 on the emulator tells you very little about performance on any device. There
     46 are also huge differences between devices with and without a 
     47 <acronym title="Just In Time compiler">JIT</acronym>: the best
     48 code for a device with a JIT is not always the best code for a device
     49 without.</p>
     50 
     51 <p>To ensure your app performs well across a wide variety of devices, ensure
     52 your code is efficient at all levels and agressively optimize your performance.</p>
     53 
     54 
     55 <h2 id="ObjectCreation">Avoid Creating Unnecessary Objects</h2>
     56 
     57 <p>Object creation is never free. A generational garbage collector with per-thread allocation
     58 pools for temporary objects can make allocation cheaper, but allocating memory
     59 is always more expensive than not allocating memory.</p>
     60 
     61 <p>As you allocate more objects in your app, you will force a periodic
     62 garbage collection, creating little "hiccups" in the user experience. The
     63 concurrent garbage collector introduced in Android 2.3 helps, but unnecessary work
     64 should always be avoided.</p>
     65 
     66 <p>Thus, you should avoid creating object instances you don't need to.  Some
     67 examples of things that can help:</p>
     68 
     69 <ul>
     70     <li>If you have a method returning a string, and you know that its result
     71     will always be appended to a {@link java.lang.StringBuffer} anyway, change your signature
     72     and implementation so that the function does the append directly,
     73     instead of creating a short-lived temporary object.</li>
     74     <li>When extracting strings from a set of input data, try
     75     to return a substring of the original data, instead of creating a copy.
     76     You will create a new {@link java.lang.String} object, but it will share the {@code char[]}
     77     with the data. (The trade-off being that if you're only using a small
     78     part of the original input, you'll be keeping it all around in memory
     79     anyway if you go this route.)</li>
     80 </ul>
     81 
     82 <p>A somewhat more radical idea is to slice up multidimensional arrays into
     83 parallel single one-dimension arrays:</p>
     84 
     85 <ul>
     86     <li>An array of {@code int}s is a much better than an array of {@link java.lang.Integer}
     87     objects,
     88     but this also generalizes to the fact that two parallel arrays of ints
     89     are also a <strong>lot</strong> more efficient than an array of {@code (int,int)}
     90     objects.  The same goes for any combination of primitive types.</li>
     91     
     92     <li>If you need to implement a container that stores tuples of {@code (Foo,Bar)}
     93     objects, try to remember that two parallel {@code Foo[]} and {@code Bar[]} arrays are
     94     generally much better than a single array of custom {@code (Foo,Bar)} objects.
     95     (The exception to this, of course, is when you're designing an API for
     96     other code to access. In those cases, it's usually better to make a small
     97     compromise to the speed in order to achieve a good API design. But in your own internal
     98     code, you should try and be as efficient as possible.)</li>
     99 </ul>
    100 
    101 <p>Generally speaking, avoid creating short-term temporary objects if you
    102 can.  Fewer objects created mean less-frequent garbage collection, which has
    103 a direct impact on user experience.</p>
    104 
    105 
    106 
    107 
    108 <h2 id="PreferStatic">Prefer Static Over Virtual</h2>
    109 
    110 <p>If you don't need to access an object's fields, make your method static.
    111 Invocations will be about 15%-20% faster.
    112 It's also good practice, because you can tell from the method
    113 signature that calling the method can't alter the object's state.</p>
    114 
    115 
    116 
    117 
    118 
    119 <h2 id="UseFinal">Use Static Final For Constants</h2>
    120 
    121 <p>Consider the following declaration at the top of a class:</p>
    122 
    123 <pre>
    124 static int intVal = 42;
    125 static String strVal = "Hello, world!";
    126 </pre>
    127 
    128 <p>The compiler generates a class initializer method, called
    129 <code>&lt;clinit&gt;</code>, that is executed when the class is first used.
    130 The method stores the value 42 into <code>intVal</code>, and extracts a
    131 reference from the classfile string constant table for <code>strVal</code>.
    132 When these values are referenced later on, they are accessed with field
    133 lookups.</p>
    134 
    135 <p>We can improve matters with the "final" keyword:</p>
    136 
    137 <pre>
    138 static final int intVal = 42;
    139 static final String strVal = "Hello, world!";
    140 </pre>
    141 
    142 <p>The class no longer requires a <code>&lt;clinit&gt;</code> method,
    143 because the constants go into static field initializers in the dex file.
    144 Code that refers to <code>intVal</code> will use
    145 the integer value 42 directly, and accesses to <code>strVal</code> will
    146 use a relatively inexpensive "string constant" instruction instead of a
    147 field lookup.</p>
    148 
    149 <p class="note"><strong>Note:</strong> This optimization applies only to primitive types and
    150 {@link java.lang.String} constants, not arbitrary reference types. Still, it's good
    151 practice to declare constants <code>static final</code> whenever possible.</p>
    152 
    153 
    154 
    155 
    156 
    157 <h2 id="GettersSetters">Avoid Internal Getters/Setters</h2>
    158 
    159 <p>In native languages like C++ it's common practice to use getters
    160 (<code>i = getCount()</code>) instead of accessing the field directly (<code>i
    161 = mCount</code>). This is an excellent habit for C++ and is often practiced in other
    162 object oriented languages like C# and Java, because the compiler can
    163 usually inline the access, and if you need to restrict or debug field access
    164 you can add the code at any time.</p>
    165 
    166 <p>However, this is a bad idea on Android.  Virtual method calls are expensive,
    167 much more so than instance field lookups.  It's reasonable to follow
    168 common object-oriented programming practices and have getters and setters
    169 in the public interface, but within a class you should always access
    170 fields directly.</p>
    171 
    172 <p>Without a <acronym title="Just In Time compiler">JIT</acronym>,
    173 direct field access is about 3x faster than invoking a
    174 trivial getter. With the JIT (where direct field access is as cheap as
    175 accessing a local), direct field access is about 7x faster than invoking a
    176 trivial getter.</p>
    177 
    178 <p>Note that if you're using <a href="{@docRoot}tools/help/proguard.html">ProGuard</a>,
    179 you can have the best of both worlds because ProGuard can inline accessors for you.</p>
    180 
    181 
    182 
    183 
    184 
    185 <h2 id="Loops">Use Enhanced For Loop Syntax</h2>
    186 
    187 <p>The enhanced <code>for</code> loop (also sometimes known as "for-each" loop) can be used
    188 for collections that implement the {@link java.lang.Iterable} interface and for arrays.
    189 With collections, an iterator is allocated to make interface calls
    190 to {@code hasNext()} and {@code next()}. With an {@link java.util.ArrayList},
    191 a hand-written counted loop is
    192 about 3x faster (with or without JIT), but for other collections the enhanced
    193 for loop syntax will be exactly equivalent to explicit iterator usage.</p>
    194 
    195 <p>There are several alternatives for iterating through an array:</p>
    196 
    197 <pre>
    198 static class Foo {
    199     int mSplat;
    200 }
    201 
    202 Foo[] mArray = ...
    203 
    204 public void zero() {
    205     int sum = 0;
    206     for (int i = 0; i &lt; mArray.length; ++i) {
    207         sum += mArray[i].mSplat;
    208     }
    209 }
    210 
    211 public void one() {
    212     int sum = 0;
    213     Foo[] localArray = mArray;
    214     int len = localArray.length;
    215 
    216     for (int i = 0; i &lt; len; ++i) {
    217         sum += localArray[i].mSplat;
    218     }
    219 }
    220 
    221 public void two() {
    222     int sum = 0;
    223     for (Foo a : mArray) {
    224         sum += a.mSplat;
    225     }
    226 }
    227 </pre>
    228 
    229 <p><code>zero()</code> is slowest, because the JIT can't yet optimize away
    230 the cost of getting the array length once for every iteration through the
    231 loop.</p>
    232 
    233 <p><code>one()</code> is faster. It pulls everything out into local
    234 variables, avoiding the lookups. Only the array length offers a performance
    235 benefit.</p>
    236 
    237 <p><code>two()</code> is fastest for devices without a JIT, and
    238 indistinguishable from <strong>one()</strong> for devices with a JIT.
    239 It uses the enhanced for loop syntax introduced in version 1.5 of the Java
    240 programming language.</p>
    241 
    242 <p>So, you should use the enhanced <code>for</code> loop by default, but consider a
    243 hand-written counted loop for performance-critical {@link java.util.ArrayList} iteration.</p>
    244 
    245 <p class="note"><strong>Tip:</strong>
    246 Also see Josh Bloch's <em>Effective Java</em>, item 46.</p>
    247 
    248 
    249 
    250 <h2 id="PackageInner">Consider Package Instead of Private Access with Private Inner Classes</h2>
    251 
    252 <p>Consider the following class definition:</p>
    253 
    254 <pre>
    255 public class Foo {
    256     private class Inner {
    257         void stuff() {
    258             Foo.this.doStuff(Foo.this.mValue);
    259         }
    260     }
    261 
    262     private int mValue;
    263 
    264     public void run() {
    265         Inner in = new Inner();
    266         mValue = 27;
    267         in.stuff();
    268     }
    269 
    270     private void doStuff(int value) {
    271         System.out.println("Value is " + value);
    272     }
    273 }</pre>
    274 
    275 <p>What's important here is that we define a private inner class
    276 (<code>Foo$Inner</code>) that directly accesses a private method and a private
    277 instance field in the outer class. This is legal, and the code prints "Value is
    278 27" as expected.</p>
    279 
    280 <p>The problem is that the VM considers direct access to <code>Foo</code>'s
    281 private members from <code>Foo$Inner</code> to be illegal because
    282 <code>Foo</code> and <code>Foo$Inner</code> are different classes, even though
    283 the Java language allows an inner class to access an outer class' private
    284 members. To bridge the gap, the compiler generates a couple of synthetic
    285 methods:</p>
    286 
    287 <pre>
    288 /*package*/ static int Foo.access$100(Foo foo) {
    289     return foo.mValue;
    290 }
    291 /*package*/ static void Foo.access$200(Foo foo, int value) {
    292     foo.doStuff(value);
    293 }</pre>
    294 
    295 <p>The inner class code calls these static methods whenever it needs to
    296 access the <code>mValue</code> field or invoke the <code>doStuff()</code> method
    297 in the outer class. What this means is that the code above really boils down to
    298 a case where you're accessing member fields through accessor methods.
    299 Earlier we talked about how accessors are slower than direct field
    300 accesses, so this is an example of a certain language idiom resulting in an
    301 "invisible" performance hit.</p>
    302 
    303 <p>If you're using code like this in a performance hotspot, you can avoid the
    304 overhead by declaring fields and methods accessed by inner classes to have
    305 package access, rather than private access. Unfortunately this means the fields
    306 can be accessed directly by other classes in the same package, so you shouldn't
    307 use this in public API.</p>
    308 
    309 
    310 
    311 
    312 <h2 id="AvoidFloat">Avoid Using Floating-Point</h2>
    313 
    314 <p>As a rule of thumb, floating-point is about 2x slower than integer on
    315 Android-powered devices.</p>
    316 
    317 <p>In speed terms, there's no difference between <code>float</code> and
    318 <code>double</code> on the more modern hardware. Space-wise, <code>double</code>
    319 is 2x larger. As with desktop machines, assuming space isn't an issue, you
    320 should prefer <code>double</code> to <code>float</code>.</p>
    321 
    322 <p>Also, even for integers, some processors have hardware multiply but lack
    323 hardware divide. In such cases, integer division and modulus operations are
    324 performed in software&mdash;something to think about if you're designing a
    325 hash table or doing lots of math.</p>
    326 
    327 
    328 
    329 
    330 <h2 id="UseLibraries">Know and Use the Libraries</h2>
    331 
    332 <p>In addition to all the usual reasons to prefer library code over rolling
    333 your own, bear in mind that the system is at liberty to replace calls
    334 to library methods with hand-coded assembler, which may be better than the
    335 best code the JIT can produce for the equivalent Java. The typical example
    336 here is {@link java.lang.String#indexOf String.indexOf()} and
    337 related APIs, which Dalvik replaces with
    338 an inlined intrinsic. Similarly, the {@link java.lang.System#arraycopy
    339 System.arraycopy()} method
    340 is about 9x faster than a hand-coded loop on a Nexus One with the JIT.</p>
    341 
    342 
    343 <p class="note"><strong>Tip:</strong>
    344 Also see Josh Bloch's <em>Effective Java</em>, item 47.</p>
    345 
    346 
    347 
    348 
    349 <h2 id="NativeMethods">Use Native Methods Carefully</h2>
    350 
    351 <p>Developing your app with native code using the
    352 <a href="{@docRoot}tools/sdk/ndk/index.html">Android NDK</a>
    353 isn't necessarily more efficient than programming with the
    354 Java language. For one thing,
    355 there's a cost associated with the Java-native transition, and the JIT can't
    356 optimize across these boundaries. If you're allocating native resources (memory
    357 on the native heap, file descriptors, or whatever), it can be significantly
    358 more difficult to arrange timely collection of these resources. You also
    359 need to compile your code for each architecture you wish to run on (rather
    360 than rely on it having a JIT). You may even have to compile multiple versions
    361 for what you consider the same architecture: native code compiled for the ARM
    362 processor in the G1 can't take full advantage of the ARM in the Nexus One, and
    363 code compiled for the ARM in the Nexus One won't run on the ARM in the G1.</p>
    364 
    365 <p>Native code is primarily useful when you have an existing native codebase
    366 that you want to port to Android, not for "speeding up" parts of your Android app
    367 written with the Java language.</p>
    368 
    369 <p>If you do need to use native code, you should read our
    370 <a href="{@docRoot}guide/practices/jni.html">JNI Tips</a>.</p>
    371 
    372 <p class="note"><strong>Tip:</strong>
    373 Also see Josh Bloch's <em>Effective Java</em>, item 54.</p>
    374 
    375 
    376 
    377 
    378 
    379 <h2 id="Myths">Performance Myths</h2>
    380 
    381 
    382 <p>On devices without a JIT, it is true that invoking methods via a
    383 variable with an exact type rather than an interface is slightly more
    384 efficient. (So, for example, it was cheaper to invoke methods on a
    385 <code>HashMap map</code> than a <code>Map map</code>, even though in both
    386 cases the map was a <code>HashMap</code>.) It was not the case that this
    387 was 2x slower; the actual difference was more like 6% slower. Furthermore,
    388 the JIT makes the two effectively indistinguishable.</p>
    389 
    390 <p>On devices without a JIT, caching field accesses is about 20% faster than
    391 repeatedly accessing the field. With a JIT, field access costs about the same
    392 as local access, so this isn't a worthwhile optimization unless you feel it
    393 makes your code easier to read. (This is true of final, static, and static
    394 final fields too.)
    395 
    396 
    397 
    398 <h2 id="Measure">Always Measure</h2>
    399 
    400 <p>Before you start optimizing, make sure you have a problem that you
    401 need to solve. Make sure you can accurately measure your existing performance,
    402 or you won't be able to measure the benefit of the alternatives you try.</p>
    403 
    404 <p>Every claim made in this document is backed up by a benchmark. The source
    405 to these benchmarks can be found in the <a
    406 href="http://code.google.com/p/dalvik/source/browse/#svn/trunk/benchmarks">code.google.com
    407 "dalvik" project</a>.</p>
    408 
    409 <p>The benchmarks are built with the
    410 <a href="http://code.google.com/p/caliper/">Caliper</a> microbenchmarking
    411 framework for Java. Microbenchmarks are hard to get right, so Caliper goes out
    412 of its way to do the hard work for you, and even detect some cases where you're
    413 not measuring what you think you're measuring (because, say, the VM has
    414 managed to optimize all your code away). We highly recommend you use Caliper
    415 to run your own microbenchmarks.</p>
    416 
    417 <p>You may also find
    418 <a href="{@docRoot}tools/debugging/debugging-tracing.html">Traceview</a> useful
    419 for profiling, but it's important to realize that it currently disables the JIT,
    420 which may cause it to misattribute time to code that the JIT may be able to win
    421 back. It's especially important after making changes suggested by Traceview
    422 data to ensure that the resulting code actually runs faster when run without
    423 Traceview.</p>
    424 
    425 <p>For more help profiling and debugging your apps, see the following documents:</p>
    426 
    427 <ul>
    428   <li><a href="{@docRoot}tools/debugging/debugging-tracing.html">Profiling with
    429     Traceview and dmtracedump</a></li>
    430   <li><a href="{@docRoot}tools/debugging/systrace.html">Analysing Display and Performance
    431     with Systrace</a></li>
    432 </ul>
    433 
    434