Home | History | Annotate | Download | only in articles
      1 page.title=Performance Tips
      2 page.article=true
      3 @jd:body
      4 
      5 <div id="tb-wrapper">
      6 <div id="tb">
      7 
      8 <h2>In this document</h2>
      9 <ol class="nolist">
     10   <li><a href="#ObjectCreation">Avoid Creating Unnecessary Objects</a></li>
     11   <li><a href="#PreferStatic">Prefer Static Over Virtual</a></li>
     12   <li><a href="#UseFinal">Use Static Final For Constants</a></li>
     13   <li><a href="#GettersSetters">Avoid Internal Getters/Setters</a></li>
     14   <li><a href="#Loops">Use Enhanced For Loop Syntax</a></li>
     15   <li><a href="#PackageInner">Consider Package Instead of Private Access with Private Inner Classes</a></li>
     16   <li><a href="#AvoidFloat">Avoid Using Floating-Point</a></li>
     17   <li><a href="#UseLibraries">Know and Use the Libraries</a></li>
     18   <li><a href="#NativeMethods">Use Native Methods Carefully</a></li>
     19   <li><a href="#library">Know And Use The Libraries</a></li>
     20   <li><a href="#native_methods">Use Native Methods Judiciously</a></li>
     21   <li><a href="#closing_notes">Closing Notes</a></li>
     22 </ol>
     23 
     24 </div>
     25 </div>
     26 
     27 <p>This document primarily covers micro-optimizations that can improve overall app performance
     28 when combined, but it's unlikely that these changes will result in dramatic
     29 performance effects. Choosing the right algorithms and data structures should always be your
     30 priority, but is outside the scope of this document. You should use the tips in this document
     31 as general coding practices that you can incorporate into your habits for general code
     32 efficiency.</p>
     33 
     34 <p>There are two basic rules for writing efficient code:</p>
     35 <ul>
     36     <li>Don't do work that you don't need to do.</li>
     37     <li>Don't allocate memory if you can avoid it.</li>
     38 </ul>
     39 
     40 <p>One of the trickiest problems you'll face when micro-optimizing an Android
     41 app is that your app is certain to be running on multiple types of
     42 hardware. Different versions of the VM running on different
     43 processors running at different speeds. It's not even generally the case
     44 that you can simply say "device X is a factor F faster/slower than device Y",
     45 and scale your results from one device to others. In particular, measurement
     46 on the emulator tells you very little about performance on any device. There
     47 are also huge differences between devices with and without a 
     48 <acronym title="Just In Time compiler">JIT</acronym>: the best
     49 code for a device with a JIT is not always the best code for a device
     50 without.</p>
     51 
     52 <p>To ensure your app performs well across a wide variety of devices, ensure
     53 your code is efficient at all levels and agressively optimize your performance.</p>
     54 
     55 
     56 <h2 id="ObjectCreation">Avoid Creating Unnecessary Objects</h2>
     57 
     58 <p>Object creation is never free. A generational garbage collector with per-thread allocation
     59 pools for temporary objects can make allocation cheaper, but allocating memory
     60 is always more expensive than not allocating memory.</p>
     61 
     62 <p>As you allocate more objects in your app, you will force a periodic
     63 garbage collection, creating little "hiccups" in the user experience. The
     64 concurrent garbage collector introduced in Android 2.3 helps, but unnecessary work
     65 should always be avoided.</p>
     66 
     67 <p>Thus, you should avoid creating object instances you don't need to.  Some
     68 examples of things that can help:</p>
     69 
     70 <ul>
     71     <li>If you have a method returning a string, and you know that its result
     72     will always be appended to a {@link java.lang.StringBuffer} anyway, change your signature
     73     and implementation so that the function does the append directly,
     74     instead of creating a short-lived temporary object.</li>
     75     <li>When extracting strings from a set of input data, try
     76     to return a substring of the original data, instead of creating a copy.
     77     You will create a new {@link java.lang.String} object, but it will share the {@code char[]}
     78     with the data. (The trade-off being that if you're only using a small
     79     part of the original input, you'll be keeping it all around in memory
     80     anyway if you go this route.)</li>
     81 </ul>
     82 
     83 <p>A somewhat more radical idea is to slice up multidimensional arrays into
     84 parallel single one-dimension arrays:</p>
     85 
     86 <ul>
     87     <li>An array of {@code int}s is a much better than an array of {@link java.lang.Integer}
     88     objects,
     89     but this also generalizes to the fact that two parallel arrays of ints
     90     are also a <strong>lot</strong> more efficient than an array of {@code (int,int)}
     91     objects.  The same goes for any combination of primitive types.</li>
     92     
     93     <li>If you need to implement a container that stores tuples of {@code (Foo,Bar)}
     94     objects, try to remember that two parallel {@code Foo[]} and {@code Bar[]} arrays are
     95     generally much better than a single array of custom {@code (Foo,Bar)} objects.
     96     (The exception to this, of course, is when you're designing an API for
     97     other code to access. In those cases, it's usually better to make a small
     98     compromise to the speed in order to achieve a good API design. But in your own internal
     99     code, you should try and be as efficient as possible.)</li>
    100 </ul>
    101 
    102 <p>Generally speaking, avoid creating short-term temporary objects if you
    103 can.  Fewer objects created mean less-frequent garbage collection, which has
    104 a direct impact on user experience.</p>
    105 
    106 
    107 
    108 
    109 <h2 id="PreferStatic">Prefer Static Over Virtual</h2>
    110 
    111 <p>If you don't need to access an object's fields, make your method static.
    112 Invocations will be about 15%-20% faster.
    113 It's also good practice, because you can tell from the method
    114 signature that calling the method can't alter the object's state.</p>
    115 
    116 
    117 
    118 
    119 
    120 <h2 id="UseFinal">Use Static Final For Constants</h2>
    121 
    122 <p>Consider the following declaration at the top of a class:</p>
    123 
    124 <pre>
    125 static int intVal = 42;
    126 static String strVal = "Hello, world!";
    127 </pre>
    128 
    129 <p>The compiler generates a class initializer method, called
    130 <code>&lt;clinit&gt;</code>, that is executed when the class is first used.
    131 The method stores the value 42 into <code>intVal</code>, and extracts a
    132 reference from the classfile string constant table for <code>strVal</code>.
    133 When these values are referenced later on, they are accessed with field
    134 lookups.</p>
    135 
    136 <p>We can improve matters with the "final" keyword:</p>
    137 
    138 <pre>
    139 static final int intVal = 42;
    140 static final String strVal = "Hello, world!";
    141 </pre>
    142 
    143 <p>The class no longer requires a <code>&lt;clinit&gt;</code> method,
    144 because the constants go into static field initializers in the dex file.
    145 Code that refers to <code>intVal</code> will use
    146 the integer value 42 directly, and accesses to <code>strVal</code> will
    147 use a relatively inexpensive "string constant" instruction instead of a
    148 field lookup.</p>
    149 
    150 <p class="note"><strong>Note:</strong> This optimization applies only to primitive types and
    151 {@link java.lang.String} constants, not arbitrary reference types. Still, it's good
    152 practice to declare constants <code>static final</code> whenever possible.</p>
    153 
    154 
    155 
    156 
    157 
    158 <h2 id="GettersSetters">Avoid Internal Getters/Setters</h2>
    159 
    160 <p>In native languages like C++ it's common practice to use getters
    161 (<code>i = getCount()</code>) instead of accessing the field directly (<code>i
    162 = mCount</code>). This is an excellent habit for C++ and is often practiced in other
    163 object oriented languages like C# and Java, because the compiler can
    164 usually inline the access, and if you need to restrict or debug field access
    165 you can add the code at any time.</p>
    166 
    167 <p>However, this is a bad idea on Android.  Virtual method calls are expensive,
    168 much more so than instance field lookups.  It's reasonable to follow
    169 common object-oriented programming practices and have getters and setters
    170 in the public interface, but within a class you should always access
    171 fields directly.</p>
    172 
    173 <p>Without a <acronym title="Just In Time compiler">JIT</acronym>,
    174 direct field access is about 3x faster than invoking a
    175 trivial getter. With the JIT (where direct field access is as cheap as
    176 accessing a local), direct field access is about 7x faster than invoking a
    177 trivial getter.</p>
    178 
    179 <p>Note that if you're using <a href="{@docRoot}tools/help/proguard.html">ProGuard</a>,
    180 you can have the best of both worlds because ProGuard can inline accessors for you.</p>
    181 
    182 
    183 
    184 
    185 
    186 <h2 id="Loops">Use Enhanced For Loop Syntax</h2>
    187 
    188 <p>The enhanced <code>for</code> loop (also sometimes known as "for-each" loop) can be used
    189 for collections that implement the {@link java.lang.Iterable} interface and for arrays.
    190 With collections, an iterator is allocated to make interface calls
    191 to {@code hasNext()} and {@code next()}. With an {@link java.util.ArrayList},
    192 a hand-written counted loop is
    193 about 3x faster (with or without JIT), but for other collections the enhanced
    194 for loop syntax will be exactly equivalent to explicit iterator usage.</p>
    195 
    196 <p>There are several alternatives for iterating through an array:</p>
    197 
    198 <pre>
    199 static class Foo {
    200     int mSplat;
    201 }
    202 
    203 Foo[] mArray = ...
    204 
    205 public void zero() {
    206     int sum = 0;
    207     for (int i = 0; i &lt; mArray.length; ++i) {
    208         sum += mArray[i].mSplat;
    209     }
    210 }
    211 
    212 public void one() {
    213     int sum = 0;
    214     Foo[] localArray = mArray;
    215     int len = localArray.length;
    216 
    217     for (int i = 0; i &lt; len; ++i) {
    218         sum += localArray[i].mSplat;
    219     }
    220 }
    221 
    222 public void two() {
    223     int sum = 0;
    224     for (Foo a : mArray) {
    225         sum += a.mSplat;
    226     }
    227 }
    228 </pre>
    229 
    230 <p><code>zero()</code> is slowest, because the JIT can't yet optimize away
    231 the cost of getting the array length once for every iteration through the
    232 loop.</p>
    233 
    234 <p><code>one()</code> is faster. It pulls everything out into local
    235 variables, avoiding the lookups. Only the array length offers a performance
    236 benefit.</p>
    237 
    238 <p><code>two()</code> is fastest for devices without a JIT, and
    239 indistinguishable from <strong>one()</strong> for devices with a JIT.
    240 It uses the enhanced for loop syntax introduced in version 1.5 of the Java
    241 programming language.</p>
    242 
    243 <p>So, you should use the enhanced <code>for</code> loop by default, but consider a
    244 hand-written counted loop for performance-critical {@link java.util.ArrayList} iteration.</p>
    245 
    246 <p class="note"><strong>Tip:</strong>
    247 Also see Josh Bloch's <em>Effective Java</em>, item 46.</p>
    248 
    249 
    250 
    251 <h2 id="PackageInner">Consider Package Instead of Private Access with Private Inner Classes</h2>
    252 
    253 <p>Consider the following class definition:</p>
    254 
    255 <pre>
    256 public class Foo {
    257     private class Inner {
    258         void stuff() {
    259             Foo.this.doStuff(Foo.this.mValue);
    260         }
    261     }
    262 
    263     private int mValue;
    264 
    265     public void run() {
    266         Inner in = new Inner();
    267         mValue = 27;
    268         in.stuff();
    269     }
    270 
    271     private void doStuff(int value) {
    272         System.out.println("Value is " + value);
    273     }
    274 }</pre>
    275 
    276 <p>What's important here is that we define a private inner class
    277 (<code>Foo$Inner</code>) that directly accesses a private method and a private
    278 instance field in the outer class. This is legal, and the code prints "Value is
    279 27" as expected.</p>
    280 
    281 <p>The problem is that the VM considers direct access to <code>Foo</code>'s
    282 private members from <code>Foo$Inner</code> to be illegal because
    283 <code>Foo</code> and <code>Foo$Inner</code> are different classes, even though
    284 the Java language allows an inner class to access an outer class' private
    285 members. To bridge the gap, the compiler generates a couple of synthetic
    286 methods:</p>
    287 
    288 <pre>
    289 /*package*/ static int Foo.access$100(Foo foo) {
    290     return foo.mValue;
    291 }
    292 /*package*/ static void Foo.access$200(Foo foo, int value) {
    293     foo.doStuff(value);
    294 }</pre>
    295 
    296 <p>The inner class code calls these static methods whenever it needs to
    297 access the <code>mValue</code> field or invoke the <code>doStuff()</code> method
    298 in the outer class. What this means is that the code above really boils down to
    299 a case where you're accessing member fields through accessor methods.
    300 Earlier we talked about how accessors are slower than direct field
    301 accesses, so this is an example of a certain language idiom resulting in an
    302 "invisible" performance hit.</p>
    303 
    304 <p>If you're using code like this in a performance hotspot, you can avoid the
    305 overhead by declaring fields and methods accessed by inner classes to have
    306 package access, rather than private access. Unfortunately this means the fields
    307 can be accessed directly by other classes in the same package, so you shouldn't
    308 use this in public API.</p>
    309 
    310 
    311 
    312 
    313 <h2 id="AvoidFloat">Avoid Using Floating-Point</h2>
    314 
    315 <p>As a rule of thumb, floating-point is about 2x slower than integer on
    316 Android-powered devices.</p>
    317 
    318 <p>In speed terms, there's no difference between <code>float</code> and
    319 <code>double</code> on the more modern hardware. Space-wise, <code>double</code>
    320 is 2x larger. As with desktop machines, assuming space isn't an issue, you
    321 should prefer <code>double</code> to <code>float</code>.</p>
    322 
    323 <p>Also, even for integers, some processors have hardware multiply but lack
    324 hardware divide. In such cases, integer division and modulus operations are
    325 performed in software&mdash;something to think about if you're designing a
    326 hash table or doing lots of math.</p>
    327 
    328 
    329 
    330 
    331 <h2 id="UseLibraries">Know and Use the Libraries</h2>
    332 
    333 <p>In addition to all the usual reasons to prefer library code over rolling
    334 your own, bear in mind that the system is at liberty to replace calls
    335 to library methods with hand-coded assembler, which may be better than the
    336 best code the JIT can produce for the equivalent Java. The typical example
    337 here is {@link java.lang.String#indexOf String.indexOf()} and
    338 related APIs, which Dalvik replaces with
    339 an inlined intrinsic. Similarly, the {@link java.lang.System#arraycopy
    340 System.arraycopy()} method
    341 is about 9x faster than a hand-coded loop on a Nexus One with the JIT.</p>
    342 
    343 
    344 <p class="note"><strong>Tip:</strong>
    345 Also see Josh Bloch's <em>Effective Java</em>, item 47.</p>
    346 
    347 
    348 
    349 
    350 <h2 id="NativeMethods">Use Native Methods Carefully</h2>
    351 
    352 <p>Developing your app with native code using the
    353 <a href="{@docRoot}tools/sdk/ndk/index.html">Android NDK</a>
    354 isn't necessarily more efficient than programming with the
    355 Java language. For one thing,
    356 there's a cost associated with the Java-native transition, and the JIT can't
    357 optimize across these boundaries. If you're allocating native resources (memory
    358 on the native heap, file descriptors, or whatever), it can be significantly
    359 more difficult to arrange timely collection of these resources. You also
    360 need to compile your code for each architecture you wish to run on (rather
    361 than rely on it having a JIT). You may even have to compile multiple versions
    362 for what you consider the same architecture: native code compiled for the ARM
    363 processor in the G1 can't take full advantage of the ARM in the Nexus One, and
    364 code compiled for the ARM in the Nexus One won't run on the ARM in the G1.</p>
    365 
    366 <p>Native code is primarily useful when you have an existing native codebase
    367 that you want to port to Android, not for "speeding up" parts of your Android app
    368 written with the Java language.</p>
    369 
    370 <p>If you do need to use native code, you should read our
    371 <a href="{@docRoot}guide/practices/jni.html">JNI Tips</a>.</p>
    372 
    373 <p class="note"><strong>Tip:</strong>
    374 Also see Josh Bloch's <em>Effective Java</em>, item 54.</p>
    375 
    376 
    377 
    378 
    379 
    380 <h2 id="Myths">Performance Myths</h2>
    381 
    382 
    383 <p>On devices without a JIT, it is true that invoking methods via a
    384 variable with an exact type rather than an interface is slightly more
    385 efficient. (So, for example, it was cheaper to invoke methods on a
    386 <code>HashMap map</code> than a <code>Map map</code>, even though in both
    387 cases the map was a <code>HashMap</code>.) It was not the case that this
    388 was 2x slower; the actual difference was more like 6% slower. Furthermore,
    389 the JIT makes the two effectively indistinguishable.</p>
    390 
    391 <p>On devices without a JIT, caching field accesses is about 20% faster than
    392 repeatedly accesssing the field. With a JIT, field access costs about the same
    393 as local access, so this isn't a worthwhile optimization unless you feel it
    394 makes your code easier to read. (This is true of final, static, and static
    395 final fields too.)
    396 
    397 
    398 
    399 <h2 id="Measure">Always Measure</h2>
    400 
    401 <p>Before you start optimizing, make sure you have a problem that you
    402 need to solve. Make sure you can accurately measure your existing performance,
    403 or you won't be able to measure the benefit of the alternatives you try.</p>
    404 
    405 <p>Every claim made in this document is backed up by a benchmark. The source
    406 to these benchmarks can be found in the <a
    407 href="http://code.google.com/p/dalvik/source/browse/#svn/trunk/benchmarks">code.google.com
    408 "dalvik" project</a>.</p>
    409 
    410 <p>The benchmarks are built with the
    411 <a href="http://code.google.com/p/caliper/">Caliper</a> microbenchmarking
    412 framework for Java. Microbenchmarks are hard to get right, so Caliper goes out
    413 of its way to do the hard work for you, and even detect some cases where you're
    414 not measuring what you think you're measuring (because, say, the VM has
    415 managed to optimize all your code away). We highly recommend you use Caliper
    416 to run your own microbenchmarks.</p>
    417 
    418 <p>You may also find
    419 <a href="{@docRoot}tools/debugging/debugging-tracing.html">Traceview</a> useful
    420 for profiling, but it's important to realize that it currently disables the JIT,
    421 which may cause it to misattribute time to code that the JIT may be able to win
    422 back. It's especially important after making changes suggested by Traceview
    423 data to ensure that the resulting code actually runs faster when run without
    424 Traceview.</p>
    425 
    426 <p>For more help profiling and debugging your apps, see the following documents:</p>
    427 
    428 <ul>
    429   <li><a href="{@docRoot}tools/debugging/debugging-tracing.html">Profiling with
    430     Traceview and dmtracedump</a></li>
    431   <li><a href="{@docRoot}tools/debugging/systrace.html">Analysing Display and Performance
    432     with Systrace</a></li>
    433 </ul>
    434 
    435