Home | History | Annotate | Download | only in guides
      1 page.title=x86 Support
      2 @jd:body
      3 
      4 <div id="qv-wrapper">
      5     <div id="qv">
      6       <h2>On this page</h2>
      7 
      8       <ol>
      9         <li><a href="#over">Overview</a></li>
     10          <li><a href="#an">ARM NEON Intrinsics Support</a></li>
     11          <li><a href="#st">Standalone Toolchain</a></li>
     12          <li><a href="#comp">Compatibility</a></li>
     13       </ol>
     14         </li>
     15       </ol>
     16     </div>
     17   </div>
     18 
     19 <p>The NDK includes support for the {@code x86} ABI, which allows native code to run on
     20 Android-based devices running on CPUs supporting the IA-32 instruction set.</p>
     21 
     22 <h2 id="over">Overview</h2>
     23 <p>To generate x86 machine code, add {@code x86} to the {@code APP_ABI} definition in your
     24 <a href="{@docRoot}ndk/guides/application_mk.html">{@code Application.mk}</a> file. For example:</p>
     25 
     26 <pre class="no-pretty-print">
     27 APP_ABI := armeabi armeabi-v7a x86
     28 </pre
     29 
     30 <p>For more information about defining the {@code APP_ABI} variable, see
     31 <a href="{@docRoot}ndk/guides/application_mk.html">{@code Application.mk}</a>.</p>
     32 
     33 <p>The build system places generated libraries into {@code $PROJECT/libs/x86/}, where
     34 {@code $PROJECT} represents your project's root directory, and embeds them in your APK under
     35 {@code /lib/mips/}.</p>
     36 
     37 <p>The Android package extracts these libraries when installing your APK on a compatible x86-based
     38 device, placing them under your app's private data directory.</p>
     39 
     40 <p>In the Google Play store, the server filters applications so that a consumer sees only the native
     41 libraries that run on the CPU powering his or her device.</p>
     42 
     43 <h2 id="an">x86 Support for ARM NEON Intrinsics</h2>
     44 <p>Support for ARM NEON intrinsics is provided in the form of C/C++ language headers with the same
     45 name as the standard ARM NEON intrinsics header, {@code arm_neon.h}. These headers are available for
     46 all NDK x86 toolchains. They translate NEON intrinsics to native x86 SSE ones.</p>
     47 
     48 <p>Characteristics of this solution include the following:</p>
     49 <ul>
     50 <li>Default use of SSE through SSSE3 for porting ARM NEON to Intel SSE, covering ~93%
     51 (1869 of total 2009) of all NEON functions.</li>
     52 <li>Redefinition of ARM NEON 128 bit vectors into the equivalent x86 SIMD data.</li>
     53 <li>Redefinition of some functions from ARM NEON to Intel SSE if a 1:1 correspondence exists.</li>
     54 <li>Implementation of some ARM NEON functions using Intel SIMD if it will yield a performant result.
     55 </li>
     56 <li>Implementation of some of the remaining NEON functions using the serial solution, and issuing
     57 the corresponding "low performance" compiler warning.</li>
     58 </ul>
     59 
     60 
     61 <h3>Performance</h3>
     62 <p>In most cases, you should be able to attain performance similar to what you would get from ARM
     63 NEON code. Recommendations for best results include:</p>
     64 
     65 <ul>
     66 <li>Use 16-byte data alignment for faster load and store.</li>
     67 <li>Avoid using constants with NEON functions. Using constants results in a performance penalty due
     68 to having to load constants. If you must use constants, try to initialize them outside of hotspot
     69 loops. If possible, replace them with logical and compare operations.</li>
     70 <li>Try to avoid functions marked as "serially implemented" because they need to store data from
     71 registers to memory. Instead, process them serially and reload them. You may be able to change the
     72 data type or algorithm used to vectorize the whole port instead of leaving it as a serial one.</li>
     73 </ul>
     74 
     75 <p>For more information on this topic, see
     76 <a href="http://software.intel.com/en-us/blogs/2012/12/12/from-arm-neon-to-intel-mmxsse-automatic-porting-solution-tips-and-tricks">
     77 From ARM NEON to Intel SSE&ndash; the automatic porting solution, tips and tricks</a>.</p>
     78 
     79 <h3>Known differences from ARM version</h3>
     80 <p>In the great majority of cases, x86 implementations produce the same results as ARM
     81 implementations for NEON. x86 implementations pass
     82 <a href="https://gitorious.org/arm-neon-tests/arm-neon-tests">NEON tests</a> nearly 100% of the
     83 time. Still, there are several corner cases in which an x86 implementation produces results
     84 different from its ARM counterpart. Known incompatibilities are as follows:</p>
     85 
     86 <ul>
     87 <li>{@code VRECPS/VRECPSQ}<br/>
     88   If one of the operands is +/- infinity and the second is +/- 0.0:
     89   <ul>
     90     <li>On ARM CPUs, these instructions
     91     <a href="http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0489h/CIHDIACI.html">
     92     return a result element equal to 2.0</a>.</li>
     93 
     94     <li>x86 CPUs return {@code QNaN Indefinite}. For more information about the QNaN floating-point
     95     indefinite, see "4.2.2 Floating-Point Data Types" and "4.8.3.7 QNaN Floating-Point Indefinite,"
     96     in the
     97     <a href="http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-manual-325462.pdf">Intel 64 and IA-32 Architectures Software Developers Manual</a>.
     98     </li>
     99 
    100   </ul>
    101 </li>
    102 <li>{@code VRSQRTS/VRSQRTSQ}<br/>
    103   If one of the operands is +/- infinity and the second is +/- 0.0:
    104   <ul>
    105     <li>On ARM CPUs, these instructions
    106     <a href="http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0489h/CIHDIACI.html">
    107     return a result element equal to 1.5</a>.</li>
    108 
    109     <li>x86 CPUs return {@code QNaN Indefinite}. For more information about the QNaN floating-point
    110     indefinite, see "4.2.2 Floating-Point Data Types" and "4.8.3.7 QNaN Floating-Point Indefinite,"
    111     in the
    112     <a href="http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-manual-325462.pdf">Intel 64 and IA-32 Architectures Software Developers Manual</a>.
    113     </li>
    114   </ul>
    115 </li>
    116 
    117 <li>{@code VMAX/VMAXQ}<br/>
    118   If one of the operands is NaN, or both operands are +/- 0.0:
    119   <ul>
    120     <li>On ARM CPUs, floating-point maximum works as follows:
    121       <ul>
    122         <li>max(+0.0, -0.0) = +0.0.</li>
    123         <li>If any input is a NaN, the corresponding result element is the default NaN.</li>
    124       </ul>
    125       To learn more about this condition and result, see the
    126       <a href="http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0489h/CIHDEEBE.html">
    127       ARM Compiler toolchain Assembler Reference</a>, ignoring the "Superseded" watermark.
    128     </li>
    129 
    130     <li>On x86 CPUs, floating-point maximum works as follows:
    131       <ul>
    132         <li>If one of the source operands is NaN, then return the second source operand.</li>
    133         <li>If both source operands are equal to 0, then return the second source operand.</li>
    134       </ul>
    135       For more information about these conditions and results, see Volume 1 Appendix E chapter
    136       E.4.2.3 and Volume 2, p 3-488, of the
    137       <a href="http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-manual-325462.pdf">Intel 64 and IA-32 Architectures Software Developers
    138       Manual</a>.
    139     </li>
    140   </ul>
    141 </li>
    142 
    143 <li>{@code VMIN/VMINQ}<br/>
    144   If one of the operands is NaN or both are +/- 0.0:
    145   <ul>
    146     <li>On ARM CPUs floating-point minimum works as follows:
    147       <ul>
    148         <li>min(+0.0, -0.0) = -0.0.</li>
    149         <li>If any input is a NaN, the corresponding result element is the default NaN.</li>
    150       </ul>
    151       To learn more about this condition and result, see the
    152       <a href="http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0489h/CIHDEEBE.html">
    153       ARM Compiler toolchain Assembler Reference</a>, ignoring the "Superseded" watermark.
    154     </li>
    155     <li>On x86 CPUs floating-point minimum works as follows:
    156       <ul>
    157         <li>If one of the source operands is NaN, than return the second source operand.</li>
    158         <li>If both source operands are equal to 0, than return the second source operand.</li>
    159       </ul>
    160       For more information about these conditions and results, see Volume 1 Appendix E chapter
    161       E.4.2.3 and Volume 2, p 3-497, of the
    162       <a href="http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-manual-325462.pdf">Intel 64 and IA-32 Architectures Software Developers
    163       Manual</a>.
    164     </li>
    165   </ul>
    166 </li>
    167 
    168 <li>{@code VRECPE/VRECPEQ}<br/>
    169   These instructions provide different levels of accuracy on ARM and x86 CPUs. For more information
    170   about these differences, see
    171   <a href="http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.faqs/ka14282.html">
    172   How do I use VRECPE/VRECPEQ for reciprocal estimate?</a> on the ARM website, and Volume 2, p.
    173   4-281 of the
    174   <a href="http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-manual-325462.pdf">Intel 64 and IA-32 Architectures Software Developers Manual</a>.
    175 </li>
    176 
    177 <li>{@code VRSQRTE/VRSQRTEQ}<br/>
    178   <ul>
    179     <li>These instructions provide different levels of accuracy on ARM and x86 CPUs. For more
    180     information about these differences, see the
    181     <a href="http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0204h/CIHCHECJ.html">
    182     RealView Compilation Tools Assembler Guide</a>, and Volume 2, p. 4-325 of the
    183     <a href="http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-manual-325462.pdf">Intel 64 and IA-32 Architectures Software Developers Manual</a>.
    184     </li>
    185 
    186     <li>If one of the operands is negative or -infinity then
    187       <ul>
    188         <li>On ARM CPUs, these instructions by default return a (positive) NaN. For more information
    189         about this result, see the
    190         <a href="http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0489i/CIHIICBB.html">
    191         ARM Compiler toolchain Assembler Reference</a>.</li>
    192         <li>On x86 CPUs, these instructions return a (negative) QNaN floating-point Indefinite. For
    193         more information about this result, see Volume 1, Appendix E, E.4.2.3, of the
    194       <a href="http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-manual-325462.pdf">Intel 64 and IA-32 Architectures Software Developers
    195       Manual</a>.</li>
    196       </ul>
    197     </li>
    198   </ul>
    199 </li>
    200 </ul>
    201 
    202 <h3>Sample code</h3>
    203 <p>In your project make sure to include the {@code arm_neon.h} header, and define include
    204 {@code x86} in your definition of {@code APP_ABI}. The build system then ports your code to x86.</p>
    205 
    206 <p>For an example of how porting ARM NEON to x86 SSE works, see the hello-neon sample.</p>
    207 
    208 <h2 id="st">Standalone Toolchain</h2>
    209 <p>You can incorporate the {@code x86} ABI into your own toolchain. For more information, see
    210 <a href="{@docRoot}ndk/guides/standalone_toolchain.html">Standalone Toolchain</a>.</p>
    211 
    212 <h2 id="comp">Compatibility</h2>
    213 <p>x86 support requires, at minimum, Android 2.3 (Android API level 9). If your project files
    214 target an older API level, but include x86 as a targeted platform, the NDK build script
    215 automatically selects the right set of native platform headers/libraries for you.</p>