Home | History | Annotate | Download | only in docs
      1 Android NDK & ARM NEON instruction set extension support
      2 --------------------------------------------------------
      3 
      4 Introduction:
      5 -------------
      6 
      7 Android NDK r3 added support for the new 'armeabi-v7a' ARM-based ABI
      8 that allows native code to use two useful instruction set extenstions:
      9 
     10 - Thumb-2, which provides performance comparable to 32-bit ARM
     11   instructions with similar compactness to Thumb-1
     12 
     13 - VFPv3, which provides hardware FPU registers and computations,
     14   to boost floating point performance significantly.
     15 
     16   More specifically, by default 'armeabi-v7a' only supports
     17   VFPv3-D16 which only uses/requires 16 hardware FPU 64-bit registers.
     18 
     19 More information about this can be read in docs/CPU-ARCH-ABIS.TXT
     20 
     21 The ARMv7 Architecture Reference Manual also defines another optional
     22 instruction set extension known as "ARM Advanced SIMD", nick-named
     23 "NEON". It provides:
     24 
     25 - A set of interesting scalar/vector instructions and registers
     26   (the latter are mapped to the same chip area than the FPU ones),
     27   comparable to MMX/SSE/3DNow! in the x86 world.
     28 
     29 - VFPv3-D32 as a requirement (i.e. 32 hardware FPU 64-bit registers,
     30   instead of the minimum of 16).
     31 
     32 Not all ARMv7-based Android devices will support NEON, but those that
     33 do may benefit in significant ways from the scalar/vector instructions.
     34 
     35 The NDK supports the compilation of modules or even specific source
     36 files with support for NEON. What this means is that a specific compiler
     37 flag will be used to enable the use of GCC ARM Neon intrinsics and
     38 VFPv3-D32 at the same time. The intrinsics are described here:
     39 
     40     http://gcc.gnu.org/onlinedocs/gcc/ARM-NEON-Intrinsics.html
     41 
     42 
     43 LOCAL_ARM_NEON:
     44 ---------------
     45 
     46 Define LOCAL_ARM_NEON to 'true' in your module definition, and the NDK
     47 will build all its source files with NEON support. This can be useful if
     48 you want to build a static or shared library that specifically contains
     49 NEON code paths.
     50 
     51 
     52 Using the .neon suffix:
     53 -----------------------
     54 
     55 When listing sources files in your LOCAL_SRC_FILES variable, you now have
     56 the option of using the .neon suffix to indicate that you want to
     57 corresponding source(s) to be built with Neon support. For example:
     58 
     59   LOCAL_SRC_FILES := foo.c.neon bar.c
     60 
     61 Will only build 'foo.c' with NEON support.
     62 
     63 Note that the .neon suffix can be used with the .arm suffix too (used to
     64 specify the 32-bit ARM instruction set for non-NEON instructions), but must
     65 appear after it.
     66 
     67 In other words, 'foo.c.arm.neon' works, but 'foo.c.neon.arm' does NOT.
     68 
     69 
     70 Build Requirements:
     71 ------------------
     72 
     73 Neon support only works when targetting the 'armeabi-v7a' ABI, otherwise the
     74 NDK build scripts will complain and abort. It is important to use checks like
     75 the following in your Android.mk:
     76 
     77    # define a static library containing our NEON code
     78    ifeq ($(TARGET_ARCH_ABI),armeabi-v7a)
     79       include $(CLEAR_VARS)
     80       LOCAL_MODULE    := mylib-neon
     81       LOCAL_SRC_FILES := mylib-neon.c
     82       LOCAL_ARM_NEON  := true
     83       include $(BUILD_STATIC_LIBRARY)
     84    endif # TARGET_ARCH_ABI == armeabi-v7a
     85 
     86 
     87 Runtime Detection:
     88 ------------------
     89 
     90 As said previously, NOT ALL ARMv7-BASED ANDROID DEVICES WILL SUPPORT NEON !
     91 It is thus crucial to perform runtime detection to know if the NEON-capable
     92 machine code can be run on the target device.
     93 
     94 To do that, use the 'cpufeatures' library that comes with this NDK. To lean
     95 more about it, see docs/CPU-FEATURES.TXT.
     96 
     97 You should explicitely check that android_getCpuFamily() returns
     98 ANDROID_CPU_FAMILY_ARM, and that android_getCpuFeatures() returns a value
     99 that has the ANDROID_CPU_ARM_FEATURE_NEON flag set, as  in:
    100 
    101     #include <cpu-features.h>
    102 
    103     ...
    104     ...
    105 
    106     if (android_getCpuFamily() == ANDROID_CPU_FAMILY_ARM &&
    107         (android_getCpuFeatures() & ANDROID_CPU_ARM_FEATURE_NEON) != 0)
    108     {
    109         // use NEON-optimized routines
    110         ...
    111     }
    112     else
    113     {
    114         // use non-NEON fallback routines instead
    115         ...
    116     }
    117 
    118     ...
    119 
    120 Sample code:
    121 ------------
    122 
    123 Look at the source code for the "hello-neon" sample in this NDK for an example
    124 on how to use the 'cpufeatures' library and Neon intrinsics at the same time.
    125 
    126 This implements a tiny benchmark for a FIR filter loop using a C version, and
    127 a NEON-optimized one for devices that support it.
    128