Home | History | Annotate | Download | only in guide
      1 
      2 # TensorFlow Lite guide
      3 
      4 TensorFlow Lite is TensorFlows lightweight solution for mobile and embedded
      5 devices. It enables on-device machine learning inference with low latency and a
      6 small binary size. TensorFlow Lite also supports hardware acceleration with the
      7 [Android Neural Networks
      8 API](https://developer.android.com/ndk/guides/neuralnetworks/index.html).
      9 
     10 TensorFlow Lite uses many techniques for achieving low latency such as
     11 optimizing the kernels for mobile apps, pre-fused activations, and quantized
     12 kernels that allow smaller and faster (fixed-point math) models.
     13 
     14 Most of our TensorFlow Lite documentation is [on
     15 GitHub](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite)
     16 for the time being.
     17 
     18 ## What does TensorFlow Lite contain?
     19 
     20 TensorFlow Lite supports a set of core operators, both quantized and
     21 float, which have been tuned for mobile platforms. They incorporate pre-fused
     22 activations and biases to further enhance performance and quantized
     23 accuracy. Additionally, TensorFlow Lite also supports using custom operations in
     24 models.
     25 
     26 TensorFlow Lite defines a new model file format, based on
     27 [FlatBuffers](https://google.github.io/flatbuffers/). FlatBuffers is an
     28 efficient open-source cross-platform serialization library. It is similar to
     29 [protocol buffers](https://developers.google.com/protocol-buffers/?hl=en), but
     30 the primary difference is that FlatBuffers does not need a parsing/unpacking
     31 step to a secondary representation before you can access data, often coupled
     32 with per-object memory allocation. Also, the code footprint of FlatBuffers is an
     33 order of magnitude smaller than protocol buffers.
     34 
     35 TensorFlow Lite has a new mobile-optimized interpreter, which has the key goals
     36 of keeping apps lean and fast. The interpreter uses a static graph ordering and
     37 a custom (less-dynamic) memory allocator to ensure minimal load, initialization,
     38 and execution latency.
     39 
     40 TensorFlow Lite provides an interface to leverage hardware acceleration, if
     41 available on the device. It does so via the
     42 [Android Neural Networks API](https://developer.android.com/ndk/guides/neuralnetworks/index.html),
     43 available on Android 8.1 (API level 27) and higher.
     44 
     45 ## Why do we need a new mobile-specific library?
     46 
     47 Machine Learning is changing the computing paradigm, and we see an emerging
     48 trend of new use cases on mobile and embedded devices. Consumer expectations are
     49 also trending toward natural, human-like interactions with their devices, driven
     50 by the camera and voice interaction models.
     51 
     52 There are several factors which are fueling interest in this domain:
     53 
     54 - Innovation at the silicon layer is enabling new possibilities for hardware
     55   acceleration, and frameworks such as the Android Neural Networks API make it
     56   easy to leverage these.
     57 
     58 - Recent advances in real-time computer-vision and spoken language understanding
     59   have led to mobile-optimized benchmark models being open sourced
     60   (e.g. MobileNets, SqueezeNet).
     61 
     62 - Widely-available smart appliances create new possibilities for
     63   on-device intelligence.
     64 
     65 - Interest in stronger user data privacy paradigms where user data does not need
     66   to leave the mobile device.
     67 
     68 - Ability to serve offline use cases, where the device does not need to be
     69   connected to a network.
     70 
     71 We believe the next wave of machine learning applications will have significant
     72 processing on mobile and embedded devices.
     73 
     74 ## TensorFlow Lite highlights
     75 
     76 TensorFlow Lite provides:
     77 
     78 - A set of core operators, both quantized and float, many of which have been
     79   tuned for mobile platforms.  These can be used to create and run custom
     80   models.  Developers can also write their own custom operators and use them in
     81   models.
     82 
     83 - A new [FlatBuffers](https://google.github.io/flatbuffers/)-based
     84   model file format.
     85 
     86 - On-device interpreter with kernels optimized for faster execution on mobile.
     87 
     88 - TensorFlow converter to convert TensorFlow-trained models to the TensorFlow
     89   Lite format.
     90 
     91 - Smaller in size: TensorFlow Lite is smaller than 300KB when all supported
     92   operators are linked and less than 200KB when using only the operators needed
     93   for supporting InceptionV3 and Mobilenet.
     94 
     95 - **Pre-tested models:**
     96 
     97     All of the following models are guaranteed to work out of the box:
     98 
     99     - Inception V3, a popular model for detecting the dominant objects
    100       present in an image.
    101 
    102     - [MobileNets](https://github.com/tensorflow/models/blob/master/research/slim/nets/mobilenet_v1.md),
    103       a family of mobile-first computer vision models designed to effectively
    104       maximize accuracy while being mindful of the restricted resources for an
    105       on-device or embedded application. They are small, low-latency, low-power
    106       models parameterized to meet the resource constraints of a variety of use
    107       cases. They can be built upon for classification, detection, embeddings
    108       and segmentation. MobileNet models are smaller but [lower in
    109       accuracy](https://research.googleblog.com/2017/06/mobilenets-open-source-models-for.html)
    110       than Inception V3.
    111 
    112     - On Device Smart Reply, an on-device model which provides one-touch
    113       replies for an incoming text message by suggesting contextually relevant
    114       messages. The model was built specifically for memory constrained devices
    115       such as watches & phones and it has been successfully used to surface
    116       [Smart Replies on Android
    117       Wear](https://research.googleblog.com/2017/02/on-device-machine-intelligence.html)
    118       to all first-party and third-party apps.
    119 
    120     Also see the complete list of
    121     [TensorFlow Lite's supported models](hosted_models.md),
    122     including the model sizes, performance numbers, and downloadable model files.
    123 
    124 - Quantized versions of the MobileNet model, which runs faster than the
    125   non-quantized (float) version on CPU.
    126 
    127 - New Android demo app to illustrate the use of TensorFlow Lite with a quantized
    128   MobileNet model for object classification.
    129 
    130 - Java and C++ API support
    131 
    132 
    133 ## Getting Started
    134 
    135 We recommend you try out TensorFlow Lite with the pre-tested models indicated
    136 above. If you have an existing model, you will need to test whether your model
    137 is compatible with both the converter and the supported operator set.  To test
    138 your model, see the
    139 [documentation on GitHub](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite).
    140 
    141 ### Retrain Inception-V3 or MobileNet for a custom data set
    142 
    143 The pre-trained models mentioned above have been trained on the ImageNet data
    144 set, which consists of 1000 predefined classes. If those classes are not
    145 relevant or useful for your use case, you will need to retrain those
    146 models. This technique is called transfer learning, which starts with a model
    147 that has been already trained on a problem and will then be retrained on a
    148 similar problem. Deep learning from scratch can take days, but transfer learning
    149 can be done fairly quickly. In order to do this, you'll need to generate your
    150 custom data set labeled with the relevant classes.
    151 
    152 The [TensorFlow for Poets](https://codelabs.developers.google.com/codelabs/tensorflow-for-poets/)
    153 codelab walks through this process step-by-step. The retraining code supports
    154 retraining for both floating point and quantized inference.
    155 
    156 ## TensorFlow Lite Architecture
    157 
    158 The following diagram shows the architectural design of TensorFlow Lite:
    159 
    160 <img src="https://www.tensorflow.org/images/tflite-architecture.jpg"
    161      alt="TensorFlow Lite architecture diagram"
    162      style="max-width:600px;">
    163 
    164 Starting with a trained TensorFlow model on disk, you'll convert that model to
    165 the TensorFlow Lite file format (`.tflite`) using the TensorFlow Lite
    166 Converter. Then you can use that converted file in your mobile application.
    167 
    168 Deploying the TensorFlow Lite model file uses:
    169 
    170 - Java API: A convenience wrapper around the C++ API on Android.
    171 
    172 - C++ API: Loads the TensorFlow Lite Model File and invokes the Interpreter. The
    173   same library is available on both Android and iOS.
    174 
    175 - Interpreter: Executes the model using a set of kernels. The interpreter
    176   supports selective kernel loading; without kernels it is only 100KB, and 300KB
    177   with all the kernels loaded. This is a significant reduction from the 1.5M
    178   required by TensorFlow Mobile.
    179 
    180 - On select Android devices, the Interpreter will use the Android Neural
    181   Networks API for hardware acceleration, or default to CPU execution if none
    182   are available.
    183 
    184 You can also implement custom kernels using the C++ API that can be used by the
    185 Interpreter.
    186 
    187 ## Future Work
    188 
    189 In future releases, TensorFlow Lite will support more models and built-in
    190 operators, contain performance improvements for both fixed point and floating
    191 point models, improvements to the tools to enable easier developer workflows and
    192 support for other smaller devices and more. As we continue development, we hope
    193 that TensorFlow Lite will greatly simplify the developer experience of targeting
    194 a model for small devices.
    195 
    196 Future plans include using specialized machine learning hardware to get the best
    197 possible performance for a particular model on a particular device.
    198 
    199 ## Next Steps
    200 
    201 The TensorFlow Lite [GitHub repository](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite).
    202 contains additional docs, code samples, and demo applications.
    203