Home | History | Annotate | only in /external/tensorflow/tensorflow/lite/tools/benchmark
Up to higher level directory
NameDateSize
android/22-Oct-2020
benchmark_main.cc22-Oct-20201.2K
benchmark_model.cc22-Oct-20207K
benchmark_model.h22-Oct-20205.7K
benchmark_params.cc22-Oct-20201.8K
benchmark_params.h22-Oct-20202.8K
benchmark_plus_flex_main.cc22-Oct-20201.3K
benchmark_test.cc22-Oct-20204K
benchmark_tflite_model.cc22-Oct-202014.7K
benchmark_tflite_model.h22-Oct-20203.3K
BUILD22-Oct-20203.6K
command_line_flags.cc22-Oct-20205.9K
command_line_flags.h22-Oct-20204.9K
command_line_flags_test.cc22-Oct-20206.3K
ios/22-Oct-2020
logging.h22-Oct-20202.3K
README.md22-Oct-202011.7K

README.md

      1 # TFLite Model Benchmark Tool
      2 
      3 ## Description
      4 
      5 A simple C++ binary to benchmark a TFLite model and its individual operators,
      6 both on desktop machines and on Android. The binary takes a TFLite model,
      7 generates random inputs and then repeatedly runs the model for specified number
      8 of runs. Aggregate latency statistics are reported after running the benchmark.
      9 
     10 The instructions below are for running the binary on Desktop and Android,
     11 for iOS please use the
     12 [iOS benchmark app](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/tools/benchmark/ios).
     13 
     14 An experimental Android APK wrapper for the benchmark model utility offers more
     15 faithful execution behavior on Android (via a foreground Activity). It is
     16 located
     17 [here](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/tools/benchmark/android).
     18 
     19 ## Parameters
     20 
     21 The binary takes the following required parameters:
     22 
     23 *   `graph`: `string` \
     24     The path to the TFLite model file.
     25 
     26 and the following optional parameters:
     27 
     28 *   `num_threads`: `int` (default=1) \
     29     The number of threads to use for running TFLite interpreter.
     30 *   `warmup_runs`: `int` (default=1) \
     31     The number of warmup runs to do before starting the benchmark.
     32 *   `num_runs`: `int` (default=50) \
     33     The number of runs. Increase this to reduce variance.
     34 *   `run_delay`: `float` (default=-1.0) \
     35     The delay in seconds between subsequent benchmark runs. Non-positive values
     36     mean use no delay.
     37 *   `use_nnapi`: `bool` (default=false) \
     38     Whether to use [Android NNAPI](https://developer.android.com/ndk/guides/neuralnetworks/).
     39     This API is available on recent Android devices.
     40 
     41 ## To build/install/run
     42 
     43 ### On Android:
     44 
     45 (0) Refer to https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/android to edit the `WORKSPACE` to configure the android NDK/SDK.
     46 
     47 (1) Build for your specific platform, e.g.:
     48 
     49 ```
     50 bazel build -c opt \
     51   --config=android_arm \
     52   --cxxopt='--std=c++11' \
     53   tensorflow/lite/tools/benchmark:benchmark_model
     54 ```
     55 
     56 (2) Connect your phone. Push the binary to your phone with adb push
     57      (make the directory if required):
     58 
     59 ```
     60 adb push bazel-bin/tensorflow/lite/tools/benchmark/benchmark_model /data/local/tmp
     61 ```
     62 
     63 (3) Make the binary executable.
     64 
     65 ```
     66 adb shell chmod +x /data/local/tmp/benchmark_model
     67 ```
     68 
     69 (4) Push the compute graph that you need to test. For example:
     70 
     71 ```
     72 adb push mobilenet_quant_v1_224.tflite /data/local/tmp
     73 ```
     74 
     75 (5) Run the benchmark. For example:
     76 
     77 ```
     78 adb shell /data/local/tmp/benchmark_model \
     79   --graph=/data/local/tmp/mobilenet_quant_v1_224.tflite \
     80   --num_threads=4
     81 ```
     82 
     83 ### On desktop:
     84 (1) build the binary
     85 
     86 ```
     87 bazel build -c opt tensorflow/lite/tools/benchmark:benchmark_model
     88 ```
     89 
     90 (2) Run on your compute graph, similar to the Android case but without the need of adb shell.
     91 For example:
     92 
     93 ```
     94 bazel-bin/tensorflow/lite/tools/benchmark/benchmark_model \
     95   --graph=mobilenet_quant_v1_224.tflite \
     96   --num_threads=4
     97 ```
     98 
     99 The MobileNet graph used as an example here may be downloaded from [here](https://storage.googleapis.com/download.tensorflow.org/models/tflite/mobilenet_v1_224_android_quant_2017_11_08.zip).
    100 
    101 
    102 ## Reducing variance between runs on Android.
    103 
    104 Most modern Android phones use [ARM big.LITTLE](https://en.wikipedia.org/wiki/ARM_big.LITTLE)
    105 architecture where some cores are more power hungry but faster than other cores.
    106 When running benchmarks on these phones there can be significant variance
    107 between different runs of the benchmark. One way to reduce variance between runs
    108 is to set the [CPU affinity](https://en.wikipedia.org/wiki/Processor_affinity)
    109 before running the benchmark. On Android this can be done using the `taskset`
    110 command.
    111 E.g. for running the benchmark on big cores on Pixel 2 with a single thread one
    112 can use the following command:
    113 
    114 ```
    115 adb shell taskset f0 /data/local/tmp/benchmark_model \
    116   --graph=/data/local/tmp/mobilenet_quant_v1_224.tflite \
    117   --num_threads=1
    118 ```
    119 
    120 where `f0` is the affinity mask for big cores on Pixel 2.
    121 Note: The affinity mask varies with the device.
    122 
    123 ## Profiling model operators
    124 The benchmark model binary also allows you to profile operators and give execution times of each operator. To do this,
    125 compile the binary with a compiler flag that enables profiling to be compiled in. Pass **--copt=-DTFLITE_PROFILING_ENABLED**
    126 to compile benchmark with profiling support.
    127 For example, to compile with profiling support on Android, add this flag to the previous command:
    128 
    129 ```
    130 bazel build -c opt \
    131   --config=android_arm \
    132   --cxxopt='--std=c++11' \
    133   --copt=-DTFLITE_PROFILING_ENABLED \
    134   tensorflow/lite/tools/benchmark:benchmark_model
    135 ```
    136 This compiles TFLite with profiling enabled, now you can run the benchmark binary like before. The binary will produce detailed statistics for each operation similar to those shown below:
    137 
    138 ```
    139 
    140 ============================== Run Order ==============================
    141 	             [node type]	  [start]	  [first]	 [avg ms]	     [%]	  [cdf%]	  [mem KB]	[times called]	[Name]
    142 	                 CONV_2D	    0.000	    4.269	    4.269	  0.107%	  0.107%	     0.000	        0	[MobilenetV1/MobilenetV1/Conv2d_0/Relu6]
    143 	       DEPTHWISE_CONV_2D	    4.270	    2.150	    2.150	  0.054%	  0.161%	     0.000	        0	[MobilenetV1/MobilenetV1/Conv2d_1_depthwise/Relu6]
    144 	                 CONV_2D	    6.421	    6.107	    6.107	  0.153%	  0.314%	     0.000	        0	[MobilenetV1/MobilenetV1/Conv2d_1_pointwise/Relu6]
    145 	       DEPTHWISE_CONV_2D	   12.528	    1.366	    1.366	  0.034%	  0.348%	     0.000	        0	[MobilenetV1/MobilenetV1/Conv2d_2_depthwise/Relu6]
    146 	                 CONV_2D	   13.895	    4.195	    4.195	  0.105%	  0.454%	     0.000	        0	[MobilenetV1/MobilenetV1/Conv2d_2_pointwise/Relu6]
    147 	       DEPTHWISE_CONV_2D	   18.091	    1.260	    1.260	  0.032%	  0.485%	     0.000	        0	[MobilenetV1/MobilenetV1/Conv2d_3_depthwise/Relu6]
    148 	                 CONV_2D	   19.352	    6.652	    6.652	  0.167%	  0.652%	     0.000	        0	[MobilenetV1/MobilenetV1/Conv2d_3_pointwise/Relu6]
    149 	       DEPTHWISE_CONV_2D	   26.005	    0.698	    0.698	  0.018%	  0.670%	     0.000	        0	[MobilenetV1/MobilenetV1/Conv2d_4_depthwise/Relu6]
    150 	                 CONV_2D	   26.703	    3.344	    3.344	  0.084%	  0.754%	     0.000	        0	[MobilenetV1/MobilenetV1/Conv2d_4_pointwise/Relu6]
    151 	       DEPTHWISE_CONV_2D	   30.047	    0.646	    0.646	  0.016%	  0.770%	     0.000	        0	[MobilenetV1/MobilenetV1/Conv2d_5_depthwise/Relu6]
    152 	                 CONV_2D	   30.694	    5.800	    5.800	  0.145%	  0.915%	     0.000	        0	[MobilenetV1/MobilenetV1/Conv2d_5_pointwise/Relu6]
    153 	       DEPTHWISE_CONV_2D	   36.495	    0.331	    0.331	  0.008%	  0.924%	     0.000	        0	[MobilenetV1/MobilenetV1/Conv2d_6_depthwise/Relu6]
    154 	                 CONV_2D	   36.826	    2.838	    2.838	  0.071%	  0.995%	     0.000	        0	[MobilenetV1/MobilenetV1/Conv2d_6_pointwise/Relu6]
    155 	       DEPTHWISE_CONV_2D	   39.665	    0.439	    0.439	  0.011%	  1.006%	     0.000	        0	[MobilenetV1/MobilenetV1/Conv2d_7_depthwise/Relu6]
    156 	                 CONV_2D	   40.105	    5.293	    5.293	  0.133%	  1.139%	     0.000	        0	[MobilenetV1/MobilenetV1/Conv2d_7_pointwise/Relu6]
    157 	       DEPTHWISE_CONV_2D	   45.399	    0.352	    0.352	  0.009%	  1.147%	     0.000	        0	[MobilenetV1/MobilenetV1/Conv2d_8_depthwise/Relu6]
    158 	                 CONV_2D	   45.752	    5.322	    5.322	  0.133%	  1.281%	     0.000	        0	[MobilenetV1/MobilenetV1/Conv2d_8_pointwise/Relu6]
    159 	       DEPTHWISE_CONV_2D	   51.075	    0.357	    0.357	  0.009%	  1.290%	     0.000	        0	[MobilenetV1/MobilenetV1/Conv2d_9_depthwise/Relu6]
    160 	                 CONV_2D	   51.432	    5.693	    5.693	  0.143%	  1.433%	     0.000	        0	[MobilenetV1/MobilenetV1/Conv2d_9_pointwise/Relu6]
    161 	       DEPTHWISE_CONV_2D	   57.126	    0.366	    0.366	  0.009%	  1.442%	     0.000	        0	[MobilenetV1/MobilenetV1/Conv2d_10_depthwise/Relu6]
    162 	                 CONV_2D	   57.493	    5.472	    5.472	  0.137%	  1.579%	     0.000	        0	[MobilenetV1/MobilenetV1/Conv2d_10_pointwise/Relu6]
    163 	       DEPTHWISE_CONV_2D	   62.966	    0.364	    0.364	  0.009%	  1.588%	     0.000	        0	[MobilenetV1/MobilenetV1/Conv2d_11_depthwise/Relu6]
    164 	                 CONV_2D	   63.330	    5.404	    5.404	  0.136%	  1.724%	     0.000	        0	[MobilenetV1/MobilenetV1/Conv2d_11_pointwise/Relu6]
    165 	       DEPTHWISE_CONV_2D	   68.735	    0.155	    0.155	  0.004%	  1.728%	     0.000	        0	[MobilenetV1/MobilenetV1/Conv2d_12_depthwise/Relu6]
    166 	                 CONV_2D	   68.891	    2.970	    2.970	  0.074%	  1.802%	     0.000	        0	[MobilenetV1/MobilenetV1/Conv2d_12_pointwise/Relu6]
    167 	       DEPTHWISE_CONV_2D	   71.862	    0.206	    0.206	  0.005%	  1.807%	     0.000	        0	[MobilenetV1/MobilenetV1/Conv2d_13_depthwise/Relu6]
    168 	                 CONV_2D	   72.069	    5.888	    5.888	  0.148%	  1.955%	     0.000	        0	[MobilenetV1/MobilenetV1/Conv2d_13_pointwise/Relu6]
    169 	         AVERAGE_POOL_2D	   77.958	    0.036	    0.036	  0.001%	  1.956%	     0.000	        0	[MobilenetV1/Logits/AvgPool_1a/AvgPool]
    170 	                 CONV_2D	   77.994	    1.445	    1.445	  0.036%	  1.992%	     0.000	        0	[MobilenetV1/Logits/Conv2d_1c_1x1/BiasAdd]
    171 	                 RESHAPE	   79.440	    0.002	    0.002	  0.000%	  1.992%	     0.000	        0	[MobilenetV1/Predictions/Reshape]
    172 	                 SOFTMAX	   79.443	    0.029	    0.029	  0.001%	  1.993%	     0.000	        0	[MobilenetV1/Predictions/Softmax]
    173 
    174 ============================== Top by Computation Time ==============================
    175 	             [node type]	  [start]	  [first]	 [avg ms]	     [%]	  [cdf%]	  [mem KB]	[times called]	[Name]
    176 	                 CONV_2D	   19.352	    6.652	    6.652	  0.167%	  0.167%	     0.000	        0	[MobilenetV1/MobilenetV1/Conv2d_3_pointwise/Relu6]
    177 	                 CONV_2D	    6.421	    6.107	    6.107	  0.153%	  0.320%	     0.000	        0	[MobilenetV1/MobilenetV1/Conv2d_1_pointwise/Relu6]
    178 	                 CONV_2D	   72.069	    5.888	    5.888	  0.148%	  0.468%	     0.000	        0	[MobilenetV1/MobilenetV1/Conv2d_13_pointwise/Relu6]
    179 	                 CONV_2D	   30.694	    5.800	    5.800	  0.145%	  0.613%	     0.000	        0	[MobilenetV1/MobilenetV1/Conv2d_5_pointwise/Relu6]
    180 	                 CONV_2D	   51.432	    5.693	    5.693	  0.143%	  0.756%	     0.000	        0	[MobilenetV1/MobilenetV1/Conv2d_9_pointwise/Relu6]
    181 	                 CONV_2D	   57.493	    5.472	    5.472	  0.137%	  0.893%	     0.000	        0	[MobilenetV1/MobilenetV1/Conv2d_10_pointwise/Relu6]
    182 	                 CONV_2D	   63.330	    5.404	    5.404	  0.136%	  1.029%	     0.000	        0	[MobilenetV1/MobilenetV1/Conv2d_11_pointwise/Relu6]
    183 	                 CONV_2D	   45.752	    5.322	    5.322	  0.133%	  1.162%	     0.000	        0	[MobilenetV1/MobilenetV1/Conv2d_8_pointwise/Relu6]
    184 	                 CONV_2D	   40.105	    5.293	    5.293	  0.133%	  1.295%	     0.000	        0	[MobilenetV1/MobilenetV1/Conv2d_7_pointwise/Relu6]
    185 	                 CONV_2D	    0.000	    4.269	    4.269	  0.107%	  1.402%	     0.000	        0	[MobilenetV1/MobilenetV1/Conv2d_0/Relu6]
    186 
    187 Number of nodes executed: 31
    188 ============================== Summary by node type ==============================
    189 	             [Node type]	  [count]	  [avg ms]	    [avg %]	    [cdf %]	  [mem KB]	[times called]
    190 	                 CONV_2D	       15	     1.406	    89.270%	    89.270%	     0.000	        0
    191 	       DEPTHWISE_CONV_2D	       13	     0.169	    10.730%	   100.000%	     0.000	        0
    192 	                 SOFTMAX	        1	     0.000	     0.000%	   100.000%	     0.000	        0
    193 	                 RESHAPE	        1	     0.000	     0.000%	   100.000%	     0.000	        0
    194 	         AVERAGE_POOL_2D	        1	     0.000	     0.000%	   100.000%	     0.000	        0
    195 
    196 Timings (microseconds): count=50 first=79449 curr=81350 min=77385 max=88213 avg=79732 std=1929
    197 Memory (bytes): count=0
    198 31 nodes observed
    199 
    200 
    201 Average inference timings in us: Warmup: 83235, Init: 38467, no stats: 79760.9
    202 ```
    203