contrib/makefile/README.md

### TensorFlow Makefile

The recommended way to build TensorFlow from source is using the Bazel
open-source build system. Sometimes this isn't possible. For example,
if you are building for iOS, you currently need to use the Makefile.

 - The build system may not have the RAM or processing power to support Bazel.
 - Bazel or its dependencies may not be available.
 - You may want to cross-compile for an unsupported target system.

This experimental project supplies a Makefile automatically derived from the
dependencies listed in the Bazel project that can be used with GNU's make tool.
With it, you can compile the core C++ runtime into a static library.

This static library will not contain:

 - Python or other language bindings
 - GPU support

You can target:
- iOS
- OS X (macOS)
- Android
- Raspberry-PI

You will compile tensorflow and protobuf libraries that you can link into other
applications.  You will also compile the [benchmark](../../tools/benchmark/)
application that will let you check your application.

## Before you start (all platforms)

First, clone this TensorFlow repository.

You will need to download all dependencies as well.  We have provided a script
that does so, to be run (as with all commands) **at the root of the repository**:

```bash
tensorflow/contrib/makefile/download_dependencies.sh
```

You should only need to do this step once.  It downloads the required libraries
like Eigen in the `tensorflow/contrib/makefile/downloads/` folder.

You should download the example graph from [https://storage.googleapis.com/download.tensorflow.org/models/inception5h.zip](https://storage.googleapis.com/download.tensorflow.org/models/inception5h.zip).

## Building on Linux

_Note: This has only been tested on Ubuntu._

As a first step, you need to make sure the required packages are installed:
```bash
sudo apt-get install autoconf automake libtool curl make g++ unzip zlib1g-dev \
git python
```

You should then be able to run the `build_all_linux.sh` script to compile:
```bash
tensorflow/contrib/makefile/build_all_linux.sh
```

This should compile a static library in
`tensorflow/contrib/makefile/gen/lib/libtensorflow-core.a`,
and create an example executable at `tensorflow/contrib/makefile/gen/bin/benchmark`.

Get the graph file, if you have not already:

```bash
mkdir -p ~/graphs
curl -o ~/graphs/inception.zip \
 https://storage.googleapis.com/download.tensorflow.org/models/inception5h.zip \
 && unzip ~/graphs/inception.zip -d ~/graphs/inception
```

To run the executable, use:

```bash
tensorflow/contrib/makefile/gen/bin/benchmark \
 --graph=$HOME/graphs/inception/tensorflow_inception_graph.pb
```

## Android

First, you will need to download and unzip the
[Native Development Kit (NDK)](https://developer.android.com/ndk/). You will not
need to install the standalone toolchain, however.

Assign your NDK location to $NDK_ROOT:

```bash
export NDK_ROOT=/absolute/path/to/NDK/android-ndk-rxxx/
```

Download the graph if you haven't already:

```bash
mkdir -p ~/graphs
curl -o ~/graphs/inception.zip \
 https://storage.googleapis.com/download.tensorflow.org/models/inception5h.zip \
 && unzip ~/graphs/inception.zip -d ~/graphs/inception
```

Then, execute the following:

```bash
tensorflow/contrib/makefile/download_dependencies.sh
tensorflow/contrib/makefile/compile_android_protobuf.sh -c
export HOST_NSYNC_LIB=`tensorflow/contrib/makefile/compile_nsync.sh`
export TARGET_NSYNC_LIB=`CC_PREFIX="${CC_PREFIX}" NDK_ROOT="${NDK_ROOT}" \
	tensorflow/contrib/makefile/compile_nsync.sh -t android -a armeabi-v7a`
make -f tensorflow/contrib/makefile/Makefile TARGET=ANDROID
```

At this point, you will have compiled libraries in `gen/lib/*` and the
[benchmark app](../../tools/benchmark) compiled for Android.

Run the benchmark by pushing both the benchmark and the graph file to your
attached Android device:

```bash
adb push ~/graphs/inception/tensorflow_inception_graph.pb /data/local/tmp/
adb push tensorflow/contrib/makefile/gen/bin/benchmark /data/local/tmp/
adb shell '/data/local/tmp/benchmark \
 --graph=/data/local/tmp/tensorflow_inception_graph.pb \
 --input_layer="input:0" \
 --input_layer_shape="1,224,224,3" \
 --input_layer_type="float" \
 --output_layer="output:0"
'
```

For more details, see the [benchmark documentation](../../tools/benchmark).

## CUDA support for Tegra devices running Android (Nvidia Shield TV, etc)

With the release of TF 1.6 and JetPack for Android 3.2 (currently pending), you can now build a version of TensorFlow for compatible devices according to the following instructions which will receive the full benefits of GPU acceleration.

#### Environment setup:

First, download and install JetPack for Android version 3.2 or greater from [Nvidia](https://developers.nvidia.com). Note that as of the TF 1.6 release the JetPack for Android 3.2 release is still pending, and regular JetPack for L4T will not work.

```bash
git clone https://github.com/tensorflow/tensorflow.git
cd tensorflow
JETPACK=$HOME/JetPack_Android_3.2
TEGRA_LIBS="$JETPACK/cuDNN/aarch64/cuda/lib64/libcudnn.so  $JETPACK/cuda-9.0/extras/CUPTI/lib64/libcupti.so $JETPACK/cuda/targets/aarch64-linux-androideabi/lib64/libcufft.so"
```

#### Building all CUDA-enabled native binaries:
This will build CUDA-enabled versions of libtensorflow_inference.so and the benchmark binary. (libtensorflow_demo.so will also be built incidentally, but it does not support CUDA)

```bash
NDK_ROOT=$JETPACK/android-ndk-r13b
CC_PREFIX=ccache tensorflow/contrib/makefile/build_all_android.sh -s tensorflow/contrib/makefile/sub_makefiles/android/Makefile.in -t "libtensorflow_inference.so libtensorflow_demo.so all" -a tegra
```
(add -T on subsequent builds to skip protobuf downloading/building)


#### Testing the CUDA-enabled benchmark via adb:
Build binaries first as above, then run:

```bash
adb shell mkdir -p /data/local/tmp/lib64
adb push $TEGRA_LIBS /data/local/tmp/lib64
adb push tensorflow/contrib/makefile/gen/bin/android_arm64-v8a/benchmark /data/local/tmp
wget  https://ci.tensorflow.org/view/Nightly/job/nightly-android/lastSuccessfulBuild/artifact/out/tensorflow_demo.apk
unzip tensorflow_demo.apk -d /tmp/tensorflow_demo
adb push /tmp/tensorflow_demo/assets/*.pb /data/local/tmp
adb shell "LD_LIBRARY_PATH=/data/local/tmp/lib64 /data/local/tmp/benchmark --graph=/data/local/tmp/tensorflow_inception_graph.pb"
```

#### Building the CUDA-enabled TensorFlow AAR with Bazel:
Build the native binaries first as above. Then, build the aar and package the native libs by executing the following:
```bash
mkdir -p /tmp/tf/jni/arm64-v8a
cp tensorflow/contrib/makefile/gen/lib/android_tegra/libtensorflow_*.so /tmp/tf/jni/arm64-v8a/
cp $TEGRA_LIBS /tmp/tf/jni/arm64-v8a
bazel build //tensorflow/contrib/android:android_tensorflow_inference_java.aar
cp bazel-bin/tensorflow/contrib/android/android_tensorflow_inference_java.aar /tmp/tf/tensorflow.aar
cd /tmp/tf
chmod +w tensorflow.aar
zip -ur tensorflow.aar $(find jni -name *.so)
```

#### Building the CUDA-enabled TensorFlow Android demo with Bazel:
Build binaries first as above, then edit tensorflow/examples/android/BUILD and replace:
```
    srcs = [
       ":libtensorflow_demo.so",
       "//tensorflow/contrib/android:libtensorflow_inference.so",
    ],
```
with:
```
srcs = glob(["libs/arm64-v8a/*.so"]),
```

Then run:
```bash
# Create dir for native libs
mkdir -p tensorflow/examples/android/libs/arm64-v8a

# Copy JetPack libs
cp $TEGRA_LIBS  tensorflow/examples/android/libs/arm64-v8a

# Copy native TensorFlow libraries
cp tensorflow/contrib/makefile/gen/lib/android_arm64-v8a/libtensorflow_*.so tensorflow/examples/android/libs/arm64-v8a/

# Build APK
bazel build -c opt --fat_apk_cpu=arm64-v8a tensorflow/android:tensorflow_demo

# Install
adb install -r -f bazel-bin/tensorflow/examples/android/tensorflow_demo.apk
```

#### Building the CUDA-enabled Android demo with gradle/Android Studio:

Add tensorflow/examples/android as an Android project in Android Studio as normal.

Edit build.gradle and:
* set nativeBuildSystem = 'makefile'
* set cpuType = 'arm64-v8a'
* in "buildNativeMake", replace cpuType with 'tegra' (optional speedups like -T and ccache also work)
* set the environment "NDK_ROOT" var to $JETPACK/android-ndk-r13b

Click "build apk" to build.

Install:
```bash
adb install -r -f tensorflow/examples/android/gradleBuild/outputs/apk/debug/android-debug.apk
```

## iOS

_Note: To use this library in an iOS application, see related instructions in
the [iOS examples](../../examples/ios/) directory._

Install XCode 7.3 or more recent. If you have not already, you will need to
install the command-line tools using `xcode-select`:

```bash
xcode-select --install
```

If this is a new install, you will need to run XCode once to agree to the
license before continuing.

(You will also need to have [Homebrew](http://brew.sh/) installed.)

Then install [automake](https://en.wikipedia.org/wiki/Automake)/[libtool](https://en.wikipedia.org/wiki/GNU_Libtool):

```bash
brew install automake
brew install libtool
```

Also, download the graph if you haven't already:

```bash
mkdir -p ~/graphs
curl -o ~/graphs/inception.zip \
 https://storage.googleapis.com/download.tensorflow.org/models/inception5h.zip \
 && unzip ~/graphs/inception.zip -d ~/graphs/inception
```

### Building all at once

If you just want to get the libraries compiled in a hurry, you can run this
from the root of your TensorFlow source folder:

```bash
tensorflow/contrib/makefile/build_all_ios.sh
```

This process will take around twenty minutes on a modern MacBook Pro.

When it completes, you will have a unified library for all architectures
(i386sim, x86_64sim, armv7, armv7s and arm64)  and the benchmark program.
Although successfully compiling the benchmark program is a
sign of success, the program is not a complete iOS app.

If you would only like to build only one architecture to save time:
(iOS 11+ only supports 64bit so you can get away with arm64)

```bash
tensorflow/contrib/makefile/build_all_ios.sh -a arm64
```

After the first build if you would like to just build the tensorflow
library you can pass the -T flag to avoid a clean & rebuild. This should
take you just a few seconds to generate the library if you modified one file.

```bash
tensorflow/contrib/makefile/build_all_ios.sh -a arm64 -T
```

To see TensorFlow running on iOS, the example Xcode project in
[tensorflow/examples/ios](../../examples/ios/) shows how to use the static
library in a simple app.

### Building by hand

This section covers each step of building.  For all the code in one place, see
[build_all_ios.sh](build_all_ios.sh).

If you have not already, you will need to download dependencies:

```bash
tensorflow/contrib/makefile/download_dependencies.sh
```

Next, you will need to compile protobufs for iOS (optionally takes the -a $ARCH flag):

```bash
tensorflow/contrib/makefile/compile_ios_protobuf.sh
```

Then, you will need to compile the nsync library for iOS (optionally takes -a $ARCH flag):

```bash
export HOST_NSYNC_LIB=`tensorflow/contrib/makefile/compile_nsync.sh`
export TARGET_NSYNC_LIB=`tensorflow/contrib/makefile/compile_nsync.sh -t ios`
```
Then, you can run the makefile specifying iOS as the target, along with the
architecture you want to build for:

```bash
make -f tensorflow/contrib/makefile/Makefile \
 TARGET=IOS \
 IOS_ARCH=ARM64
```

This creates a library in
`tensorflow/contrib/makefile/gen/lib/libtensorflow-core.a` that you can link any
xcode project against.

To see TensorFlow running on iOS, the example Xcode project in
[tensorflow/examples/ios](../../examples/ios/) shows how to use the static
library in a simple app.

#### Universal binaries

In some situations, you will need a universal library.  In that case, you will
still need to run `compile_ios_protobuf.sh` and `compile_nsync.sh`, but this
time follow it with:

```bash
compile_ios_tensorflow.sh
```

`compile_ios_tensorflow.sh` takes the -a flag to build only for one architecture.
In case you run into issues with unresolved symbols with nsync you can also pass
-h ${HOST_NSYNC_LIB} and -n {TARGET_NSYNC_LIB} so it would look like:

```bash
tensorflow/contrib/makefile/compile_ios_tensorflow.sh -f "-O3" -h tensorflow/contrib/makefile/downloads/nsync/builds/default.macos.c++11/nsync.a -n tensorflow/contrib/makefile/downloads/nsync/builds/lipo.ios.c++11/nsync.a -a arm64
```

In XCode, you will need to use -force_load in the linker flags
section of the build settings to pull in the global constructors that are used
to register ops and kernels.

#### Optimization

The `build_all_ios.sh` script can take optional command-line arguments to
selectively register only for the operators used in your graph.

```bash
tensorflow/contrib/makefile/build_all_ios.sh -a arm64 -g $HOME/graphs/inception/tensorflow_inception_graph.pb
```
Please note this is an aggressive optimization of the operators and the resulting library may not work with other graphs but will reduce the size of the final library.

The `compile_ios_tensorflow.sh` script can take optional command-line arguments.
The first argument will be passed as a C++ optimization flag and defaults to
debug mode. If you are concerned about performance or are working on a release
build, you would likely want a higher optimization setting, like so:

```bash
compile_ios_tensorflow.sh -f "-Os"
```

For other variations of valid optimization flags, see [clang optimization levels](http://stackoverflow.com/questions/15548023/clang-optimization-levels).

## Raspberry Pi

Building on the Raspberry Pi is similar to a normal Linux system. First
download the dependencies, install the required packages and build protobuf:

```bash
tensorflow/contrib/makefile/download_dependencies.sh
sudo apt-get install -y autoconf automake libtool gcc-4.8 g++-4.8
cd tensorflow/contrib/makefile/downloads/protobuf/
./autogen.sh
./configure
make
sudo make install
sudo ldconfig  # refresh shared library cache
cd ../../../../..
export HOST_NSYNC_LIB=`tensorflow/contrib/makefile/compile_nsync.sh`
export TARGET_NSYNC_LIB="$HOST_NSYNC_LIB"
```

Once that's done, you can use make to build the library and example:

```bash
make -f tensorflow/contrib/makefile/Makefile HOST_OS=PI TARGET=PI OPTFLAGS="-Os" CXX=g++-4.8
```

If you're only interested in building for Raspberry Pi's 2 and 3, you can supply
some extra optimization flags to give you code that will run faster:

```bash
make -f tensorflow/contrib/makefile/Makefile HOST_OS=PI TARGET=PI \
 OPTFLAGS="-Os -mfpu=neon-vfpv4 -funsafe-math-optimizations -ftree-vectorize" CXX=g++-4.8
```

One thing to be careful of is that the gcc version 4.9 currently installed on
Jessie by default will hit an error mentioning `__atomic_compare_exchange`. This
is why the examples above specify `CXX=g++-4.8` explicitly, and why we install
it using apt-get. If you have partially built using the default gcc 4.9, hit the
error and switch to 4.8, you need to do a
`make -f tensorflow/contrib/makefile/Makefile clean` before you build. If you
don't, the build will appear to succeed but you'll encounter [malloc(): memory corruption errors](https://github.com/tensorflow/tensorflow/issues/3442)
when you try to run any programs using the library.

For more examples, look at the tensorflow/contrib/pi_examples folder in the
source tree, which contains code samples aimed at the Raspberry Pi.

# Other notes

## Supported Systems

The Make script has been tested on Ubuntu and OS X. If you look in the Makefile
itself, you'll see it's broken up into host and target sections. If you are
cross-compiling, you should look at customizing the target settings to match
what you need for your desired system.

## Dependency Management

The Makefile loads in a list of dependencies stored in text files. These files
are generated from the main Bazel build by running
`tensorflow/contrib/makefile/gen_file_lists.sh`. You'll need to re-run this i
you make changes to the files that are included in the build.

Header dependencies are not automatically tracked by the Makefile, so if you
make header changes you will need to run this command to recompile cleanly:

```bash
make -f tensorflow/contrib/makefile/Makefile clean
```

### Cleaning up

In some situations, you may want to completely clean up. The dependencies,
intermediate stages, and generated files are stored in:

```bash
tensorflow/contrib/makefile/downloads
tensorflow/contrib/makefile/gen
```

Those directories can safely be removed, but you will have to start over with
`download_dependencies.sh` once you delete them.

### Fixing Makefile Issues

Because the main development of TensorFlow is done using Bazel, changes to the
codebase can sometimes break the makefile build process. If you find that tests
relying on this makefile are failing with a change you're involved in, here are
some trouble-shooting steps:

 - Try to reproduce the issue on your platform. If you're on Linux, running
 `make -f tensorflow/contrib/makefile/Makefile` should be enough to recreate
  most issues. For other platforms, see the sections earlier in this document.

 - The most common cause of breakages are files that have been added to the
  Bazel build scripts, but that the makefile isn't aware of. Typical symptoms
  of this include linker errors mentioning missing symbols or protobuf headers
  that aren't found. To address these problems, take a look at the *.txt files
  in `tensorflow/contrib/makefile`. If you have a new operator, you may need to
  add it to `tf_op_files.txt`, or for a new proto to `tf_proto_files.txt`.

 - There's also a wildcard system in `Makefile` that defines what core C++ files
  are included in the library. This is designed to match the equivalent rule in
  `tensorflow/core/BUILD`, so if you change the wildcards there to include new
  files you'll need to also update `CORE_CC_ALL_SRCS` and `CORE_CC_EXCLUDE_SRCS`
  in the makefile.

 - Some of the supported platforms use clang instead of gcc as their compiler,
  so if you're hitting compile errors you may need to tweak your code to be more
  friendly to different compilers by avoiding gcc extensions or idioms.

These are the most common reasons for makefile breakages, but it's also
possible you may hit something unusual, like a platform incompatibility. For
those, you'll need to see if you can reproduce the issue on that particular
platform and debug it there. You can also reach out to the broader TensorFlow
team by [filing a Github issue](https://github.com/tensorflow/tensorflow/issues)
to ask for help.