JNI is the Java Native Interface. It defines a way for code written in the Java programming language to interact with native code, e.g. functions written in C/C++. It's VM-neutral, has support for loading code from dynamic shared libraries, and while cumbersome at times is reasonably efficient.
You really should read through the JNI spec for J2SE 1.6 to get a sense for how JNI works and what features are available. Some aspects of the interface aren't immediately obvious on first reading, so you may find the next few sections handy. The more detailed JNI Programmer's Guide and Specification can be found here.
JNI defines two key data structures, "JavaVM" and "JNIEnv". Both of these are essentially pointers to pointers to function tables. (In the C++ version, it's a class whose sole member is a pointer to a function table.) The JavaVM provides the "invocation interface" functions, which allow you to create and destroy the VM. In theory you can have multiple VMs per process, but Android's VM only allows one.
The JNIEnv provides most of the JNI functions. Your native functions all receive a JNIEnv as the first argument.
On some VMs, the JNIEnv is used for thread-local storage. For this reason, you cannot share a JNIEnv between threads. If a piece of code has no other way to get its JNIEnv, you should share the JavaVM, and use JavaVM->GetEnv to discover the thread's JNIEnv.
The C declarations of JNIEnv and JavaVM are different from the C++ declarations. "jni.h" provides different typedefs depending on whether it's included into ".c" or ".cpp". For this reason it's a bad idea to include JNIEnv arguments in header files included by both languages. (Put another way: if your header file requires "#ifdef __cplusplus", you may have to do some extra work if anything in that header refers to JNIEnv.)
All VM threads are Linux threads, scheduled by the kernel. They're usually
started using Java language features (notably Thread.start()
),
but they can also be created elsewhere and then attached to the VM. For
example, a thread started with pthread_create
can be attached
with the JNI AttachCurrentThread
or
AttachCurrentThreadAsDaemon
functions. Until a thread is
attached to the VM, it has no JNIEnv, and
cannot make JNI calls.
Attaching a natively-created thread causes the VM to allocate and initialize
a Thread
object, add it to the "main" ThreadGroup
,
and add the thread to the set that is visible to the debugger. Calling
AttachCurrentThread
on an already-attached thread is a no-op.
The Dalvik VM does not suspend threads executing native code. If garbage collection is in progress, or the debugger has issued a suspend request, the VM will pause the thread the next time it makes a JNI call.
Threads attached through JNI must call
DetachCurrentThread
before they exit.
If coding this directly is awkward, in Android >= 2.0 you
can use pthread_key_create
to define a destructor
function that will be called before the thread exits, and
call DetachCurrentThread
from there. (Use that
key with pthread_setspecific
to store the JNIEnv in
thread-local-storage; that way it'll be passed into your destructor as
the argument.)
If you want to access an object's field from native code, you would do the following:
FindClass
GetFieldID
GetIntField
Similarly, to call a method, you'd first get a class object reference and then a method ID. The IDs are often just pointers to internal VM data structures. Looking them up may require several string comparisons, but once you have them the actual call to get the field or invoke the method is very quick.
If performance is important, it's useful to look the values up once and cache the results in your native code. Because we are limiting ourselves to one VM per process, it's reasonable to store this data in a static local structure.
The class references, field IDs, and method IDs are guaranteed valid until the class is unloaded. Classes
are only unloaded if all classes associated with a ClassLoader can be garbage collected,
which is rare but will not be impossible in our system. Note however that
the jclass
is a class reference and must be protected with a call
to NewGlobalRef
(see the next section).
If you would like to cache the IDs when a class is loaded, and automatically re-cache them if the class is ever unloaded and reloaded, the correct way to initialize the IDs is to add a piece of code that looks like this to the appropriate class:
/* * We use a class initializer to allow the native code to cache some * field offsets. */ /* * A native function that looks up and caches interesting * class/field/method IDs for this class. Returns false on failure. */ native private static boolean nativeClassInit(); /* * Invoke the native initializer when the class is loaded. */ static { if (!nativeClassInit()) throw new RuntimeException("native init failed"); }
Create a nativeClassInit method in your C/C++ code that performs the ID lookups. The code will be executed once, when the class is initialized. If the class is ever unloaded and then reloaded, it will be executed again. (See the implementation of java.io.FileDescriptor for an example in our source tree.)
Every object that JNI returns is a "local reference". This means that it's valid for the
duration of the current native method in the current thread.
Even if the object itself continues to live on after the native method returns, the reference is not valid.
This applies to all sub-classes of jobject
, including
jclass
, jstring
, and jarray
.
(Dalvik VM will warn you about most reference mis-uses when extended JNI
checks are enabled.)
If you want to hold on to a reference for a longer period, you must use
a "global" reference. The NewGlobalRef
function takes the
local reference as an argument and returns a global one.
The global reference is guaranteed to be valid until you call
DeleteGlobalRef
.
This pattern is commonly used when caching copies of class objects obtained
from FindClass
, e.g.:
jclass* localClass = env->FindClass("MyClass"); jclass* globalClass = (jclass*) env->NewGlobalRef(localClass);
All JNI methods accept both local and global references as arguments.
It's possible for references to the same object to have different values;
for example, the return values from consecutive calls to
NewGlobalRef
on the same object may be different.
To see if two references refer to the same object,
you must use the IsSameObject
function. Never compare
references with "==" in native code.
One consequence of this is that you
must not assume object references are constant or unique
in native code. The 32-bit value representing an object may be different
from one invocation of a method to the next, and it's possible that two
different objects could have the same 32-bit value on consecutive calls. Do
not use jobject
values as keys.
Programmers are required to "not excessively allocate" local references. In practical terms this means
that if you're creating large numbers of local references, perhaps while running through an array of
Objects, you should free them manually with
DeleteLocalRef
instead of letting JNI do it for you. The
VM is only required to reserve slots for
16 local references, so if you need more than that you should either delete as you go or use
EnsureLocalCapacity
to reserve more.
Note: method and field IDs are just 32-bit identifiers, not object
references, and should not be passed to NewGlobalRef
. The raw data
pointers returned by functions like GetStringUTFChars
and GetByteArrayElements
are also not objects.
One unusual case deserves separate mention. If you attach a native thread to the VM with AttachCurrentThread, the code you are running will never "return" to the VM until the thread detaches from the VM. Any local references you create will have to be deleted manually unless you're going to detach the thread soon.
The Java programming language uses UTF-16. For convenience, JNI provides methods that work with "modified UTF-8" encoding as well. (Some VMs use the modified UTF-8 internally to store strings; ours do not.) The modified encoding only supports the 8- and 16-bit forms, and stores ASCII NUL values in a 16-bit encoding. The nice thing about it is that you can count on having C-style zero-terminated strings, suitable for use with standard libc string functions. The down side is that you cannot pass arbitrary UTF-8 data into the VM and expect it to work correctly.
It's usually best to operate with UTF-16 strings. With our current VMs, the
GetStringChars
method
does not require a copy, whereas GetStringUTFChars
requires a malloc and a UTF conversion. Note that
UTF-16 strings are not zero-terminated, and \u0000 is allowed,
so you need to hang on to the string length as well as
the string pointer.
Don't forget to Release the strings you Get. The
string functions return jchar*
or jbyte*
, which
are C-style pointers to primitive data rather than local references. They
are guaranteed valid until Release is called, which means they are not
released when the native method returns.
Data passed to NewStringUTF must be in "modified" UTF-8 format. A
common mistake is reading character data from a file or network stream
and handing it to NewStringUTF
without filtering it.
Unless you know the data is 7-bit ASCII, you need to strip out high-ASCII
characters or convert them to proper "modified" UTF-8 form. If you don't,
the UTF-16 conversion will likely not be what you expect. The extended
JNI checks will scan strings and warn you about invalid data, but they
won't catch everything.
JNI provides functions for accessing the contents of array objects. While arrays of objects must be accessed one entry at a time, arrays of primitives can be read and written directly as if they were declared in C.
To make the interface as efficient as possible without constraining
the VM implementation,
the Get<PrimitiveType>ArrayElements
family of calls
allows the VM to either return a pointer to the actual elements, or
allocate some memory and make a copy. Either way, the raw pointer returned
is guaranteed to be valid until the corresponding Release
call
is issued (which implies that, if the data wasn't copied, the array object
will be pinned down and can't be relocated as part of compacting the heap).
You must Release every array you Get. Also, if the Get
call fails, you must ensure that your code doesn't try to Release a NULL
pointer later.
You can determine whether or not the data was copied by passing in a
non-NULL pointer for the isCopy
argument. This is rarely
useful.
The Release
call takes a mode
argument that can
have one of three values. The actions performed by the VM depend upon
whether it returned a pointer to the actual data or a copy of it:
0
JNI_COMMIT
JNI_ABORT
One reason for checking the isCopy
flag is to know if
you need to call Release
with JNI_COMMIT
after making changes to an array — if you're alternating between making
changes and executing code that uses the contents of the array, you may be
able to
skip the no-op commit. Another possible reason for checking the flag is for
efficient handling of JNI_ABORT
. For example, you might want
to get an array, modify it in place, pass pieces to other functions, and
then discard the changes. If you know that JNI is making a new copy for
you, there's no need to create another "editable" copy. If JNI is passing
you the original, then you do need to make your own copy.
Some have asserted that you can skip the Release
call if
*isCopy
is false. This is not the case. If no copy buffer was
allocated, then the original memory must be pinned down and can't be moved by
the garbage collector.
Also note that the JNI_COMMIT
flag does NOT release the array,
and you will need to call Release
again with a different flag
eventually.
There is an alternative to calls like Get<Type>ArrayElements
and GetStringChars
that may be very helpful when all you want
to do is copy data in or out. Consider the following:
jbyte* data = env->GetByteArrayElements(array, NULL); if (data != NULL) { memcpy(buffer, data, len); env->ReleaseByteArrayElements(array, data, JNI_ABORT); }
This grabs the array, copies the first len
byte
elements out of it, and then releases the array. Depending upon the VM
policies the Get
call will either pin or copy the array contents.
We copy the data (for perhaps a second time), then call Release; in this case
we use JNI_ABORT
so there's no chance of a third copy.
We can accomplish the same thing with this:
env->GetByteArrayRegion(array, 0, len, buffer);
This has several advantages:
Release
after something fails.
Similarly, you can use the Set<Type>ArrayRegion
call
to copy data into an array, and GetStringRegion
or
GetStringUTFRegion
to copy characters out of a
String
.
You may not call most JNI functions while an exception is pending.
Your code is expected to notice the exception (via the function's return value,
ExceptionCheck()
, or ExceptionOccurred()
) and return,
or clear the exception and handle it.
The only JNI functions that you are allowed to call while an exception is
pending are:
Many JNI calls can throw an exception, but often provide a simpler way
of checking for failure. For example, if NewString
returns
a non-NULL value, you don't need to check for an exception. However, if
you call a method (using a function like CallObjectMethod
),
you must always check for an exception, because the return value is not
going to be valid if an exception was thrown.
Note that exceptions thrown by interpreted code do not "leap over" native code,
and C++ exceptions thrown by native code are not handled by Dalvik.
The JNI Throw
and ThrowNew
instructions just
set an exception pointer in the current thread. Upon returning to the VM from
native code, the exception will be noted and handled appropriately.
Native code can "catch" an exception by calling ExceptionCheck
or
ExceptionOccurred
, and clear it with
ExceptionClear
. As usual,
discarding exceptions without handling them can lead to problems.
There are no built-in functions for manipulating the Throwable object
itself, so if you want to (say) get the exception string you will need to
find the Throwable class, look up the method ID for
getMessage "()Ljava/lang/String;"
, invoke it, and if the result
is non-NULL use GetStringUTFChars
to get something you can
hand to printf or a LOG macro.
JNI does very little error checking. Calling SetIntField
on an Object field will succeed, even if the field is marked
private
and final
. The
goal is to minimize the overhead on the assumption that, if you've written it in native code,
you probably did it for performance reasons.
In Dalvik, you can enable additional checks by setting the
"-Xcheck:jni
" flag. If the flag is set, the VM directs
the JavaVM and JNIEnv pointers to a different table of functions.
These functions perform an extended series of checks before calling the
standard implementation.
The additional tests include:
Accessibility of methods and fields (i.e. public vs. private) is not checked.
For a description of how to enable CheckJNI for Android apps, see Controlling the Embedded VM. It's currently enabled by default in the Android emulator and on "engineering" device builds.
JNI checks can be modified with the -Xjniopts
command-line
flag. Currently supported values include:
- forcecopy
- When set, any function that can return a copy of the original data (array of primitive values, UTF-16 chars) will always do so. The buffers are over-allocated and surrounded with a guard pattern to help identify code writing outside the buffer, and the contents are erased before the storage is freed to trip up code that uses the data after calling Release. This will have a noticeable performance impact on some applications.
- warnonly
- By default, JNI "warnings" cause the VM to abort. With this flag it continues on.
You can load native code from shared libraries with the standard
System.loadLibrary()
call. The
preferred way to get at your native code is:
System.loadLibrary()
from a static class
initializer. (See the earlier example, where one is used to call
nativeClassInit()
.) The argument is the "undecorated"
library name, e.g. to load "libfubar.so" you would pass in "fubar".
jint JNI_OnLoad(JavaVM* vm, void* reserved)
JNI_OnLoad
, register all of your native methods. You
should declare
the methods "static" so the names don't take up space in the symbol table
on the device.
The JNI_OnLoad
function should look something like this if
written in C:
jint JNI_OnLoad(JavaVM* vm, void* reserved) { JNIEnv* env; if ((*vm)->GetEnv(vm, (void**) &env, JNI_VERSION_1_6) != JNI_OK) return -1; /* get class with (*env)->FindClass */ /* register methods with (*env)->RegisterNatives */ return JNI_VERSION_1_6; }
You can also call System.load()
with the full path name of the
shared library. For Android apps, you may find it useful to get the full
path to the application's private data storage area from the context object.
This is the recommended approach, but not the only approach. The VM does
not require explicit registration, nor that you provide a
JNI_OnLoad
function.
You can instead use "discovery" of native methods that are named in a
specific way (see
the JNI spec for details), though this is less desirable.
It requires more space in the shared object symbol table,
loading is slower because it requires string searches through all of the
loaded shared libraries, and if a method signature is wrong you won't know
about it until the first time the method is actually used.
One other note about JNI_OnLoad
: any FindClass
calls you make from there will happen in the context of the class loader
that was used to load the shared library. Normally FindClass
uses the loader associated with the method at the top of the interpreted
stack, or if there isn't one (because the thread was just attached to
the VM) it uses the "system" class loader. This makes
JNI_OnLoad
a convenient place to look up and cache class
object references.
Android is currently expected to run on 32-bit platforms. In theory it
could be built for a 64-bit system, but that is not a goal at this time.
For the most part this isn't something that you will need to worry about
when interacting with native code,
but it becomes significant if you plan to store pointers to native
structures in integer fields in an object. To support architectures
that use 64-bit pointers, you need to stash your native pointers in a
long
field rather than an int
.
All JNI 1.6 features are supported, with the following exceptions:
DefineClass
is not implemented. Dalvik does not use
Java bytecodes or class files, so passing in binary class data
doesn't work. Translation facilities may be added in a future
version of the VM.NewLocalRef
, NewGlobalRef
, and
DeleteWeakGlobalRef
. (The spec strongly encourages
programmers to create hard references to weak globals before doing
anything with them, so this should not be at all limiting.)GetObjectRefType
(new in 1.6) is implemented but not fully
functional — it can't always tell the difference between "local" and
"global" references.For backward compatibility, you may need to be aware of:
pthread_key_create
destructor function to avoid the VM's "thread must be detached before
exit" check. (The VM also uses a pthread key destructor function,
so it'd be a race to see which gets called first.)
When working on native code it's not uncommon to see a failure like this:
java.lang.UnsatisfiedLinkError: Library foo not found
In some cases it means what it says — the library wasn't found. In other cases the library exists but couldn't be opened by dlopen(), and the details of the failure can be found in the exception's detail message.
Common reasons why you might encounter "library not found" exceptions:
adb shell ls -l <path>
to check its presence
and permissions.
Another class of UnsatisfiedLinkError
failures looks like:
java.lang.UnsatisfiedLinkError: myfunc at Foo.myfunc(Native Method) at Foo.main(Foo.java:10)
In logcat, you'll see:
W/dalvikvm( 880): No implementation found for native LFoo;.myfunc ()V
This means that the VM tried to find a matching method but was unsuccessful. Some common reasons for this are:
extern C
. You can use arm-eabi-nm
to see the symbols as they appear in the library; if they look
mangled (e.g. _Z15Java_Foo_myfuncP7_JNIEnvP7_jclass
rather than Java_Foo_myfunc
) then you need to
adjust the declaration.
byte
and 'Z' is boolean
.
Class name components in signatures start with 'L', end with ';',
use '/' to separate package/class names, and use '$' to separate
inner-class names
(e.g. Ljava/util/Map$Entry;
).
Using javah
to automatically generate JNI headers may help
avoid some problems.
Make sure that the class name string has the correct format. JNI class
names start with the package name and are separated with slashes,
e.g. java/lang/String
. If you're looking up an array class,
you need to start with the appropriate number of square brackets and
must also wrap the class with 'L' and ';', so a one-dimensional array of
String
would be [Ljava/lang/String;
.
If the class name looks right, you could be running into a class loader
issue. FindClass
wants to start the class search in the
class loader associated with your code. It examines the VM call stack,
which will look something like:
Foo.myfunc(Native Method) Foo.main(Foo.java:10) dalvik.system.NativeStart.main(Native Method)
The topmost method is Foo.myfunc
. FindClass
finds the ClassLoader
object associated with the Foo
class and uses that.
This usually does what you want. You can get into trouble if you
create a thread outside the VM (perhaps by calling pthread_create
and then attaching it to the VM with AttachCurrentThread
).
Now the stack trace looks like this:
dalvik.system.NativeStart.run(Native Method)
The topmost method is NativeStart.run
, which isn't part of
your application. If you call FindClass
from this thread, the
VM will start in the "system" class loader instead of the one associated
with your application, so attempts to find app-specific classes will fail.
There are a few ways to work around this:
FindClass
lookups once, in
JNI_OnLoad
, and cache the class references for later
use. Any FindClass
calls made as part of executing
JNI_OnLoad
will use the class loader associated with
the function that called System.loadLibrary
(this is a
special rule, provided to make library initialization more convenient).
If your app code is loading the library, FindClass
will use the correct class loader.
Foo.class
in.
ClassLoader
object somewhere
handy, and issue loadClass
calls directly. This requires
some effort.
You may find yourself in a situation where you need to access a large buffer of raw data from code written in Java and C/C++. Common examples include manipulation of bitmaps or sound samples. There are two basic approaches.
You can store the data in a byte[]
. This allows very fast
access from code written in Java. On the native side, however, you're
not guaranteed to be able to access the data without having to copy it. In
some implementations, GetByteArrayElements
and
GetPrimitiveArrayCritical
will return actual pointers to the
raw data in the managed heap, but in others it will allocate a buffer
on the native heap and copy the data over.
The alternative is to store the data in a direct byte buffer. These
can be created with java.nio.ByteBuffer.allocateDirect
, or
the JNI NewDirectByteBuffer
function. Unlike regular
byte buffers, the storage is not allocated on the managed heap, and can
always be accessed directly from native code (get the address
with GetDirectBufferAddress
). Depending on how direct
byte buffer access is implemented in the VM, accessing the data from code
written in Java can be very slow.
The choice of which to use depends on two factors:
ByteBuffer
might be unwise.)