Home | History | Annotate | Download | only in docs
      1 # heapprofd - Android Heap Profiler
      2 
      3 Googlers, for design doc see: http://go/heapprofd-design
      4 
      5 **heapprofd requires Android Q.**
      6 
      7 heapprofd is a tool that tracks native heap allocations & deallocations of an
      8 Android process within a given time period. The resulting profile can be used
      9 to attribute memory usage to particular function callstacks, supporting a mix
     10 of both native and java code. The tool should be useful to Android platform
     11 developers, and app developers investigating memory issues.
     12 
     13 On debug Android builds, you can profile all apps and most system services.
     14 On "user" builds, you can only use it on apps with the debuggable or
     15 profileable manifest flag.
     16 
     17 ## Quickstart
     18 
     19 <!-- This uses github because gitiles does not allow to get the raw file. -->
     20 
     21 Use the `tools/heap_profile` script to heap profile a process. If you are
     22 having trouble make sure you are using the [latest version](
     23 https://raw.githubusercontent.com/catapult-project/perfetto/master/tools/heap_profile).
     24 
     25 See all the arguments using `tools/heap_profile -h`, or use the defaults
     26 and just profile a process (e.g. `system_server`):
     27 
     28 ```
     29 $ tools/heap_profile --name system_server
     30 Profiling active. Press Ctrl+C to terminate.
     31 ^CWrote profiles to /tmp/heap_profile-XSKcZ3i (symlink /tmp/heap_profile-latest)
     32 These can be viewed using pprof. Googlers: head to pprof/ and upload them.
     33 ```
     34 
     35 This will create a pprof-compatible heap dump when Ctrl+C is pressed.
     36 
     37 ## Viewing the data
     38 
     39 The resulting profile proto contains four views on the data
     40 
     41 * space: how many bytes were allocated but not freed at this callstack the
     42   moment the dump was created.
     43 * alloc\_space: how many bytes were allocated (including ones freed at the
     44   moment of the dump) at this callstack
     45 * objects: how many allocations without matching frees were done at this
     46   callstack.
     47 * alloc\_objects: how many allocations (including ones with matching frees) were
     48   done at this callstack.
     49 
     50 **Googlers:** Head to http://pprof/ and upload the gzipped protos to get a
     51 visualization. *Tip: you might want to put `libart.so` as a "Hide regex" when
     52 profiling apps.*
     53 
     54 [Speedscope](https://speedscope.app) can also be used to visualize the heap
     55 dump, but will only show the space view. *Tip: Click Left Heavy on the top
     56 left for a good visualisation.*
     57 
     58 ## Sampling interval
     59 heapprofd samples heap allocations. Given a sampling interval of n bytes,
     60 one allocation is sampled, on average, every n bytes allocated. This allows to
     61 reduce the performance impact on the target process. The default sampling rate
     62 is 4096 bytes.
     63 
     64 The easiest way to reason about this is to imagine the memory allocations as a
     65 steady stream of one byte allocations. From this stream, every n-th byte is
     66 selected as a sample, and the corresponding allocation gets attributed the
     67 complete n bytes. As an optimization, we sample allocations larger than the
     68 sampling interval with their true size.
     69 
     70 To make this statistically more meaningful, Poisson sampling is employed.
     71 Instead of a static parameter of n bytes, the user can only choose the mean
     72 value around which the interval is distributed. This makes sure frequent small
     73 allocations get sampled as well as infrequent large ones.
     74 
     75 ## Startup profiling
     76 When a profile session names processes by name and a matching process is
     77 started, it gets profiled from the beginning. The resulting profile will
     78 contain all allocations done between the start of the process and the end
     79 of the profiling session.
     80 
     81 On Android, Java apps are usually not started, but the zygote forks and then
     82 specializes into the desired app. If the app's name matches a name specified
     83 in the profiling session, profiling will be enabled as part of the zygote
     84 specialization. The resulting profile contains all allocations done between
     85 that point in zygote specialization and the end of the profiling session.
     86 Some allocations done early in the specialization process are not accounted
     87 for.
     88 
     89 The Resulting `ProfileProto` will have `from_startup` set  to true in the
     90 corresponding `ProcessHeapSamples` message. This does not get surfaced in the
     91 converted pprof compatible proto.
     92 
     93 ## Runtime profiling
     94 When a profile session is started, all matching processes (by name or PID)
     95 are enumerated and profiling is enabled. The resulting profile will contain
     96 all allocations done between the beginning and the end of the profiling
     97 session.
     98 
     99 The Resulting `ProfileProto` will have `from_startup` set  to false in the
    100 corresponding `ProcessHeapSamples` message. This does not get surfaced in the
    101 converted pprof compatible proto.
    102 
    103 ## Concurrent profiling sessions
    104 If multiple sessions name the same target process (either by name or PID),
    105 only the first relevant session will profile the process. The other sessions
    106 will report that the process had already been profiled when converting to
    107 the pprof compatible proto.
    108 
    109 If you see this message but do not expect any other sessions, run
    110 ```
    111 adb shell killall -KILL perfetto
    112 ```
    113 to stop any concurrent sessions that may be running.
    114 
    115 
    116 The Resulting `ProfileProto` will have `rejected_concurrent` set  to true in
    117 otherwise empty corresponding `ProcessHeapSamples` message. This does not get
    118 surfaced in the converted pprof compatible proto.
    119 
    120 ## Target processes
    121 Depending on the build of Android that heapprofd is run on, some processes
    122 are not be eligible to be profiled.
    123 
    124 On user builds, only Java applications with either the profileable or the
    125 debugable manifest flag set can be profiled. Profiling requests for other
    126 processes will result in an empty profile.
    127 
    128 On userdebug builds, all processes except for a small blacklist of critical
    129 services can be profiled. This restriction can be lifted by disabling
    130 SELinux by running `adb shell su root setenforce 0` or by passing
    131 `--disable-selinux` to the `heap_profile` script.
    132 
    133 |                         | userdebug setenforce 0 | userdebug | user |
    134 |-------------------------|------------------------|-----------|------|
    135 | critical native service |            y           |     n     |  n   |
    136 | native service          |            y           |     y     |  n   |
    137 | app                     |            y           |     y     |  n   |
    138 | profileable app         |            y           |     y     |  y   |
    139 | debugable app           |            y           |     y     |  y   |
    140 
    141 ## Troubleshooting
    142 
    143 ### Buffer overrun
    144 If the rate of allocations is too high for heapprofd to keep up, the profiling
    145 session will end early due to a buffer overrun. If the buffer overrun is
    146 caused by a transient spike in allocations, increasing the shared memory buffer
    147 size (passing `--shmem-size` to heap\_profile) can resolve the issue.
    148 Otherwise the sampling interval can be increased (at the expense of lower
    149 accuracy in the resulting profile) by passing `--interval` to heap\_profile.
    150 
    151 ### Profile is empty
    152 Check whether your target process is eligible to be profiled by consulting
    153 [Target processes](#target-processes) above.
    154 
    155 ## Known Issues
    156 
    157 * Does not work on x86 platforms (including the Android cuttlefish emulator).
    158 
    159 ## Ways to count memory
    160 
    161 When using heapprofd and interpreting results, it is important to know the
    162 precise meaning of the different memory metrics that can be obtained from the
    163 operating system.
    164 
    165 **heapprofd** gives you the number of bytes the target program
    166 requested from the allocator. If you are profiling a Java app from startup,
    167 allocations that happen early in the application's initialization will not be
    168 visibile to heapprofd. Native services that do not fork from the Zygote
    169 are not affected by this.
    170 
    171 **malloc\_info** is a libc function that gives you information about the
    172 allocator. This can be triggered on userdebug builds by using
    173 `am dumpheap -m <PID> /data/local/tmp/heap.txt`. This will in general be more
    174 than the memory seen by heapprofd, depending on the allocator not all memory
    175 is immediately freed. In particular, jemalloc retains some freed memory in
    176 thread caches.
    177 
    178 **Heap RSS** is the amount of memory requested from the operating system by the
    179 allocator. This is larger than the previous two numbers because memory can only
    180 be obtained in page size chunks, and fragmentation causes some of that memory to
    181 be wasted. This can be obtained by running `adb shell dumpsys meminfo <PID>` and
    182 looking at the "Private Dirty" column.
    183 
    184 |                     | heapprofd         | malloc\_info | RSS |
    185 |---------------------|-------------------|--------------|-----|
    186 | from native startup |          x        |      x       |  x  |
    187 | after zygote init   |          x        |      x       |  x  |
    188 | before zygote init  |                   |      x       |  x  |
    189 | thread caches       |                   |      x       |  x  |
    190 | fragmentation       |                   |              |  x  |
    191 
    192 If you observe high RSS or malloc\_info metrics but heapprofd does not match,
    193 there might be a problem with fragmentation or the allocator.
    194 
    195 ## Manual instructions
    196 *It is not recommended to use these instructions unless you have advanced
    197 requirements or are developing heapprofd. Proceed with caution*
    198 
    199 ### Download trace\_to\_text
    200 Download the latest trace\_to\_text for [Linux](
    201 https://storage.googleapis.com/perfetto/trace_to_text-4ab1d18e69bc70e211d27064505ed547aa82f919)
    202 or [MacOS](https://storage.googleapis.com/perfetto/trace_to_text-mac-2ba325f95c08e8cd5a78e04fa85ee7f2a97c847e).
    203 This is needed to convert the Perfetto trace to a pprof-compatible file.
    204 
    205 Compare the `sha1sum` of this file to the one contained in the file name.
    206 
    207 ### Start profiling
    208 To start profiling the process `${PID}`, run the following sequence of commands.
    209 Adjust the `INTERVAL` to trade-off runtime impact for higher accuracy of the
    210 results. If `INTERVAL=1`, every allocation is sampled for maximum accuracy.
    211 Otherwise, a sample is taken every `INTERVAL` bytes on average.
    212 
    213 ```bash
    214 INTERVAL=4096
    215 
    216 echo '
    217 buffers {
    218   size_kb: 100024
    219 }
    220 
    221 data_sources {
    222   config {
    223     name: "android.heapprofd"
    224     target_buffer: 0
    225     heapprofd_config {
    226       sampling_interval_bytes: '${INTERVAL}'
    227       pid: '${PID}'
    228     }
    229   }
    230 }
    231 
    232 duration_ms: 20000
    233 ' | adb shell perfetto --txt -c - -o /data/misc/perfetto-traces/profile
    234 
    235 adb pull /data/misc/perfetto-traces/profile /tmp/profile
    236 ```
    237 
    238 ### Convert to pprof compatible file
    239 
    240 While we work on UI support, you can convert the trace into pprof compatible
    241 heap dumps.
    242 
    243 Use the trace\_to\_text file downloaded above, with XXXXXXX replaced with the
    244 `sha1sum` of the file.
    245 
    246 ```
    247 trace_to_text-linux-XXXXXXX profile /tmp/profile
    248 ```
    249 
    250 This will create a directory in `/tmp/` containing the heap dumps. Run
    251 
    252 ```
    253 gzip /tmp/heap_profile-XXXXXX/*.pb
    254 ```
    255 
    256 to get gzipped protos, which tools handling pprof profile protos expect.
    257 
    258 Follow the instructions in [Viewing the Data](#viewing-the-data) to visualise
    259 the results.
    260