1 # heapprofd - Android Heap Profiler 2 3 Googlers, for design doc see: http://go/heapprofd-design 4 5 **heapprofd requires Android Q.** 6 7 heapprofd is a tool that tracks native heap allocations & deallocations of an 8 Android process within a given time period. The resulting profile can be used 9 to attribute memory usage to particular function callstacks, supporting a mix 10 of both native and java code. The tool should be useful to Android platform 11 developers, and app developers investigating memory issues. 12 13 On debug Android builds, you can profile all apps and most system services. 14 On "user" builds, you can only use it on apps with the debuggable or 15 profileable manifest flag. 16 17 ## Quickstart 18 19 <!-- This uses github because gitiles does not allow to get the raw file. --> 20 21 Use the `tools/heap_profile` script to heap profile a process. If you are 22 having trouble make sure you are using the [latest version]( 23 https://raw.githubusercontent.com/catapult-project/perfetto/master/tools/heap_profile). 24 25 See all the arguments using `tools/heap_profile -h`, or use the defaults 26 and just profile a process (e.g. `system_server`): 27 28 ``` 29 $ tools/heap_profile --name system_server 30 Profiling active. Press Ctrl+C to terminate. 31 ^CWrote profiles to /tmp/heap_profile-XSKcZ3i (symlink /tmp/heap_profile-latest) 32 These can be viewed using pprof. Googlers: head to pprof/ and upload them. 33 ``` 34 35 This will create a pprof-compatible heap dump when Ctrl+C is pressed. 36 37 ## Viewing the data 38 39 The resulting profile proto contains four views on the data 40 41 * space: how many bytes were allocated but not freed at this callstack the 42 moment the dump was created. 43 * alloc\_space: how many bytes were allocated (including ones freed at the 44 moment of the dump) at this callstack 45 * objects: how many allocations without matching frees were done at this 46 callstack. 47 * alloc\_objects: how many allocations (including ones with matching frees) were 48 done at this callstack. 49 50 **Googlers:** Head to http://pprof/ and upload the gzipped protos to get a 51 visualization. *Tip: you might want to put `libart.so` as a "Hide regex" when 52 profiling apps.* 53 54 [Speedscope](https://speedscope.app) can also be used to visualize the heap 55 dump, but will only show the space view. *Tip: Click Left Heavy on the top 56 left for a good visualisation.* 57 58 ## Sampling interval 59 heapprofd samples heap allocations. Given a sampling interval of n bytes, 60 one allocation is sampled, on average, every n bytes allocated. This allows to 61 reduce the performance impact on the target process. The default sampling rate 62 is 4096 bytes. 63 64 The easiest way to reason about this is to imagine the memory allocations as a 65 steady stream of one byte allocations. From this stream, every n-th byte is 66 selected as a sample, and the corresponding allocation gets attributed the 67 complete n bytes. As an optimization, we sample allocations larger than the 68 sampling interval with their true size. 69 70 To make this statistically more meaningful, Poisson sampling is employed. 71 Instead of a static parameter of n bytes, the user can only choose the mean 72 value around which the interval is distributed. This makes sure frequent small 73 allocations get sampled as well as infrequent large ones. 74 75 ## Startup profiling 76 When a profile session names processes by name and a matching process is 77 started, it gets profiled from the beginning. The resulting profile will 78 contain all allocations done between the start of the process and the end 79 of the profiling session. 80 81 On Android, Java apps are usually not started, but the zygote forks and then 82 specializes into the desired app. If the app's name matches a name specified 83 in the profiling session, profiling will be enabled as part of the zygote 84 specialization. The resulting profile contains all allocations done between 85 that point in zygote specialization and the end of the profiling session. 86 Some allocations done early in the specialization process are not accounted 87 for. 88 89 The Resulting `ProfileProto` will have `from_startup` set to true in the 90 corresponding `ProcessHeapSamples` message. This does not get surfaced in the 91 converted pprof compatible proto. 92 93 ## Runtime profiling 94 When a profile session is started, all matching processes (by name or PID) 95 are enumerated and profiling is enabled. The resulting profile will contain 96 all allocations done between the beginning and the end of the profiling 97 session. 98 99 The Resulting `ProfileProto` will have `from_startup` set to false in the 100 corresponding `ProcessHeapSamples` message. This does not get surfaced in the 101 converted pprof compatible proto. 102 103 ## Concurrent profiling sessions 104 If multiple sessions name the same target process (either by name or PID), 105 only the first relevant session will profile the process. The other sessions 106 will report that the process had already been profiled when converting to 107 the pprof compatible proto. 108 109 If you see this message but do not expect any other sessions, run 110 ``` 111 adb shell killall -KILL perfetto 112 ``` 113 to stop any concurrent sessions that may be running. 114 115 116 The Resulting `ProfileProto` will have `rejected_concurrent` set to true in 117 otherwise empty corresponding `ProcessHeapSamples` message. This does not get 118 surfaced in the converted pprof compatible proto. 119 120 ## Target processes 121 Depending on the build of Android that heapprofd is run on, some processes 122 are not be eligible to be profiled. 123 124 On user builds, only Java applications with either the profileable or the 125 debugable manifest flag set can be profiled. Profiling requests for other 126 processes will result in an empty profile. 127 128 On userdebug builds, all processes except for a small blacklist of critical 129 services can be profiled. This restriction can be lifted by disabling 130 SELinux by running `adb shell su root setenforce 0` or by passing 131 `--disable-selinux` to the `heap_profile` script. 132 133 | | userdebug setenforce 0 | userdebug | user | 134 |-------------------------|------------------------|-----------|------| 135 | critical native service | y | n | n | 136 | native service | y | y | n | 137 | app | y | y | n | 138 | profileable app | y | y | y | 139 | debugable app | y | y | y | 140 141 ## Troubleshooting 142 143 ### Buffer overrun 144 If the rate of allocations is too high for heapprofd to keep up, the profiling 145 session will end early due to a buffer overrun. If the buffer overrun is 146 caused by a transient spike in allocations, increasing the shared memory buffer 147 size (passing `--shmem-size` to heap\_profile) can resolve the issue. 148 Otherwise the sampling interval can be increased (at the expense of lower 149 accuracy in the resulting profile) by passing `--interval` to heap\_profile. 150 151 ### Profile is empty 152 Check whether your target process is eligible to be profiled by consulting 153 [Target processes](#target-processes) above. 154 155 ## Known Issues 156 157 * Does not work on x86 platforms (including the Android cuttlefish emulator). 158 159 ## Ways to count memory 160 161 When using heapprofd and interpreting results, it is important to know the 162 precise meaning of the different memory metrics that can be obtained from the 163 operating system. 164 165 **heapprofd** gives you the number of bytes the target program 166 requested from the allocator. If you are profiling a Java app from startup, 167 allocations that happen early in the application's initialization will not be 168 visibile to heapprofd. Native services that do not fork from the Zygote 169 are not affected by this. 170 171 **malloc\_info** is a libc function that gives you information about the 172 allocator. This can be triggered on userdebug builds by using 173 `am dumpheap -m <PID> /data/local/tmp/heap.txt`. This will in general be more 174 than the memory seen by heapprofd, depending on the allocator not all memory 175 is immediately freed. In particular, jemalloc retains some freed memory in 176 thread caches. 177 178 **Heap RSS** is the amount of memory requested from the operating system by the 179 allocator. This is larger than the previous two numbers because memory can only 180 be obtained in page size chunks, and fragmentation causes some of that memory to 181 be wasted. This can be obtained by running `adb shell dumpsys meminfo <PID>` and 182 looking at the "Private Dirty" column. 183 184 | | heapprofd | malloc\_info | RSS | 185 |---------------------|-------------------|--------------|-----| 186 | from native startup | x | x | x | 187 | after zygote init | x | x | x | 188 | before zygote init | | x | x | 189 | thread caches | | x | x | 190 | fragmentation | | | x | 191 192 If you observe high RSS or malloc\_info metrics but heapprofd does not match, 193 there might be a problem with fragmentation or the allocator. 194 195 ## Manual instructions 196 *It is not recommended to use these instructions unless you have advanced 197 requirements or are developing heapprofd. Proceed with caution* 198 199 ### Download trace\_to\_text 200 Download the latest trace\_to\_text for [Linux]( 201 https://storage.googleapis.com/perfetto/trace_to_text-4ab1d18e69bc70e211d27064505ed547aa82f919) 202 or [MacOS](https://storage.googleapis.com/perfetto/trace_to_text-mac-2ba325f95c08e8cd5a78e04fa85ee7f2a97c847e). 203 This is needed to convert the Perfetto trace to a pprof-compatible file. 204 205 Compare the `sha1sum` of this file to the one contained in the file name. 206 207 ### Start profiling 208 To start profiling the process `${PID}`, run the following sequence of commands. 209 Adjust the `INTERVAL` to trade-off runtime impact for higher accuracy of the 210 results. If `INTERVAL=1`, every allocation is sampled for maximum accuracy. 211 Otherwise, a sample is taken every `INTERVAL` bytes on average. 212 213 ```bash 214 INTERVAL=4096 215 216 echo ' 217 buffers { 218 size_kb: 100024 219 } 220 221 data_sources { 222 config { 223 name: "android.heapprofd" 224 target_buffer: 0 225 heapprofd_config { 226 sampling_interval_bytes: '${INTERVAL}' 227 pid: '${PID}' 228 } 229 } 230 } 231 232 duration_ms: 20000 233 ' | adb shell perfetto --txt -c - -o /data/misc/perfetto-traces/profile 234 235 adb pull /data/misc/perfetto-traces/profile /tmp/profile 236 ``` 237 238 ### Convert to pprof compatible file 239 240 While we work on UI support, you can convert the trace into pprof compatible 241 heap dumps. 242 243 Use the trace\_to\_text file downloaded above, with XXXXXXX replaced with the 244 `sha1sum` of the file. 245 246 ``` 247 trace_to_text-linux-XXXXXXX profile /tmp/profile 248 ``` 249 250 This will create a directory in `/tmp/` containing the heap dumps. Run 251 252 ``` 253 gzip /tmp/heap_profile-XXXXXX/*.pb 254 ``` 255 256 to get gzipped protos, which tools handling pprof profile protos expect. 257 258 Follow the instructions in [Viewing the Data](#viewing-the-data) to visualise 259 the results. 260