1 page.title=Analyzing with Profile GPU Rendering 2 page.metaDescription=Use the Profile GPU tool to help you optimize your app's rendering performance. 3 4 meta.tags="power" 5 page.tags="power" 6 7 @jd:body 8 9 <div id="qv-wrapper"> 10 <div id="qv"> 11 12 <h2>In this document</h2> 13 <ol> 14 <li> 15 <a href="#visrep">Visual Representation</a></li> 16 </li> 17 18 <li> 19 <a href="#sam">Stages and Their Meanings</a> 20 21 <ul> 22 <li> 23 <a href="#sv">Input Handling</a> 24 </li> 25 <li> 26 <a href="#asd">Animation</a> 27 </li> 28 <li> 29 <a href="#asd">Measurement/Layout</a> 30 </li> 31 <li> 32 <a href="#asd">Drawing</a> 33 </li> 34 </li> 35 <li> 36 <a href="#asd">Sync/Upload</a> 37 </li> 38 <li> 39 <a href="#asd">Issuing Commands</a> 40 </li> 41 <li> 42 <a href="#asd">Processing/Swapping Buffer</a> 43 </li> 44 <li> 45 <a href="#asd">Miscellaneous</a> 46 </li> 47 </ul> 48 </li> 49 </ol> 50 </div> 51 </div> 52 53 <p> 54 The <a href="/studio/profile/dev-options-rendering.html"> 55 Profile GPU Rendering</a> tool indicates the relative time that each stage of 56 the rendering pipeline takes to render the previous frame. This knowledge 57 can help you identify bottlenecks in the pipeline, so that you 58 can know what to optimize to improve your app's rendering performance. 59 </p> 60 61 <p> 62 This page briefly explains what happens during each pipeline stage, and 63 discusses issues that can cause bottlenecks there. Before reading 64 this page, you should be familiar with the information presented in the 65 <a href="/studio/profile/dev-options-rendering.html">Profile GPU 66 Rendering Walkthrough</a>. In addition, to understand how all of the 67 stages fit together, it may be helpful to review 68 <a href="https://www.youtube.com/watch?v=we6poP0kw6E&index=64&list=PLWz5rJ2EKKc9CBxr3BVjPTPoDPLdPIFCE"> 69 how the rendering pipeline works.</a> 70 </p> 71 72 <h2 id="#visrep">Visual Representation</h2> 73 74 <p> 75 The Profile GPU Rendering tool displays stages and their relative times in the 76 form of a graph: a color-coded histogram. Figure 1 shows an example of 77 such a display. 78 </p> 79 80 <img src="{@docRoot}topic/performance/images/bars.png"> 81 <p class="img-caption"> 82 <strong>Figure 1.</strong> Profile GPU Rendering Graph 83 </p> 84 85 </p> 86 87 <p> 88 Each segment of each vertical bar displayed in the Profile GPU Rendering 89 graph represents a stage of the pipeline and is highlighted using a specific 90 color in 91 the bar graph. Figure 2 shows a key to the meaning of each displayed color. 92 </p> 93 94 <img src="{@docRoot}topic/performance/images/s-profiler-legend.png"> 95 <p class="img-caption"> 96 <strong>Figure 2.</strong> Profile GPU Rendering Graph Legend 97 </p> 98 99 <p> 100 Once you understand what each color signfiies, 101 you can target specific aspects of your 102 app to try to optimize its rendering performance. 103 </p> 104 105 <h2 id="sam">Stages and Their Meanings</a></h2> 106 107 <p> 108 This section explains what happens during each stage corresponding 109 to a color in Figure 2, as well as bottleneck causes to look out for. 110 </p> 111 112 113 <h3 id="ih">Input Handling</h3> 114 115 <p> 116 The input handling stage of the pipeline measures how long the app 117 spent handling input events. This metric indicates how long the app 118 spent executing code called as a result of input event callbacks. 119 </p> 120 121 <h4>When this segment is large</h4> 122 123 <p> 124 High values in this area are typically a result of too much work, or 125 too-complex work, occurring inside the input-handler event callbacks. 126 Since these callbacks always occur on the main thread, solutions to this 127 problem focus on optimizing the work directly, or offloading the work to a 128 different thread. 129 </p> 130 131 <p> 132 Its also worth noting that {@link android.support.v7.widget.RecyclerView} 133 scrolling can appear in this phase. 134 {@link android.support.v7.widget.RecyclerView} scrolls immediately when it 135 consumes the touch event. As a result, 136 it can inflate or populate new item views. For this reason, its important to 137 make this operation as fast as possible. Profiling tools like Traceview or 138 Systrace can help you investigate further. 139 </p> 140 141 <h3 id="at">Animation</h3> 142 143 <p> 144 The Animations phase shows you just how long it took to evaluate all the 145 animators that were running in that frame. The most common animators are 146 {@link android.animation.ObjectAnimator}, 147 {@link android.view.ViewPropertyAnimator}, and 148 <a href="/training/transitions/overview.html">Transitions</a>. 149 </p> 150 151 <h4>When this segment is large</h4> 152 153 <p> 154 High values in this area are typically a result of work thats executing due 155 to some property change of the animation. For example, a fling animation, 156 which scrolls your {@link android.widget.ListView} or 157 {@link android.support.v7.widget.RecyclerView}, causes large amounts of view 158 inflation and population. 159 </p> 160 161 <h3 id="ml">Measurement/Layout</h3> 162 163 <p> 164 In order for Android to draw your view items on the screen, it executes 165 two specific operations across layouts and views in your view hierarchy. 166 </p> 167 168 <p> 169 First, the system measures the view items. Every view and layout has 170 specific data that describes the size of the object on the screen. Some views 171 can have a specific size; others have a size that adapts to the size 172 of the parent layout container 173 </p> 174 175 <p> 176 Second, the system lays out the view items. Once the system calculates 177 the sizes of children views, the system can proceed with layout, sizing 178 and positioning the views on the screen. 179 </p> 180 181 <p> 182 The system performs measurement and layout not only for the views to be drawn, 183 but also for the parent hierarchies of those views, all the way up to the root 184 view. 185 </p> 186 187 <h4>When this segment is large</h4> 188 189 <p> 190 If your app spends a lot of time per frame in this area, it is 191 usually either because of the sheer volume of views that need to be 192 laid out, or problems such as 193 <a href="/topic/performance/optimizing-view-hierarchies.html#double"> 194 double taxation</a> at the wrong spot in your 195 hierarchy. In either of these cases, addressing performance involves 196 <a href="/topic/performance/optimizing-view-hierarchies.html">improving 197 the performance of your view hierarchies</a>. 198 </p> 199 200 <p> 201 Code that youve added to 202 {@link android.view.View#onLayout(boolean, int, int, int, int)} or 203 {@link android.view.View#onMeasure(int, int)} 204 can also cause performance 205 issues. <a href="/studio/profile/traceview.html">Traceview</a> and 206 <a href="/studio/profile/systrace.html">Systrace</a> can help you examine 207 the callstacks to identify problems your code may have. 208 </p> 209 210 <h3 id="draw">Drawing</h3> 211 212 <p> 213 The draw stage translates a views rendering operations, such as drawing 214 a background or drawing text, into a sequence of native drawing commands. 215 The system captures these commands into a display list. 216 </p> 217 218 <p> 219 The Draw bar records how much time it takes to complete capturing the commands 220 into the display list, for all the views that needed to be updated on the screen 221 this frame. The measured time applies to any code that you have added to the UI 222 objects in your app. Examples of such code may be the 223 {@link android.view.View#onDraw(android.graphics.Canvas) onDraw()}, 224 {@link android.view.View#dispatchDraw(android.graphics.Canvas) dispatchDraw()}, 225 and the various <code>draw ()methods</code> belonging to the subclasses of the 226 {@link android.graphics.drawable.Drawable} class. 227 </p> 228 229 <h4>When this segment is large</h4> 230 231 <p> 232 In simplified terms, you can understand this metric as showing how long it took 233 to run all of the calls to 234 {@link android.view.View#onDraw(android.graphics.Canvas) onDraw()} 235 for each invalidated view. This 236 measurement includes any time spent dispatching draw commands to children and 237 drawables that may be present. For this reason, when you see this bar spike, the 238 cause could be that a bunch of views suddenly became invalidated. Invalidation 239 makes it necessary to regenerate views' display lists. Alternatively, a 240 lengthy time may be the result of a few custom views that have some extremely 241 complex logic in their 242 {@link android.view.View#onDraw(android.graphics.Canvas) onDraw()} methods. 243 </p> 244 245 <h3 id="su">Sync/Upload</h3> 246 247 <p> 248 The Sync & Upload metric represents the time it takes to transfer 249 bitmap objects from CPU memory to GPU memory during the current frame. 250 </p> 251 252 <p> 253 As different processors, the CPU and the GPU have different RAM areas 254 dedicated to processing. When you draw a bitmap on Android, the system 255 transfers the bitmap to GPU memory before the GPU can render it to the 256 screen. Then, the GPU caches the bitmap so that the system doesnt need to 257 transfer the data again unless the texture gets evicted from the GPU texture 258 cache. 259 </p> 260 261 <p class="note"><strong>Note:</strong> On Lollipop devices, this stage is 262 purple. 263 </p> 264 265 <h4>When this segment is large</h4> 266 267 <p> 268 All resources for a frame need to reside in GPU memory before they can be 269 used to draw a frame. This means that a high value for this metric could mean 270 either a large number of small resource loads or a small number of very large 271 resources. A common case is when an app displays a single bitmap thats 272 close to the size of the screen. Another case is when an app displays a 273 large number of thumbnails. 274 </p> 275 276 <p> 277 To shrink this bar, you can employ techniques such as: 278 </p> 279 280 <ul> 281 <li> 282 Ensuring your bitmap resolutions are not much larger than the size at which they 283 will be displayed. For example, your app should avoid displaying a 1024x1024 284 image as a 48x48 image. 285 </li> 286 287 <li> 288 Taking advantage of {@link android.graphics.Bitmap#prepareToDraw()} 289 to asynchronously pre-upload a bitmap before the next sync phase. 290 </li> 291 </ul> 292 293 <h3 id="ic">Issuing Commands</h3> 294 295 <p> 296 The <em>Issue Commands</em> segment represents the time it takes to issue all 297 of the commands necessary for drawing display lists to the screen. 298 </p> 299 300 <p> 301 For the system to draw display lists to the screen, it sends the 302 necessary commands to the GPU. Typically, it performs this action through the 303 <a href="/guide/topics/graphics/opengl.html">OpenGL ES</a> API. 304 </p> 305 306 <p> 307 This process takes some time, as the system performs final transformation 308 and clipping for each command before sending the command to the GPU. Additional 309 overhead then arises on the GPU side, which computes the final commands. These 310 commands include final transformations, and additional clipping. 311 </p> 312 313 <h4>When this segment is large</h4> 314 315 <p> 316 The time spent in this stage is a direct measure of the complexity and 317 quantity of display lists that the system renders in a given 318 frame. For example, having many draw operations, especially in cases where 319 there's a small inherent cost to each draw primitive, could inflate this time. 320 For example: 321 </p> 322 323 <pre> 324 for (int i = 0; i < 1000; i++) 325 canvas.drawPoint() 326 </pre> 327 328 <p> 329 is a lot more expensive to issue than: 330 </p> 331 332 <pre> 333 canvas.drawPoints(mThousandPointArray); 334 </pre> 335 336 <p> 337 There isnt always a 1:1 correlation between issuing commands and 338 actually drawing display lists. Unlike <em>Issue Commands</em>, 339 which captures the time it takes to send drawing commands to the GPU, 340 the <em>Draw</em> metric represents the time that it took to capture the issued 341 commands into the display list. 342 </p> 343 344 <p> 345 This difference arises because the display lists are cached by 346 the system wherever possible. As a result, there are situations where a 347 scroll, transform, or animation requires the system to re-send a display 348 list, but not have to actually rebuild it—recapture the drawing 349 commands—from scratch. As a result, you can see a high Issue 350 commands bar without seeing a high <em>Draw commands</em> bar. 351 </p> 352 353 <h3 id="psb">Processing/Swapping Buffers</h3> 354 355 <p> 356 Once Android finishes submitting all its display list to the GPU, 357 the system issues one final command to tell the graphics driver that it's 358 done with the current frame. At this point, the driver can finally present 359 the updated image to the screen. 360 </p> 361 362 <h4>When this segment is large</h4> 363 364 <p> 365 Its important to understand that the GPU executes work in parallel with the 366 CPU. The Android system issues draw commands to the GPU, and then moves on to 367 the next task. The GPU reads those draw commands from a queue and processes 368 them. 369 </p> 370 371 <p> 372 In situations where the CPU issues commands faster than the GPU 373 consumes them, the communications queue between the processors can become 374 full. When this occurs, the CPU blocks, and waits until there is space in the 375 queue to place the next command. This full-queue state arises often during the 376 <em>Swap Buffers</em> stage, because at that point, a whole frames worth of 377 commands have been submitted. 378 </p> 379 380 </p> 381 The key to mitigating this problem is to reduce the complexity of work occurring 382 on the GPU, in similar fashion to what you would do for the Issue Commands 383 phase. 384 </p> 385 386 387 <h3 id="mt">Miscellaneous</h3> 388 389 <p> 390 In addition to the time it takes the rendering system to perform its work, 391 theres an additional set of work that occurs on the main thread and has 392 nothing to do with rendering. Time that this work consumes is reported as 393 <em>misc time</em>. Misc time generally represents work that might be occurring 394 on the UI thread between two consecutive frames of rendering. 395 </p> 396 397 <h4>When this segment is large</h4> 398 399 <p> 400 If this value is high, it is likely that your app has callbacks, intents, or 401 other work that should be happening on another thread. Tools such as 402 <a href="/studio/profile/traceview.html">Method 403 Tracing</a> or <a href="/studio/profile/systrace.html">Systrace</a> can provide 404 visibility into the tasks that are running on 405 the main thread. This information can help you target performance improvements. 406 </p> 407