1 page.title=Graphics architecture 2 @jd:body 3 4 <!-- 5 Copyright 2014 The Android Open Source Project 6 7 Licensed under the Apache License, Version 2.0 (the "License"); 8 you may not use this file except in compliance with the License. 9 You may obtain a copy of the License at 10 11 http://www.apache.org/licenses/LICENSE-2.0 12 13 Unless required by applicable law or agreed to in writing, software 14 distributed under the License is distributed on an "AS IS" BASIS, 15 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 16 See the License for the specific language governing permissions and 17 limitations under the License. 18 --> 19 <div id="qv-wrapper"> 20 <div id="qv"> 21 <h2>In this document</h2> 22 <ol id="auto-toc"> 23 </ol> 24 </div> 25 </div> 26 27 28 <p><em>What every developer should know about Surface, SurfaceHolder, EGLSurface, 29 SurfaceView, GLSurfaceView, SurfaceTexture, TextureView, and SurfaceFlinger</em> 30 </p> 31 <p>This document describes the essential elements of Android's "system-level" 32 graphics architecture, and how it is used by the application framework and 33 multimedia system. The focus is on how buffers of graphical data move through 34 the system. If you've ever wondered why SurfaceView and TextureView behave the 35 way they do, or how Surface and EGLSurface interact, you've come to the right 36 place.</p> 37 38 <p>Some familiarity with Android devices and application development is assumed. 39 You don't need detailed knowledge of the app framework, and very few API calls 40 will be mentioned, but the material herein doesn't overlap much with other 41 public documentation. The goal here is to provide a sense for the significant 42 events involved in rendering a frame for output, so that you can make informed 43 choices when designing an application. To achieve this, we work from the bottom 44 up, describing how the UI classes work rather than how they can be used.</p> 45 46 <p>Early sections contain background material used in later sections, so it's a 47 good idea to read straight through rather than skipping to a section that sounds 48 interesting. We start with an explanation of Android's graphics buffers, 49 describe the composition and display mechanism, and then proceed to the 50 higher-level mechanisms that supply the compositor with data.</p> 51 52 <p>This document is chiefly concerned with the system as it exists in Android 4.4 53 ("KitKat"). Earlier versions of the system worked differently, and future 54 versions will likely be different as well. Version-specific features are called 55 out in a few places.</p> 56 57 <p>At various points I will refer to source code from the AOSP sources or from 58 Grafika. Grafika is a Google open source project for testing; it can be found at 59 <a 60 href="https://github.com/google/grafika">https://github.com/google/grafika</a>. 61 It's more "quick hack" than solid example code, but it will suffice.</p> 62 <h2 id="BufferQueue">BufferQueue and gralloc</h2> 63 64 <p>To understand how Android's graphics system works, we have to start behind the 65 scenes. At the heart of everything graphical in Android is a class called 66 BufferQueue. Its role is simple enough: connect something that generates 67 buffers of graphical data (the "producer") to something that accepts the data 68 for display or further processing (the "consumer"). The producer and consumer 69 can live in different processes. Nearly everything that moves buffers of 70 graphical data through the system relies on BufferQueue.</p> 71 72 <p>The basic usage is straightforward. The producer requests a free buffer 73 (<code>dequeueBuffer()</code>), specifying a set of characteristics including width, 74 height, pixel format, and usage flags. The producer populates the buffer and 75 returns it to the queue (<code>queueBuffer()</code>). Some time later, the consumer 76 acquires the buffer (<code>acquireBuffer()</code>) and makes use of the buffer contents. 77 When the consumer is done, it returns the buffer to the queue 78 (<code>releaseBuffer()</code>).</p> 79 80 <p>Most recent Android devices support the "sync framework". This allows the 81 system to do some nifty thing when combined with hardware components that can 82 manipulate graphics data asynchronously. For example, a producer can submit a 83 series of OpenGL ES drawing commands and then enqueue the output buffer before 84 rendering completes. The buffer is accompanied by a fence that signals when the 85 contents are ready. A second fence accompanies the buffer when it is returned 86 to the free list, so that the consumer can release the buffer while the contents 87 are still in use. This approach improves latency and throughput as the buffers 88 move through the system.</p> 89 90 <p>Some characteristics of the queue, such as the maximum number of buffers it can 91 hold, are determined jointly by the producer and the consumer.</p> 92 93 <p>The BufferQueue is responsible for allocating buffers as it needs them. Buffers 94 are retained unless the characteristics change; for example, if the producer 95 starts requesting buffers with a different size, the old buffers will be freed 96 and new buffers will be allocated on demand.</p> 97 98 <p>The data structure is currently always created and "owned" by the consumer. In 99 Android 4.3 only the producer side was "binderized", i.e. the producer could be 100 in a remote process but the consumer had to live in the process where the queue 101 was created. This evolved a bit in 4.4, moving toward a more general 102 implementation.</p> 103 104 <p>Buffer contents are never copied by BufferQueue. Moving that much data around 105 would be very inefficient. Instead, buffers are always passed by handle.</p> 106 107 <h3 id="gralloc_HAL">gralloc HAL</h3> 108 109 <p>The actual buffer allocations are performed through a memory allocator called 110 "gralloc", which is implemented through a vendor-specific HAL interface (see 111 <a 112 href="https://android.googlesource.com/platform/hardware/libhardware/+/kitkat-release/include/hardware/gralloc.h">hardware/libhardware/include/hardware/gralloc.h</a>). 113 The <code>alloc()</code> function takes the arguments you'd expect -- width, 114 height, pixel format -- as well as a set of usage flags. Those flags merit 115 closer attention.</p> 116 117 <p>The gralloc allocator is not just another way to allocate memory on the native 118 heap. In some situations, the allocated memory may not be cache-coherent, or 119 could be totally inaccessible from user space. The nature of the allocation is 120 determined by the usage flags, which include attributes like:</p> 121 122 <ul> 123 <li>how often the memory will be accessed from software (CPU)</li> 124 <li>how often the memory will be accessed from hardware (GPU)</li> 125 <li>whether the memory will be used as an OpenGL ES ("GLES") texture</li> 126 <li>whether the memory will be used by a video encoder</li> 127 </ul> 128 129 <p>For example, if your format specifies RGBA 8888 pixels, and you indicate 130 the buffer will be accessed from software -- meaning your application will touch 131 pixels directly -- then the allocator needs to create a buffer with 4 bytes per 132 pixel in R-G-B-A order. If instead you say the buffer will only be 133 accessed from hardware and as a GLES texture, the allocator can do anything the 134 GLES driver wants -- BGRA ordering, non-linear "swizzled" layouts, alternative 135 color formats, etc. Allowing the hardware to use its preferred format can 136 improve performance.</p> 137 138 <p>Some values cannot be combined on certain platforms. For example, the "video 139 encoder" flag may require YUV pixels, so adding "software access" and specifying 140 RGBA 8888 would fail.</p> 141 142 <p>The handle returned by the gralloc allocator can be passed between processes 143 through Binder.</p> 144 145 <h2 id="SurfaceFlinger">SurfaceFlinger and Hardware Composer</h2> 146 147 <p>Having buffers of graphical data is wonderful, but life is even better when you 148 get to see them on your device's screen. That's where SurfaceFlinger and the 149 Hardware Composer HAL come in.</p> 150 151 <p>SurfaceFlinger's role is to accept buffers of data from multiple sources, 152 composite them, and send them to the display. Once upon a time this was done 153 with software blitting to a hardware framebuffer (e.g. 154 <code>/dev/graphics/fb0</code>), but those days are long gone.</p> 155 156 <p>When an app comes to the foreground, the WindowManager service asks 157 SurfaceFlinger for a drawing surface. SurfaceFlinger creates a "layer" - the 158 primary component of which is a BufferQueue - for which SurfaceFlinger acts as 159 the consumer. A Binder object for the producer side is passed through the 160 WindowManager to the app, which can then start sending frames directly to 161 SurfaceFlinger.</p> 162 163 <p class="note"><strong>Note:</strong> The WindowManager uses the term "window" instead of 164 "layer" for this and uses "layer" to mean something else. We're going to use the 165 SurfaceFlinger terminology. It can be argued that SurfaceFlinger should really 166 be called LayerFlinger.</p> 167 168 <p>For most apps, there will be three layers on screen at any time: the "status 169 bar" at the top of the screen, the "navigation bar" at the bottom or side, and 170 the application's UI. Some apps will have more or less, e.g. the default home app has a 171 separate layer for the wallpaper, while a full-screen game might hide the status 172 bar. Each layer can be updated independently. The status and navigation bars 173 are rendered by a system process, while the app layers are rendered by the app, 174 with no coordination between the two.</p> 175 176 <p>Device displays refresh at a certain rate, typically 60 frames per second on 177 phones and tablets. If the display contents are updated mid-refresh, "tearing" 178 will be visible; so it's important to update the contents only between cycles. 179 The system receives a signal from the display when it's safe to update the 180 contents. For historical reasons we'll call this the VSYNC signal.</p> 181 182 <p>The refresh rate may vary over time, e.g. some mobile devices will range from 58 183 to 62fps depending on current conditions. For an HDMI-attached television, this 184 could theoretically dip to 24 or 48Hz to match a video. Because we can update 185 the screen only once per refresh cycle, submitting buffers for display at 186 200fps would be a waste of effort as most of the frames would never be seen. 187 Instead of taking action whenever an app submits a buffer, SurfaceFlinger wakes 188 up when the display is ready for something new.</p> 189 190 <p>When the VSYNC signal arrives, SurfaceFlinger walks through its list of layers 191 looking for new buffers. If it finds a new one, it acquires it; if not, it 192 continues to use the previously-acquired buffer. SurfaceFlinger always wants to 193 have something to display, so it will hang on to one buffer. If no buffers have 194 ever been submitted on a layer, the layer is ignored.</p> 195 196 <p>Once SurfaceFlinger has collected all of the buffers for visible layers, it 197 asks the Hardware Composer how composition should be performed.</p> 198 199 <h3 id="hwcomposer">Hardware Composer</h3> 200 201 <p>The Hardware Composer HAL ("HWC") was first introduced in Android 3.0 202 ("Honeycomb") and has evolved steadily over the years. Its primary purpose is 203 to determine the most efficient way to composite buffers with the available 204 hardware. As a HAL, its implementation is device-specific and usually 205 implemented by the display hardware OEM.</p> 206 207 <p>The value of this approach is easy to recognize when you consider "overlay 208 planes." The purpose of overlay planes is to composite multiple buffers 209 together, but in the display hardware rather than the GPU. For example, suppose 210 you have a typical Android phone in portrait orientation, with the status bar on 211 top and navigation bar at the bottom, and app content everywhere else. The contents 212 for each layer are in separate buffers. You could handle composition by 213 rendering the app content into a scratch buffer, then rendering the status bar 214 over it, then rendering the navigation bar on top of that, and finally passing the 215 scratch buffer to the display hardware. Or, you could pass all three buffers to 216 the display hardware, and tell it to read data from different buffers for 217 different parts of the screen. The latter approach can be significantly more 218 efficient.</p> 219 220 <p>As you might expect, the capabilities of different display processors vary 221 significantly. The number of overlays, whether layers can be rotated or 222 blended, and restrictions on positioning and overlap can be difficult to express 223 through an API. So, the HWC works like this:</p> 224 225 <ol> 226 <li>SurfaceFlinger provides the HWC with a full list of layers, and asks, "how do 227 you want to handle this?"</li> 228 <li>The HWC responds by marking each layer as "overlay" or "GLES composition."</li> 229 <li>SurfaceFlinger takes care of any GLES composition, passing the output buffer 230 to HWC, and lets HWC handle the rest.</li> 231 </ol> 232 233 <p>Since the decision-making code can be custom tailored by the hardware vendor, 234 it's possible to get the best performance out of every device.</p> 235 236 <p>Overlay planes may be less efficient than GL composition when nothing on the 237 screen is changing. This is particularly true when the overlay contents have 238 transparent pixels, and overlapping layers are being blended together. In such 239 cases, the HWC can choose to request GLES composition for some or all layers 240 and retain the composited buffer. If SurfaceFlinger comes back again asking to 241 composite the same set of buffers, the HWC can just continue to show the 242 previously-composited scratch buffer. This can improve the battery life of an 243 idle device.</p> 244 245 <p>Devices shipping with Android 4.4 ("KitKat") typically support four overlay 246 planes. Attempting to composite more layers than there are overlays will cause 247 the system to use GLES composition for some of them; so the number of layers 248 used by an application can have a measurable impact on power consumption and 249 performance.</p> 250 251 <p>You can see exactly what SurfaceFlinger is up to with the command <code>adb shell 252 dumpsys SurfaceFlinger</code>. The output is verbose. The part most relevant to our 253 current discussion is the HWC summary that appears near the bottom of the 254 output:</p> 255 256 <pre> 257 type | source crop | frame name 258 ------------+-----------------------------------+-------------------------------- 259 HWC | [ 0.0, 0.0, 320.0, 240.0] | [ 48, 411, 1032, 1149] SurfaceView 260 HWC | [ 0.0, 75.0, 1080.0, 1776.0] | [ 0, 75, 1080, 1776] com.android.grafika/com.android.grafika.PlayMovieSurfaceActivity 261 HWC | [ 0.0, 0.0, 1080.0, 75.0] | [ 0, 0, 1080, 75] StatusBar 262 HWC | [ 0.0, 0.0, 1080.0, 144.0] | [ 0, 1776, 1080, 1920] NavigationBar 263 FB TARGET | [ 0.0, 0.0, 1080.0, 1920.0] | [ 0, 0, 1080, 1920] HWC_FRAMEBUFFER_TARGET 264 </pre> 265 266 <p>This tells you what layers are on screen, whether they're being handled with 267 overlays ("HWC") or OpenGL ES composition ("GLES"), and gives you a bunch of 268 other facts you probably won't care about ("handle" and "hints" and "flags" and 269 other stuff that we've trimmed out of the snippet above). The "source crop" and 270 "frame" values will be examined more closely later on.</p> 271 272 <p>The FB_TARGET layer is where GLES composition output goes. Since all layers 273 shown above are using overlays, FB_TARGET isnt being used for this frame. The 274 layer's name is indicative of its original role: On a device with 275 <code>/dev/graphics/fb0</code> and no overlays, all composition would be done 276 with GLES, and the output would be written to the framebuffer. On recent devices there 277 generally is no simple framebuffer, so the FB_TARGET layer is a scratch buffer.</p> 278 279 <p class="note"><strong>Note:</strong> This is why screen grabbers written for old versions of Android no 280 longer work: They're trying to read from the Framebuffer, but there is no such 281 thing.</p> 282 283 <p>The overlay planes have another important role: they're the only way to display 284 DRM content. DRM-protected buffers cannot be accessed by SurfaceFlinger or the 285 GLES driver, which means that your video will disappear if HWC switches to GLES 286 composition.</p> 287 288 <h3 id="triple-buffering">The Need for Triple-Buffering</h3> 289 290 <p>To avoid tearing on the display, the system needs to be double-buffered: the 291 front buffer is displayed while the back buffer is being prepared. At VSYNC, if 292 the back buffer is ready, you quickly switch them. This works reasonably well 293 in a system where you're drawing directly into the framebuffer, but there's a 294 hitch in the flow when a composition step is added. Because of the way 295 SurfaceFlinger is triggered, our double-buffered pipeline will have a bubble.</p> 296 297 <p>Suppose frame N is being displayed, and frame N+1 has been acquired by 298 SurfaceFlinger for display on the next VSYNC. (Assume frame N is composited 299 with an overlay, so we can't alter the buffer contents until the display is done 300 with it.) When VSYNC arrives, HWC flips the buffers. While the app is starting 301 to render frame N+2 into the buffer that used to hold frame N, SurfaceFlinger is 302 scanning the layer list, looking for updates. SurfaceFlinger won't find any new 303 buffers, so it prepares to show frame N+1 again after the next VSYNC. A little 304 while later, the app finishes rendering frame N+2 and queues it for 305 SurfaceFlinger, but it's too late. This has effectively cut our maximum frame 306 rate in half.</p> 307 308 <p>We can fix this with triple-buffering. Just before VSYNC, frame N is being 309 displayed, frame N+1 has been composited (or scheduled for an overlay) and is 310 ready to be displayed, and frame N+2 is queued up and ready to be acquired by 311 SurfaceFlinger. When the screen flips, the buffers rotate through the stages 312 with no bubble. The app has just less than a full VSYNC period (16.7ms at 60fps) to 313 do its rendering and queue the buffer. And SurfaceFlinger / HWC has a full VSYNC 314 period to figure out the composition before the next flip. The downside is 315 that it takes at least two VSYNC periods for anything that the app does to 316 appear on the screen. As the latency increases, the device feels less 317 responsive to touch input.</p> 318 319 <img src="images/surfaceflinger_bufferqueue.png" alt="SurfaceFlinger with BufferQueue" /> 320 321 <p class="img-caption"> 322 <strong>Figure 1.</strong> SurfaceFlinger + BufferQueue 323 </p> 324 325 <p>The diagram above depicts the flow of SurfaceFlinger and BufferQueue. During 326 frame:</p> 327 328 <ol> 329 <li>red buffer fills up, then slides into BufferQueue</li> 330 <li>after red buffer leaves app, blue buffer slides in, replacing it</li> 331 <li>green buffer and systemUI* shadow-slide into HWC (showing that SurfaceFlinger 332 still has the buffers, but now HWC has prepared them for display via overlay on 333 the next VSYNC).</li> 334 </ol> 335 336 <p>The blue buffer is referenced by both the display and the BufferQueue. The 337 app is not allowed to render to it until the associated sync fence signals.</p> 338 339 <p>On VSYNC, all of these happen at once:</p> 340 341 <ul> 342 <li>red buffer leaps into SurfaceFlinger, replacing green buffer</li> 343 <li>green buffer leaps into Display, replacing blue buffer, and a dotted-line 344 green twin appears in the BufferQueue</li> 345 <li>the blue buffers fence is signaled, and the blue buffer in App empties**</li> 346 <li>display rect changes from <blue + SystemUI> to <green + 347 SystemUI></li> 348 </ul> 349 350 <p><strong>*</strong> - The System UI process is providing the status and nav 351 bars, which for our purposes here arent changing, so SurfaceFlinger keeps using 352 the previously-acquired buffer. In practice there would be two separate 353 buffers, one for the status bar at the top, one for the navigation bar at the 354 bottom, and they would be sized to fit their contents. Each would arrive on its 355 own BufferQueue.</p> 356 357 <p><strong>**</strong> - The buffer doesnt actually empty; if you submit it 358 without drawing on it youll get that same blue again. The emptying is the 359 result of clearing the buffer contents, which the app should do before it starts 360 drawing.</p> 361 362 <p>We can reduce the latency by noting layer composition should not require a 363 full VSYNC period. If composition is performed by overlays, it takes essentially 364 zero CPU and GPU time. But we can't count on that, so we need to allow a little 365 time. If the app starts rendering halfway between VSYNC signals, and 366 SurfaceFlinger defers the HWC setup until a few milliseconds before the signal 367 is due to arrive, we can cut the latency from 2 frames to perhaps 1.5. In 368 theory you could render and composite in a single period, allowing a return to 369 double-buffering; but getting it down that far is difficult on current devices. 370 Minor fluctuations in rendering and composition time, and switching from 371 overlays to GLES composition, can cause us to miss a swap deadline and repeat 372 the previous frame.</p> 373 374 <p>SurfaceFlinger's buffer handling demonstrates the fence-based buffer 375 management mentioned earlier. If we're animating at full speed, we need to 376 have an acquired buffer for the display ("front") and an acquired buffer for 377 the next flip ("back"). If we're showing the buffer on an overlay, the 378 contents are being accessed directly by the display and must not be touched. 379 But if you look at an active layer's BufferQueue state in the <code>dumpsys 380 SurfaceFlinger</code> output, you'll see one acquired buffer, one queued buffer, and 381 one free buffer. That's because, when SurfaceFlinger acquires the new "back" 382 buffer, it releases the current "front" buffer to the queue. The "front" 383 buffer is still in use by the display, so anything that dequeues it must wait 384 for the fence to signal before drawing on it. So long as everybody follows 385 the fencing rules, all of the queue-management IPC requests can happen in 386 parallel with the display.</p> 387 388 <h3 id="virtual-displays">Virtual Displays</h3> 389 390 <p>SurfaceFlinger supports a "primary" display, i.e. what's built into your phone 391 or tablet, and an "external" display, such as a television connected through 392 HDMI. It also supports a number of "virtual" displays, which make composited 393 output available within the system. Virtual displays can be used to record the 394 screen or send it over a network.</p> 395 396 <p>Virtual displays may share the same set of layers as the main display 397 (the "layer stack") or have its own set. There is no VSYNC for a virtual 398 display, so the VSYNC for the primary display is used to trigger composition for 399 all displays.</p> 400 401 <p>In the past, virtual displays were always composited with GLES. The Hardware 402 Composer managed composition for only the primary display. In Android 4.4, the 403 Hardware Composer gained the ability to participate in virtual display 404 composition.</p> 405 406 <p>As you might expect, the frames generated for a virtual display are written to a 407 BufferQueue.</p> 408 409 <h3 id="screenrecord">Case study: screenrecord</h3> 410 411 <p>Now that we've established some background on BufferQueue and SurfaceFlinger, 412 it's useful to examine a practical use case.</p> 413 414 <p>The <a href="https://android.googlesource.com/platform/frameworks/av/+/kitkat-release/cmds/screenrecord/">screenrecord 415 command</a>, 416 introduced in Android 4.4, allows you to record everything that appears on the 417 screen as an .mp4 file on disk. To implement this, we have to receive composited 418 frames from SurfaceFlinger, write them to the video encoder, and then write the 419 encoded video data to a file. The video codecs are managed by a separate 420 process - called "mediaserver" - so we have to move large graphics buffers around 421 the system. To make it more challenging, we're trying to record 60fps video at 422 full resolution. The key to making this work efficiently is BufferQueue.</p> 423 424 <p>The MediaCodec class allows an app to provide data as raw bytes in buffers, or 425 through a Surface. We'll discuss Surface in more detail later, but for now just 426 think of it as a wrapper around the producer end of a BufferQueue. When 427 screenrecord requests access to a video encoder, mediaserver creates a 428 BufferQueue and connects itself to the consumer side, and then passes the 429 producer side back to screenrecord as a Surface.</p> 430 431 <p>The screenrecord command then asks SurfaceFlinger to create a virtual display 432 that mirrors the main display (i.e. it has all of the same layers), and directs 433 it to send output to the Surface that came from mediaserver. Note that, in this 434 case, SurfaceFlinger is the producer of buffers rather than the consumer.</p> 435 436 <p>Once the configuration is complete, screenrecord can just sit and wait for 437 encoded data to appear. As apps draw, their buffers travel to SurfaceFlinger, 438 which composites them into a single buffer that gets sent directly to the video 439 encoder in mediaserver. The full frames are never even seen by the screenrecord 440 process. Internally, mediaserver has its own way of moving buffers around that 441 also passes data by handle, minimizing overhead.</p> 442 443 <h3 id="simulate-secondary">Case study: Simulate Secondary Displays</h3> 444 445 <p>The WindowManager can ask SurfaceFlinger to create a visible layer for which 446 SurfaceFlinger will act as the BufferQueue consumer. It's also possible to ask 447 SurfaceFlinger to create a virtual display, for which SurfaceFlinger will act as 448 the BufferQueue producer. What happens if you connect them, configuring a 449 virtual display that renders to a visible layer?</p> 450 451 <p>You create a closed loop, where the composited screen appears in a window. Of 452 course, that window is now part of the composited output, so on the next refresh 453 the composited image inside the window will show the window contents as well. 454 It's turtles all the way down. You can see this in action by enabling 455 "<a href="http://developer.android.com/tools/index.html">Developer options</a>" in 456 settings, selecting "Simulate secondary displays", and enabling a window. For 457 bonus points, use screenrecord to capture the act of enabling the display, then 458 play it back frame-by-frame.</p> 459 460 <h2 id="surface">Surface and SurfaceHolder</h2> 461 462 <p>The <a 463 href="http://developer.android.com/reference/android/view/Surface.html">Surface</a> 464 class has been part of the public API since 1.0. Its description simply says, 465 "Handle onto a raw buffer that is being managed by the screen compositor." The 466 statement was accurate when initially written but falls well short of the mark 467 on a modern system.</p> 468 469 <p>The Surface represents the producer side of a buffer queue that is often (but 470 not always!) consumed by SurfaceFlinger. When you render onto a Surface, the 471 result ends up in a buffer that gets shipped to the consumer. A Surface is not 472 simply a raw chunk of memory you can scribble on.</p> 473 474 <p>The BufferQueue for a display Surface is typically configured for 475 triple-buffering; but buffers are allocated on demand. So if the producer 476 generates buffers slowly enough -- maybe it's animating at 30fps on a 60fps 477 display -- there might only be two allocated buffers in the queue. This helps 478 minimize memory consumption. You can see a summary of the buffers associated 479 with every layer in the <code>dumpsys SurfaceFlinger</code> output.</p> 480 481 <h3 id="canvas">Canvas Rendering</h3> 482 483 <p>Once upon a time, all rendering was done in software, and you can still do this 484 today. The low-level implementation is provided by the Skia graphics library. 485 If you want to draw a rectangle, you make a library call, and it sets bytes in a 486 buffer appropriately. To ensure that a buffer isn't updated by two clients at 487 once, or written to while being displayed, you have to lock the buffer to access 488 it. <code>lockCanvas()</code> locks the buffer and returns a Canvas to use for drawing, 489 and <code>unlockCanvasAndPost()</code> unlocks the buffer and sends it to the compositor.</p> 490 491 <p>As time went on, and devices with general-purpose 3D engines appeared, Android 492 reoriented itself around OpenGL ES. However, it was important to keep the old 493 API working, for apps as well as app framework code, so an effort was made to 494 hardware-accelerate the Canvas API. As you can see from the charts on the 495 <a href="http://developer.android.com/guide/topics/graphics/hardware-accel.html">Hardware 496 Acceleration</a> 497 page, this was a bit of a bumpy ride. Note in particular that while the Canvas 498 provided to a View's <code>onDraw()</code> method may be hardware-accelerated, the Canvas 499 obtained when an app locks a Surface directly with <code>lockCanvas()</code> never is.</p> 500 501 <p>When you lock a Surface for Canvas access, the "CPU renderer" connects to the 502 producer side of the BufferQueue and does not disconnect until the Surface is 503 destroyed. Most other producers (like GLES) can be disconnected and reconnected 504 to a Surface, but the Canvas-based "CPU renderer" cannot. This means you can't 505 draw on a surface with GLES or send it frames from a video decoder if you've 506 ever locked it for a Canvas.</p> 507 508 <p>The first time the producer requests a buffer from a BufferQueue, it is 509 allocated and initialized to zeroes. Initialization is necessary to avoid 510 inadvertently sharing data between processes. When you re-use a buffer, 511 however, the previous contents will still be present. If you repeatedly call 512 <code>lockCanvas()</code> and <code>unlockCanvasAndPost()</code> without 513 drawing anything, you'll cycle between previously-rendered frames.</p> 514 515 <p>The Surface lock/unlock code keeps a reference to the previously-rendered 516 buffer. If you specify a dirty region when locking the Surface, it will copy 517 the non-dirty pixels from the previous buffer. There's a fair chance the buffer 518 will be handled by SurfaceFlinger or HWC; but since we need to only read from 519 it, there's no need to wait for exclusive access.</p> 520 521 <p>The main non-Canvas way for an application to draw directly on a Surface is 522 through OpenGL ES. That's described in the <a href="#eglsurface">EGLSurface and 523 OpenGL ES</a> section.</p> 524 525 <h3 id="surfaceholder">SurfaceHolder</h3> 526 527 <p>Some things that work with Surfaces want a SurfaceHolder, notably SurfaceView. 528 The original idea was that Surface represented the raw compositor-managed 529 buffer, while SurfaceHolder was managed by the app and kept track of 530 higher-level information like the dimensions and format. The Java-language 531 definition mirrors the underlying native implementation. It's arguably no 532 longer useful to split it this way, but it has long been part of the public API.</p> 533 534 <p>Generally speaking, anything having to do with a View will involve a 535 SurfaceHolder. Some other APIs, such as MediaCodec, will operate on the Surface 536 itself. You can easily get the Surface from the SurfaceHolder, so hang on to 537 the latter when you have it.</p> 538 539 <p>APIs to get and set Surface parameters, such as the size and format, are 540 implemented through SurfaceHolder.</p> 541 542 <h2 id="eglsurface">EGLSurface and OpenGL ES</h2> 543 544 <p>OpenGL ES defines an API for rendering graphics. It does not define a windowing 545 system. To allow GLES to work on a variety of platforms, it is designed to be 546 combined with a library that knows how to create and access windows through the 547 operating system. The library used for Android is called EGL. If you want to 548 draw textured polygons, you use GLES calls; if you want to put your rendering on 549 the screen, you use EGL calls.</p> 550 551 <p>Before you can do anything with GLES, you need to create a GL context. In EGL, 552 this means creating an EGLContext and an EGLSurface. GLES operations apply to 553 the current context, which is accessed through thread-local storage rather than 554 passed around as an argument. This means you have to be careful about which 555 thread your rendering code executes on, and which context is current on that 556 thread.</p> 557 558 <p>The EGLSurface can be an off-screen buffer allocated by EGL (called a "pbuffer") 559 or a window allocated by the operating system. EGL window surfaces are created 560 with the <code>eglCreateWindowSurface()</code> call. It takes a "window object" as an 561 argument, which on Android can be a SurfaceView, a SurfaceTexture, a 562 SurfaceHolder, or a Surface -- all of which have a BufferQueue underneath. When 563 you make this call, EGL creates a new EGLSurface object, and connects it to the 564 producer interface of the window object's BufferQueue. From that point onward, 565 rendering to that EGLSurface results in a buffer being dequeued, rendered into, 566 and queued for use by the consumer. (The term "window" is indicative of the 567 expected use, but bear in mind the output might not be destined to appear 568 on the display.)</p> 569 570 <p>EGL does not provide lock/unlock calls. Instead, you issue drawing commands and 571 then call <code>eglSwapBuffers()</code> to submit the current frame. The 572 method name comes from the traditional swap of front and back buffers, but the actual 573 implementation may be very different.</p> 574 575 <p>Only one EGLSurface can be associated with a Surface at a time -- you can have 576 only one producer connected to a BufferQueue -- but if you destroy the 577 EGLSurface it will disconnect from the BufferQueue and allow something else to 578 connect.</p> 579 580 <p>A given thread can switch between multiple EGLSurfaces by changing what's 581 "current." An EGLSurface must be current on only one thread at a time.</p> 582 583 <p>The most common mistake when thinking about EGLSurface is assuming that it is 584 just another aspect of Surface (like SurfaceHolder). It's a related but 585 independent concept. You can draw on an EGLSurface that isn't backed by a 586 Surface, and you can use a Surface without EGL. EGLSurface just gives GLES a 587 place to draw.</p> 588 589 <h3 id="anativewindow">ANativeWindow</h3> 590 591 <p>The public Surface class is implemented in the Java programming language. The 592 equivalent in C/C++ is the ANativeWindow class, semi-exposed by the <a 593 href="https://developer.android.com/tools/sdk/ndk/index.html">Android NDK</a>. You 594 can get the ANativeWindow from a Surface with the <code>ANativeWindow_fromSurface()</code> 595 call. Just like its Java-language cousin, you can lock it, render in software, 596 and unlock-and-post.</p> 597 598 <p>To create an EGL window surface from native code, you pass an instance of 599 EGLNativeWindowType to <code>eglCreateWindowSurface()</code>. EGLNativeWindowType is just 600 a synonym for ANativeWindow, so you can freely cast one to the other.</p> 601 602 <p>The fact that the basic "native window" type just wraps the producer side of a 603 BufferQueue should not come as a surprise.</p> 604 605 <h2 id="surfaceview">SurfaceView and GLSurfaceView</h2> 606 607 <p>Now that we've explored the lower-level components, it's time to see how they 608 fit into the higher-level components that apps are built from.</p> 609 610 <p>The Android app framework UI is based on a hierarchy of objects that start with 611 View. Most of the details don't matter for this discussion, but it's helpful to 612 understand that UI elements go through a complicated measurement and layout 613 process that fits them into a rectangular area. All visible View objects are 614 rendered to a SurfaceFlinger-created Surface that was set up by the 615 WindowManager when the app was brought to the foreground. The layout and 616 rendering is performed on the app's UI thread.</p> 617 618 <p>Regardless of how many Layouts and Views you have, everything gets rendered into 619 a single buffer. This is true whether or not the Views are hardware-accelerated.</p> 620 621 <p>A SurfaceView takes the same sorts of parameters as other views, so you can give 622 it a position and size, and fit other elements around it. When it comes time to 623 render, however, the contents are completely transparent. The View part of a 624 SurfaceView is just a see-through placeholder.</p> 625 626 <p>When the SurfaceView's View component is about to become visible, the framework 627 asks the WindowManager to ask SurfaceFlinger to create a new Surface. (This 628 doesn't happen synchronously, which is why you should provide a callback that 629 notifies you when the Surface creation finishes.) By default, the new Surface 630 is placed behind the app UI Surface, but the default "Z-ordering" can be 631 overridden to put the Surface on top.</p> 632 633 <p>Whatever you render onto this Surface will be composited by SurfaceFlinger, not 634 by the app. This is the real power of SurfaceView: the Surface you get can be 635 rendered by a separate thread or a separate process, isolated from any rendering 636 performed by the app UI, and the buffers go directly to SurfaceFlinger. You 637 can't totally ignore the UI thread -- you still have to coordinate with the 638 Activity lifecycle, and you may need to adjust something if the size or position 639 of the View changes -- but you have a whole Surface all to yourself, and 640 blending with the app UI and other layers is handled by the Hardware Composer.</p> 641 642 <p>It's worth taking a moment to note that this new Surface is the producer side of 643 a BufferQueue whose consumer is a SurfaceFlinger layer. You can update the 644 Surface with any mechanism that can feed a BufferQueue. You can: use the 645 Surface-supplied Canvas functions, attach an EGLSurface and draw on it 646 with GLES, and configure a MediaCodec video decoder to write to it.</p> 647 648 <h3 id="composition">Composition and the Hardware Scaler</h3> 649 650 <p>Now that we have a bit more context, it's useful to go back and look at a couple 651 of fields from <code>dumpsys SurfaceFlinger</code> that we skipped over earlier 652 on. Back in the <a href="#hwcomposer">Hardware Composer</a> discussion, we 653 looked at some output like this:</p> 654 655 <pre> 656 type | source crop | frame name 657 ------------+-----------------------------------+-------------------------------- 658 HWC | [ 0.0, 0.0, 320.0, 240.0] | [ 48, 411, 1032, 1149] SurfaceView 659 HWC | [ 0.0, 75.0, 1080.0, 1776.0] | [ 0, 75, 1080, 1776] com.android.grafika/com.android.grafika.PlayMovieSurfaceActivity 660 HWC | [ 0.0, 0.0, 1080.0, 75.0] | [ 0, 0, 1080, 75] StatusBar 661 HWC | [ 0.0, 0.0, 1080.0, 144.0] | [ 0, 1776, 1080, 1920] NavigationBar 662 FB TARGET | [ 0.0, 0.0, 1080.0, 1920.0] | [ 0, 0, 1080, 1920] HWC_FRAMEBUFFER_TARGET 663 </pre> 664 665 <p>This was taken while playing a movie in Grafika's "Play video (SurfaceView)" 666 activity, on a Nexus 5 in portrait orientation. Note that the list is ordered 667 from back to front: the SurfaceView's Surface is in the back, the app UI layer 668 sits on top of that, followed by the status and navigation bars that are above 669 everything else. The video is QVGA (320x240).</p> 670 671 <p>The "source crop" indicates the portion of the Surface's buffer that 672 SurfaceFlinger is going to display. The app UI was given a Surface equal to the 673 full size of the display (1080x1920), but there's no point rendering and 674 compositing pixels that will be obscured by the status and navigation bars, so 675 the source is cropped to a rectangle that starts 75 pixels from the top, and 676 ends 144 pixels from the bottom. The status and navigation bars have smaller 677 Surfaces, and the source crop describes a rectangle that begins at the the top 678 left (0,0) and spans their content.</p> 679 680 <p>The "frame" is the rectangle where the pixels end up on the display. For the 681 app UI layer, the frame matches the source crop, because we're copying (or 682 overlaying) a portion of a display-sized layer to the same location in another 683 display-sized layer. For the status and navigation bars, the size of the frame 684 rectangle is the same, but the position is adjusted so that the navigation bar 685 appears at the bottom of the screen.</p> 686 687 <p>Now consider the layer labeled "SurfaceView", which holds our video content. 688 The source crop matches the video size, which SurfaceFlinger knows because the 689 MediaCodec decoder (the buffer producer) is dequeuing buffers that size. The 690 frame rectangle has a completely different size -- 984x738.</p> 691 692 <p>SurfaceFlinger handles size differences by scaling the buffer contents to fill 693 the frame rectangle, upscaling or downscaling as needed. This particular size 694 was chosen because it has the same aspect ratio as the video (4:3), and is as 695 wide as possible given the constraints of the View layout (which includes some 696 padding at the edges of the screen for aesthetic reasons).</p> 697 698 <p>If you started playing a different video on the same Surface, the underlying 699 BufferQueue would reallocate buffers to the new size automatically, and 700 SurfaceFlinger would adjust the source crop. If the aspect ratio of the new 701 video is different, the app would need to force a re-layout of the View to match 702 it, which causes the WindowManager to tell SurfaceFlinger to update the frame 703 rectangle.</p> 704 705 <p>If you're rendering on the Surface through some other means, perhaps GLES, you 706 can set the Surface size using the <code>SurfaceHolder#setFixedSize()</code> 707 call. You could, for example, configure a game to always render at 1280x720, 708 which would significantly reduce the number of pixels that must be touched to 709 fill the screen on a 2560x1440 tablet or 4K television. The display processor 710 handles the scaling. If you don't want to letter- or pillar-box your game, you 711 could adjust the game's aspect ratio by setting the size so that the narrow 712 dimension is 720 pixels, but the long dimension is set to maintain the aspect 713 ratio of the physical display (e.g. 1152x720 to match a 2560x1600 display). 714 You can see an example of this approach in Grafika's "Hardware scaler 715 exerciser" activity.</p> 716 717 <h3 id="glsurfaceview">GLSurfaceView</h3> 718 719 <p>The GLSurfaceView class provides some helper classes that help manage EGL 720 contexts, inter-thread communication, and interaction with the Activity 721 lifecycle. That's it. You do not need to use a GLSurfaceView to use GLES.</p> 722 723 <p>For example, GLSurfaceView creates a thread for rendering and configures an EGL 724 context there. The state is cleaned up automatically when the activity pauses. 725 Most apps won't need to know anything about EGL to use GLES with GLSurfaceView.</p> 726 727 <p>In most cases, GLSurfaceView is very helpful and can make working with GLES 728 easier. In some situations, it can get in the way. Use it if it helps, don't 729 if it doesn't.</p> 730 731 <h2 id="surfacetexture">SurfaceTexture</h2> 732 733 <p>The SurfaceTexture class is a relative newcomer, added in Android 3.0 734 ("Honeycomb"). Just as SurfaceView is the combination of a Surface and a View, 735 SurfaceTexture is the combination of a Surface and a GLES texture. Sort of.</p> 736 737 <p>When you create a SurfaceTexture, you are creating a BufferQueue for which your 738 app is the consumer. When a new buffer is queued by the producer, your app is 739 notified via callback (<code>onFrameAvailable()</code>). Your app calls 740 <code>updateTexImage()</code>, which releases the previously-held buffer, 741 acquires the new buffer from the queue, and makes some EGL calls to make the 742 buffer available to GLES as an "external" texture.</p> 743 744 <p>External textures (<code>GL_TEXTURE_EXTERNAL_OES</code>) are not quite the 745 same as textures created by GLES (<code>GL_TEXTURE_2D</code>). You have to 746 configure your renderer a bit differently, and there are things you can't do 747 with them. But the key point is this: You can render textured polygons directly 748 from the data received by your BufferQueue.</p> 749 750 <p>You may be wondering how we can guarantee the format of the data in the 751 buffer is something GLES can recognize -- gralloc supports a wide variety 752 of formats. When SurfaceTexture created the BufferQueue, it set the consumer's 753 usage flags to <code>GRALLOC_USAGE_HW_TEXTURE</code>, ensuring that any buffer 754 created by gralloc would be usable by GLES.</p> 755 756 <p>Because SurfaceTexture interacts with an EGL context, you have to be careful to 757 call its methods from the correct thread. This is spelled out in the class 758 documentation.</p> 759 760 <p>If you look deeper into the class documentation, you will see a couple of odd 761 calls. One retrieves a timestamp, the other a transformation matrix, the value 762 of each having been set by the previous call to <code>updateTexImage()</code>. 763 It turns out that BufferQueue passes more than just a buffer handle to the consumer. 764 Each buffer is accompanied by a timestamp and transformation parameters.</p> 765 766 <p>The transformation is provided for efficiency. In some cases, the source data 767 might be in the "wrong" orientation for the consumer; but instead of rotating 768 the data before sending it, we can send the data in its current orientation with 769 a transform that corrects it. The transformation matrix can be merged with 770 other transformations at the point the data is used, minimizing overhead.</p> 771 772 <p>The timestamp is useful for certain buffer sources. For example, suppose you 773 connect the producer interface to the output of the camera (with 774 <code>setPreviewTexture()</code>). If you want to create a video, you need to 775 set the presentation time stamp for each frame; but you want to base that on the time 776 when the frame was captured, not the time when the buffer was received by your 777 app. The timestamp provided with the buffer is set by the camera code, 778 resulting in a more consistent series of timestamps.</p> 779 780 <h3 id="surfacet">SurfaceTexture and Surface</h3> 781 782 <p>If you look closely at the API you'll see the only way for an application 783 to create a plain Surface is through a constructor that takes a SurfaceTexture 784 as the sole argument. (Prior to API 11, there was no public constructor for 785 Surface at all.) This might seem a bit backward if you view SurfaceTexture as a 786 combination of a Surface and a texture.</p> 787 788 <p>Under the hood, SurfaceTexture is called GLConsumer, which more accurately 789 reflects its role as the owner and consumer of a BufferQueue. When you create a 790 Surface from a SurfaceTexture, what you're doing is creating an object that 791 represents the producer side of the SurfaceTexture's BufferQueue.</p> 792 793 <h3 id="continuous-capture">Case Study: Grafika's "Continuous Capture" Activity</h3> 794 795 <p>The camera can provide a stream of frames suitable for recording as a movie. If 796 you want to display it on screen, you create a SurfaceView, pass the Surface to 797 <code>setPreviewDisplay()</code>, and let the producer (camera) and consumer 798 (SurfaceFlinger) do all the work. If you want to record the video, you create a 799 Surface with MediaCodec's <code>createInputSurface()</code>, pass that to the 800 camera, and again you sit back and relax. If you want to show the video and 801 record it at the same time, you have to get more involved.</p> 802 803 <p>The "Continuous capture" activity displays video from the camera as it's being 804 recorded. In this case, encoded video is written to a circular buffer in memory 805 that can be saved to disk at any time. It's straightforward to implement so 806 long as you keep track of where everything is.</p> 807 808 <p>There are three BufferQueues involved. The app uses a SurfaceTexture to receive 809 frames from Camera, converting them to an external GLES texture. The app 810 declares a SurfaceView, which we use to display the frames, and we configure a 811 MediaCodec encoder with an input Surface to create the video. So one 812 BufferQueue is created by the app, one by SurfaceFlinger, and one by 813 mediaserver.</p> 814 815 <img src="images/continuous_capture_activity.png" alt="Grafika continuous 816 capture activity" /> 817 818 <p class="img-caption"> 819 <strong>Figure 2.</strong>Grafika's continuous capture activity 820 </p> 821 822 <p>In the diagram above, the arrows show the propagation of the data from the 823 camera. BufferQueues are in color (purple producer, cyan consumer). Note 824 Camera actually lives in the mediaserver process.</p> 825 826 <p>Encoded H.264 video goes to a circular buffer in RAM in the app process, and is 827 written to an MP4 file on disk using the MediaMuxer class when the capture 828 button is hit.</p> 829 830 <p>All three of the BufferQueues are handled with a single EGL context in the 831 app, and the GLES operations are performed on the UI thread. Doing the 832 SurfaceView rendering on the UI thread is generally discouraged, but since we're 833 doing simple operations that are handled asynchronously by the GLES driver we 834 should be fine. (If the video encoder locks up and we block trying to dequeue a 835 buffer, the app will become unresponsive. But at that point, we're probably 836 failing anyway.) The handling of the encoded data -- managing the circular 837 buffer and writing it to disk -- is performed on a separate thread.</p> 838 839 <p>The bulk of the configuration happens in the SurfaceView's <code>surfaceCreated()</code> 840 callback. The EGLContext is created, and EGLSurfaces are created for the 841 display and for the video encoder. When a new frame arrives, we tell 842 SurfaceTexture to acquire it and make it available as a GLES texture, then 843 render it with GLES commands on each EGLSurface (forwarding the transform and 844 timestamp from SurfaceTexture). The encoder thread pulls the encoded output 845 from MediaCodec and stashes it in memory.</p> 846 847 <h2 id="texture">TextureView</h2> 848 849 <p>The TextureView class was 850 <a href="http://android-developers.blogspot.com/2011/11/android-40-graphics-and-animations.html">introduced</a> 851 in Android 4.0 ("Ice Cream Sandwich"). It's the most complex of the View 852 objects discussed here, combining a View with a SurfaceTexture.</p> 853 854 <p>Recall that the SurfaceTexture is a "GL consumer", consuming buffers of graphics 855 data and making them available as textures. TextureView wraps a SurfaceTexture, 856 taking over the responsibility of responding to the callbacks and acquiring new 857 buffers. The arrival of new buffers causes TextureView to issue a View 858 invalidate request. When asked to draw, the TextureView uses the contents of 859 the most recently received buffer as its data source, rendering wherever and 860 however the View state indicates it should.</p> 861 862 <p>You can render on a TextureView with GLES just as you would SurfaceView. Just 863 pass the SurfaceTexture to the EGL window creation call. However, doing so 864 exposes a potential problem.</p> 865 866 <p>In most of what we've looked at, the BufferQueues have passed buffers between 867 different processes. When rendering to a TextureView with GLES, both producer 868 and consumer are in the same process, and they might even be handled on a single 869 thread. Suppose we submit several buffers in quick succession from the UI 870 thread. The EGL buffer swap call will need to dequeue a buffer from the 871 BufferQueue, and it will stall until one is available. There won't be any 872 available until the consumer acquires one for rendering, but that also happens 873 on the UI thread so we're stuck.</p> 874 875 <p>The solution is to have BufferQueue ensure there is always a buffer 876 available to be dequeued, so the buffer swap never stalls. One way to guarantee 877 this is to have BufferQueue discard the contents of the previously-queued buffer 878 when a new buffer is queued, and to place restrictions on minimum buffer counts 879 and maximum acquired buffer counts. (If your queue has three buffers, and all 880 three buffers are acquired by the consumer, then there's nothing to dequeue and 881 the buffer swap call must hang or fail. So we need to prevent the consumer from 882 acquiring more than two buffers at once.) Dropping buffers is usually 883 undesirable, so it's only enabled in specific situations, such as when the 884 producer and consumer are in the same process.</p> 885 886 <h3 id="surface-or-texture">SurfaceView or TextureView?</h3> 887 SurfaceView and TextureView fill similar roles, but have very different 888 implementations. To decide which is best requires an understanding of the 889 trade-offs.</p> 890 891 <p>Because TextureView is a proper citizen of the View hierarchy, it behaves like 892 any other View, and can overlap or be overlapped by other elements. You can 893 perform arbitrary transformations and retrieve the contents as a bitmap with 894 simple API calls.</p> 895 896 <p>The main strike against TextureView is the performance of the composition step. 897 With SurfaceView, the content is written to a separate layer that SurfaceFlinger 898 composites, ideally with an overlay. With TextureView, the View composition is 899 always performed with GLES, and updates to its contents may cause other View 900 elements to redraw as well (e.g. if they're positioned on top of the 901 TextureView). After the View rendering completes, the app UI layer must then be 902 composited with other layers by SurfaceFlinger, so you're effectively 903 compositing every visible pixel twice. For a full-screen video player, or any 904 other application that is effectively just UI elements layered on top of video, 905 SurfaceView offers much better performance.</p> 906 907 <p>As noted earlier, DRM-protected video can be presented only on an overlay plane. 908 Video players that support protected content must be implemented with 909 SurfaceView.</p> 910 911 <h3 id="grafika">Case Study: Grafika's Play Video (TextureView)</h3> 912 913 <p>Grafika includes a pair of video players, one implemented with TextureView, the 914 other with SurfaceView. The video decoding portion, which just sends frames 915 from MediaCodec to a Surface, is the same for both. The most interesting 916 differences between the implementations are the steps required to present the 917 correct aspect ratio.</p> 918 919 <p>While SurfaceView requires a custom implementation of FrameLayout, resizing 920 SurfaceTexture is a simple matter of configuring a transformation matrix with 921 <code>TextureView#setTransform()</code>. For the former, you're sending new 922 window position and size values to SurfaceFlinger through WindowManager; for 923 the latter, you're just rendering it differently.</p> 924 925 <p>Otherwise, both implementations follow the same pattern. Once the Surface has 926 been created, playback is enabled. When "play" is hit, a video decoding thread 927 is started, with the Surface as the output target. After that, the app code 928 doesn't have to do anything -- composition and display will either be handled by 929 SurfaceFlinger (for the SurfaceView) or by TextureView.</p> 930 931 <h3 id="decode">Case Study: Grafika's Double Decode</h3> 932 933 <p>This activity demonstrates manipulation of the SurfaceTexture inside a 934 TextureView.</p> 935 936 <p>The basic structure of this activity is a pair of TextureViews that show two 937 different videos playing side-by-side. To simulate the needs of a 938 videoconferencing app, we want to keep the MediaCodec decoders alive when the 939 activity is paused and resumed for an orientation change. The trick is that you 940 can't change the Surface that a MediaCodec decoder uses without fully 941 reconfiguring it, which is a fairly expensive operation; so we want to keep the 942 Surface alive. The Surface is just a handle to the producer interface in the 943 SurfaceTexture's BufferQueue, and the SurfaceTexture is managed by the 944 TextureView;, so we also need to keep the SurfaceTexture alive. So how do we deal 945 with the TextureView getting torn down?</p> 946 947 <p>It just so happens TextureView provides a <code>setSurfaceTexture()</code> call 948 that does exactly what we want. We obtain references to the SurfaceTextures 949 from the TextureViews and save them in a static field. When the activity is 950 shut down, we return "false" from the <code>onSurfaceTextureDestroyed()</code> 951 callback to prevent destruction of the SurfaceTexture. When the activity is 952 restarted, we stuff the old SurfaceTexture into the new TextureView. The 953 TextureView class takes care of creating and destroying the EGL contexts.</p> 954 955 <p>Each video decoder is driven from a separate thread. At first glance it might 956 seem like we need EGL contexts local to each thread; but remember the buffers 957 with decoded output are actually being sent from mediaserver to our 958 BufferQueue consumers (the SurfaceTextures). The TextureViews take care of the 959 rendering for us, and they execute on the UI thread.</p> 960 961 <p>Implementing this activity with SurfaceView would be a bit harder. We can't 962 just create a pair of SurfaceViews and direct the output to them, because the 963 Surfaces would be destroyed during an orientation change. Besides, that would 964 add two layers, and limitations on the number of available overlays strongly 965 motivate us to keep the number of layers to a minimum. Instead, we'd want to 966 create a pair of SurfaceTextures to receive the output from the video decoders, 967 and then perform the rendering in the app, using GLES to render two textured 968 quads onto the SurfaceView's Surface.</p> 969 970 <h2 id="notes">Conclusion</h2> 971 972 <p>We hope this page has provided useful insights into the way Android handles 973 graphics at the system level.</p> 974 975 <p>Some information and advice on related topics can be found in the appendices 976 that follow.</p> 977 978 <h2 id="loops">Appendix A: Game Loops</h2> 979 980 <p>A very popular way to implement a game loop looks like this:</p> 981 982 <pre> 983 while (playing) { 984 advance state by one frame 985 render the new frame 986 sleep until its time to do the next frame 987 } 988 </pre> 989 990 <p>There are a few problems with this, the most fundamental being the idea that the 991 game can define what a "frame" is. Different displays will refresh at different 992 rates, and that rate may vary over time. If you generate frames faster than the 993 display can show them, you will have to drop one occasionally. If you generate 994 them too slowly, SurfaceFlinger will periodically fail to find a new buffer to 995 acquire and will re-show the previous frame. Both of these situations can 996 cause visible glitches.</p> 997 998 <p>What you need to do is match the display's frame rate, and advance game state 999 according to how much time has elapsed since the previous frame. There are two 1000 ways to go about this: (1) stuff the BufferQueue full and rely on the "swap 1001 buffers" back-pressure; (2) use Choreographer (API 16+).</p> 1002 1003 <h3 id="stuffing">Queue Stuffing</h3> 1004 1005 <p>This is very easy to implement: just swap buffers as fast as you can. In early 1006 versions of Android this could actually result in a penalty where 1007 <code>SurfaceView#lockCanvas()</code> would put you to sleep for 100ms. Now 1008 it's paced by the BufferQueue, and the BufferQueue is emptied as quickly as 1009 SurfaceFlinger is able.</p> 1010 1011 <p>One example of this approach can be seen in <a 1012 href="https://code.google.com/p/android-breakout/">Android Breakout</a>. It 1013 uses GLSurfaceView, which runs in a loop that calls the application's 1014 onDrawFrame() callback and then swaps the buffer. If the BufferQueue is full, 1015 the <code>eglSwapBuffers()</code> call will wait until a buffer is available. 1016 Buffers become available when SurfaceFlinger releases them, which it does after 1017 acquiring a new one for display. Because this happens on VSYNC, your draw loop 1018 timing will match the refresh rate. Mostly.</p> 1019 1020 <p>There are a couple of problems with this approach. First, the app is tied to 1021 SurfaceFlinger activity, which is going to take different amounts of time 1022 depending on how much work there is to do and whether it's fighting for CPU time 1023 with other processes. Since your game state advances according to the time 1024 between buffer swaps, your animation won't update at a consistent rate. When 1025 running at 60fps with the inconsistencies averaged out over time, though, you 1026 probably won't notice the bumps.</p> 1027 1028 <p>Second, the first couple of buffer swaps are going to happen very quickly 1029 because the BufferQueue isn't full yet. The computed time between frames will 1030 be near zero, so the game will generate a few frames in which nothing happens. 1031 In a game like Breakout, which updates the screen on every refresh, the queue is 1032 always full except when a game is first starting (or un-paused), so the effect 1033 isn't noticeable. A game that pauses animation occasionally and then returns to 1034 as-fast-as-possible mode might see odd hiccups.</p> 1035 1036 <h3 id="choreographer">Choreographer</h3> 1037 1038 <p>Choreographer allows you to set a callback that fires on the next VSYNC. The 1039 actual VSYNC time is passed in as an argument. So even if your app doesn't wake 1040 up right away, you still have an accurate picture of when the display refresh 1041 period began. Using this value, rather than the current time, yields a 1042 consistent time source for your game state update logic.</p> 1043 1044 <p>Unfortunately, the fact that you get a callback after every VSYNC does not 1045 guarantee that your callback will be executed in a timely fashion or that you 1046 will be able to act upon it sufficiently swiftly. Your app will need to detect 1047 situations where it's falling behind and drop frames manually.</p> 1048 1049 <p>The "Record GL app" activity in Grafika provides an example of this. On some 1050 devices (e.g. Nexus 4 and Nexus 5), the activity will start dropping frames if 1051 you just sit and watch. The GL rendering is trivial, but occasionally the View 1052 elements get redrawn, and the measure/layout pass can take a very long time if 1053 the device has dropped into a reduced-power mode. (According to systrace, it 1054 takes 28ms instead of 6ms after the clocks slow on Android 4.4. If you drag 1055 your finger around the screen, it thinks you're interacting with the activity, 1056 so the clock speeds stay high and you'll never drop a frame.)</p> 1057 1058 <p>The simple fix was to drop a frame in the Choreographer callback if the current 1059 time is more than N milliseconds after the VSYNC time. Ideally the value of N 1060 is determined based on previously observed VSYNC intervals. For example, if the 1061 refresh period is 16.7ms (60fps), you might drop a frame if you're running more 1062 than 15ms late.</p> 1063 1064 <p>If you watch "Record GL app" run, you will see the dropped-frame counter 1065 increase, and even see a flash of red in the border when frames drop. Unless 1066 your eyes are very good, though, you won't see the animation stutter. At 60fps, 1067 the app can drop the occasional frame without anyone noticing so long as the 1068 animation continues to advance at a constant rate. How much you can get away 1069 with depends to some extent on what you're drawing, the characteristics of the 1070 display, and how good the person using the app is at detecting jank.</p> 1071 1072 <h3 id="thread">Thread Management</h3> 1073 1074 <p>Generally speaking, if you're rendering onto a SurfaceView, GLSurfaceView, or 1075 TextureView, you want to do that rendering in a dedicated thread. Never do any 1076 "heavy lifting" or anything that takes an indeterminate amount of time on the 1077 UI thread.</p> 1078 1079 <p>Breakout and "Record GL app" use dedicated renderer threads, and they also 1080 update animation state on that thread. This is a reasonable approach so long as 1081 game state can be updated quickly.</p> 1082 1083 <p>Other games separate the game logic and rendering completely. If you had a 1084 simple game that did nothing but move a block every 100ms, you could have a 1085 dedicated thread that just did this:</p> 1086 1087 <pre> 1088 run() { 1089 Thread.sleep(100); 1090 synchronized (mLock) { 1091 moveBlock(); 1092 } 1093 } 1094 </pre> 1095 1096 <p>(You may want to base the sleep time off of a fixed clock to prevent drift -- 1097 sleep() isn't perfectly consistent, and moveBlock() takes a nonzero amount of 1098 time -- but you get the idea.)</p> 1099 1100 <p>When the draw code wakes up, it just grabs the lock, gets the current position 1101 of the block, releases the lock, and draws. Instead of doing fractional 1102 movement based on inter-frame delta times, you just have one thread that moves 1103 things along and another thread that draws things wherever they happen to be 1104 when the drawing starts.</p> 1105 1106 <p>For a scene with any complexity you'd want to create a list of upcoming events 1107 sorted by wake time, and sleep until the next event is due, but it's the same 1108 idea.</p> 1109 1110 <h2 id="activity">Appendix B: SurfaceView and the Activity Lifecycle</h2> 1111 1112 <p>When using a SurfaceView, it's considered good practice to render the Surface 1113 from a thread other than the main UI thread. This raises some questions about 1114 the interaction between that thread and the Activity lifecycle.</p> 1115 1116 <p>First, a little background. For an Activity with a SurfaceView, there are two 1117 separate but interdependent state machines:</p> 1118 1119 <ol> 1120 <li>Application onCreate / onResume / onPause</li> 1121 <li>Surface created / changed / destroyed</li> 1122 </ol> 1123 1124 <p>When the Activity starts, you get callbacks in this order:</p> 1125 1126 <ul> 1127 <li>onCreate</li> 1128 <li>onResume</li> 1129 <li>surfaceCreated</li> 1130 <li>surfaceChanged</li> 1131 </ul> 1132 1133 <p>If you hit "back" you get:</p> 1134 1135 <ul> 1136 <li>onPause</li> 1137 <li>surfaceDestroyed (called just before the Surface goes away)</li> 1138 </ul> 1139 1140 <p>If you rotate the screen, the Activity is torn down and recreated, so you 1141 get the full cycle. If it matters, you can tell that it's a "quick" restart by 1142 checking <code>isFinishing()</code>. (It might be possible to start / stop an 1143 Activity so quickly that surfaceCreated() might actually happen after onPause().)</p> 1144 1145 <p>If you tap the power button to blank the screen, you only get 1146 <code>onPause()</code> -- no <code>surfaceDestroyed()</code>. The Surface 1147 remains alive, and rendering can continue. You can even keep getting 1148 Choreographer events if you continue to request them. If you have a lock 1149 screen that forces a different orientation, your Activity may be restarted when 1150 the device is unblanked; but if not, you can come out of screen-blank with the 1151 same Surface you had before.</p> 1152 1153 <p>This raises a fundamental question when using a separate renderer thread with 1154 SurfaceView: Should the lifespan of the thread be tied to that of the Surface or 1155 the Activity? The answer depends on what you want to have happen when the 1156 screen goes blank. There are two basic approaches: (1) start/stop the thread on 1157 Activity start/stop; (2) start/stop the thread on Surface create/destroy.</p> 1158 1159 <p>#1 interacts well with the app lifecycle. We start the renderer thread in 1160 <code>onResume()</code> and stop it in <code>onPause()</code>. It gets a bit 1161 awkward when creating and configuring the thread because sometimes the Surface 1162 will already exist and sometimes it won't (e.g. it's still alive after toggling 1163 the screen with the power button). We have to wait for the surface to be 1164 created before we do some initialization in the thread, but we can't simply do 1165 it in the <code>surfaceCreated()</code> callback because that won't fire again 1166 if the Surface didn't get recreated. So we need to query or cache the Surface 1167 state, and forward it to the renderer thread. Note we have to be a little 1168 careful here passing objects between threads -- it is best to pass the Surface or 1169 SurfaceHolder through a Handler message, rather than just stuffing it into the 1170 thread, to avoid issues on multi-core systems (cf. the <a 1171 href="http://developer.android.com/training/articles/smp.html">Android SMP 1172 Primer</a>).</p> 1173 1174 <p>#2 has a certain appeal because the Surface and the renderer are logically 1175 intertwined. We start the thread after the Surface has been created, which 1176 avoids some inter-thread communication concerns. Surface created / changed 1177 messages are simply forwarded. We need to make sure rendering stops when the 1178 screen goes blank, and resumes when it un-blanks; this could be a simple matter 1179 of telling Choreographer to stop invoking the frame draw callback. Our 1180 <code>onResume()</code> will need to resume the callbacks if and only if the 1181 renderer thread is running. It may not be so trivial though -- if we animate 1182 based on elapsed time between frames, we could have a very large gap when the 1183 next event arrives; so an explicit pause/resume message may be desirable.</p> 1184 1185 <p>The above is primarily concerned with how the renderer thread is configured and 1186 whether it's executing. A related concern is extracting state from the thread 1187 when the Activity is killed (in <code>onPause()</code> or <code>onSaveInstanceState()</code>). 1188 Approach #1 will work best for that, because once the renderer thread has been 1189 joined its state can be accessed without synchronization primitives.</p> 1190 1191 <p>You can see an example of approach #2 in Grafika's "Hardware scaler exerciser."</p> 1192 1193 <h2 id="tracking">Appendix C: Tracking BufferQueue with systrace</h2> 1194 1195 <p>If you really want to understand how graphics buffers move around, you need to 1196 use systrace. The system-level graphics code is well instrumented, as is much 1197 of the relevant app framework code. Enable the "gfx" and "view" tags, and 1198 generally "sched" as well.</p> 1199 1200 <p>A full description of how to use systrace effectively would fill a rather long 1201 document. One noteworthy item is the presence of BufferQueues in the trace. If 1202 you've used systrace before, you've probably seen them, but maybe weren't sure 1203 what they were. As an example, if you grab a trace while Grafika's "Play video 1204 (SurfaceView)" is running, you will see a row labeled: "SurfaceView" This row 1205 tells you how many buffers were queued up at any given time.</p> 1206 1207 <p>You'll notice the value increments while the app is active -- triggering 1208 the rendering of frames by the MediaCodec decoder -- and decrements while 1209 SurfaceFlinger is doing work, consuming buffers. If you're showing video at 1210 30fps, the queue's value will vary from 0 to 1, because the ~60fps display can 1211 easily keep up with the source. (You'll also notice that SurfaceFlinger is only 1212 waking up when there's work to be done, not 60 times per second. The system tries 1213 very hard to avoid work and will disable VSYNC entirely if nothing is updating 1214 the screen.)</p> 1215 1216 <p>If you switch to "Play video (TextureView)" and grab a new trace, you'll see a 1217 row with a much longer name 1218 ("com.android.grafika/com.android.grafika.PlayMovieActivity"). This is the 1219 main UI layer, which is of course just another BufferQueue. Because TextureView 1220 renders into the UI layer, rather than a separate layer, you'll see all of the 1221 video-driven updates here.</p> 1222 1223 <p>For more information about systrace, see the <a 1224 href="http://developer.android.com/tools/help/systrace.html">Android 1225 documentation</a> for the tool.</p> 1226