Home | History | Annotate | Download | only in audio
      1 <html devsite>
      2   <head>
      3     <title>Avoiding Priority Inversion</title>
      4     <meta name="project_path" value="/_project.yaml" />
      5     <meta name="book_path" value="/_book.yaml" />
      6   </head>
      7   <body>
      8   <!--
      9       Copyright 2017 The Android Open Source Project
     10 
     11       Licensed under the Apache License, Version 2.0 (the "License");
     12       you may not use this file except in compliance with the License.
     13       You may obtain a copy of the License at
     14 
     15           http://www.apache.org/licenses/LICENSE-2.0
     16 
     17       Unless required by applicable law or agreed to in writing, software
     18       distributed under the License is distributed on an "AS IS" BASIS,
     19       WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
     20       See the License for the specific language governing permissions and
     21       limitations under the License.
     22   -->
     23 
     24 
     25 
     26 <p>
     27 This article explains how the Android's audio system attempts to avoid
     28 priority inversion,
     29 and highlights techniques that you can use too.
     30 </p>
     31 
     32 <p>
     33 These techniques may be useful to developers of high-performance
     34 audio apps, OEMs, and SoC providers who are implementing an audio
     35 HAL. Please note implementing these techniques is not
     36 guaranteed to prevent glitches or other failures, particularly if
     37 used outside of the audio context.
     38 Your results may vary, and you should conduct your own
     39 evaluation and testing.
     40 </p>
     41 
     42 <h2 id="background">Background</h2>
     43 
     44 <p>
     45 The Android AudioFlinger audio server and AudioTrack/AudioRecord
     46 client implementation are being re-architected to reduce latency.
     47 This work started in Android 4.1, and continued with further improvements
     48 in 4.2, 4.3, 4.4, and 5.0.
     49 </p>
     50 
     51 <p>
     52 To achieve this lower latency, many changes were needed throughout the system. One
     53 important change is to assign CPU resources to time-critical
     54 threads with a more predictable scheduling policy. Reliable scheduling
     55 allows the audio buffer sizes and counts to be reduced while still
     56 avoiding underruns and overruns.
     57 </p>
     58 
     59 <h2 id="priorityInversion">Priority inversion</h2>
     60 
     61 <p>
     62 <a href="http://en.wikipedia.org/wiki/Priority_inversion">Priority inversion</a>
     63 is a classic failure mode of real-time systems,
     64 where a higher-priority task is blocked for an unbounded time waiting
     65 for a lower-priority task to release a resource such as (shared
     66 state protected by) a
     67 <a href="http://en.wikipedia.org/wiki/Mutual_exclusion">mutex</a>.
     68 </p>
     69 
     70 <p>
     71 In an audio system, priority inversion typically manifests as a
     72 <a href="http://en.wikipedia.org/wiki/Glitch">glitch</a>
     73 (click, pop, dropout),
     74 <a href="http://en.wikipedia.org/wiki/Max_Headroom_(character)">repeated audio</a>
     75 when circular buffers
     76 are used, or delay in responding to a command.
     77 </p>
     78 
     79 <p>
     80 A common workaround for priority inversion is to increase audio buffer sizes.
     81 However, this method increases latency and merely hides the problem
     82 instead of solving it. It is better to understand and prevent priority
     83 inversion, as seen below.
     84 </p>
     85 
     86 <p>
     87 In the Android audio implementation, priority inversion is most
     88 likely to occur in these places. And so you should focus your attention here:
     89 </p>
     90 
     91 <ul>
     92 
     93 <li>
     94 between normal mixer thread and fast mixer thread in AudioFlinger
     95 </li>
     96 
     97 <li>
     98 between application callback thread for a fast AudioTrack and
     99 fast mixer thread (they both have elevated priority, but slightly
    100 different priorities)
    101 </li>
    102 
    103 <li>
    104 between application callback thread for a fast AudioRecord and
    105 fast capture thread (similar to previous)
    106 </li>
    107 
    108 <li>
    109 within the audio Hardware Abstraction Layer (HAL) implementation, e.g. for telephony or echo cancellation
    110 </li>
    111 
    112 <li>
    113 within the audio driver in kernel
    114 </li>
    115 
    116 <li>
    117 between AudioTrack or AudioRecord callback thread and other app threads (this is out of our control)
    118 </li>
    119 
    120 </ul>
    121 
    122 <h2 id="commonSolutions">Common solutions</h2>
    123 
    124 <p>
    125 The typical solutions include:
    126 </p>
    127 
    128 <ul>
    129 
    130 <li>
    131 disabling interrupts
    132 </li>
    133 
    134 <li>
    135 priority inheritance mutexes
    136 </li>
    137 
    138 </ul>
    139 
    140 <p>
    141 Disabling interrupts is not feasible in Linux user space, and does
    142 not work for Symmetric Multi-Processors (SMP).
    143 </p>
    144 
    145 
    146 <p>
    147 Priority inheritance
    148 <a href="http://en.wikipedia.org/wiki/Futex">futexes</a>
    149 (fast user-space mutexes) are available
    150 in Linux kernel, but are not currently exposed by the Android C
    151 runtime library
    152 <a href="http://en.wikipedia.org/wiki/Bionic_(software)">Bionic</a>.
    153 They are not used in the audio system because they are relatively heavyweight,
    154 and because they rely on a trusted client.
    155 </p>
    156 
    157 <h2 id="androidTechniques">Techniques used by Android</h2>
    158 
    159 <p>
    160 Experiments started with "try lock" and lock with timeout. These are
    161 non-blocking and bounded blocking variants of the mutex lock
    162 operation. Try lock and lock with timeout worked fairly well but were
    163 susceptible to a couple of obscure failure modes: the
    164 server was not guaranteed to be able to access the shared state if
    165 the client happened to be busy, and the cumulative timeout could
    166 be too long if there was a long sequence of unrelated locks that
    167 all timed out.
    168 </p>
    169 
    170 
    171 <p>
    172 We also use
    173 <a href="http://en.wikipedia.org/wiki/Linearizability">atomic operations</a>
    174 such as:
    175 </p>
    176 
    177 <ul>
    178 <li>increment</li>
    179 <li>bitwise "or"</li>
    180 <li>bitwise "and"</li>
    181 </ul>
    182 
    183 <p>
    184 All of these return the previous value and include the necessary
    185 SMP barriers. The disadvantage is they can require unbounded retries.
    186 In practice, we've found that the retries are not a problem.
    187 </p>
    188 
    189 <p class="note"><strong>Note:</strong> Atomic operations and their interactions with memory barriers
    190 are notoriously badly misunderstood and used incorrectly. We include these methods
    191 here for completeness but recommend you also read the article
    192 <a href="https://developer.android.com/training/articles/smp.html">
    193 SMP Primer for Android</a>
    194 for further information.
    195 </p>
    196 
    197 <p>
    198 We still have and use most of the above tools, and have recently
    199 added these techniques:
    200 </p>
    201 
    202 <ul>
    203 
    204 <li>
    205 Use non-blocking single-reader single-writer
    206 <a href="http://en.wikipedia.org/wiki/Circular_buffer">FIFO queues</a>
    207 for data.
    208 </li>
    209 
    210 <li>
    211 Try to
    212 <i>copy</i>
    213 state rather than
    214 <i>share</i>
    215 state between high- and
    216 low-priority modules.
    217 </li>
    218 
    219 <li>
    220 When state does need to be shared, limit the state to the
    221 maximum-size
    222 <a href="http://en.wikipedia.org/wiki/Word_(computer_architecture)">word</a>
    223 that can be accessed atomically in one-bus operation
    224 without retries.
    225 </li>
    226 
    227 <li>
    228 For complex multi-word state, use a state queue. A state queue
    229 is basically just a non-blocking single-reader single-writer FIFO
    230 queue used for state rather than data, except the writer collapses
    231 adjacent pushes into a single push.
    232 </li>
    233 
    234 <li>
    235 Pay attention to
    236 <a href="http://en.wikipedia.org/wiki/Memory_barrier">memory barriers</a>
    237 for SMP correctness.
    238 </li>
    239 
    240 <li>
    241 <a href="http://en.wikipedia.org/wiki/Trust,_but_verify">Trust, but verify</a>.
    242 When sharing
    243 <i>state</i>
    244 between processes, don't
    245 assume that the state is well-formed. For example, check that indices
    246 are within bounds. This verification isn't needed between threads
    247 in the same process, between mutual trusting processes (which
    248 typically have the same UID). It's also unnecessary for shared
    249 <i>data</i>
    250 such as PCM audio where a corruption is inconsequential.
    251 </li>
    252 
    253 </ul>
    254 
    255 <h2 id="nonBlockingAlgorithms">Non-blocking algorithms</h2>
    256 
    257 <p>
    258 <a href="http://en.wikipedia.org/wiki/Non-blocking_algorithm">Non-blocking algorithms</a>
    259 have been a subject of much recent study.
    260 But with the exception of single-reader single-writer FIFO queues,
    261 we've found them to be complex and error-prone.
    262 </p>
    263 
    264 <p>
    265 Starting in Android 4.2, you can find our non-blocking,
    266 single-reader/writer classes in these locations:
    267 </p>
    268 
    269 <ul>
    270 
    271 <li>
    272 frameworks/av/include/media/nbaio/
    273 </li>
    274 
    275 <li>
    276 frameworks/av/media/libnbaio/
    277 </li>
    278 
    279 <li>
    280 frameworks/av/services/audioflinger/StateQueue*
    281 </li>
    282 
    283 </ul>
    284 
    285 <p>
    286 These were designed specifically for AudioFlinger and are not
    287 general-purpose. Non-blocking algorithms are notorious for being
    288 difficult to debug. You can look at this code as a model. But be
    289 aware there may be bugs, and the classes are not guaranteed to be
    290 suitable for other purposes.
    291 </p>
    292 
    293 <p>
    294 For developers, some of the sample OpenSL ES application code should be updated to
    295 use non-blocking algorithms or reference a non-Android open source library.
    296 </p>
    297 
    298 <p>
    299 We have published an example non-blocking FIFO implementation that is specifically designed for
    300 application code.  See these files located in the platform source directory
    301 <code>frameworks/av/audio_utils</code>:
    302 </p>
    303 <ul>
    304   <li><a href="https://android.googlesource.com/platform/system/media/+/master/audio_utils/include/audio_utils/fifo.h">include/audio_utils/fifo.h</a></li>
    305   <li><a href="https://android.googlesource.com/platform/system/media/+/master/audio_utils/fifo.c">fifo.c</a></li>
    306   <li><a href="https://android.googlesource.com/platform/system/media/+/master/audio_utils/include/audio_utils/roundup.h">include/audio_utils/roundup.h</a></li>
    307   <li><a href="https://android.googlesource.com/platform/system/media/+/master/audio_utils/roundup.c">roundup.c</a></li>
    308 </ul>
    309 
    310 <h2 id="tools">Tools</h2>
    311 
    312 <p>
    313 To the best of our knowledge, there are no automatic tools for
    314 finding priority inversion, especially before it happens. Some
    315 research static code analysis tools are capable of finding priority
    316 inversions if able to access the entire codebase. Of course, if
    317 arbitrary user code is involved (as it is here for the application)
    318 or is a large codebase (as for the Linux kernel and device drivers),
    319 static analysis may be impractical. The most important thing is to
    320 read the code very carefully and get a good grasp on the entire
    321 system and the interactions. Tools such as
    322 <a href="http://developer.android.com/tools/help/systrace.html">systrace</a>
    323 and
    324 <code>ps -t -p</code>
    325 are useful for seeing priority inversion after it occurs, but do
    326 not tell you in advance.
    327 </p>
    328 
    329 <h2 id="aFinalWord">A final word</h2>
    330 
    331 <p>
    332 After all of this discussion, don't be afraid of mutexes. Mutexes
    333 are your friend for ordinary use, when used and implemented correctly
    334 in ordinary non-time-critical use cases. But between high- and
    335 low-priority tasks and in time-sensitive systems mutexes are more
    336 likely to cause trouble.
    337 </p>
    338 
    339   </body>
    340 </html>
    341