Home | History | Annotate | Download | only in audio
      1 page.title=Avoiding Priority Inversion
      2 @jd:body
      3 
      4 <!--
      5     Copyright 2013 The Android Open Source Project
      6 
      7     Licensed under the Apache License, Version 2.0 (the "License");
      8     you may not use this file except in compliance with the License.
      9     You may obtain a copy of the License at
     10 
     11         http://www.apache.org/licenses/LICENSE-2.0
     12 
     13     Unless required by applicable law or agreed to in writing, software
     14     distributed under the License is distributed on an "AS IS" BASIS,
     15     WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
     16     See the License for the specific language governing permissions and
     17     limitations under the License.
     18 -->
     19 <div id="qv-wrapper">
     20   <div id="qv">
     21     <h2>In this document</h2>
     22     <ol id="auto-toc">
     23     </ol>
     24   </div>
     25 </div>
     26 
     27 <p>
     28 This article explains how the Android's audio system attempts to avoid
     29 priority inversion,
     30 and highlights techniques that you can use too.
     31 </p>
     32 
     33 <p>
     34 These techniques may be useful to developers of high-performance
     35 audio apps, OEMs, and SoC providers who are implementing an audio
     36 HAL. Please note implementing these techniques is not
     37 guaranteed to prevent glitches or other failures, particularly if
     38 used outside of the audio context.
     39 Your results may vary, and you should conduct your own
     40 evaluation and testing.
     41 </p>
     42 
     43 <h2 id="background">Background</h2>
     44 
     45 <p>
     46 The Android AudioFlinger audio server and AudioTrack/AudioRecord
     47 client implementation are being re-architected to reduce latency.
     48 This work started in Android 4.1, and continued with further improvements
     49 in 4.2, 4.3, 4.4, and 5.0.
     50 </p>
     51 
     52 <p>
     53 To achieve this lower latency, many changes were needed throughout the system. One
     54 important change is to assign CPU resources to time-critical
     55 threads with a more predictable scheduling policy. Reliable scheduling
     56 allows the audio buffer sizes and counts to be reduced while still
     57 avoiding underruns and overruns.
     58 </p>
     59 
     60 <h2 id="priorityInversion">Priority inversion</h2>
     61 
     62 <p>
     63 <a href="http://en.wikipedia.org/wiki/Priority_inversion">Priority inversion</a>
     64 is a classic failure mode of real-time systems,
     65 where a higher-priority task is blocked for an unbounded time waiting
     66 for a lower-priority task to release a resource such as (shared
     67 state protected by) a
     68 <a href="http://en.wikipedia.org/wiki/Mutual_exclusion">mutex</a>.
     69 </p>
     70 
     71 <p>
     72 In an audio system, priority inversion typically manifests as a
     73 <a href="http://en.wikipedia.org/wiki/Glitch">glitch</a>
     74 (click, pop, dropout),
     75 <a href="http://en.wikipedia.org/wiki/Max_Headroom_(character)">repeated audio</a>
     76 when circular buffers
     77 are used, or delay in responding to a command.
     78 </p>
     79 
     80 <p>
     81 A common workaround for priority inversion is to increase audio buffer sizes.
     82 However, this method increases latency and merely hides the problem
     83 instead of solving it. It is better to understand and prevent priority
     84 inversion, as seen below.
     85 </p>
     86 
     87 <p>
     88 In the Android audio implementation, priority inversion is most
     89 likely to occur in these places. And so you should focus your attention here:
     90 </p>
     91 
     92 <ul>
     93 
     94 <li>
     95 between normal mixer thread and fast mixer thread in AudioFlinger
     96 </li>
     97 
     98 <li>
     99 between application callback thread for a fast AudioTrack and
    100 fast mixer thread (they both have elevated priority, but slightly
    101 different priorities)
    102 </li>
    103 
    104 <li>
    105 between application callback thread for a fast AudioRecord and
    106 fast capture thread (similar to previous)
    107 </li>
    108 
    109 <li>
    110 within the audio Hardware Abstraction Layer (HAL) implementation, e.g. for telephony or echo cancellation
    111 </li>
    112 
    113 <li>
    114 within the audio driver in kernel
    115 </li>
    116 
    117 <li>
    118 between AudioTrack or AudioRecord callback thread and other app threads (this is out of our control)
    119 </li>
    120 
    121 </ul>
    122 
    123 <h2 id="commonSolutions">Common solutions</h2>
    124 
    125 <p>
    126 The typical solutions include:
    127 </p>
    128 
    129 <ul>
    130 
    131 <li>
    132 disabling interrupts
    133 </li>
    134 
    135 <li>
    136 priority inheritance mutexes
    137 </li>
    138 
    139 </ul>
    140 
    141 <p>
    142 Disabling interrupts is not feasible in Linux user space, and does
    143 not work for Symmetric Multi-Processors (SMP).
    144 </p>
    145 
    146 
    147 <p>
    148 Priority inheritance
    149 <a href="http://en.wikipedia.org/wiki/Futex">futexes</a>
    150 (fast user-space mutexes) are available
    151 in Linux kernel, but are not currently exposed by the Android C
    152 runtime library
    153 <a href="http://en.wikipedia.org/wiki/Bionic_(software)">Bionic</a>.
    154 They are not used in the audio system because they are relatively heavyweight,
    155 and because they rely on a trusted client.
    156 </p>
    157 
    158 <h2 id="androidTechniques">Techniques used by Android</h2>
    159 
    160 <p>
    161 Experiments started with "try lock" and lock with timeout. These are
    162 non-blocking and bounded blocking variants of the mutex lock
    163 operation. Try lock and lock with timeout worked fairly well but were
    164 susceptible to a couple of obscure failure modes: the
    165 server was not guaranteed to be able to access the shared state if
    166 the client happened to be busy, and the cumulative timeout could
    167 be too long if there was a long sequence of unrelated locks that
    168 all timed out.
    169 </p>
    170 
    171 
    172 <p>
    173 We also use
    174 <a href="http://en.wikipedia.org/wiki/Linearizability">atomic operations</a>
    175 such as:
    176 </p>
    177 
    178 <ul>
    179 <li>increment</li>
    180 <li>bitwise "or"</li>
    181 <li>bitwise "and"</li>
    182 </ul>
    183 
    184 <p>
    185 All of these return the previous value and include the necessary
    186 SMP barriers. The disadvantage is they can require unbounded retries.
    187 In practice, we've found that the retries are not a problem.
    188 </p>
    189 
    190 <p class="note"><strong>Note:</strong> Atomic operations and their interactions with memory barriers
    191 are notoriously badly misunderstood and used incorrectly. We include these methods
    192 here for completeness but recommend you also read the article
    193 <a href="https://developer.android.com/training/articles/smp.html">
    194 SMP Primer for Android</a>
    195 for further information.
    196 </p>
    197 
    198 <p>
    199 We still have and use most of the above tools, and have recently
    200 added these techniques:
    201 </p>
    202 
    203 <ul>
    204 
    205 <li>
    206 Use non-blocking single-reader single-writer
    207 <a href="http://en.wikipedia.org/wiki/Circular_buffer">FIFO queues</a>
    208 for data.
    209 </li>
    210 
    211 <li>
    212 Try to
    213 <i>copy</i>
    214 state rather than
    215 <i>share</i>
    216 state between high- and
    217 low-priority modules.
    218 </li>
    219 
    220 <li>
    221 When state does need to be shared, limit the state to the
    222 maximum-size
    223 <a href="http://en.wikipedia.org/wiki/Word_(computer_architecture)">word</a>
    224 that can be accessed atomically in one-bus operation
    225 without retries.
    226 </li>
    227 
    228 <li>
    229 For complex multi-word state, use a state queue. A state queue
    230 is basically just a non-blocking single-reader single-writer FIFO
    231 queue used for state rather than data, except the writer collapses
    232 adjacent pushes into a single push.
    233 </li>
    234 
    235 <li>
    236 Pay attention to
    237 <a href="http://en.wikipedia.org/wiki/Memory_barrier">memory barriers</a>
    238 for SMP correctness.
    239 </li>
    240 
    241 <li>
    242 <a href="http://en.wikipedia.org/wiki/Trust,_but_verify">Trust, but verify</a>.
    243 When sharing
    244 <i>state</i>
    245 between processes, don't
    246 assume that the state is well-formed. For example, check that indices
    247 are within bounds. This verification isn't needed between threads
    248 in the same process, between mutual trusting processes (which
    249 typically have the same UID). It's also unnecessary for shared
    250 <i>data</i>
    251 such as PCM audio where a corruption is inconsequential.
    252 </li>
    253 
    254 </ul>
    255 
    256 <h2 id="nonBlockingAlgorithms">Non-blocking algorithms</h2>
    257 
    258 <p>
    259 <a href="http://en.wikipedia.org/wiki/Non-blocking_algorithm">Non-blocking algorithms</a>
    260 have been a subject of much recent study.
    261 But with the exception of single-reader single-writer FIFO queues,
    262 we've found them to be complex and error-prone.
    263 </p>
    264 
    265 <p>
    266 Starting in Android 4.2, you can find our non-blocking,
    267 single-reader/writer classes in these locations:
    268 </p>
    269 
    270 <ul>
    271 
    272 <li>
    273 frameworks/av/include/media/nbaio/
    274 </li>
    275 
    276 <li>
    277 frameworks/av/media/libnbaio/
    278 </li>
    279 
    280 <li>
    281 frameworks/av/services/audioflinger/StateQueue*
    282 </li>
    283 
    284 </ul>
    285 
    286 <p>
    287 These were designed specifically for AudioFlinger and are not
    288 general-purpose. Non-blocking algorithms are notorious for being
    289 difficult to debug. You can look at this code as a model. But be
    290 aware there may be bugs, and the classes are not guaranteed to be
    291 suitable for other purposes.
    292 </p>
    293 
    294 <p>
    295 For developers, some of the sample OpenSL ES application code should be updated to
    296 use non-blocking algorithms or reference a non-Android open source library.
    297 </p>
    298 
    299 <p>
    300 We have published an example non-blocking FIFO implementation that is specifically designed for
    301 application code.  See these files located in the platform source directory
    302 <code>frameworks/av/audio_utils</code>:
    303 </p>
    304 <ul>
    305   <li><a href="https://android.googlesource.com/platform/system/media/+/master/audio_utils/include/audio_utils/fifo.h">include/audio_utils/fifo.h</a></li>
    306   <li><a href="https://android.googlesource.com/platform/system/media/+/master/audio_utils/fifo.c">fifo.c</a></li>
    307   <li><a href="https://android.googlesource.com/platform/system/media/+/master/audio_utils/include/audio_utils/roundup.h">include/audio_utils/roundup.h</a></li>
    308   <li><a href="https://android.googlesource.com/platform/system/media/+/master/audio_utils/roundup.c">roundup.c</a></li>
    309 </ul>
    310 
    311 <h2 id="tools">Tools</h2>
    312 
    313 <p>
    314 To the best of our knowledge, there are no automatic tools for
    315 finding priority inversion, especially before it happens. Some
    316 research static code analysis tools are capable of finding priority
    317 inversions if able to access the entire codebase. Of course, if
    318 arbitrary user code is involved (as it is here for the application)
    319 or is a large codebase (as for the Linux kernel and device drivers),
    320 static analysis may be impractical. The most important thing is to
    321 read the code very carefully and get a good grasp on the entire
    322 system and the interactions. Tools such as
    323 <a href="http://developer.android.com/tools/help/systrace.html">systrace</a>
    324 and
    325 <code>ps -t -p</code>
    326 are useful for seeing priority inversion after it occurs, but do
    327 not tell you in advance.
    328 </p>
    329 
    330 <h2 id="aFinalWord">A final word</h2>
    331 
    332 <p>
    333 After all of this discussion, don't be afraid of mutexes. Mutexes
    334 are your friend for ordinary use, when used and implemented correctly
    335 in ordinary non-time-critical use cases. But between high- and
    336 low-priority tasks and in time-sensitive systems mutexes are more
    337 likely to cause trouble.
    338 </p>
    339