Home | History | Annotate | Download | only in devices
      1 page.title=Audio Latency
      2 @jd:body
      3 
      4 <!--
      5     Copyright 2010 The Android Open Source Project
      6 
      7     Licensed under the Apache License, Version 2.0 (the "License");
      8     you may not use this file except in compliance with the License.
      9     You may obtain a copy of the License at
     10 
     11         http://www.apache.org/licenses/LICENSE-2.0
     12 
     13     Unless required by applicable law or agreed to in writing, software
     14     distributed under the License is distributed on an "AS IS" BASIS,
     15     WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
     16     See the License for the specific language governing permissions and
     17     limitations under the License.
     18 -->
     19 <div id="qv-wrapper">
     20   <div id="qv">
     21     <h2>In this document</h2>
     22     <ol id="auto-toc">
     23     </ol>
     24   </div>
     25 </div>
     26 
     27 <p>Audio latency is the time delay as an audio signal passes through a system.
     28   For a complete description of audio latency for the purposes of Android
     29   compatibility, see <em>Section 5.4 Audio Latency</em>
     30   in the <a href="http://source.android.com/compatibility/index.html">Android CDD</a>.
     31 </p>
     32 
     33 <h2 id="contributors">Contributors to Latency</h2>
     34 
     35 <p>
     36   This section focuses on the contributors to output latency,
     37   but a similar discussion applies to input latency.
     38 </p>
     39 <p>
     40   Assuming that the analog circuitry does not contribute significantly.
     41   Then the major surface-level contributors to audio latency are the following:
     42 </p>
     43 
     44 <ul>
     45   <li>Application</li>
     46   <li>Total number of buffers in pipeline</li>
     47   <li>Size of each buffer, in frames</li>
     48   <li>Additional latency after the app processor, such as from a DSP</li>
     49 </ul>
     50 
     51 <p>
     52   As accurate as the above list of contributors may be, it is also misleading.
     53   The reason is that buffer count and buffer size are more of an
     54   <em>effect</em> than a <em>cause</em>.  What usually happens is that
     55   a given buffer scheme is implemented and tested, but during testing, an audio
     56   underrun is heard as a "click" or "pop".  To compensate, the
     57   system designer then increases buffer sizes or buffer counts.
     58   This has the desired result of eliminating the underruns, but it also
     59   has the undesired side effect of increasing latency.
     60 </p>
     61 
     62 <p>
     63   A better approach is to understand the underlying causes of the
     64   underruns and then correct those.  This eliminates the
     65   audible artifacts and may even permit even smaller or fewer buffers
     66   and thus reduce latency.
     67 </p>
     68 
     69 <p>
     70   In our experience, the most common causes of underruns include:
     71 </p>
     72 <ul>
     73   <li>Linux CFS (Completely Fair Scheduler)</li>
     74   <li>high-priority threads with SCHED_FIFO scheduling</li>
     75   <li>long scheduling latency</li>
     76   <li>long-running interrupt handlers</li>
     77   <li>long interrupt disable time</li>
     78 </ul>
     79 
     80 <h3>Linux CFS and SCHED_FIFO scheduling</h3>
     81 <p>
     82   The Linux CFS is designed to be fair to competing workloads sharing a common CPU
     83   resource. This fairness is represented by a per-thread <em>nice</em> parameter.
     84   The nice value ranges from -19 (least nice, or most CPU time allocated)
     85   to 20 (nicest, or least CPU time allocated). In general, all threads with a given
     86   nice value receive approximately equal CPU time and threads with a
     87   numerically lower nice value should expect to
     88   receive more CPU time. However, CFS is "fair" only over relatively long
     89   periods of observation. Over short-term observation windows,
     90   CFS may allocate the CPU resource in unexpected ways. For example, it
     91   may take the CPU away from a thread with numerically low niceness
     92   onto a thread with a numerically high niceness.  In the case of audio,
     93   this can result in an underrun.
     94 </p>
     95 
     96 <p>
     97   The obvious solution is to avoid CFS for high-performance audio
     98   threads. Beginning with Android 4.1 (Jelly Bean), such threads now use the
     99   <code>SCHED_FIFO</code> scheduling policy rather than the <code>SCHED_NORMAL</code> (also called
    100   <code>SCHED_OTHER</code>) scheduling policy implemented by CFS.
    101 </p>
    102 
    103 <p>
    104   Though the high-performance audio threads now use <code>SCHED_FIFO</code>, they
    105   are still susceptible to other higher priority <code>SCHED_FIFO</code> threads.
    106   These are typically kernel worker threads, but there may also be a few
    107   non-audio user threads with policy <code>SCHED_FIFO</code>. The available <code>SCHED_FIFO</code>
    108   priorities range from 1 to 99.  The audio threads run at priority
    109   2 or 3.  This leaves priority 1 available for lower priority threads,
    110   and priorities 4 to 99 for higher priority threads.  We recommend that
    111   you use priority 1 whenever possible, and reserve priorities 4 to 99 for
    112   those threads that are guaranteed to complete within a bounded amount
    113   of time, and are known to not interfere with scheduling of audio threads.
    114 </p>
    115 
    116 <h3>Scheduling latency</h3>
    117 <p>
    118   Scheduling latency is the time between when a thread becomes
    119   ready to run, and when the resulting context switch completes so that the
    120   thread actually runs on a CPU. The shorter the latency the better and 
    121   anything over two milliseconds causes problems for audio. Long scheduling
    122   latency is most likely to occur during mode transitions, such as
    123   bringing up or shutting down a CPU, switching between a security kernel
    124   and the normal kernel, switching from full power to low-power mode,
    125   or adjusting the CPU clock frequency and voltage.
    126 </p>
    127 
    128 <h3>Interrupts</h3>
    129 <p>
    130   In many designs, CPU 0 services all external interrupts.  So a
    131   long-running interrupt handler may delay other interrupts, in particular
    132   audio DMA completion interrupts. Design interrupt handlers
    133   to finish quickly and defer any lengthy work to a thread (preferably
    134   a CFS thread or <code>SCHED_FIFO</code> thread of priority 1).
    135 </p>
    136 
    137 <p>
    138   Equivalently, disabling interrupts on CPU 0 for a long period
    139   has the same result of delaying the servicing of audio interrupts.
    140   Long interrupt disable times typically happen while waiting for a kernel
    141   <i>spin lock</i>.  Review these spin locks to ensure that
    142   they are bounded.
    143 </p>
    144 
    145 
    146 
    147 <h2 id="measuringOutput">Measuring Output Latency</h2>
    148 
    149 <p>
    150   There are several techniques available to measure output latency,
    151   with varying degrees of accuracy and ease of running.
    152 </p>
    153 
    154 <h3>LED and oscilloscope test</h3>
    155 <p>
    156 This test measures latency in relation to the device's LED indicator.
    157 If your production device does not have an LED, you can install the
    158   LED on a prototype form factor device. For even better accuracy
    159   on prototype devices with exposed circuity, connect one
    160   oscilloscope probe to the LED directly to bypass the light
    161   sensor latency.
    162   </p>
    163 
    164 <p>
    165   If you cannot install an LED on either your production or prototype device,
    166   try the following workarounds:
    167 </p>
    168 
    169 <ul>
    170   <li>Use a General Purpose Input/Output (GPIO) pin for the same purpose</li>
    171   <li>Use JTAG or another debugging port</li>
    172   <li>Use the screen backlight. This might be risky as the
    173   backlight may have a non-neglible latency, and can contribute to
    174   an inaccurate latency reading.
    175   </li>
    176 </ul>
    177 
    178 <p>To conduct this test:</p>
    179 
    180 <ol>
    181   <li>Run an app that periodically pulses the LED at
    182   the same time it outputs audio. 
    183 
    184   <p class="note"><b>Note:</b> To get useful results, it is crucial to use the correct
    185   APIs in the test app so that you're exercising the fast audio output path.
    186   See the separate document "Application developer guidelines for reduced
    187   audio latency". <!-- where is this ?-->
    188   </p>
    189   </li>
    190   <li>Place a light sensor next to the LED.</li>
    191   <li>Connect the probes of a dual-channel oscilloscope to both the wired headphone
    192   jack (line output) and light sensor.</li>
    193   <li>Use the oscilloscope to measure
    194   the time difference between observing the line output signal versus the light
    195   sensor signal.</li>
    196 </ol>
    197 
    198   <p>The difference in time is the approximate audio output latency,
    199   assuming that the LED latency and light sensor latency are both zero.
    200   Typically, the LED and light sensor each have a relatively low latency
    201   on the order of 1 millisecond or less, which is sufficiently low enough
    202   to ignore.</p>
    203 
    204 <h3>Larsen test</h3>
    205 <p>
    206   One of the easiest latency tests is an audio feedback
    207   (Larsen effect) test. This provides a crude measure of combined output
    208   and input latency by timing an impulse response loop. This test is not very useful
    209   by itself because of the nature of the test, but</p>
    210 
    211 <p>To conduct this test:</p>
    212 <ol>
    213   <li>Run an app that captures audio from the microphone and immediately plays the
    214   captured data back over the speaker.</li>
    215   <li>Create a sound externally,
    216   such as tapping a pencil by the microphone. This noise generates a feedback loop.</li>
    217   <li>Measure the time between feedback pulses to get the sum of the output latency, input latency, and application overhead.</li>
    218 </ol>
    219 
    220   <p>This method does not break down the
    221   component times, which is important when the output latency
    222   and input latency are independent, so this method is not recommended for measuring output latency, but might be useful
    223   to help measure output latency.</p>
    224 
    225 <h2 id="measuringInput">Measuring Input Latency</h2>
    226 
    227 <p>
    228   Input latency is more difficult to measure than output latency. The following
    229   tests might help.
    230 </p>
    231 
    232 <p>
    233 One approach is to first determine the output latency
    234   using the LED and oscilloscope method and then use
    235   the audio feedback (Larsen) test to determine the sum of output
    236   latency and input latency. The difference between these two
    237   measurements is the input latency.
    238 </p>
    239 
    240 <p>
    241   Another technique is to use a GPIO pin on a prototype device.
    242   Externally, pulse a GPIO input at the same time that you present
    243   an audio signal to the device.  Run an app that compares the
    244   difference in arrival times of the GPIO signal and audio data.
    245 </p>
    246 
    247 <h2 id="reducing">Reducing Latency</h2>
    248 
    249 <p>To achieve low audio latency, pay special attention throughout the
    250 system to scheduling, interrupt handling, power management, and device
    251 driver design. Your goal is to prevent any part of the platform from
    252 blocking a <code>SCHED_FIFO</code> audio thread for more than a couple
    253 of milliseconds. By adopting such a systematic approach, you can reduce
    254 audio latency and get the side benefit of more predictable performance
    255 overall.
    256 </p>
    257 
    258 
    259  <p>
    260   Audio underruns, when they do occur, are often detectable only under certain
    261   conditions or only at the transitions. Try stressing the system by launching
    262   new apps and scrolling quickly through various displays. But be aware
    263   that some test conditions are so stressful as to be beyond the design
    264   goals. For example, taking a bugreport puts such enormous load on the
    265   system that it may be acceptable to have an underrun in that case.
    266 </p>
    267 
    268 <p>
    269   When testing for underruns:
    270 </p>
    271   <ul>
    272   <li>Configure any DSP after the app processor so that it adds
    273   minimal latency</li>
    274   <li>Run tests under different conditions
    275   such as having the screen on or off, USB plugged in or unplugged,
    276   WiFi on or off, Bluetooth on or off, and telephony and data radios
    277   on or off.</li>
    278   <li>Select relatively quiet music that you're very familiar with, and which is easy
    279   to hear underruns in.</li>
    280   <li>Use wired headphones for extra sensitivity.</li>
    281   <li>Give yourself breaks so that you don't experience "ear fatigue".</li>
    282   </ul>
    283 
    284 <p>
    285   Once you find the underlying causes of underruns, reduce
    286   the buffer counts and sizes to take advantage of this.
    287   The eager approach of reducing buffer counts and sizes <i>before</i>
    288   analyzing underruns and fixing the causes of underruns only
    289   results in frustration.
    290 </p>
    291 
    292 <h3 id="tools">Tools</h3>
    293 <p>
    294   <code>systrace</code> is an excellent general-purpose tool
    295   for diagnosing system-level performance glitches.
    296 </p>
    297 
    298 <p>
    299   The output of <code>dumpsys media.audio_flinger</code> also contains a
    300   useful section called "simple moving statistics". This has a summary
    301   of the variability of elapsed times for each audio mix and I/O cycle.
    302   Ideally, all the time measurements should be about equal to the mean or
    303   nominal cycle time. If you see a very low minimum or high maximum, this is an
    304   indication of a problem, which is probably a high scheduling latency or interrupt
    305   disable time. The <i>tail</i> part of the output is especially helpful,
    306   as it highlights the variability beyond +/- 3 standard deviations.
    307 </p>