Home | History | Annotate | Download | only in devices
      1 page.title=USB Digital Audio
      2 @jd:body
      3 
      4 <!--
      5     Copyright 2014 The Android Open Source Project
      6 
      7     Licensed under the Apache License, Version 2.0 (the "License");
      8     you may not use this file except in compliance with the License.
      9     You may obtain a copy of the License at
     10 
     11         http://www.apache.org/licenses/LICENSE-2.0
     12 
     13     Unless required by applicable law or agreed to in writing, software
     14     distributed under the License is distributed on an "AS IS" BASIS,
     15     WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
     16     See the License for the specific language governing permissions and
     17     limitations under the License.
     18 -->
     19 <div id="qv-wrapper">
     20   <div id="qv">
     21     <h2>In this document</h2>
     22     <ol id="auto-toc">
     23     </ol>
     24   </div>
     25 </div>
     26 
     27 <p>
     28 This article reviews Android support for USB digital audio and related
     29 USB-based protocols.
     30 </p>
     31 
     32 <h3 id="audience">Audience</h3>
     33 
     34 <p>
     35 The target audience of this article is Android device OEMs, SoC vendors,
     36 USB audio peripheral suppliers, advanced audio application developers,
     37 and others seeking detailed understanding of USB digital audio internals on Android.
     38 </p>
     39 
     40 <p>
     41 End users should see the <a href="https://support.google.com/android/">Help Center</a> instead.
     42 Though this article is not oriented towards end users,
     43 certain audiophile consumers may find portions of interest.
     44 </p>
     45 
     46 <h2 id="overview">Overview of USB</h2>
     47 
     48 <p>
     49 Universal Serial Bus (USB) is informally described in the Wikipedia article
     50 <a href="http://en.wikipedia.org/wiki/USB">USB</a>,
     51 and is formally defined by the standards published by the
     52 <a href="http://www.usb.org/">USB Implementers Forum, Inc</a>.
     53 For convenience, we summarize the key USB concepts here,
     54 but the standards are the authoritative reference.
     55 </p>
     56 
     57 <h3 id="terminology">Basic concepts and terminology</h3>
     58 
     59 <p>
     60 USB is a <a href="http://en.wikipedia.org/wiki/Bus_(computing)">bus</a>
     61 with a single initiator of data transfer operations, called the <i>host</i>.
     62 The host communicates with
     63 <a href="http://en.wikipedia.org/wiki/Peripheral">peripherals</a> via the bus.
     64 </p>
     65 
     66 <p>
     67 <b>Note:</b> the terms <i>device</i> or <i>accessory</i> are common synonyms for
     68 <i>peripheral</i>.  We avoid those terms here, as they could be confused with
     69 Android <a href="http://en.wikipedia.org/wiki/Mobile_device">device</a>
     70 or the Android-specific concept called
     71 <a href="http://developer.android.com/guide/topics/connectivity/usb/accessory.html">accessory mode</a>.
     72 </p>
     73 
     74 <p>
     75 A critical host role is <i>enumeration</i>:
     76 the process of detecting which peripherals are connected to the bus,
     77 and querying their properties expressed via <i>descriptors</i>.
     78 </p>
     79 
     80 <p>
     81 A peripheral may be one physical object
     82 but actually implement multiple logical <i>functions</i>.
     83 For example, a webcam peripheral could have both a camera function and a
     84 microphone audio function.
     85 </p>
     86 
     87 <p>
     88 Each peripheral function has an <i>interface</i> that
     89 defines the protocol to communicate with that function.
     90 </p>
     91 
     92 <p>
     93 The host communicates with a peripheral over a
     94 <a href="http://en.wikipedia.org/wiki/Stream_(computing)">pipe</a>
     95 to an <a href="http://en.wikipedia.org/wiki/Communication_endpoint">endpoint</a>,
     96 a data source or sink
     97 provided by one of the peripheral's functions.
     98 </p>
     99 
    100 <p>
    101 There are two kinds of pipes: <i>message</i> and <i>stream</i>.
    102 A message pipe is used for bi-directional control and status.
    103 A stream pipe is used for uni-directional data transfer.
    104 </p>
    105 
    106 <p>
    107 The host initiates all data transfers,
    108 hence the terms <i>input</i> and <i>output</i> are expressed relative to the host.
    109 An input operation transfers data from the peripheral to the host,
    110 while an output operation transfers data from the host to the peripheral.
    111 </p>
    112 
    113 <p>
    114 There are three major data transfer modes:
    115 <i>interrupt</i>, <i>bulk</i>, and <i>isochronous</i>.
    116 Isochronous mode will be discussed further in the context of audio.
    117 </p>
    118 
    119 <p>
    120 The peripheral may have <i>terminals</i> that connect to the outside world,
    121 beyond the peripheral itself.  In this way, the peripheral serves
    122 to translate between USB protocol and "real world" signals.
    123 The terminals are logical objects of the function.
    124 </p>
    125 
    126 <h2 id="androidModes">Android USB modes</h2>
    127 
    128 <h3 id="developmentMode">Development mode</h3>
    129 
    130 <p>
    131 <i>Development mode</i> has been present since the initial release of Android.
    132 The Android device appears as a USB peripheral
    133 to a host PC running a desktop operating system such as Linux,
    134 Mac OS X, or Windows.  The only visible peripheral function is either
    135 <a href="http://en.wikipedia.org/wiki/Android_software_development#Fastboot">Android fastboot</a>
    136 or
    137 <a href="http://developer.android.com/tools/help/adb.html">Android Debug Bridge (adb)</a>.
    138 The fastboot and adb protocols are layered over USB bulk data transfer mode.
    139 </p>
    140 
    141 <h3 id="hostMode">Host mode</h3>
    142 
    143 <p>
    144 <i>Host mode</i> is introduced in Android 3.1 (API level 12).
    145 </p>
    146 
    147 <p>
    148 As the Android device must act as host, and most Android devices include
    149 a micro-USB connector that does not directly permit host operation,
    150 an on-the-go (<a href="http://en.wikipedia.org/wiki/USB_On-The-Go">OTG</a>) adapter
    151 such as this is usually required:
    152 </p>
    153 
    154 <img src="audio/images/otg.jpg" style="image-orientation: 90deg;" height="50%" width="50%" alt="OTG">
    155 
    156 <p>
    157 An Android device might not provide sufficient power to operate a
    158 particular peripheral, depending on how much power the peripheral needs,
    159 and how much the Android device is capable of supplying.  Even if
    160 adequate power is available, the Android device battery charge may
    161 be significantly shortened.  For these situations, use a powered
    162 <a href="http://en.wikipedia.org/wiki/USB_hub">hub</a> such as this:
    163 </p>
    164 
    165 <img src="audio/images/hub.jpg" alt="Powered hub">
    166 
    167 <h3 id="accessoryMode">Accessory mode</h3>
    168 
    169 <p>
    170 <i>Accessory mode</i> was introduced in Android 3.1 (API level 12) and back-ported to Android 2.3.4.
    171 In this mode, the Android device operates as a USB peripheral,
    172 under the control of another device such as a dock that serves as host.
    173 The difference between development mode and accessory mode
    174 is that additional USB functions are visible to the host, beyond adb.
    175 The Android device begins in development mode and then
    176 transitions to accessory mode via a re-negotiation process.
    177 </p>
    178 
    179 <p>
    180 Accessory mode was extended with additional features in Android 4.1,
    181 in particular audio described below.
    182 </p>
    183 
    184 <h2 id="audioClass">USB audio</h2>
    185 
    186 <h3 id="class">USB classes</h3>
    187 
    188 <p>
    189 Each peripheral function has an associated <i>device class</i> document
    190 that specifies the standard protocol for that function.
    191 This enables <i>class compliant</i> hosts and peripheral functions
    192 to inter-operate, without detailed knowledge of each other's workings.
    193 Class compliance is critical if the host and peripheral are provided by
    194 different entities.
    195 </p>
    196 
    197 <p>
    198 The term <i>driverless</i> is a common synonym for <i>class compliant</i>,
    199 indicating that it is possible to use the standard features of such a
    200 peripheral without requiring an operating-system specific
    201 <a href="http://en.wikipedia.org/wiki/Device_driver">driver</a> to be installed.
    202 One can assume that a peripheral advertised as "no driver needed"
    203 for major desktop operating systems
    204 will be class compliant, though there may be exceptions.
    205 </p>
    206 
    207 <h3 id="audioClass">USB audio class</h3>
    208 
    209 <p>
    210 Here we concern ourselves only with peripherals that implement
    211 audio functions, and thus adhere to the audio device class.  There are two
    212 editions of the USB audio class specification: class 1 (UAC1) and 2 (UAC2).
    213 </p>
    214 
    215 <h3 id="otherClasses">Comparison with other classes</h3>
    216 
    217 <p>
    218 USB includes many other device classes, some of which may be confused
    219 with the audio class.  The
    220 <a href="http://en.wikipedia.org/wiki/USB_mass_storage_device_class">mass storage class</a>
    221 (MSC) is used for
    222 sector-oriented access to media, while
    223 <a href="http://en.wikipedia.org/wiki/Media_Transfer_Protocol">Media Transfer Protocol</a>
    224 (MTP) is for full file access to media.
    225 Both MSC and MTP may be used for transferring audio files,
    226 but only USB audio class is suitable for real-time streaming.
    227 </p>
    228 
    229 <h3 id="audioTerminals">Audio terminals</h3>
    230 
    231 <p>
    232 The terminals of an audio peripheral are typically analog.
    233 The analog signal presented at the peripheral's input terminal is converted to digital by an
    234 <a href="http://en.wikipedia.org/wiki/Analog-to-digital_converter">analog-to-digital converter</a>
    235 (ADC),
    236 and is carried over USB protocol to be consumed by
    237 the host.  The ADC is a data <i>source</i>
    238 for the host.  Similarly, the host sends a
    239 digital audio signal over USB protocol to the peripheral, where a
    240 <a href="http://en.wikipedia.org/wiki/Digital-to-analog_converter">digital-to-analog converter</a>
    241 (DAC)
    242 converts and presents to an analog output terminal.
    243 The DAC is a <i>sink</i> for the host.
    244 </p>
    245 
    246 <h3 id="channels">Channels</h3>
    247 
    248 <p>
    249 A peripheral with audio function can include a source terminal, sink terminal, or both.
    250 Each direction may have one channel (<i>mono</i>), two channels
    251 (<i>stereo</i>), or more.
    252 Peripherals with more than two channels are called <i>multichannel</i>.
    253 It is common to interpret a stereo stream as consisting of
    254 <i>left</i> and <i>right</i> channels, and by extension to interpret a multichannel stream as having
    255 spatial locations corresponding to each channel.  However, it is also quite appropriate
    256 (especially for USB audio more so than
    257 <a href="http://en.wikipedia.org/wiki/HDMI">HDMI</a>)
    258 to not assign any particular
    259 standard spatial meaning to each channel.  In this case, it is up to the
    260 application and user to define how each channel is used.
    261 For example, a four-channel USB input stream might have the first three
    262 channels attached to various microphones within a room, and the final
    263 channel receiving input from an AM radio.
    264 </p>
    265 
    266 <h3 id="isochronous">Isochronous transfer mode</h3>
    267 
    268 <p>
    269 USB audio uses isochronous transfer mode for its real-time characteristics,
    270 at the expense of error recovery.
    271 In isochronous mode, bandwidth is guaranteed, and data transmission
    272 errors are detected using a cyclic redundancy check (CRC).  But there is
    273 no packet acknowledgement or re-transmission in the event of error.
    274 </p>
    275 
    276 <p>
    277 Isochronous transmissions occur each Start Of Frame (SOF) period.
    278 The SOF period is one millisecond for full-speed, and 125 microseconds for
    279 high-speed.  Each full-speed frame carries up to 1023 bytes of payload,
    280 and a high-speed frame carries up to 1024 bytes.  Putting these together,
    281 we calculate the maximum transfer rate as 1,023,000 or 8,192,000 bytes
    282 per second.  This sets a theoretical upper limit on the combined audio
    283 sample rate, channel count, and bit depth.  The practical limit is lower.
    284 </p>
    285 
    286 <p>
    287 Within isochronous mode, there are three sub-modes:
    288 </p>
    289 
    290 <ul>
    291 <li>Adaptive</li>
    292 <li>Asynchronous</li>
    293 <li>Synchronous</li>
    294 </ul>
    295 
    296 <p>
    297 In adaptive sub-mode, the peripheral sink or source adapts to a potentially varying sample rate
    298 of the host.
    299 </p>
    300 
    301 <p>
    302 In asynchronous (also called implicit feedback) sub-mode,
    303 the sink or source determines the sample rate, and the host accomodates.
    304 The primary theoretical advantage of asynchronous sub-mode is that the source
    305 or sink USB clock is physically and electrically closer to (and indeed may
    306 be the same as, or derived from) the clock that drives the DAC or ADC.
    307 This proximity means that asynchronous sub-mode should be less susceptible
    308 to clock jitter.  In addition, the clock used by the DAC or ADC may be
    309 designed for higher accuracy and lower drift than the host clock.
    310 </p>
    311 
    312 <p>
    313 In synchronous sub-mode, a fixed number of bytes is transferred each SOF period.
    314 The audio sample rate is effectively derived from the USB clock.
    315 Synchronous sub-mode is not commonly used with audio because both
    316 host and peripheral are at the mercy of the USB clock.
    317 </p>
    318 
    319 <p>
    320 The table below summarizes the isochronous sub-modes:
    321 </p>
    322 
    323 <table>
    324 <tr>
    325   <th>Sub-mode</th>
    326   <th>Byte count<br \>per packet</th>
    327   <th>Sample rate<br \>determined by</th>
    328   <th>Used for audio</th>
    329 </tr>
    330 <tr>
    331   <td>adaptive</td>
    332   <td>variable</td>
    333   <td>host</td>
    334   <td>yes</td>
    335 </tr>
    336 <tr>
    337   <td>asynchronous</td>
    338   <td>variable</td>
    339   <td>peripheral</td>
    340   <td>yes</td>
    341 </tr>
    342 <tr>
    343   <td>synchronous</td>
    344   <td>fixed</td>
    345   <td>USB clock</td>
    346   <td>no</td>
    347 </tr>
    348 </table>
    349 
    350 <p>
    351 In practice, the sub-mode does of course matter, but other factors
    352 should also be considered.
    353 </p>
    354 
    355 <h2 id="androidSupport">Android support for USB audio class</h2>
    356 
    357 <h3 id="developmentAudio">Development mode</h3>
    358 
    359 <p>
    360 USB audio is not supported in development mode.
    361 </p>
    362 
    363 <h3 id="hostAudio">Host mode</h3>
    364 
    365 <p>
    366 Android 5.0 (API level 21) and above supports a subset of USB audio class 1 (UAC1) features:
    367 </p>
    368 
    369 <ul>
    370 <li>The Android device must act as host</li>
    371 <li>The audio format must be PCM (interface type I)</li>
    372 <li>The bit depth must be 16-bits, 24-bits, or 32-bits where
    373 24 bits of useful audio data are left-justified within the most significant
    374 bits of the 32-bit word</li>
    375 <li>The sample rate must be either 48, 44.1, 32, 24, 22.05, 16, 12, 11.025, or 8 kHz</li>
    376 <li>The channel count must be 1 (mono) or 2 (stereo)</li>
    377 </ul>
    378 
    379 <p>
    380 Perusal of the Android framework source code may show additional code
    381 beyond the minimum needed to support these features.  But this code
    382 has not been validated, so more advanced features are not yet claimed.
    383 </p>
    384 
    385 <h3 id="accessoryAudio">Accessory mode</h3>
    386 
    387 <p>
    388 Android 4.1 (API level 16) added limited support for audio playback to the host.
    389 While in accessory mode, Android automatically routes its audio output to USB.
    390 That is, the Android device serves as a data source to the host, for example a dock.
    391 </p>
    392 
    393 <p>
    394 Accessory mode audio has these features:
    395 </p>
    396 
    397 <ul>
    398 <li>
    399 The Android device must be controlled by a knowledgeable host that
    400 can first transition the Android device from development mode to accessory mode,
    401 and then the host must transfer audio data from the appropriate endpoint.
    402 Thus the Android device does not appear "driverless" to the host.
    403 </li>
    404 <li>The direction must be <i>input</i>, expressed relative to the host</li>
    405 <li>The audio format must be 16-bit PCM</li>
    406 <li>The sample rate must be 44.1 kHz</li>
    407 <li>The channel count must be 2 (stereo)</li>
    408 </ul>
    409 
    410 <p>
    411 Accessory mode audio has not been widely adopted,
    412 and is not currently recommended for new designs.
    413 </p>
    414 
    415 <h2 id="applications">Applications of USB digital audio</h2>
    416 
    417 <p>
    418 As the name indicates, the USB digital audio signal is represented
    419 by a <a href="http://en.wikipedia.org/wiki/Digital_data">digital</a> data stream
    420 rather than the <a href="http://en.wikipedia.org/wiki/Analog_signal">analog</a>
    421 signal used by the common TRS mini
    422 <a href=" http://en.wikipedia.org/wiki/Phone_connector_(audio)">headset connector</a>.
    423 Eventually any digital signal must be converted to analog before it can be heard.
    424 There are tradeoffs in choosing where to place that conversion.
    425 </p>
    426 
    427 <h3 id="comparison">A tale of two DACs</h3>
    428 
    429 <p>
    430 In the example diagram below, we compare two designs.  First we have a
    431 mobile device with Application Processor (AP), on-board DAC, amplifier,
    432 and analog TRS connector attached to headphones.  We also consider a
    433 mobile device with USB connected to external USB DAC and amplifier,
    434 also with headphones.
    435 </p>
    436 
    437 <img src="audio/images/dac.png" alt="DAC comparison">
    438 
    439 <p>
    440 Which design is better?  The answer depends on your needs.
    441 Each has advantages and disadvantages.
    442 <b>Note:</b> this is an artificial comparison, since
    443 a real Android device would probably have both options available.
    444 </p>
    445 
    446 <p>
    447 The first design A is simpler, less expensive, uses less power,
    448 and will be a more reliable design assuming otherwise equally reliable components.
    449 However, there are usually audio quality tradeoffs vs. other requirements.
    450 For example, if this is a mass-market device, it may be designed to fit
    451 the needs of the general consumer, not for the audiophile.
    452 </p>
    453 
    454 <p>
    455 In the second design, the external audio peripheral C can be designed for
    456 higher audio quality and greater power output without impacting the cost of
    457 the basic mass market Android device B.  Yes, it is a more expensive design,
    458 but the cost is absorbed only by those who want it.
    459 </p>
    460 
    461 <p>
    462 Mobile devices are notorious for having high-density
    463 circuit boards, which can result in more opportunities for
    464 <a href="http://en.wikipedia.org/wiki/Crosstalk_(electronics)">crosstalk</a>
    465 that degrades adjacent analog signals.  Digital communication is less susceptible to
    466 <a href="http://en.wikipedia.org/wiki/Noise_(electronics)">noise</a>,
    467 so moving the DAC from the Android device A to an external circuit board
    468 C allows the final analog stages to be physically and electrically
    469 isolated from the dense and noisy circuit board, resulting in higher fidelity audio.
    470 </p>
    471 
    472 <p>
    473 On the other hand,
    474 the second design is more complex, and with added complexity come more
    475 opportunities for things to fail.  There is also additional latency
    476 from the USB controllers.
    477 </p>
    478 
    479 <h3 id="applications">Applications</h3>
    480 
    481 <p>
    482 Typical USB host mode audio applications include:
    483 </p>
    484 
    485 <ul>
    486 <li>music listening</li>
    487 <li>telephony</li>
    488 <li>instant messaging and voice chat</li>
    489 <li>recording</li>
    490 </ul>
    491 
    492 <p>
    493 For all of these applications, Android detects a compatible USB digital
    494 audio peripheral, and automatically routes audio playback and capture
    495 appropriately, based on the audio policy rules.
    496 Stereo content is played on the first two channels of the peripheral.
    497 </p>
    498 
    499 <p>
    500 There are no APIs specific to USB digital audio.
    501 For advanced usage, the automatic routing may interfere with applications
    502 that are USB-aware.  For such applications, disable automatic routing
    503 via the corresponding control in the Media section of
    504 <a href="http://developer.android.com/tools/index.html">Settings / Developer Options</a>.
    505 </p>
    506 
    507 <h2 id="compatibility">Implementing USB audio</h2>
    508 
    509 <h3 id="recommendationsPeripheral">Recommendations for audio peripheral vendors</h3>
    510 
    511 <p>
    512 In order to inter-operate with Android devices, audio peripheral vendors should:
    513 </p>
    514 
    515 <ul>
    516 <li>design for audio class compliance;
    517 currently Android targets class 1, but it is wise to plan for class 2</li>
    518 <li>avoid <a href="http://en.wiktionary.org/wiki/quirk">quirks</a>
    519 <li>test for inter-operability with reference and popular Android devices</li>
    520 <li>clearly document supported features, audio class compliance, power requirements, etc.
    521 so that consumers can make informed decisions</li>
    522 </ul>
    523 
    524 <h3 id="recommendationsAndroid">Recommendations for Android device OEMs and SoC vendors</h3>
    525 
    526 <p>
    527 In order to support USB digital audio, device OEMs and SoC vendors should:
    528 </p>
    529 
    530 <ul>
    531 <li>enable all kernel features needed: USB host mode, USB audio, isochronous transfer mode</li>
    532 <li>keep up-to-date with recent kernel releases and patches;
    533 despite the noble goal of class compliance, there are extant audio peripherals
    534 with <a href="http://en.wiktionary.org/wiki/quirk">quirks</a>,
    535 and recent kernels have workarounds for such quirks
    536 </li>
    537 <li>enable USB audio policy as described below</li>
    538 <li>test for inter-operability with common USB audio peripherals</li>
    539 </ul>
    540 
    541 <h3 id="enable">How to enable USB audio policy</h3>
    542 
    543 <p>
    544 To enable USB audio, add an entry to the
    545 audio policy configuration file.  This is typically
    546 located here:
    547 <pre>device/oem/codename/audio_policy.conf</pre>
    548 The pathname component "oem" should be replaced by the name
    549 of the OEM who manufactures the Android device,
    550 and "codename" should be replaced by the device code name.
    551 </p>
    552 
    553 <p>
    554 An example entry is shown here:
    555 </p>
    556 
    557 <pre>
    558 audio_hw_modules {
    559   ...
    560   usb {
    561     outputs {
    562       usb_accessory {
    563         sampling_rates 44100
    564         channel_masks AUDIO_CHANNEL_OUT_STEREO
    565         formats AUDIO_FORMAT_PCM_16_BIT
    566         devices AUDIO_DEVICE_OUT_USB_ACCESSORY
    567       }
    568       usb_device {
    569         sampling_rates dynamic
    570         channel_masks dynamic
    571         formats dynamic
    572         devices AUDIO_DEVICE_OUT_USB_DEVICE
    573       }
    574     }
    575     inputs {
    576       usb_device {
    577         sampling_rates dynamic
    578         channel_masks AUDIO_CHANNEL_IN_STEREO
    579         formats AUDIO_FORMAT_PCM_16_BIT
    580         devices AUDIO_DEVICE_IN_USB_DEVICE
    581       }
    582     }
    583   }
    584   ...
    585 }
    586 </pre>
    587 
    588 <h3 id="sourceCode">Source code</h3>
    589 
    590 <p>
    591 The audio Hardware Abstraction Layer (HAL)
    592 implementation for USB audio is located here:
    593 <pre>hardware/libhardware/modules/usbaudio/</pre>
    594 The USB audio HAL relies heavily on
    595 <i>tinyalsa</i>, described at <a href="audio_terminology.html">Audio Terminology</a>.
    596 Though USB audio relies on isochronous transfers,
    597 this is abstracted away by the ALSA implementation.
    598 So the USB audio HAL and tinyalsa do not need to concern
    599 themselves with this part of USB protocol.
    600 </p>
    601