1 <html devsite> 2 <head> 3 <title>USB Digital Audio</title> 4 <meta name="project_path" value="/_project.yaml" /> 5 <meta name="book_path" value="/_book.yaml" /> 6 </head> 7 <body> 8 <!-- 9 Copyright 2017 The Android Open Source Project 10 11 Licensed under the Apache License, Version 2.0 (the "License"); 12 you may not use this file except in compliance with the License. 13 You may obtain a copy of the License at 14 15 http://www.apache.org/licenses/LICENSE-2.0 16 17 Unless required by applicable law or agreed to in writing, software 18 distributed under the License is distributed on an "AS IS" BASIS, 19 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 20 See the License for the specific language governing permissions and 21 limitations under the License. 22 --> 23 24 25 26 <p> 27 This article reviews Android support for USB digital audio and related 28 USB-based protocols. 29 </p> 30 31 <h3 id="audience">Audience</h3> 32 33 <p> 34 The target audience of this article is Android device OEMs, SoC vendors, 35 USB audio peripheral suppliers, advanced audio application developers, 36 and others seeking detailed understanding of USB digital audio internals on Android. 37 </p> 38 39 <p> 40 End users of Nexus devices should see the article 41 <a href="https://support.google.com/nexus/answer/6127700">Record and play back audio using USB host mode</a> 42 at the 43 <a href="https://support.google.com/nexus/">Nexus Help Center</a> instead. 44 Though this article is not oriented towards end users, 45 certain audiophile consumers may find portions of interest. 46 </p> 47 48 <h2 id="overview">Overview of USB</h2> 49 50 <p> 51 Universal Serial Bus (USB) is informally described in the Wikipedia article 52 <a href="http://en.wikipedia.org/wiki/USB">USB</a>, 53 and is formally defined by the standards published by the 54 <a href="http://www.usb.org/">USB Implementers Forum, Inc</a>. 55 For convenience, we summarize the key USB concepts here, 56 but the standards are the authoritative reference. 57 </p> 58 59 <h3 id="terminology">Basic concepts and terminology</h3> 60 61 <p> 62 USB is a <a href="http://en.wikipedia.org/wiki/Bus_(computing)">bus</a> 63 with a single initiator of data transfer operations, called the <i>host</i>. 64 The host communicates with 65 <a href="http://en.wikipedia.org/wiki/Peripheral">peripherals</a> via the bus. 66 </p> 67 68 <p class="note"><strong>Note:</strong> The terms <i>device</i> and <i>accessory</i> are common synonyms for 69 <i>peripheral</i>. We avoid those terms here, as they could be confused with 70 Android <a href="http://en.wikipedia.org/wiki/Mobile_device">device</a> 71 or the Android-specific concept called 72 <a href="http://developer.android.com/guide/topics/connectivity/usb/accessory.html">accessory mode</a>. 73 </p> 74 75 <p> 76 A critical host role is <i>enumeration</i>: 77 the process of detecting which peripherals are connected to the bus, 78 and querying their properties expressed via <i>descriptors</i>. 79 </p> 80 81 <p> 82 A peripheral may be one physical object 83 but actually implement multiple logical <i>functions</i>. 84 For example, a webcam peripheral could have both a camera function and a 85 microphone audio function. 86 </p> 87 88 <p> 89 Each peripheral function has an <i>interface</i> that 90 defines the protocol to communicate with that function. 91 </p> 92 93 <p> 94 The host communicates with a peripheral over a 95 <a href="http://en.wikipedia.org/wiki/Stream_(computing)">pipe</a> 96 to an <a href="http://en.wikipedia.org/wiki/Communication_endpoint">endpoint</a>, 97 a data source or sink 98 provided by one of the peripheral's functions. 99 </p> 100 101 <p> 102 There are two kinds of pipes: <i>message</i> and <i>stream</i>. 103 A message pipe is used for bi-directional control and status. 104 A stream pipe is used for uni-directional data transfer. 105 </p> 106 107 <p> 108 The host initiates all data transfers, 109 hence the terms <i>input</i> and <i>output</i> are expressed relative to the host. 110 An input operation transfers data from the peripheral to the host, 111 while an output operation transfers data from the host to the peripheral. 112 </p> 113 114 <p> 115 There are three major data transfer modes: 116 <i>interrupt</i>, <i>bulk</i>, and <i>isochronous</i>. 117 Isochronous mode will be discussed further in the context of audio. 118 </p> 119 120 <p> 121 The peripheral may have <i>terminals</i> that connect to the outside world, 122 beyond the peripheral itself. In this way, the peripheral serves 123 to translate between USB protocol and "real world" signals. 124 The terminals are logical objects of the function. 125 </p> 126 127 <h2 id="androidModes">Android USB modes</h2> 128 129 <h3 id="developmentMode">Development mode</h3> 130 131 <p> 132 <i>Development mode</i> has been present since the initial release of Android. 133 The Android device appears as a USB peripheral 134 to a host PC running a desktop operating system such as Linux, 135 Mac OS X, or Windows. The only visible peripheral function is either 136 <a href="http://en.wikipedia.org/wiki/Android_software_development#Fastboot">Android fastboot</a> 137 or 138 <a href="http://developer.android.com/tools/help/adb.html">Android Debug Bridge (adb)</a>. 139 The fastboot and adb protocols are layered over USB bulk data transfer mode. 140 </p> 141 142 <h3 id="hostMode">Host mode</h3> 143 144 <p> 145 <i>Host mode</i> is introduced in Android 3.1 (API level 12). 146 </p> 147 148 <p> 149 As the Android device must act as host, and most Android devices include 150 a micro-USB connector that does not directly permit host operation, 151 an on-the-go (<a href="http://en.wikipedia.org/wiki/USB_On-The-Go">OTG</a>) adapter 152 such as this is usually required: 153 </p> 154 155 <img src="images/otg.jpg" style="image-orientation: 90deg;" height="50%" width="50%" alt="OTG" id="figure1" /> 156 <p class="img-caption"> 157 <strong>Figure 1.</strong> On-the-go (OTG) adapter 158 </p> 159 160 161 <p> 162 An Android device might not provide sufficient power to operate a 163 particular peripheral, depending on how much power the peripheral needs, 164 and how much the Android device is capable of supplying. Even if 165 adequate power is available, the Android device battery charge may 166 be significantly shortened. For these situations, use a powered 167 <a href="http://en.wikipedia.org/wiki/USB_hub">hub</a> such as this: 168 </p> 169 170 <img src="images/hub.jpg" alt="Powered hub" id="figure2" /> 171 <p class="img-caption"> 172 <strong>Figure 2.</strong> Powered hub 173 </p> 174 175 <h3 id="accessoryMode">Accessory mode</h3> 176 177 <p> 178 <i>Accessory mode</i> was introduced in Android 3.1 (API level 12) and back-ported to Android 2.3.4. 179 In this mode, the Android device operates as a USB peripheral, 180 under the control of another device such as a dock that serves as host. 181 The difference between development mode and accessory mode 182 is that additional USB functions are visible to the host, beyond adb. 183 The Android device begins in development mode and then 184 transitions to accessory mode via a re-negotiation process. 185 </p> 186 187 <p> 188 Accessory mode was extended with additional features in Android 4.1, 189 in particular audio described below. 190 </p> 191 192 <h2 id="usbAudio">USB audio</h2> 193 194 <h3 id="class">USB classes</h3> 195 196 <p> 197 Each peripheral function has an associated <i>device class</i> document 198 that specifies the standard protocol for that function. 199 This enables <i>class compliant</i> hosts and peripheral functions 200 to inter-operate, without detailed knowledge of each other's workings. 201 Class compliance is critical if the host and peripheral are provided by 202 different entities. 203 </p> 204 205 <p> 206 The term <i>driverless</i> is a common synonym for <i>class compliant</i>, 207 indicating that it is possible to use the standard features of such a 208 peripheral without requiring an operating-system specific 209 <a href="http://en.wikipedia.org/wiki/Device_driver">driver</a> to be installed. 210 One can assume that a peripheral advertised as "no driver needed" 211 for major desktop operating systems 212 will be class compliant, though there may be exceptions. 213 </p> 214 215 <h3 id="audioClass">USB audio class</h3> 216 217 <p> 218 Here we concern ourselves only with peripherals that implement 219 audio functions, and thus adhere to the audio device class. There are two 220 editions of the USB audio class specification: class 1 (UAC1) and 2 (UAC2). 221 </p> 222 223 <h3 id="otherClasses">Comparison with other classes</h3> 224 225 <p> 226 USB includes many other device classes, some of which may be confused 227 with the audio class. The 228 <a href="http://en.wikipedia.org/wiki/USB_mass_storage_device_class">mass storage class</a> 229 (MSC) is used for 230 sector-oriented access to media, while 231 <a href="http://en.wikipedia.org/wiki/Media_Transfer_Protocol">Media Transfer Protocol</a> 232 (MTP) is for full file access to media. 233 Both MSC and MTP may be used for transferring audio files, 234 but only USB audio class is suitable for real-time streaming. 235 </p> 236 237 <h3 id="audioTerminals">Audio terminals</h3> 238 239 <p> 240 The terminals of an audio peripheral are typically analog. 241 The analog signal presented at the peripheral's input terminal is converted to digital by an 242 <a href="http://en.wikipedia.org/wiki/Analog-to-digital_converter">analog-to-digital converter</a> 243 (ADC), 244 and is carried over USB protocol to be consumed by 245 the host. The ADC is a data <i>source</i> 246 for the host. Similarly, the host sends a 247 digital audio signal over USB protocol to the peripheral, where a 248 <a href="http://en.wikipedia.org/wiki/Digital-to-analog_converter">digital-to-analog converter</a> 249 (DAC) 250 converts and presents to an analog output terminal. 251 The DAC is a <i>sink</i> for the host. 252 </p> 253 254 <h3 id="channels">Channels</h3> 255 256 <p> 257 A peripheral with audio function can include a source terminal, sink terminal, or both. 258 Each direction may have one channel (<i>mono</i>), two channels 259 (<i>stereo</i>), or more. 260 Peripherals with more than two channels are called <i>multichannel</i>. 261 It is common to interpret a stereo stream as consisting of 262 <i>left</i> and <i>right</i> channels, and by extension to interpret a multichannel stream as having 263 spatial locations corresponding to each channel. However, it is also quite appropriate 264 (especially for USB audio more so than 265 <a href="http://en.wikipedia.org/wiki/HDMI">HDMI</a>) 266 to not assign any particular 267 standard spatial meaning to each channel. In this case, it is up to the 268 application and user to define how each channel is used. 269 For example, a four-channel USB input stream might have the first three 270 channels attached to various microphones within a room, and the final 271 channel receiving input from an AM radio. 272 </p> 273 274 <h3 id="isochronous">Isochronous transfer mode</h3> 275 276 <p> 277 USB audio uses isochronous transfer mode for its real-time characteristics, 278 at the expense of error recovery. 279 In isochronous mode, bandwidth is guaranteed, and data transmission 280 errors are detected using a cyclic redundancy check (CRC). But there is 281 no packet acknowledgement or re-transmission in the event of error. 282 </p> 283 284 <p> 285 Isochronous transmissions occur each Start Of Frame (SOF) period. 286 The SOF period is one millisecond for full-speed, and 125 microseconds for 287 high-speed. Each full-speed frame carries up to 1023 bytes of payload, 288 and a high-speed frame carries up to 1024 bytes. Putting these together, 289 we calculate the maximum transfer rate as 1,023,000 or 8,192,000 bytes 290 per second. This sets a theoretical upper limit on the combined audio 291 sample rate, channel count, and bit depth. The practical limit is lower. 292 </p> 293 294 <p> 295 Within isochronous mode, there are three sub-modes: 296 </p> 297 298 <ul> 299 <li>Adaptive</li> 300 <li>Asynchronous</li> 301 <li>Synchronous</li> 302 </ul> 303 304 <p> 305 In adaptive sub-mode, the peripheral sink or source adapts to a potentially varying sample rate 306 of the host. 307 </p> 308 309 <p> 310 In asynchronous (also called implicit feedback) sub-mode, 311 the sink or source determines the sample rate, and the host accommodates. 312 The primary theoretical advantage of asynchronous sub-mode is that the source 313 or sink USB clock is physically and electrically closer to (and indeed may 314 be the same as, or derived from) the clock that drives the DAC or ADC. 315 This proximity means that asynchronous sub-mode should be less susceptible 316 to clock jitter. In addition, the clock used by the DAC or ADC may be 317 designed for higher accuracy and lower drift than the host clock. 318 </p> 319 320 <p> 321 In synchronous sub-mode, a fixed number of bytes is transferred each SOF period. 322 The audio sample rate is effectively derived from the USB clock. 323 Synchronous sub-mode is not commonly used with audio because both 324 host and peripheral are at the mercy of the USB clock. 325 </p> 326 327 <p> 328 The table below summarizes the isochronous sub-modes: 329 </p> 330 331 <table> 332 <tr> 333 <th>Sub-mode</th> 334 <th>Byte count<br />per packet</th> 335 <th>Sample rate<br />determined by</th> 336 <th>Used for audio</th> 337 </tr> 338 <tr> 339 <td>adaptive</td> 340 <td>variable</td> 341 <td>host</td> 342 <td>yes</td> 343 </tr> 344 <tr> 345 <td>asynchronous</td> 346 <td>variable</td> 347 <td>peripheral</td> 348 <td>yes</td> 349 </tr> 350 <tr> 351 <td>synchronous</td> 352 <td>fixed</td> 353 <td>USB clock</td> 354 <td>no</td> 355 </tr> 356 </table> 357 358 <p> 359 In practice, the sub-mode does of course matter, but other factors 360 should also be considered. 361 </p> 362 363 <h2 id="androidSupport">Android support for USB audio class</h2> 364 365 <h3 id="developmentAudio">Development mode</h3> 366 367 <p> 368 USB audio is not supported in development mode. 369 </p> 370 371 <h3 id="hostAudio">Host mode</h3> 372 373 <p> 374 Android 5.0 (API level 21) and above supports a subset of USB audio class 1 (UAC1) features: 375 </p> 376 377 <ul> 378 <li>The Android device must act as host</li> 379 <li>The audio format must be PCM (interface type I)</li> 380 <li>The bit depth must be 16-bits, 24-bits, or 32-bits where 381 24 bits of useful audio data are left-justified within the most significant 382 bits of the 32-bit word</li> 383 <li>The sample rate must be either 48, 44.1, 32, 24, 22.05, 16, 12, 11.025, or 8 kHz</li> 384 <li>The channel count must be 1 (mono) or 2 (stereo)</li> 385 </ul> 386 387 <p> 388 Perusal of the Android framework source code may show additional code 389 beyond the minimum needed to support these features. But this code 390 has not been validated, so more advanced features are not yet claimed. 391 </p> 392 393 <h3 id="accessoryAudio">Accessory mode</h3> 394 395 <p> 396 Android 4.1 (API level 16) added limited support for audio playback to the host. 397 While in accessory mode, Android automatically routes its audio output to USB. 398 That is, the Android device serves as a data source to the host, for example a dock. 399 </p> 400 401 <p> 402 Accessory mode audio has these features: 403 </p> 404 405 <ul> 406 <li> 407 The Android device must be controlled by a knowledgeable host that 408 can first transition the Android device from development mode to accessory mode, 409 and then the host must transfer audio data from the appropriate endpoint. 410 Thus the Android device does not appear "driverless" to the host. 411 </li> 412 <li>The direction must be <i>input</i>, expressed relative to the host</li> 413 <li>The audio format must be 16-bit PCM</li> 414 <li>The sample rate must be 44.1 kHz</li> 415 <li>The channel count must be 2 (stereo)</li> 416 </ul> 417 418 <p> 419 Accessory mode audio has not been widely adopted, 420 and is not currently recommended for new designs. 421 </p> 422 423 <h2 id="applications">Applications of USB digital audio</h2> 424 425 <p> 426 As the name indicates, the USB digital audio signal is represented 427 by a <a href="http://en.wikipedia.org/wiki/Digital_data">digital</a> data stream 428 rather than the <a href="http://en.wikipedia.org/wiki/Analog_signal">analog</a> 429 signal used by the common TRS mini 430 <a href="http://en.wikipedia.org/wiki/Phone_connector_(audio)">headset connector</a>. 431 Eventually any digital signal must be converted to analog before it can be heard. 432 There are tradeoffs in choosing where to place that conversion. 433 </p> 434 435 <h3 id="comparison">A tale of two DACs</h3> 436 437 <p> 438 In the example diagram below, we compare two designs. First we have a 439 mobile device with Application Processor (AP), on-board DAC, amplifier, 440 and analog TRS connector attached to headphones. We also consider a 441 mobile device with USB connected to external USB DAC and amplifier, 442 also with headphones. 443 </p> 444 445 <img src="images/dac.png" alt="DAC comparison" id="figure3" /> 446 <p class="img-caption"> 447 <strong>Figure 3.</strong> Comparison of two DACs 448 </p> 449 450 <p> 451 Which design is better? The answer depends on your needs. 452 Each has advantages and disadvantages. 453 </p> 454 <p class="note"><strong>Note:</strong> This is an artificial comparison, since 455 a real Android device would probably have both options available. 456 </p> 457 458 <p> 459 The first design A is simpler, less expensive, uses less power, 460 and will be a more reliable design assuming otherwise equally reliable components. 461 However, there are usually audio quality tradeoffs vs. other requirements. 462 For example, if this is a mass-market device, it may be designed to fit 463 the needs of the general consumer, not for the audiophile. 464 </p> 465 466 <p> 467 In the second design, the external audio peripheral C can be designed for 468 higher audio quality and greater power output without impacting the cost of 469 the basic mass market Android device B. Yes, it is a more expensive design, 470 but the cost is absorbed only by those who want it. 471 </p> 472 473 <p> 474 Mobile devices are notorious for having high-density 475 circuit boards, which can result in more opportunities for 476 <a href="http://en.wikipedia.org/wiki/Crosstalk_(electronics)">crosstalk</a> 477 that degrades adjacent analog signals. Digital communication is less susceptible to 478 <a href="http://en.wikipedia.org/wiki/Noise_(electronics)">noise</a>, 479 so moving the DAC from the Android device A to an external circuit board 480 C allows the final analog stages to be physically and electrically 481 isolated from the dense and noisy circuit board, resulting in higher fidelity audio. 482 </p> 483 484 <p> 485 On the other hand, 486 the second design is more complex, and with added complexity come more 487 opportunities for things to fail. There is also additional latency 488 from the USB controllers. 489 </p> 490 491 <h3 id="hostApplications">Host mode applications</h3> 492 493 <p> 494 Typical USB host mode audio applications include: 495 </p> 496 497 <ul> 498 <li>music listening</li> 499 <li>telephony</li> 500 <li>instant messaging and voice chat</li> 501 <li>recording</li> 502 </ul> 503 504 <p> 505 For all of these applications, Android detects a compatible USB digital 506 audio peripheral, and automatically routes audio playback and capture 507 appropriately, based on the audio policy rules. 508 Stereo content is played on the first two channels of the peripheral. 509 </p> 510 511 <p> 512 There are no APIs specific to USB digital audio. 513 For advanced usage, the automatic routing may interfere with applications 514 that are USB-aware. For such applications, disable automatic routing 515 via the corresponding control in the Media section of 516 <a href="http://developer.android.com/tools/index.html">Settings / Developer Options</a>. 517 </p> 518 519 <h3 id="hostDebugging">Debugging while in host mode</h3> 520 521 <p> 522 While in USB host mode, adb debugging over USB is unavailable. 523 See section <a href="http://developer.android.com/tools/help/adb.html#wireless">Wireless usage</a> 524 of 525 <a href="http://developer.android.com/tools/help/adb.html">Android Debug Bridge</a> 526 for an alternative. 527 </p> 528 529 <h2 id="compatibility">Implementing USB audio</h2> 530 531 <h3 id="recommendationsPeripheral">Recommendations for audio peripheral vendors</h3> 532 533 <p> 534 In order to inter-operate with Android devices, audio peripheral vendors should: 535 </p> 536 537 <ul> 538 <li>design for audio class compliance; 539 currently Android targets class 1, but it is wise to plan for class 2</li> 540 <li>avoid <a href="http://en.wiktionary.org/wiki/quirk">quirks</a></li> 541 <li>test for inter-operability with reference and popular Android devices</li> 542 <li>clearly document supported features, audio class compliance, power requirements, etc. 543 so that consumers can make informed decisions</li> 544 </ul> 545 546 <h3 id="recommendationsAndroid">Recommendations for Android device OEMs and SoC vendors</h3> 547 548 <p> 549 In order to support USB digital audio, device OEMs and SoC vendors should: 550 </p> 551 552 <ul> 553 <li>design hardware to support USB host mode</li> 554 <li>enable generic USB host support at the framework level 555 via the <code>android.hardware.usb.host.xml</code> feature flag</li> 556 <li>enable all kernel features needed: USB host mode, USB audio, isochronous transfer mode; 557 see <a href="/devices/tech/config/kernel.html">Android Kernel Configuration</a></li> 558 <li>keep up-to-date with recent kernel releases and patches; 559 despite the noble goal of class compliance, there are extant audio peripherals 560 with <a href="http://en.wiktionary.org/wiki/quirk">quirks</a>, 561 and recent kernels have workarounds for such quirks 562 </li> 563 <li>enable USB audio policy as described below</li> 564 <li>add audio.usb.default to PRODUCT_PACKAGES in device.mk</li> 565 <li>test for inter-operability with common USB audio peripherals</li> 566 </ul> 567 568 <h3 id="enable">How to enable USB audio policy</h3> 569 570 <p> 571 To enable USB audio, add an entry to the 572 audio policy configuration file. This is typically 573 located here: 574 </p> 575 <pre class="devsite-click-to-copy"> 576 device/oem/codename/audio_policy.conf 577 </pre> 578 <p> 579 The pathname component "oem" should be replaced by the name 580 of the OEM who manufactures the Android device, 581 and "codename" should be replaced by the device code name. 582 </p> 583 584 <p> 585 An example entry is shown here: 586 </p> 587 588 <pre class="devsite-click-to-copy"> 589 audio_hw_modules { 590 ... 591 usb { 592 outputs { 593 usb_accessory { 594 sampling_rates 44100 595 channel_masks AUDIO_CHANNEL_OUT_STEREO 596 formats AUDIO_FORMAT_PCM_16_BIT 597 devices AUDIO_DEVICE_OUT_USB_ACCESSORY 598 } 599 usb_device { 600 sampling_rates dynamic 601 channel_masks dynamic 602 formats dynamic 603 devices AUDIO_DEVICE_OUT_USB_DEVICE 604 } 605 } 606 inputs { 607 usb_device { 608 sampling_rates dynamic 609 channel_masks AUDIO_CHANNEL_IN_STEREO 610 formats AUDIO_FORMAT_PCM_16_BIT 611 devices AUDIO_DEVICE_IN_USB_DEVICE 612 } 613 } 614 } 615 ... 616 } 617 </pre> 618 619 <h3 id="sourceCode">Source code</h3> 620 621 <p> 622 The audio Hardware Abstraction Layer (HAL) 623 implementation for USB audio is located here: 624 </p> 625 <pre class="devsite-click-to-copy"> 626 hardware/libhardware/modules/usbaudio/ 627 </pre> 628 <p> 629 The USB audio HAL relies heavily on 630 <i>tinyalsa</i>, described at <a href="terminology.html">Audio Terminology</a>. 631 Though USB audio relies on isochronous transfers, 632 this is abstracted away by the ALSA implementation. 633 So the USB audio HAL and tinyalsa do not need to concern 634 themselves with this part of USB protocol. 635 </p> 636 637 </body> 638 </html> 639