Home | History | Annotate | Download | only in include
      1 /* -----------------------------------------------------------------------------
      2 Software License for The Fraunhofer FDK AAC Codec Library for Android
      3 
      4  Copyright  1995 - 2018 Fraunhofer-Gesellschaft zur Frderung der angewandten
      5 Forschung e.V. All rights reserved.
      6 
      7  1.    INTRODUCTION
      8 The Fraunhofer FDK AAC Codec Library for Android ("FDK AAC Codec") is software
      9 that implements the MPEG Advanced Audio Coding ("AAC") encoding and decoding
     10 scheme for digital audio. This FDK AAC Codec software is intended to be used on
     11 a wide variety of Android devices.
     12 
     13 AAC's HE-AAC and HE-AAC v2 versions are regarded as today's most efficient
     14 general perceptual audio codecs. AAC-ELD is considered the best-performing
     15 full-bandwidth communications codec by independent studies and is widely
     16 deployed. AAC has been standardized by ISO and IEC as part of the MPEG
     17 specifications.
     18 
     19 Patent licenses for necessary patent claims for the FDK AAC Codec (including
     20 those of Fraunhofer) may be obtained through Via Licensing
     21 (www.vialicensing.com) or through the respective patent owners individually for
     22 the purpose of encoding or decoding bit streams in products that are compliant
     23 with the ISO/IEC MPEG audio standards. Please note that most manufacturers of
     24 Android devices already license these patent claims through Via Licensing or
     25 directly from the patent owners, and therefore FDK AAC Codec software may
     26 already be covered under those patent licenses when it is used for those
     27 licensed purposes only.
     28 
     29 Commercially-licensed AAC software libraries, including floating-point versions
     30 with enhanced sound quality, are also available from Fraunhofer. Users are
     31 encouraged to check the Fraunhofer website for additional applications
     32 information and documentation.
     33 
     34 2.    COPYRIGHT LICENSE
     35 
     36 Redistribution and use in source and binary forms, with or without modification,
     37 are permitted without payment of copyright license fees provided that you
     38 satisfy the following conditions:
     39 
     40 You must retain the complete text of this software license in redistributions of
     41 the FDK AAC Codec or your modifications thereto in source code form.
     42 
     43 You must retain the complete text of this software license in the documentation
     44 and/or other materials provided with redistributions of the FDK AAC Codec or
     45 your modifications thereto in binary form. You must make available free of
     46 charge copies of the complete source code of the FDK AAC Codec and your
     47 modifications thereto to recipients of copies in binary form.
     48 
     49 The name of Fraunhofer may not be used to endorse or promote products derived
     50 from this library without prior written permission.
     51 
     52 You may not charge copyright license fees for anyone to use, copy or distribute
     53 the FDK AAC Codec software or your modifications thereto.
     54 
     55 Your modified versions of the FDK AAC Codec must carry prominent notices stating
     56 that you changed the software and the date of any change. For modified versions
     57 of the FDK AAC Codec, the term "Fraunhofer FDK AAC Codec Library for Android"
     58 must be replaced by the term "Third-Party Modified Version of the Fraunhofer FDK
     59 AAC Codec Library for Android."
     60 
     61 3.    NO PATENT LICENSE
     62 
     63 NO EXPRESS OR IMPLIED LICENSES TO ANY PATENT CLAIMS, including without
     64 limitation the patents of Fraunhofer, ARE GRANTED BY THIS SOFTWARE LICENSE.
     65 Fraunhofer provides no warranty of patent non-infringement with respect to this
     66 software.
     67 
     68 You may use this FDK AAC Codec software or modifications thereto only for
     69 purposes that are authorized by appropriate patent licenses.
     70 
     71 4.    DISCLAIMER
     72 
     73 This FDK AAC Codec software is provided by Fraunhofer on behalf of the copyright
     74 holders and contributors "AS IS" and WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES,
     75 including but not limited to the implied warranties of merchantability and
     76 fitness for a particular purpose. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR
     77 CONTRIBUTORS BE LIABLE for any direct, indirect, incidental, special, exemplary,
     78 or consequential damages, including but not limited to procurement of substitute
     79 goods or services; loss of use, data, or profits, or business interruption,
     80 however caused and on any theory of liability, whether in contract, strict
     81 liability, or tort (including negligence), arising in any way out of the use of
     82 this software, even if advised of the possibility of such damage.
     83 
     84 5.    CONTACT INFORMATION
     85 
     86 Fraunhofer Institute for Integrated Circuits IIS
     87 Attention: Audio and Multimedia Departments - FDK AAC LL
     88 Am Wolfsmantel 33
     89 91058 Erlangen, Germany
     90 
     91 www.iis.fraunhofer.de/amm
     92 amm-info (at) iis.fraunhofer.de
     93 ----------------------------------------------------------------------------- */
     94 
     95 /**************************** AAC encoder library ******************************
     96 
     97    Author(s):   M. Lohwasser
     98 
     99    Description:
    100 
    101 *******************************************************************************/
    102 
    103 /**
    104  * \file   aacenc_lib.h
    105  * \brief  FDK AAC Encoder library interface header file.
    106  *
    107 \mainpage  Introduction
    108 
    109 \section Scope
    110 
    111 This document describes the high-level interface and usage of the ISO/MPEG-2/4
    112 AAC Encoder library developed by the Fraunhofer Institute for Integrated
    113 Circuits (IIS).
    114 
    115 The library implements encoding on the basis of the MPEG-2 and MPEG-4 AAC
    116 Low-Complexity standard, and depending on the library's configuration, MPEG-4
    117 High-Efficiency AAC v2 and/or AAC-ELD standard.
    118 
    119 All references to SBR (Spectral Band Replication) are only applicable to HE-AAC
    120 or AAC-ELD versions of the library. All references to PS (Parametric Stereo) are
    121 only applicable to HE-AAC v2 versions of the library.
    122 
    123 \section encBasics Encoder Basics
    124 
    125 This document can only give a rough overview about the ISO/MPEG-2 and ISO/MPEG-4
    126 AAC audio coding standard. To understand all the terms in this document, you are
    127 encouraged to read the following documents.
    128 
    129 - ISO/IEC 13818-7 (MPEG-2 AAC), which defines the syntax of MPEG-2 AAC audio
    130 bitstreams.
    131 - ISO/IEC 14496-3 (MPEG-4 AAC, subparts 1 and 4), which defines the syntax of
    132 MPEG-4 AAC audio bitstreams.
    133 - Lutzky, Schuller, Gayer, Krämer, Wabnik, "A guideline to audio codec
    134 delay", 116th AES Convention, May 8, 2004
    135 
    136 MPEG Advanced Audio Coding is based on a time-to-frequency mapping of the
    137 signal. The signal is partitioned into overlapping portions and transformed into
    138 frequency domain. The spectral components are then quantized and coded. \n An
    139 MPEG-2 or MPEG-4 AAC audio bitstream is composed of frames. Contrary to MPEG-1/2
    140 Layer-3 (mp3), the length of individual frames is not restricted to a fixed
    141 number of bytes, but can take on any length between 1 and 768 bytes.
    142 
    143 
    144 \page LIBUSE Library Usage
    145 
    146 \section InterfaceDescription API Files
    147 
    148 All API header files are located in the folder /include of the release package.
    149 All header files are provided for usage in C/C++ programs. The AAC encoder
    150 library API functions are located in aacenc_lib.h.
    151 
    152 In binary releases the encoder core resides in statically linkable libraries
    153 called for example libAACenc.a/libFDK.a (LINUX) or FDK_fastaaclib.lib (MS Visual
    154 C++) for the plain AAC-LC core encoder and libSBRenc.a (LINUX) or
    155 FDK_sbrEncLib.lib (MS Visual C++) for the SBR (Spectral Band Replication) and PS
    156 (Parametric Stereo) modules.
    157 
    158 \section CallingSequence Calling Sequence
    159 
    160 For encoding of ISO/MPEG-2/4 AAC bitstreams the following sequence is mandatory.
    161 Input read and output write functions as well as the corresponding open and
    162 close functions are left out, since they may be implemented differently
    163 according to the user's specific requirements. The example implementation uses
    164 file-based input/output.
    165 
    166 -# Call aacEncOpen() to allocate encoder instance with required \ref encOpen
    167 "configuration". \code HANDLE_AACENCODER hAacEncoder = NULL; if ( (ErrorStatus =
    168 aacEncOpen(&hAacEncoder,0,0)) != AACENC_OK ) { \endcode
    169 -# Call aacEncoder_SetParam() for each parameter to be set. AOT, samplingrate,
    170 channelMode, bitrate and transport type are \ref encParams "mandatory". \code
    171 ErrorStatus = aacEncoder_SetParam(hAacEncoder, parameter, value);
    172 \endcode
    173 -# Call aacEncEncode() with NULL parameters to \ref encReconf "initialize"
    174 encoder instance with present parameter set. \code ErrorStatus =
    175 aacEncEncode(hAacEncoder, NULL, NULL, NULL, NULL); \endcode
    176 -# Call aacEncInfo() to retrieve a configuration data block to be transmitted
    177 out of band. This is required when using RFC3640 or RFC3016 like transport.
    178 \code
    179 AACENC_InfoStruct encInfo;
    180 aacEncInfo(hAacEncoder, &encInfo);
    181 \endcode
    182 -# Encode input audio data in loop.
    183 \code
    184 do
    185 {
    186 \endcode
    187 Feed \ref feedInBuf "input buffer" with new audio data and provide input/output
    188 \ref bufDes "arguments" to aacEncEncode(). \code ErrorStatus =
    189 aacEncEncode(hAacEncoder, &inBufDesc, &outBufDesc, &inargs, &outargs); \endcode
    190 Write \ref writeOutData "output data" to file or audio device.
    191 \code
    192 } while (ErrorStatus==AACENC_OK);
    193 \endcode
    194 -# Call aacEncClose() and destroy encoder instance.
    195 \code
    196 aacEncClose(&hAacEncoder);
    197 \endcode
    198 
    199 
    200 \section encOpen Encoder Instance Allocation
    201 
    202 The assignment of the aacEncOpen() function is very flexible and can be used in
    203 the following way.
    204 - If the amount of memory consumption is not an issue, the encoder instance can
    205 be allocated for the maximum number of possible audio channels (for example 6 or
    206 8) with the full functional range supported by the library. This is the default
    207 open procedure for the AAC encoder if memory consumption does not need to be
    208 minimized. \code aacEncOpen(&hAacEncoder,0,0) \endcode
    209 - If the required MPEG-4 AOTs do not call for the full functional range of the
    210 library, encoder modules can be allocated selectively. \verbatim
    211 ------------------------------------------------------
    212  AAC | SBR |  PS | MD |         FLAGS         | value
    213 -----+-----+-----+----+-----------------------+-------
    214   X  |  -  |  -  |  - | (0x01)                |  0x01
    215   X  |  X  |  -  |  - | (0x01|0x02)           |  0x03
    216   X  |  X  |  X  |  - | (0x01|0x02|0x04)      |  0x07
    217   X  |  -  |  -  |  X | (0x01          |0x10) |  0x11
    218   X  |  X  |  -  |  X | (0x01|0x02     |0x10) |  0x13
    219   X  |  X  |  X  |  X | (0x01|0x02|0x04|0x10) |  0x17
    220 ------------------------------------------------------
    221  - AAC: Allocate AAC Core Encoder module.
    222  - SBR: Allocate Spectral Band Replication module.
    223  - PS: Allocate Parametric Stereo module.
    224  - MD: Allocate Meta Data module within AAC encoder.
    225 \endverbatim
    226 \code aacEncOpen(&hAacEncoder,value,0) \endcode
    227 - Specifying the maximum number of channels to be supported in the encoder
    228 instance can be done as follows.
    229  - For example allocate an encoder instance which supports 2 channels for all
    230 supported AOTs. The library itself may be capable of encoding up to 6 or 8
    231 channels but in this example only 2 channel encoding is required and thus only
    232 buffers for 2 channels are allocated to save data memory. \code
    233 aacEncOpen(&hAacEncoder,0,2) \endcode
    234  - Additionally the maximum number of supported channels in the SBR module can
    235 be denoted separately.\n In this example the encoder instance provides a maximum
    236 of 6 channels out of which up to 2 channels support SBR. This encoder instance
    237 can produce for example 5.1 channel AAC-LC streams or stereo HE-AAC (v2)
    238 streams. HE-AAC 5.1 multi channel is not possible since only 2 out of 6 channels
    239 support SBR, which saves data memory. \code aacEncOpen(&hAacEncoder,0,6|(2<<8))
    240 \endcode \n
    241 
    242 \section bufDes Input/Output Arguments
    243 
    244 \subsection allocIOBufs Provide Buffer Descriptors
    245 In the present encoder API, the input and output buffers are described with \ref
    246 AACENC_BufDesc "buffer descriptors". This mechanism allows a flexible handling
    247 of input and output buffers without impact to the actual encoding call. Optional
    248 buffers are necessary e.g. for ancillary data, meta data input or additional
    249 output buffers describing superframing data in DAB+ or DRM+.\n At least one
    250 input buffer for audio input data and one output buffer for bitstream data must
    251 be allocated. The input buffer size can be a user defined multiple of the number
    252 of input channels. PCM input data will be copied from the user defined PCM
    253 buffer to an internal input buffer and so input data can be less than one AAC
    254 audio frame. The output buffer size should be 6144 bits per channel excluding
    255 the LFE channel. If the output data does not fit into the provided buffer, an
    256 AACENC_ERROR will be returned by aacEncEncode(). \code static INT_PCM
    257 inputBuffer[8*2048]; static UCHAR            ancillaryBuffer[50]; static
    258 AACENC_MetaData  metaDataSetup; static UCHAR            outputBuffer[8192];
    259 \endcode
    260 
    261 All input and output buffer must be clustered in input and output buffer arrays.
    262 \code
    263 static void* inBuffer[]        = { inputBuffer, ancillaryBuffer, &metaDataSetup
    264 }; static INT   inBufferIds[]     = { IN_AUDIO_DATA, IN_ANCILLRY_DATA,
    265 IN_METADATA_SETUP }; static INT   inBufferSize[]    = { sizeof(inputBuffer),
    266 sizeof(ancillaryBuffer), sizeof(metaDataSetup) }; static INT   inBufferElSize[]
    267 = { sizeof(INT_PCM), sizeof(UCHAR), sizeof(AACENC_MetaData) };
    268 
    269 static void* outBuffer[]       = { outputBuffer };
    270 static INT   outBufferIds[]    = { OUT_BITSTREAM_DATA };
    271 static INT   outBufferSize[]   = { sizeof(outputBuffer) };
    272 static INT   outBufferElSize[] = { sizeof(UCHAR) };
    273 \endcode
    274 
    275 Allocate buffer descriptors
    276 \code
    277 AACENC_BufDesc inBufDesc;
    278 AACENC_BufDesc outBufDesc;
    279 \endcode
    280 
    281 Initialize input buffer descriptor
    282 \code
    283 inBufDesc.numBufs            = sizeof(inBuffer)/sizeof(void*);
    284 inBufDesc.bufs              = (void**)&inBuffer;
    285 inBufDesc.bufferIdentifiers = inBufferIds;
    286 inBufDesc.bufSizes          = inBufferSize;
    287 inBufDesc.bufElSizes        = inBufferElSize;
    288 \endcode
    289 
    290 Initialize output buffer descriptor
    291 \code
    292 outBufDesc.numBufs           = sizeof(outBuffer)/sizeof(void*);
    293 outBufDesc.bufs              = (void**)&outBuffer;
    294 outBufDesc.bufferIdentifiers = outBufferIds;
    295 outBufDesc.bufSizes          = outBufferSize;
    296 outBufDesc.bufElSizes        = outBufferElSize;
    297 \endcode
    298 
    299 \subsection argLists Provide Input/Output Argument Lists
    300 The input and output arguments of an aacEncEncode() call are described in
    301 argument structures. \code AACENC_InArgs     inargs; AACENC_OutArgs    outargs;
    302 \endcode
    303 
    304 \section feedInBuf Feed Input Buffer
    305 The input buffer should be handled as a modulo buffer. New audio data in the
    306 form of pulse-code- modulated samples (PCM) must be read from external and be
    307 fed to the input buffer depending on its fill level. The required sample bitrate
    308 (represented by the data type INT_PCM which is 16, 24 or 32 bits wide) is fixed
    309 and depends on library configuration (usually 16 bit). \code inargs.numInSamples
    310 += WAV_InputRead ( wavIn, &inputBuffer[inargs.numInSamples],
    311                                        FDKmin(encInfo.inputChannels*encInfo.frameLength,
    312                                               sizeof(inputBuffer) /
    313                                               sizeof(INT_PCM)-inargs.numInSamples),
    314                                        SAMPLE_BITS
    315                                      );
    316 \endcode
    317 
    318 After the encoder's internal buffer is fed with incoming audio samples, and
    319 aacEncEncode() processed the new input data, update/move remaining samples in
    320 input buffer, simulating a modulo buffer: \code if (outargs.numInSamples>0) {
    321     FDKmemmove( inputBuffer,
    322                 &inputBuffer[outargs.numInSamples],
    323                 sizeof(INT_PCM)*(inargs.numInSamples-outargs.numInSamples) );
    324     inargs.numInSamples -= outargs.numInSamples;
    325 }
    326 \endcode
    327 
    328 \section writeOutData Output Bitstream Data
    329 If any AAC bitstream data is available, write it to output file or device. This
    330 can be done once the following condition is true: \code if
    331 (outargs.numOutBytes>0) {
    332 
    333 }
    334 \endcode
    335 
    336 If you use file I/O then for example call mpegFileWrite_Write() from the library
    337 libMpegFileWrite \code mpegFileWrite_Write(hMpegFile, outputBuffer,
    338 outargs.numOutBytes, aacEncoder_GetParam(hAacEncoder, AACENC_GRANULE_LENGTH));
    339 \endcode
    340 
    341 \section cfgMetaData Meta Data Configuration
    342 
    343 If the present library is configured with Metadata support, it is possible to
    344 insert meta data side info into the generated audio bitstream while encoding.
    345 
    346 To work with meta data the encoder instance has to be \ref encOpen "allocated"
    347 with meta data support. The meta data mode must be be configured with the
    348 ::AACENC_METADATA_MODE parameter and aacEncoder_SetParam() function. \code
    349 aacEncoder_SetParam(hAacEncoder, AACENC_METADATA_MODE, 0-3); \endcode
    350 
    351 This configuration indicates how to embed meta data into bitstrem. Either no
    352 insertion, MPEG or ETSI style. The meta data itself must be specified within the
    353 meta data setup structure AACENC_MetaData.
    354 
    355 Changing one of the AACENC_MetaData setup parameters can be achieved from
    356 outside the library within ::IN_METADATA_SETUP input buffer. There is no need to
    357 supply meta data setup structure every frame. If there is no new meta setup data
    358 available, the encoder uses the previous setup or the default configuration in
    359 initial state.
    360 
    361 In general the audio compressor and limiter within the encoder library can be
    362 configured with the ::AACENC_METADATA_DRC_PROFILE parameter
    363 AACENC_MetaData::drc_profile and and AACENC_MetaData::comp_profile.
    364 \n
    365 
    366 \section encReconf Encoder Reconfiguration
    367 
    368 The encoder library allows reconfiguration of the encoder instance with new
    369 settings continuously between encoding frames. Each parameter to be changed must
    370 be set with a single aacEncoder_SetParam() call. The internal status of each
    371 parameter can be retrieved with an aacEncoder_GetParam() call.\n There is no
    372 stand-alone reconfiguration function available. When parameters were modified
    373 from outside the library, an internal control mechanism triggers the necessary
    374 reconfiguration process which will be applied at the beginning of the following
    375 aacEncEncode() call. This state can be observed from external via the
    376 AACENC_INIT_STATUS and aacEncoder_GetParam() function. The reconfiguration
    377 process can also be applied immediately when all parameters of an aacEncEncode()
    378 call are NULL with a valid encoder handle.\n\n The internal reconfiguration
    379 process can be controlled from extern with the following access. \code
    380 aacEncoder_SetParam(hAacEncoder, AACENC_CONTROL_STATE, AACENC_CTRLFLAGS);
    381 \endcode
    382 
    383 
    384 \section encParams Encoder Parametrization
    385 
    386 All parameteres listed in ::AACENC_PARAM can be modified within an encoder
    387 instance.
    388 
    389 \subsection encMandatory Mandatory Encoder Parameters
    390 The following parameters must be specified when the encoder instance is
    391 initialized. \code aacEncoder_SetParam(hAacEncoder, AACENC_AOT, value);
    392 aacEncoder_SetParam(hAacEncoder, AACENC_BITRATE, value);
    393 aacEncoder_SetParam(hAacEncoder, AACENC_SAMPLERATE, value);
    394 aacEncoder_SetParam(hAacEncoder, AACENC_CHANNELMODE, value);
    395 \endcode
    396 Beyond that is an internal auto mode which preinitizializes the ::AACENC_BITRATE
    397 parameter if the parameter was not set from extern. The bitrate depends on the
    398 number of effective channels and sampling rate and is determined as follows.
    399 \code
    400 AAC-LC (AOT_AAC_LC): 1.5 bits per sample
    401 HE-AAC (AOT_SBR): 0.625 bits per sample (dualrate sbr)
    402 HE-AAC (AOT_SBR): 1.125 bits per sample (downsampled sbr)
    403 HE-AAC v2 (AOT_PS): 0.5 bits per sample
    404 \endcode
    405 
    406 \subsection channelMode Channel Mode Configuration
    407 The input audio data is described with the ::AACENC_CHANNELMODE parameter in the
    408 aacEncoder_SetParam() call. It is not possible to use the encoder instance with
    409 a 'number of input channels' argument. Instead, the channelMode must be set as
    410 follows. \code aacEncoder_SetParam(hAacEncoder, AACENC_CHANNELMODE, value);
    411 \endcode The parameter is specified in ::CHANNEL_MODE and can be mapped from the
    412 number of input channels in the following way. \code CHANNEL_MODE chMode =
    413 MODE_INVALID;
    414 
    415 switch (nChannels) {
    416   case 1:  chMode = MODE_1;          break;
    417   case 2:  chMode = MODE_2;          break;
    418   case 3:  chMode = MODE_1_2;        break;
    419   case 4:  chMode = MODE_1_2_1;      break;
    420   case 5:  chMode = MODE_1_2_2;      break;
    421   case 6:  chMode = MODE_1_2_2_1;    break;
    422   case 7:  chMode = MODE_6_1;        break;
    423   case 8:  chMode = MODE_7_1_BACK;   break;
    424   default:
    425     chMode = MODE_INVALID;
    426 }
    427 return chMode;
    428 \endcode
    429 
    430 \subsection bitreservoir Bitreservoir Configuration
    431 In AAC, the default bitreservoir configuration depends on the chosen bitrate per
    432 frame and the number of effective channels. The size can be determined as below.
    433 \f[
    434 bitreservoir = nEffChannels*6144 - (bitrate*framelength/samplerate)
    435 \f]
    436 Due to audio quality concerns it is not recommended to change the bitreservoir
    437 size to a lower value than the default setting! However, for minimizing the
    438 delay for streaming applications or for achieving a constant size of the
    439 bitstream packages in each frame, it may be necessaray to change the
    440 bitreservoir size. This can be done with the ::AACENC_PEAK_BITRATE parameter.
    441 \code
    442 aacEncoder_SetParam(hAacEncoder, AACENC_PEAK_BITRATE, value);
    443 \endcode
    444 By setting ::AACENC_BITRATEMODE to fixed framing, the bitreservoir is disabled.
    445 A disabled bitreservoir results in a constant size for each bitstream package.
    446 Please note that especially at lower bitrates a disabled bitreservoir can
    447 downgrade the audio quality considerably! The default bitreservoir configuration
    448 can be achieved as follows. \code aacEncoder_SetParam(hAacEncoder,
    449 AACENC_BITRESERVOIR, -1); \endcode
    450 
    451 To achieve acceptable audio quality with a reduced bitreservoir size setting at
    452 least 1000 bits per audio channel is recommended. For a multichannel audio file
    453 with 5.1 channels the bitreservoir reduced to 5000 bits results in acceptable
    454 audio quality.
    455 
    456 
    457 \subsection vbrmode Variable Bitrate Mode
    458 The encoder provides various Variable Bitrate Modes that differ in audio quality
    459 and average overall bitrate. The given values are averages over time, different
    460 encoder settings and strongly depend on the type of audio signal. The VBR
    461 configurations can be adjusted via ::AACENC_BITRATEMODE encoder parameter.
    462 \verbatim
    463 --------------------------------------------
    464  VBR_MODE | Approx. Bitrate in kbps/channel
    465           |     AAC-LC    |  AAC-LD/AC_ELD
    466 ----------+---------------+-----------------
    467     VBR_1 |    32 -  48   |      32 -  56
    468     VBR_2 |    40 -  56   |      40 -  64
    469     VBR_3 |    48 -  64   |      48 -  72
    470     VBR_4 |    64 -  80   |      64 -  88
    471     VBR_5 |    96 - 120   |     112 - 144
    472 --------------------------------------------
    473 \endverbatim
    474 The bitrate ranges apply for individual audio channels. In case of multichannel
    475 configurations the average bitrate might be estimated by multiplying with the
    476 number of effective channels. This corresponds to all audio input channels
    477 exclusively the low frequency channel. At configurations which are making use of
    478 downmix modules the AAC core channels respectively downmix channels shall be
    479 considered. For ::AACENC_AOT which are using SBR, the average bitrate can be
    480 estimated by using the ratio of 0.5 for dualrate SBR and 0.75 for downsampled
    481 SBR configurations.
    482 
    483 
    484 \subsection encQual Audio Quality Considerations
    485 The default encoder configuration is suggested to be used. Encoder tools such as
    486 TNS and PNS are activated by default and are internally controlled (see \ref
    487 BEHAVIOUR_TOOLS).
    488 
    489 There is an additional quality parameter called ::AACENC_AFTERBURNER. In the
    490 default configuration this quality switch is deactivated because it would cause
    491 a workload increase which might be significant. If workload is not an issue in
    492 the application we recommended to activate this feature. \code
    493 aacEncoder_SetParam(hAacEncoder, AACENC_AFTERBURNER, 0/1); \endcode
    494 
    495 \subsection encELD ELD Auto Configuration Mode
    496 For ELD configuration a so called auto configurator is available which
    497 configures SBR and the SBR ratio by itself. The configurator is used when the
    498 encoder parameter ::AACENC_SBR_MODE and ::AACENC_SBR_RATIO are not set
    499 explicitly.
    500 
    501 Based on sampling rate and chosen bitrate a reasonable SBR configuration will be
    502 used. \verbatim
    503 ------------------------------------------------------------------
    504  Sampling Rate |   Total Bitrate | No. of | SBR |       SBR Ratio
    505      [kHz]     |      [bit/s]    |  Chan  |     |
    506                |                 |        |     |
    507 ---------------+-----------------+--------+-----+-----------------
    508      ]min, 16[ |    min -    max |      1 | off |             ---
    509 ---------------+-----------------+--------------+-----------------
    510           [16] |    min -  27999 |      1 |  on | downsampled SBR
    511                |  28000 -    max |      1 | off |             ---
    512 ---------------+-----------------+--------------+-----------------
    513      ]16 - 24] |    min -  39999 |      1 |  on | downsampled SBR
    514                |  40000 -    max |      1 | off |             ---
    515 ---------------+-----------------+--------------+-----------------
    516      ]24 - 32] |    min -  27999 |      1 |  on |    dualrate SBR
    517                |  28000 -  55999 |      1 |  on | downsampled SBR
    518                |  56000 -    max |      1 | off |             ---
    519 ---------------+-----------------+--------------+-----------------
    520    ]32 - 44.1] |    min -  63999 |      1 |  on |    dualrate SBR
    521                |  64000 -    max |      1 | off |             ---
    522 ---------------+-----------------+--------------+-----------------
    523    ]44.1 - 48] |    min -  63999 |      1 |  on |    dualrate SBR
    524                |  64000 -  max   |      1 | off |             ---
    525                |                 |        |     |
    526 ---------------+-----------------+--------+-----+-----------------
    527      ]min, 16[ |    min -    max |      2 | off |             ---
    528 ---------------+-----------------+--------------+-----------------
    529           [16] |    min -  31999 |      2 |  on | downsampled SBR
    530                |  32000 -  63999 |      2 |  on | downsampled SBR
    531                |  64000 -    max |      2 | off |             ---
    532 ---------------+-----------------+--------------+-----------------
    533      ]16 - 24] |    min -  47999 |      2 |  on | downsampled SBR
    534                |  48000 -  79999 |      2 |  on | downsampled SBR
    535                |  80000 -    max |      2 | off |             ---
    536 ---------------+-----------------+--------------+-----------------
    537      ]24 - 32] |    min -  31999 |      2 |  on |    dualrate SBR
    538                |  32000 -  67999 |      2 |  on |    dualrate SBR
    539                |  68000 -  95999 |      2 |  on | downsampled SBR
    540                |  96000 -    max |      2 | off |             ---
    541 ---------------+-----------------+--------------+-----------------
    542    ]32 - 44.1] |    min -  43999 |      2 |  on |    dualrate SBR
    543                |  44000 - 127999 |      2 |  on |    dualrate SBR
    544                | 128000 -    max |      2 | off |             ---
    545 ---------------+-----------------+--------------+-----------------
    546    ]44.1 - 48] |    min -  43999 |      2 |  on |    dualrate SBR
    547                |  44000 - 127999 |      2 |  on |    dualrate SBR
    548                | 128000 -  max   |      2 | off |             ---
    549                |                 |              |
    550 ------------------------------------------------------------------
    551 \endverbatim
    552 
    553 \subsection encDsELD Reduced Delay (Downscaled) Mode
    554 The downscaled mode of AAC-ELD reduces the algorithmic delay of AAC-ELD by
    555 virtually increasing the sampling rate. When using the downscaled mode, the
    556 bitrate should be increased for keeping the same audio quality level. For common
    557 signals, the bitrate should be increased by 25% for a downscale factor of 2.
    558 
    559 Currently, downscaling factors 2 and 4 are supported.
    560 To enable the downscaled mode in the encoder, the framelength parameter
    561 AACENC_GRANULE_LENGTH must be set accordingly to 256 or 240 for a downscale
    562 factor of 2 or 128 or 120 for a downscale factor of 4. The default values of 512
    563 or 480 mean that no downscaling is applied. \code
    564 aacEncoder_SetParam(hAacEncoder, AACENC_GRANULE_LENGTH, 256);
    565 aacEncoder_SetParam(hAacEncoder, AACENC_GRANULE_LENGTH, 128);
    566 \endcode
    567 
    568 Downscaled bitstreams are fully backwards compatible. However, the legacy
    569 decoder needs to support high sample rate, e.g. 96kHz. The signaled sampling
    570 rate is multiplied by the downscale factor. Although not required, downscaling
    571 should be applied when decoding downscaled bitstreams. It reduces CPU workload
    572 and the output will have the same sampling rate as the input. In an ideal
    573 configuration both encoder and decoder should run with the same downscale
    574 factor.
    575 
    576 The following table shows approximate filter bank delays in ms for common
    577 sampling rates(sr) at framesize(fs), and downscale factor(dsf), based on this
    578 formula: \f[ 1000 * fs / (dsf * sr) \f]
    579 
    580 \verbatim
    581 --------------------------------------
    582       | 512/2 | 512/4 | 480/2 | 480/4
    583 ------+-------+-------+-------+-------
    584 22050 | 17.41 |  8.71 | 16.33 |  8.16
    585 32000 | 12.00 |  6.00 | 11.25 |  5.62
    586 44100 |  8.71 |  4.35 |  8.16 |  4.08
    587 48000 |  8.00 |  4.00 |  7.50 |  3.75
    588 --------------------------------------
    589 \endverbatim
    590 
    591 \section audiochCfg Audio Channel Configuration
    592 The MPEG standard refers often to the so-called Channel Configuration. This
    593 Channel Configuration is used for a fixed Channel Mapping. The configurations
    594 1-7 and 11,12,14 are predefined in MPEG standard and used for implicit
    595 signalling within the encoded bitstream. For user defined Configurations the
    596 Channel Configuration is set to 0 and the Channel Mapping must be explecitly
    597 described with an appropriate Program Config Element. The present Encoder
    598 implementation does not allow the user to configure this Channel Configuration
    599 from extern. The Encoder implementation supports fixed Channel Modes which are
    600 mapped to Channel Configuration as follow. \verbatim
    601 ----------------------------------------------------------------------------------------
    602  ChannelMode           | ChCfg | Height | front_El      | side_El  | back_El  |
    603 lfe_El
    604 -----------------------+-------+--------+---------------+----------+----------+---------
    605 MODE_1                 |     1 | NORM   | SCE           |          |          |
    606 MODE_2                 |     2 | NORM   | CPE           |          |          |
    607 MODE_1_2               |     3 | NORM   | SCE, CPE      |          |          |
    608 MODE_1_2_1             |     4 | NORM   | SCE, CPE      |          | SCE      |
    609 MODE_1_2_2             |     5 | NORM   | SCE, CPE      |          | CPE      |
    610 MODE_1_2_2_1           |     6 | NORM   | SCE, CPE      |          | CPE      |
    611 LFE MODE_1_2_2_2_1         |     7 | NORM   | SCE, CPE, CPE |          | CPE
    612 | LFE MODE_6_1               |    11 | NORM   | SCE, CPE      |          | CPE,
    613 SCE | LFE MODE_7_1_BACK          |    12 | NORM   | SCE, CPE      |          |
    614 CPE, CPE | LFE
    615 -----------------------+-------+--------+---------------+----------+----------+---------
    616 MODE_7_1_TOP_FRONT     |    14 | NORM   | SCE, CPE      |          | CPE      |
    617 LFE |       | TOP    | CPE           |          |          |
    618 -----------------------+-------+--------+---------------+----------+----------+---------
    619 MODE_7_1_REAR_SURROUND |     0 | NORM   | SCE, CPE      |          | CPE, CPE |
    620 LFE MODE_7_1_FRONT_CENTER  |     0 | NORM   | SCE, CPE, CPE |          | CPE
    621 | LFE
    622 ----------------------------------------------------------------------------------------
    623 - NORM: Normal Height Layer.     - TOP: Top Height Layer.  - BTM: Bottom Height
    624 Layer.
    625 - SCE: Single Channel Element.   - CPE: Channel Pair.      - LFE: Low Frequency
    626 Element. \endverbatim
    627 
    628 The Table describes all fixed Channel Elements for each Channel Mode which are
    629 assigned to a speaker arrangement. The arrangement includes front, side, back
    630 and lfe Audio Channel Elements in the normal height layer, possibly followed by
    631 front, side, and back elements in the top and bottom layer (Channel
    632 Configuration 14). \n This mapping of Audio Channel Elements is defined in MPEG
    633 standard for Channel Config 1-7 and 11,12,14.\n In case of Channel Config 0 or
    634 writing matrix mixdown coefficients, the encoder enables the writing of Program
    635 Config Element itself as described in \ref encPCE. The configuration used in
    636 Program Config Element refers to the denoted Table.\n Beside the Channel Element
    637 assignment the Channel Modes are resposible for audio input data channel
    638 mapping. The Channel Mapping of the audio data depends on the selected
    639 ::AACENC_CHANNELORDER which can be MPEG or WAV like order.\n Following table
    640 describes the complete channel mapping for both Channel Order configurations.
    641 \verbatim
    642 ---------------------------------------------------------------------------------------
    643 ChannelMode            |  MPEG-Channelorder            |  WAV-Channelorder
    644 -----------------------+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---
    645 MODE_1                 | 0 |   |   |   |   |   |   |   | 0 |   |   |   |   |   |
    646 | MODE_2                 | 0 | 1 |   |   |   |   |   |   | 0 | 1 |   |   |   |
    647 |   | MODE_1_2               | 0 | 1 | 2 |   |   |   |   |   | 2 | 0 | 1 |   |
    648 |   |   | MODE_1_2_1             | 0 | 1 | 2 | 3 |   |   |   |   | 2 | 0 | 1 | 3
    649 |   |   |   | MODE_1_2_2             | 0 | 1 | 2 | 3 | 4 |   |   |   | 2 | 0 | 1
    650 | 3 | 4 |   |   | MODE_1_2_2_1           | 0 | 1 | 2 | 3 | 4 | 5 |   |   | 2 | 0
    651 | 1 | 4 | 5 | 3 |   | MODE_1_2_2_2_1         | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 2
    652 | 6 | 7 | 0 | 1 | 4 | 5 | 3 MODE_6_1               | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
    653 | 2 | 0 | 1 | 4 | 5 | 6 | 3 | MODE_7_1_BACK          | 0 | 1 | 2 | 3 | 4 | 5 | 6
    654 | 7 | 2 | 0 | 1 | 6 | 7 | 4 | 5 | 3 MODE_7_1_TOP_FRONT     | 0 | 1 | 2 | 3 | 4 |
    655 5 | 6 | 7 | 2 | 0 | 1 | 4 | 5 | 3 | 6 | 7
    656 -----------------------+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---
    657 MODE_7_1_REAR_SURROUND | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 2 | 0 | 1 | 6 | 7 | 4 |
    658 5 | 3 MODE_7_1_FRONT_CENTER  | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 2 | 6 | 7 | 0 | 1
    659 | 4 | 5 | 3
    660 ---------------------------------------------------------------------------------------
    661 \endverbatim
    662 
    663 The denoted mapping is important for correct audio channel assignment when using
    664 MPEG or WAV ordering. The incoming audio channels are distributed MPEG like
    665 starting at the front channels and ending at the back channels. The distribution
    666 is used as described in Table concering Channel Config and fix channel elements.
    667 Please see the following example for clarification.
    668 
    669 \verbatim
    670 Example: MODE_1_2_2_1 - WAV-Channelorder 5.1
    671 ------------------------------------------
    672  Input Channel      | Coder Channel
    673 --------------------+---------------------
    674  2 (front center)   | 0 (SCE channel)
    675  0 (left center)    | 1 (1st of 1st CPE)
    676  1 (right center)   | 2 (2nd of 1st CPE)
    677  4 (left surround)  | 3 (1st of 2nd CPE)
    678  5 (right surround) | 4 (2nd of 2nd CPE)
    679  3 (LFE)            | 5 (LFE)
    680 ------------------------------------------
    681 \endverbatim
    682 
    683 
    684 \section suppBitrates Supported Bitrates
    685 
    686 The FDK AAC Encoder provides a wide range of supported bitrates.
    687 The minimum and maximum allowed bitrate depends on the Audio Object Type. For
    688 AAC-LC the minimum bitrate is the bitrate that is required to write the most
    689 basic and minimal valid bitstream. It consists of the bitstream format header
    690 information and other static/mandatory information within the AAC payload. The
    691 maximum AAC framesize allowed by the MPEG-4 standard determines the maximum
    692 allowed bitrate for AAC-LC. For HE-AAC and HE-AAC v2 a library internal look-up
    693 table is used.
    694 
    695 A good working point in terms of audio quality, sampling rate and bitrate, is at
    696 1 to 1.5 bits/audio sample for AAC-LC, 0.625 bits/audio sample for dualrate
    697 HE-AAC, 1.125 bits/audio sample for downsampled HE-AAC and 0.5 bits/audio sample
    698 for HE-AAC v2. For example for one channel with a sampling frequency of 48 kHz,
    699 the range from 48 kbit/s to 72 kbit/s achieves reasonable audio quality for
    700 AAC-LC.
    701 
    702 For HE-AAC and HE-AAC v2 the lowest possible audio input sampling frequency is
    703 16 kHz because then the AAC-LC core encoder operates in dual rate mode at its
    704 lowest possible sampling frequency, which is 8 kHz. HE-AAC v2 requires stereo
    705 input audio data.
    706 
    707 Please note that in HE-AAC or HE-AAC v2 mode the encoder supports much higher
    708 bitrates than are appropriate for HE-AAC or HE-AAC v2. For example, at a bitrate
    709 of more than 64 kbit/s for a stereo audio signal at 44.1 kHz it usually makes
    710 sense to use AAC-LC, which will produce better audio quality at that bitrate
    711 than HE-AAC or HE-AAC v2.
    712 
    713 \section reommendedConfig Recommended Sampling Rate and Bitrate Combinations
    714 
    715 The following table provides an overview of recommended encoder configuration
    716 parameters which we determined by virtue of numerous listening tests.
    717 
    718 \subsection reommendedConfigLC AAC-LC, HE-AAC, HE-AACv2 in Dualrate SBR mode.
    719 \verbatim
    720 -----------------------------------------------------------------------------------
    721 Audio Object Type  |  Bit Rate Range  |            Supported  | Preferred  | No.
    722 of |         [bit/s]  |       Sampling Rates  |    Sampl.  |  Chan. |
    723 |                [kHz]  |      Rate  | |                  |
    724 |     [kHz]  |
    725 -------------------+------------------+-----------------------+------------+-------
    726 AAC LC + SBR + PS  |   8000 -  11999  |         22.05, 24.00  |     24.00  | 2
    727 AAC LC + SBR + PS  |  12000 -  17999  |                32.00  |     32.00  | 2
    728 AAC LC + SBR + PS  |  18000 -  39999  |  32.00, 44.10, 48.00  |     44.10  | 2
    729 AAC LC + SBR + PS  |  40000 -  64000  |  32.00, 44.10, 48.00  |     48.00  | 2
    730 -------------------+------------------+-----------------------+------------+-------
    731 AAC LC + SBR       |   8000 -  11999  |         22.05, 24.00  |     24.00  | 1
    732 AAC LC + SBR       |  12000 -  17999  |                32.00  |     32.00  | 1
    733 AAC LC + SBR       |  18000 -  39999  |  32.00, 44.10, 48.00  |     44.10  | 1
    734 AAC LC + SBR       |  40000 -  64000  |  32.00, 44.10, 48.00  |     48.00  | 1
    735 -------------------+------------------+-----------------------+------------+-------
    736 AAC LC + SBR       |  16000 -  27999  |  32.00, 44.10, 48.00  |     32.00  | 2
    737 AAC LC + SBR       |  28000 -  63999  |  32.00, 44.10, 48.00  |     44.10  | 2
    738 AAC LC + SBR       |  64000 - 128000  |  32.00, 44.10, 48.00  |     48.00  | 2
    739 -------------------+------------------+-----------------------+------------+-------
    740 AAC LC + SBR       |  64000 -  69999  |  32.00, 44.10, 48.00  |     32.00  |
    741 5, 5.1 AAC LC + SBR       |  70000 - 239999  |  32.00, 44.10, 48.00  |     44.10
    742 | 5, 5.1 AAC LC + SBR       | 240000 - 319999  |  32.00, 44.10, 48.00  |
    743 48.00  | 5, 5.1
    744 -------------------+------------------+-----------------------+------------+-------
    745 AAC LC             |   8000 -  15999  | 11.025, 12.00, 16.00  |     12.00  | 1
    746 AAC LC             |  16000 -  23999  |                16.00  |     16.00  | 1
    747 AAC LC             |  24000 -  31999  |  16.00, 22.05, 24.00  |     24.00  | 1
    748 AAC LC             |  32000 -  55999  |                32.00  |     32.00  | 1
    749 AAC LC             |  56000 - 160000  |  32.00, 44.10, 48.00  |     44.10  | 1
    750 AAC LC             | 160001 - 288000  |                48.00  |     48.00  | 1
    751 -------------------+------------------+-----------------------+------------+-------
    752 AAC LC             |  16000 -  23999  | 11.025, 12.00, 16.00  |     12.00  | 2
    753 AAC LC             |  24000 -  31999  |                16.00  |     16.00  | 2
    754 AAC LC             |  32000 -  39999  |  16.00, 22.05, 24.00  |     22.05  | 2
    755 AAC LC             |  40000 -  95999  |                32.00  |     32.00  | 2
    756 AAC LC             |  96000 - 111999  |  32.00, 44.10, 48.00  |     32.00  | 2
    757 AAC LC             | 112000 - 320001  |  32.00, 44.10, 48.00  |     44.10  | 2
    758 AAC LC             | 320002 - 576000  |                48.00  |     48.00  | 2
    759 -------------------+------------------+-----------------------+------------+-------
    760 AAC LC             | 160000 - 239999  |                32.00  |     32.00  |
    761 5, 5.1 AAC LC             | 240000 - 279999  |  32.00, 44.10, 48.00  |     32.00
    762 | 5, 5.1 AAC LC             | 280000 - 800000  |  32.00, 44.10, 48.00  |
    763 44.10  | 5, 5.1
    764 -----------------------------------------------------------------------------------
    765 \endverbatim \n
    766 
    767 \subsection reommendedConfigLD AAC-LD, AAC-ELD, AAC-ELD with SBR in Dualrate SBR
    768 mode. Unlike to HE-AAC configuration the SBR is not covered by ELD audio object
    769 type and needs to be enabled explicitly. Use ::AACENC_SBR_MODE to configure SBR
    770 and its samplingrate ratio with ::AACENC_SBR_RATIO parameter. \verbatim
    771 -----------------------------------------------------------------------------------
    772 Audio Object Type  |  Bit Rate Range  |            Supported  | Preferred  | No.
    773 of |         [bit/s]  |       Sampling Rates  |    Sampl.  |  Chan. |
    774 |                [kHz]  |      Rate  | |                  |
    775 |     [kHz]  |
    776 -------------------+------------------+-----------------------+------------+-------
    777 ELD + SBR          |  18000 -  24999  |        32.00 - 44.10  |     32.00  | 1
    778 ELD + SBR          |  25000 -  31999  |        32.00 - 48.00  |     32.00  | 1
    779 ELD + SBR          |  32000 -  64000  |        32.00 - 48.00  |     48.00  | 1
    780 -------------------+------------------+-----------------------+------------+-------
    781 ELD + SBR          |  32000 -  51999  |        32.00 - 48.00  |     44.10  | 2
    782 ELD + SBR          |  52000 - 128000  |        32.00 - 48.00  |     48.00  | 2
    783 -------------------+------------------+-----------------------+------------+-------
    784 ELD + SBR          |  78000 - 160000  |        32.00 - 48.00  |     48.00  | 3
    785 -------------------+------------------+-----------------------+------------+-------
    786 ELD + SBR          | 104000 - 212000  |        32.00 - 48.00  |     48.00  | 4
    787 -------------------+------------------+-----------------------+------------+-------
    788 ELD + SBR          | 130000 - 246000  |        32.00 - 48.00  |     48.00  |
    789 5, 5.1
    790 -------------------+------------------+-----------------------+------------+-------
    791 LD, ELD            |  16000 -  19999  |        16.00 - 24.00  |     16.00  | 1
    792 LD, ELD            |  20000 -  39999  |        16.00 - 32.00  |     24.00  | 1
    793 LD, ELD            |  40000 -  49999  |        22.05 - 32.00  |     32.00  | 1
    794 LD, ELD            |  50000 -  61999  |        24.00 - 44.10  |     32.00  | 1
    795 LD, ELD            |  62000 -  84999  |        32.00 - 48.00  |     44.10  | 1
    796 LD, ELD            |  85000 - 192000  |        44.10 - 48.00  |     48.00  | 1
    797 -------------------+------------------+-----------------------+------------+-------
    798 LD, ELD            |  64000 -  75999  |        24.00 - 32.00  |     32.00  | 2
    799 LD, ELD            |  76000 -  97999  |        24.00 - 44.10  |     32.00  | 2
    800 LD, ELD            |  98000 - 135999  |        32.00 - 48.00  |     44.10  | 2
    801 LD, ELD            | 136000 - 384000  |        44.10 - 48.00  |     48.00  | 2
    802 -------------------+------------------+-----------------------+------------+-------
    803 LD, ELD            |  96000 - 113999  |        24.00 - 32.00  |     32.00  | 3
    804 LD, ELD            | 114000 - 146999  |        24.00 - 44.10  |     32.00  | 3
    805 LD, ELD            | 147000 - 203999  |        32.00 - 48.00  |     44.10  | 3
    806 LD, ELD            | 204000 - 576000  |        44.10 - 48.00  |     48.00  | 3
    807 -------------------+------------------+-----------------------+------------+-------
    808 LD, ELD            | 128000 - 151999  |        24.00 - 32.00  |     32.00  | 4
    809 LD, ELD            | 152000 - 195999  |        24.00 - 44.10  |     32.00  | 4
    810 LD, ELD            | 196000 - 271999  |        32.00 - 48.00  |     44.10  | 4
    811 LD, ELD            | 272000 - 768000  |        44.10 - 48.00  |     48.00  | 4
    812 -------------------+------------------+-----------------------+------------+-------
    813 LD, ELD            | 160000 - 189999  |        24.00 - 32.00  |     32.00  |
    814 5, 5.1 LD, ELD            | 190000 - 244999  |        24.00 - 44.10  |     32.00
    815 | 5, 5.1 LD, ELD            | 245000 - 339999  |        32.00 - 48.00  |
    816 44.10  | 5, 5.1 LD, ELD            | 340000 - 960000  |        44.10 - 48.00  |
    817 48.00  | 5, 5.1
    818 -----------------------------------------------------------------------------------
    819 \endverbatim \n
    820 
    821 \subsection reommendedConfigELD AAC-ELD with SBR in Downsampled SBR mode.
    822 \verbatim
    823 -----------------------------------------------------------------------------------
    824 Audio Object Type  |  Bit Rate Range  |            Supported  | Preferred  | No.
    825 of |         [bit/s]  |       Sampling Rates  |    Sampl.  |  Chan. |
    826 |                [kHz]  |      Rate  | |                  |
    827 |     [kHz]  |
    828 -------------------+------------------+-----------------------+------------+-------
    829 ELD + SBR          |  18000 - 24999   |        16.00 - 22.05  |     22.05  | 1
    830 (downsampled SBR)  |  25000 - 31999   |        16.00 - 24.00  |     24.00  | 1
    831                    |  32000 - 47999   |        22.05 - 32.00  |     32.00  | 1
    832                    |  48000 - 64000   |        22.05 - 48.00  |     32.00  | 1
    833 -------------------+------------------+-----------------------+------------+-------
    834 ELD + SBR          |  32000 - 51999   |        16.00 - 24.00  |     24.00  | 2
    835 (downsampled SBR)  |  52000 - 59999   |        22.05 - 24.00  |     24.00  | 2
    836                    |  60000 - 95999   |        22.05 - 32.00  |     32.00  | 2
    837                    |  96000 - 128000  |        22.05 - 48.00  |     32.00  | 2
    838 -------------------+------------------+-----------------------+------------+-------
    839 ELD + SBR          |  78000 -  99999  |        22.05 - 24.00  |     24.00  | 3
    840 (downsampled SBR)  | 100000 - 143999  |        22.05 - 32.00  |     32.00  | 3
    841                    | 144000 - 159999  |        22.05 - 48.00  |     32.00  | 3
    842                    | 160000 - 192000  |        32.00 - 48.00  |     32.00  | 3
    843 -------------------+------------------+-----------------------+------------+-------
    844 ELD + SBR          | 104000 - 149999  |        22.05 - 24.00  |     24.00  | 4
    845 (downsampled SBR)  | 150000 - 191999  |        22.05 - 32.00  |     32.00  | 4
    846                    | 192000 - 211999  |        22.05 - 48.00  |     32.00  | 4
    847                    | 212000 - 256000  |        32.00 - 48.00  |     32.00  | 4
    848 -------------------+------------------+-----------------------+------------+-------
    849 ELD + SBR          | 130000 - 171999  |        22.05 - 24.00  |     24.00  |
    850 5, 5.1 (downsampled SBR)  | 172000 - 239999  |        22.05 - 32.00  |     32.00
    851 | 5, 5.1 | 240000 - 320000  |        32.00 - 48.00  |     32.00  | 5, 5.1
    852 -----------------------------------------------------------------------------------
    853 \endverbatim \n
    854 
    855 \subsection reommendedConfigELDv2 AAC-ELD v2, AAC-ELD v2 with SBR.
    856 The ELD v2 212 configuration must be configured explicitly with
    857 ::AACENC_CHANNELMODE parameter according MODE_212 value. SBR can be configured
    858 separately through ::AACENC_SBR_MODE and ::AACENC_SBR_RATIO parameter. Following
    859 configurations shall apply to both framelengths 480 and 512. For ELD v2
    860 configuration without SBR and framelength 480 the supported sampling rate is
    861 restricted to the range from 16 kHz up to 24 kHz. \verbatim
    862 -----------------------------------------------------------------------------------
    863 Audio Object Type  |  Bit Rate Range  |            Supported  | Preferred  | No.
    864 of |         [bit/s]  |       Sampling Rates  |    Sampl.  |  Chan. |
    865 |                [kHz]  |      Rate  | |                  |
    866 |     [kHz]  |
    867 -------------------+------------------+-----------------------+------------+-------
    868 ELD-212            |  16000 -  19999  |        16.00 - 24.00  |     16.00  | 2
    869 (without SBR)      |  20000 -  39999  |        16.00 - 32.00  |     24.00  | 2
    870                    |  40000 -  49999  |        22.05 - 32.00  |     32.00  | 2
    871                    |  50000 -  61999  |        24.00 - 44.10  |     32.00  | 2
    872                    |  62000 -  84999  |        32.00 - 48.00  |     44.10  | 2
    873                    |  85000 - 192000  |        44.10 - 48.00  |     48.00  | 2
    874 -------------------+------------------+-----------------------+------------+-------
    875 ELD-212 + SBR      |  18000 -  20999  |                32.00  |     32.00  | 2
    876 (dualrate SBR)     |  21000 -  25999  |        32.00 - 44.10  |     32.00  | 2
    877                    |  26000 -  31999  |        32.00 - 48.00  |     44.10  | 2
    878                    |  32000 -  64000  |        32.00 - 48.00  |     48.00  | 2
    879 -------------------+------------------+-----------------------+------------+-------
    880 ELD-212 + SBR      |  18000 -  19999  |        16.00 - 22.05  |     22.05  | 2
    881 (downsampled SBR)  |  20000 -  24999  |        16.00 - 24.00  |     22.05  | 2
    882                    |  25000 -  31999  |        16.00 - 24.00  |     24.00  | 2
    883                    |  32000 -  64000  |        24.00 - 24.00  |     24.00  | 2
    884 -------------------+------------------+-----------------------+------------+-------
    885 \endverbatim \n
    886 
    887 \page ENCODERBEHAVIOUR Encoder Behaviour
    888 
    889 \section BEHAVIOUR_BANDWIDTH Bandwidth
    890 
    891 The FDK AAC encoder usually does not use the full frequency range of the input
    892 signal, but restricts the bandwidth according to certain library-internal
    893 settings. They can be changed in the table "bandWidthTable" in the file
    894 bandwidth.cpp (if available).
    895 
    896 The encoder API provides the ::AACENC_BANDWIDTH parameter to adjust the
    897 bandwidth explicitly. \code aacEncoder_SetParam(hAacEncoder, AACENC_BANDWIDTH,
    898 value); \endcode
    899 
    900 However it is not recommended to change these settings, because they are based
    901 on numerous listening tests and careful tweaks to ensure the best overall
    902 encoding quality. Also, the maximum bandwidth that can be set manually by the
    903 user is 20kHz or fs/2, whichever value is smaller.
    904 
    905 Theoretically a signal of for example 48 kHz can contain frequencies up to 24
    906 kHz, but to use this full range in an audio encoder usually does not make sense.
    907 Usually the encoder has a very limited amount of bits to spend (typically 128
    908 kbit/s for stereo 48 kHz content) and to allow full range bandwidth would waste
    909 a lot of these bits for frequencies the human ear is hardly able to perceive
    910 anyway, if at all. Hence it is wise to use the available bits for the really
    911 important frequency range and just skip the rest. At lower bitrates (e. g. <= 80
    912 kbit/s for stereo 48 kHz content) the encoder will choose an even smaller
    913 bandwidth, because an encoded signal with smaller bandwidth and hence less
    914 artifacts sounds better than a signal with higher bandwidth but then more coding
    915 artefacts across all frequencies. These artefacts would occur if small bitrates
    916 and high bandwidths are chosen because the available bits are just not enough to
    917 encode all frequencies well.
    918 
    919 Unfortunately some people evaluate encoding quality based on possible bandwidth
    920 as well, but it is a double-edged sword considering the trade-off described
    921 above.
    922 
    923 Another aspect is workload consumption. The higher the allowed bandwidth, the
    924 more frequency lines have to be processed, which in turn increases the workload.
    925 
    926 \section FRAMESIZES_AND_BIT_RESERVOIR Frame Sizes & Bit Reservoir
    927 
    928 For AAC there is a difference between constant bit rate and constant frame
    929 length due to the so-called bit reservoir technique, which allows the encoder to
    930 use less bits in an AAC frame for those audio signal sections which are easy to
    931 encode, and then spend them at a later point in time for more complex audio
    932 sections. The extent to which this "bit exchange" is done is limited to allow
    933 for reliable and relatively low delay real time streaming. Therefore, for
    934 AAC-ELD, the bitreservoir is limited. It varies between 500 and 4000 bits/frame,
    935 depending on the bitrate/channel.
    936 - For a bitrate of 12kbps/channel and below, the AAC-ELD bitreservoir is 500
    937 bits/frame.
    938 - For a bitrate of 70kbps/channel and above, the AAC-ELD bitreservoir is 4000
    939 bits/frame.
    940 - Between 12kbps/channel and 70kbps/channel, the AAC-ELD bitrervoir is increased
    941 linearly.
    942 - For AAC-LC, the bitrate is only limited by the maximum AAC frame length. It
    943 is, regardless of the available bit reservoir, defined as 6144 bits per channel.
    944 
    945 Over a longer period in time the bitrate will be constant in the AAC constant
    946 bitrate mode, e.g. for ISDN transmission. This means that in AAC each bitstream
    947 frame will in general have a different length in bytes but over time it
    948 will reach the target bitrate.
    949 
    950 
    951 One could also make an MPEG compliant
    952 AAC encoder which always produces constant length packages for each AAC frame,
    953 but the audio quality would be considerably worse since the bit reservoir
    954 technique would have to be switched off completely. A higher bit rate would have
    955 to be used to get the same audio quality as with an enabled bit reservoir.
    956 
    957 For mp3 by the way, the same bit reservoir technique exists, but there each bit
    958 stream frame has a constant length for a given bit rate (ignoring the
    959 padding byte). In mp3 there is a so-called "back pointer" which tells
    960 the decoder which bits belong to the current mp3 frame - and in general some or
    961 many bits have been transmitted in an earlier mp3 frame. Basically this leads to
    962 the same "bit exchange between mp3 frames" as in AAC but with virtually constant
    963 length frames.
    964 
    965 This variable frame length at "constant bit rate" is not something special
    966 in this Fraunhofer IIS AAC encoder. AAC has been designed in that way.
    967 
    968 \subsection BEHAVIOUR_ESTIM_AVG_FRAMESIZES Estimating Average Frame Sizes
    969 
    970 A HE-AAC v1 or v2 audio frame contains 2048 PCM samples per channel (there is
    971 also one mode with 1920 samples per channel but this is only for special
    972 purposes such as DAB+ digital radio).
    973 
    974 The number of HE-AAC frames \f$N\_FRAMES\f$ per second at 44.1 kHz is:
    975 
    976 \f[
    977 N\_FRAMES = 44100 / 2048 = 21.5332
    978 \f]
    979 
    980 At a bit rate of 8 kbps the average number of bits per frame
    981 \f$N\_BITS\_PER\_FRAME\f$ is:
    982 
    983 \f[
    984 N\_BITS\_PER\_FRAME = 8000 / 21.5332 = 371.52
    985 \f]
    986 
    987 which is about 46.44 bytes per encoded frame.
    988 
    989 At a bit rate of 32 kbps, which is quite high for single channel HE-AAC v1, it
    990 is:
    991 
    992 \f[
    993 N\_BITS\_PER\_FRAME = 32000 / 21.5332 = 1486
    994 \f]
    995 
    996 which is about 185.76 bytes per encoded frame.
    997 
    998 These bits/frame figures are average figures where each AAC frame generally has
    999 a different size in bytes. To calculate the same for AAC-LC just use 1024
   1000 instead of 2048 PCM samples per frame and channel. For AAC-LD/ELD it is either
   1001 480 or 512 PCM samples per frame and channel.
   1002 
   1003 
   1004 \section BEHAVIOUR_TOOLS Encoder Tools
   1005 
   1006 The AAC encoder supports TNS, PNS, MS, Intensity and activates these tools
   1007 depending on the audio signal and the encoder configuration (i.e. bitrate or
   1008 AOT). It is not required to configure these tools manually.
   1009 
   1010 PNS improves encoding quality only for certain bitrates. Therefore it makes
   1011 sense to activate PNS only for these bitrates and save the processing power
   1012 required for PNS (about 10 % of the encoder) when using other bitrates. This is
   1013 done automatically inside the encoder library. PNS is disabled inside the
   1014 encoder library if an MPEG-2 AOT is choosen since PNS is an MPEG-4 AAC feature.
   1015 
   1016 If SBR is activated, the encoder automatically deactivates PNS internally. If
   1017 TNS is disabled but PNS is allowed, the encoder deactivates PNS calculation
   1018 internally.
   1019 
   1020 */
   1021 
   1022 #ifndef AACENC_LIB_H
   1023 #define AACENC_LIB_H
   1024 
   1025 #include "machine_type.h"
   1026 #include "FDK_audio.h"
   1027 
   1028 /**
   1029  *  AAC encoder error codes.
   1030  */
   1031 typedef enum {
   1032   AACENC_OK = 0x0000, /*!< No error happened. All fine. */
   1033 
   1034   AACENC_INVALID_HANDLE =
   1035       0x0020, /*!< Handle passed to function call was invalid. */
   1036   AACENC_MEMORY_ERROR = 0x0021,          /*!< Memory allocation failed. */
   1037   AACENC_UNSUPPORTED_PARAMETER = 0x0022, /*!< Parameter not available. */
   1038   AACENC_INVALID_CONFIG = 0x0023,        /*!< Configuration not provided. */
   1039 
   1040   AACENC_INIT_ERROR = 0x0040,     /*!< General initialization error. */
   1041   AACENC_INIT_AAC_ERROR = 0x0041, /*!< AAC library initialization error. */
   1042   AACENC_INIT_SBR_ERROR = 0x0042, /*!< SBR library initialization error. */
   1043   AACENC_INIT_TP_ERROR = 0x0043, /*!< Transport library initialization error. */
   1044   AACENC_INIT_META_ERROR =
   1045       0x0044, /*!< Meta data library initialization error. */
   1046   AACENC_INIT_MPS_ERROR = 0x0045, /*!< MPS library initialization error. */
   1047 
   1048   AACENC_ENCODE_ERROR = 0x0060, /*!< The encoding process was interrupted by an
   1049                                    unexpected error. */
   1050 
   1051   AACENC_ENCODE_EOF = 0x0080 /*!< End of file reached. */
   1052 
   1053 } AACENC_ERROR;
   1054 
   1055 /**
   1056  *  AAC encoder buffer descriptors identifier.
   1057  *  This identifier are used within buffer descriptors
   1058  * AACENC_BufDesc::bufferIdentifiers.
   1059  */
   1060 typedef enum {
   1061   /* Input buffer identifier. */
   1062   IN_AUDIO_DATA = 0,    /*!< Audio input buffer, interleaved INT_PCM samples. */
   1063   IN_ANCILLRY_DATA = 1, /*!< Ancillary data to be embedded into bitstream. */
   1064   IN_METADATA_SETUP = 2, /*!< Setup structure for embedding meta data. */
   1065 
   1066   /* Output buffer identifier. */
   1067   OUT_BITSTREAM_DATA = 3, /*!< Buffer holds bitstream output data. */
   1068   OUT_AU_SIZES =
   1069       4 /*!< Buffer contains sizes of each access unit. This information
   1070              is necessary for superframing. */
   1071 
   1072 } AACENC_BufferIdentifier;
   1073 
   1074 /**
   1075  *  AAC encoder handle.
   1076  */
   1077 typedef struct AACENCODER *HANDLE_AACENCODER;
   1078 
   1079 /**
   1080  *  Provides some info about the encoder configuration.
   1081  */
   1082 typedef struct {
   1083   UINT maxOutBufBytes; /*!< Maximum number of encoder bitstream bytes within one
   1084                           frame. Size depends on maximum number of supported
   1085                           channels in encoder instance. For superframing (as
   1086                           used for example in DAB+), size has to be a multiple
   1087                           accordingly. */
   1088 
   1089   UINT maxAncBytes; /*!< Maximum number of ancillary data bytes which can be
   1090                        inserted into bitstream within one frame. */
   1091 
   1092   UINT inBufFillLevel; /*!< Internal input buffer fill level in samples per
   1093                           channel. This parameter will automatically be cleared
   1094                           if samplingrate or channel(Mode/Order) changes. */
   1095 
   1096   UINT inputChannels; /*!< Number of input channels expected in encoding
   1097                          process. */
   1098 
   1099   UINT frameLength; /*!< Amount of input audio samples consumed each frame per
   1100                        channel, depending on audio object type configuration. */
   1101 
   1102   UINT nDelay; /*!< Codec delay in PCM samples/channel. Depends on framelength
   1103                   and AOT. Does not include framing delay for filling up encoder
   1104                   PCM input buffer. */
   1105 
   1106   UINT nDelayCore; /*!< Codec delay in PCM samples/channel, w/o delay caused by
   1107                       the decoder SBR module. This delay is needed to correctly
   1108                       write edit lists for gapless playback. The decoder may not
   1109                       know how much delay is introdcued by SBR, since it may not
   1110                       know if SBR is active at all (implicit signaling),
   1111                       therefore the deocder must take into account any delay
   1112                       caused by the SBR module. */
   1113 
   1114   UCHAR confBuf[64]; /*!< Configuration buffer in binary format as an
   1115                         AudioSpecificConfig or StreamMuxConfig according to the
   1116                         selected transport type. */
   1117 
   1118   UINT confSize; /*!< Number of valid bytes in confBuf. */
   1119 
   1120 } AACENC_InfoStruct;
   1121 
   1122 /**
   1123  *  Describes the input and output buffers for an aacEncEncode() call.
   1124  */
   1125 typedef struct {
   1126   INT numBufs;            /*!< Number of buffers. */
   1127   void **bufs;            /*!< Pointer to vector containing buffer addresses. */
   1128   INT *bufferIdentifiers; /*!< Identifier of each buffer element. See
   1129                              ::AACENC_BufferIdentifier. */
   1130   INT *bufSizes;          /*!< Size of each buffer in 8-bit bytes. */
   1131   INT *bufElSizes;        /*!< Size of each buffer element in bytes. */
   1132 
   1133 } AACENC_BufDesc;
   1134 
   1135 /**
   1136  *  Defines the input arguments for an aacEncEncode() call.
   1137  */
   1138 typedef struct {
   1139   INT numInSamples; /*!< Number of valid input audio samples (multiple of input
   1140                        channels). */
   1141   INT numAncBytes;  /*!< Number of ancillary data bytes to be encoded. */
   1142 
   1143 } AACENC_InArgs;
   1144 
   1145 /**
   1146  *  Defines the output arguments for an aacEncEncode() call.
   1147  */
   1148 typedef struct {
   1149   INT numOutBytes;  /*!< Number of valid bitstream bytes generated during
   1150                        aacEncEncode(). */
   1151   INT numInSamples; /*!< Number of input audio samples consumed by the encoder.
   1152                      */
   1153   INT numAncBytes;  /*!< Number of ancillary data bytes consumed by the encoder.
   1154                      */
   1155   INT bitResState;  /*!< State of the bit reservoir in bits. */
   1156 
   1157 } AACENC_OutArgs;
   1158 
   1159 /**
   1160  *  Meta Data Compression Profiles.
   1161  */
   1162 typedef enum {
   1163   AACENC_METADATA_DRC_NONE = 0,          /*!< None. */
   1164   AACENC_METADATA_DRC_FILMSTANDARD = 1,  /*!< Film standard. */
   1165   AACENC_METADATA_DRC_FILMLIGHT = 2,     /*!< Film light. */
   1166   AACENC_METADATA_DRC_MUSICSTANDARD = 3, /*!< Music standard. */
   1167   AACENC_METADATA_DRC_MUSICLIGHT = 4,    /*!< Music light. */
   1168   AACENC_METADATA_DRC_SPEECH = 5,        /*!< Speech. */
   1169   AACENC_METADATA_DRC_NOT_PRESENT =
   1170       256 /*!< Disable writing gain factor (used for comp_profile only). */
   1171 
   1172 } AACENC_METADATA_DRC_PROFILE;
   1173 
   1174 /**
   1175  *  Meta Data setup structure.
   1176  */
   1177 typedef struct {
   1178   AACENC_METADATA_DRC_PROFILE
   1179   drc_profile; /*!< MPEG DRC compression profile. See
   1180                   ::AACENC_METADATA_DRC_PROFILE. */
   1181   AACENC_METADATA_DRC_PROFILE
   1182   comp_profile; /*!< ETSI heavy compression profile. See
   1183                    ::AACENC_METADATA_DRC_PROFILE. */
   1184 
   1185   INT drc_TargetRefLevel;  /*!< Used to define expected level to:
   1186                                 Scaled with 16 bit. x*2^16. */
   1187   INT comp_TargetRefLevel; /*!< Adjust limiter to avoid overload.
   1188                                 Scaled with 16 bit. x*2^16. */
   1189 
   1190   INT prog_ref_level_present; /*!< Flag, if prog_ref_level is present */
   1191   INT prog_ref_level;         /*!< Programme Reference Level = Dialogue Level:
   1192                                    -31.75dB .. 0 dB ; stepsize: 0.25dB
   1193                                    Scaled with 16 bit. x*2^16.*/
   1194 
   1195   UCHAR PCE_mixdown_idx_present; /*!< Flag, if dmx-idx should be written in
   1196                                     programme config element */
   1197   UCHAR ETSI_DmxLvl_present;     /*!< Flag, if dmx-lvl should be written in
   1198                                     ETSI-ancData */
   1199 
   1200   SCHAR centerMixLevel; /*!< Center downmix level (0...7, according to table) */
   1201   SCHAR surroundMixLevel; /*!< Surround downmix level (0...7, according to
   1202                              table) */
   1203 
   1204   UCHAR
   1205   dolbySurroundMode; /*!< Indication for Dolby Surround Encoding Mode.
   1206                           - 0: Dolby Surround mode not indicated
   1207                           - 1: 2-ch audio part is not Dolby surround encoded
   1208                           - 2: 2-ch audio part is Dolby surround encoded */
   1209 
   1210   UCHAR drcPresentationMode; /*!< Indicatin for DRC Presentation Mode.
   1211                                   - 0: Presentation mode not inticated
   1212                                   - 1: Presentation mode 1
   1213                                   - 2: Presentation mode 2 */
   1214 
   1215   struct {
   1216     /* extended ancillary data */
   1217     UCHAR extAncDataEnable; /*< Indicates if MPEG4_ext_ancillary_data() exists.
   1218                                 - 0: No MPEG4_ext_ancillary_data().
   1219                                 - 1: Insert MPEG4_ext_ancillary_data(). */
   1220 
   1221     UCHAR
   1222     extDownmixLevelEnable;   /*< Indicates if ext_downmixing_levels() exists.
   1223                                  - 0: No ext_downmixing_levels().
   1224                                  - 1: Insert ext_downmixing_levels(). */
   1225     UCHAR extDownmixLevel_A; /*< Downmix level index A (0...7, according to
   1226                                 table) */
   1227     UCHAR extDownmixLevel_B; /*< Downmix level index B (0...7, according to
   1228                                 table) */
   1229 
   1230     UCHAR dmxGainEnable; /*< Indicates if ext_downmixing_global_gains() exists.
   1231                              - 0: No ext_downmixing_global_gains().
   1232                              - 1: Insert ext_downmixing_global_gains(). */
   1233     INT dmxGain5;        /*< Gain factor for downmix to 5 channels.
   1234                               -15.75dB .. -15.75dB; stepsize: 0.25dB
   1235                               Scaled with 16 bit. x*2^16.*/
   1236     INT dmxGain2;        /*< Gain factor for downmix to 2 channels.
   1237                               -15.75dB .. -15.75dB; stepsize: 0.25dB
   1238                               Scaled with 16 bit. x*2^16.*/
   1239 
   1240     UCHAR lfeDmxEnable; /*< Indicates if ext_downmixing_lfe_level() exists.
   1241                             - 0: No ext_downmixing_lfe_level().
   1242                             - 1: Insert ext_downmixing_lfe_level(). */
   1243     UCHAR lfeDmxLevel;  /*< Downmix level index for LFE (0..15, according to
   1244                            table) */
   1245 
   1246   } ExtMetaData;
   1247 
   1248 } AACENC_MetaData;
   1249 
   1250 /**
   1251  * AAC encoder control flags.
   1252  *
   1253  * In interaction with the ::AACENC_CONTROL_STATE parameter it is possible to
   1254  * get information about the internal initialization process. It is also
   1255  * possible to overwrite the internal state from extern when necessary.
   1256  */
   1257 typedef enum {
   1258   AACENC_INIT_NONE = 0x0000, /*!< Do not trigger initialization. */
   1259   AACENC_INIT_CONFIG =
   1260       0x0001, /*!< Initialize all encoder modules configuration. */
   1261   AACENC_INIT_STATES = 0x0002, /*!< Reset all encoder modules history buffer. */
   1262   AACENC_INIT_TRANSPORT =
   1263       0x1000, /*!< Initialize transport lib with new parameters. */
   1264   AACENC_RESET_INBUFFER =
   1265       0x2000,              /*!< Reset fill level of internal input buffer. */
   1266   AACENC_INIT_ALL = 0xFFFF /*!< Initialize all. */
   1267 } AACENC_CTRLFLAGS;
   1268 
   1269 /**
   1270  * \brief  AAC encoder setting parameters.
   1271  *
   1272  * Use aacEncoder_SetParam() function to configure, or use aacEncoder_GetParam()
   1273  * function to read the internal status of the following parameters.
   1274  */
   1275 typedef enum {
   1276   AACENC_AOT =
   1277       0x0100, /*!< Audio object type. See ::AUDIO_OBJECT_TYPE in FDK_audio.h.
   1278                    - 2: MPEG-4 AAC Low Complexity.
   1279                    - 5: MPEG-4 AAC Low Complexity with Spectral Band Replication
   1280                  (HE-AAC).
   1281                    - 29: MPEG-4 AAC Low Complexity with Spectral Band
   1282                  Replication and Parametric Stereo (HE-AAC v2). This
   1283                  configuration can be used only with stereo input audio data.
   1284                    - 23: MPEG-4 AAC Low-Delay.
   1285                    - 39: MPEG-4 AAC Enhanced Low-Delay. Since there is no
   1286                  ::AUDIO_OBJECT_TYPE for ELD in combination with SBR defined,
   1287                  enable SBR explicitely by ::AACENC_SBR_MODE parameter. The ELD
   1288                  v2 212 configuration can be configured by ::AACENC_CHANNELMODE
   1289                  parameter.
   1290                    - 129: MPEG-2 AAC Low Complexity.
   1291                    - 132: MPEG-2 AAC Low Complexity with Spectral Band
   1292                  Replication (HE-AAC).
   1293 
   1294                    Please note that the virtual MPEG-2 AOT's basically disables
   1295                  non-existing Perceptual Noise Substitution tool in AAC encoder
   1296                  and controls the MPEG_ID flag in adts header. The virtual
   1297                  MPEG-2 AOT doesn't prohibit specific transport formats. */
   1298 
   1299   AACENC_BITRATE = 0x0101, /*!< Total encoder bitrate. This parameter is
   1300                               mandatory and interacts with ::AACENC_BITRATEMODE.
   1301                                 - CBR: Bitrate in bits/second.
   1302                                 - VBR: Variable bitrate. Bitrate argument will
   1303                               be ignored. See \ref suppBitrates for details. */
   1304 
   1305   AACENC_BITRATEMODE = 0x0102, /*!< Bitrate mode. Configuration can be different
   1306                                   kind of bitrate configurations:
   1307                                     - 0: Constant bitrate, use bitrate according
   1308                                   to ::AACENC_BITRATE. (default) Within none
   1309                                   LD/ELD ::AUDIO_OBJECT_TYPE, the CBR mode makes
   1310                                   use of full allowed bitreservoir. In contrast,
   1311                                   at Low-Delay ::AUDIO_OBJECT_TYPE the
   1312                                   bitreservoir is kept very small.
   1313                                     - 1: Variable bitrate mode, \ref vbrmode
   1314                                   "very low bitrate".
   1315                                     - 2: Variable bitrate mode, \ref vbrmode
   1316                                   "low bitrate".
   1317                                     - 3: Variable bitrate mode, \ref vbrmode
   1318                                   "medium bitrate".
   1319                                     - 4: Variable bitrate mode, \ref vbrmode
   1320                                   "high bitrate".
   1321                                     - 5: Variable bitrate mode, \ref vbrmode
   1322                                   "very high bitrate". */
   1323 
   1324   AACENC_SAMPLERATE = 0x0103, /*!< Audio input data sampling rate. Encoder
   1325                                  supports following sampling rates: 8000, 11025,
   1326                                  12000, 16000, 22050, 24000, 32000, 44100,
   1327                                  48000, 64000, 88200, 96000 */
   1328 
   1329   AACENC_SBR_MODE = 0x0104, /*!< Configure SBR independently of the chosen Audio
   1330                                Object Type ::AUDIO_OBJECT_TYPE. This parameter
   1331                                is for ELD audio object type only.
   1332                                  - -1: Use ELD SBR auto configurator (default).
   1333                                  - 0: Disable Spectral Band Replication.
   1334                                  - 1: Enable Spectral Band Replication. */
   1335 
   1336   AACENC_GRANULE_LENGTH =
   1337       0x0105, /*!< Core encoder (AAC) audio frame length in samples:
   1338                    - 1024: Default configuration.
   1339                    - 512: Default length in LD/ELD configuration.
   1340                    - 480: Length in LD/ELD configuration.
   1341                    - 256: Length for ELD reduced delay mode (x2).
   1342                    - 240: Length for ELD reduced delay mode (x2).
   1343                    - 128: Length for ELD reduced delay mode (x4).
   1344                    - 120: Length for ELD reduced delay mode (x4). */
   1345 
   1346   AACENC_CHANNELMODE = 0x0106, /*!< Set explicit channel mode. Channel mode must
   1347                                   match with number of input channels.
   1348                                     - 1-7, 11,12,14 and 33,34: MPEG channel
   1349                                   modes supported, see ::CHANNEL_MODE in
   1350                                   FDK_audio.h. */
   1351 
   1352   AACENC_CHANNELORDER =
   1353       0x0107, /*!< Input audio data channel ordering scheme:
   1354                    - 0: MPEG channel ordering (e. g. 5.1: C, L, R, SL, SR, LFE).
   1355                  (default)
   1356                    - 1: WAVE file format channel ordering (e. g. 5.1: L, R, C,
   1357                  LFE, SL, SR). */
   1358 
   1359   AACENC_SBR_RATIO =
   1360       0x0108, /*!<  Controls activation of downsampled SBR. With downsampled
   1361                  SBR, the delay will be shorter. On the other hand, for
   1362                  achieving the same quality level, downsampled SBR needs more
   1363                  bits than dual-rate SBR. With downsampled SBR, the AAC encoder
   1364                  will work at the same sampling rate as the SBR encoder (single
   1365                  rate). Downsampled SBR is supported for AAC-ELD and HE-AACv1.
   1366                     - 1: Downsampled SBR (default for ELD).
   1367                     - 2: Dual-rate SBR   (default for HE-AAC). */
   1368 
   1369   AACENC_AFTERBURNER =
   1370       0x0200, /*!< This parameter controls the use of the afterburner feature.
   1371                    The afterburner is a type of analysis by synthesis algorithm
   1372                  which increases the audio quality but also the required
   1373                  processing power. It is recommended to always activate this if
   1374                  additional memory consumption and processing power consumption
   1375                    is not a problem. If increased MHz and memory consumption are
   1376                  an issue then the MHz and memory cost of this optional module
   1377                  need to be evaluated against the improvement in audio quality
   1378                  on a case by case basis.
   1379                    - 0: Disable afterburner (default).
   1380                    - 1: Enable afterburner. */
   1381 
   1382   AACENC_BANDWIDTH = 0x0203, /*!< Core encoder audio bandwidth:
   1383                                   - 0: Determine audio bandwidth internally
   1384                                 (default, see chapter \ref BEHAVIOUR_BANDWIDTH).
   1385                                   - 1 to fs/2: Audio bandwidth in Hertz. Limited
   1386                                 to 20kHz max. Not usable if SBR is active. This
   1387                                 setting is for experts only, better do not touch
   1388                                 this value to avoid degraded audio quality. */
   1389 
   1390   AACENC_PEAK_BITRATE =
   1391       0x0207, /*!< Peak bitrate configuration parameter to adjust maximum bits
   1392                  per audio frame. Bitrate is in bits/second. The peak bitrate
   1393                  will internally be limited to the chosen bitrate
   1394                  ::AACENC_BITRATE as lower limit and the
   1395                  number_of_effective_channels*6144 bit as upper limit.
   1396 
   1397                    Setting the peak bitrate equal to ::AACENC_BITRATE does not
   1398                  necessarily mean that the audio frames will be of constant
   1399                  size. Since the peak bitate is in bits/second, the frame sizes
   1400                  can vary by one byte in one or the other direction over various
   1401                  frames. However, it is not recommended to reduce the peak
   1402                  pitrate to ::AACENC_BITRATE - it would disable the
   1403                  bitreservoir, which would affect the audio quality by a large
   1404                  amount. */
   1405 
   1406   AACENC_TRANSMUX = 0x0300, /*!< Transport type to be used. See ::TRANSPORT_TYPE
   1407                                in FDK_audio.h. Following types can be configured
   1408                                in encoder library:
   1409                                  - 0: raw access units
   1410                                  - 1: ADIF bitstream format
   1411                                  - 2: ADTS bitstream format
   1412                                  - 6: Audio Mux Elements (LATM) with
   1413                                muxConfigPresent = 1
   1414                                  - 7: Audio Mux Elements (LATM) with
   1415                                muxConfigPresent = 0, out of band StreamMuxConfig
   1416                                  - 10: Audio Sync Stream (LOAS) */
   1417 
   1418   AACENC_HEADER_PERIOD =
   1419       0x0301, /*!< Frame count period for sending in-band configuration buffers
   1420                  within LATM/LOAS transport layer. Additionally this parameter
   1421                  configures the PCE repetition period in raw_data_block(). See
   1422                  \ref encPCE.
   1423                    - 0xFF: auto-mode default 10 for TT_MP4_ADTS, TT_MP4_LOAS and
   1424                  TT_MP4_LATM_MCP1, otherwise 0.
   1425                    - n: Frame count period. */
   1426 
   1427   AACENC_SIGNALING_MODE =
   1428       0x0302, /*!< Signaling mode of the extension AOT:
   1429                    - 0: Implicit backward compatible signaling (default for
   1430                  non-MPEG-4 based AOT's and for the transport formats ADIF and
   1431                  ADTS)
   1432                         - A stream that uses implicit signaling can be decoded
   1433                  by every AAC decoder, even AAC-LC-only decoders
   1434                         - An AAC-LC-only decoder will only decode the
   1435                  low-frequency part of the stream, resulting in a band-limited
   1436                  output
   1437                         - This method works with all transport formats
   1438                         - This method does not work with downsampled SBR
   1439                    - 1: Explicit backward compatible signaling
   1440                         - A stream that uses explicit backward compatible
   1441                  signaling can be decoded by every AAC decoder, even AAC-LC-only
   1442                  decoders
   1443                         - An AAC-LC-only decoder will only decode the
   1444                  low-frequency part of the stream, resulting in a band-limited
   1445                  output
   1446                         - A decoder not capable of decoding PS will only decode
   1447                  the AAC-LC+SBR part. If the stream contained PS, the result
   1448                  will be a a decoded mono downmix
   1449                         - This method does not work with ADIF or ADTS. For
   1450                  LOAS/LATM, it only works with AudioMuxVersion==1
   1451                         - This method does work with downsampled SBR
   1452                    - 2: Explicit hierarchical signaling (default for MPEG-4
   1453                  based AOT's and for all transport formats excluding ADIF and
   1454                  ADTS)
   1455                         - A stream that uses explicit hierarchical signaling can
   1456                  be decoded only by HE-AAC decoders
   1457                         - An AAC-LC-only decoder will not decode a stream that
   1458                  uses explicit hierarchical signaling
   1459                         - A decoder not capable of decoding PS will not decode
   1460                  the stream at all if it contained PS
   1461                         - This method does not work with ADIF or ADTS. It works
   1462                  with LOAS/LATM and the MPEG-4 File format
   1463                         - This method does work with downsampled SBR
   1464 
   1465                     For making sure that the listener always experiences the
   1466                  best audio quality, explicit hierarchical signaling should be
   1467                  used. This makes sure that only a full HE-AAC-capable decoder
   1468                  will decode those streams. The audio is played at full
   1469                  bandwidth. For best backwards compatibility, it is recommended
   1470                  to encode with implicit SBR signaling. A decoder capable of
   1471                  AAC-LC only will then only decode the AAC part, which means the
   1472                  decoded audio will sound band-limited.
   1473 
   1474                     For MPEG-2 transport types (ADTS,ADIF), only implicit
   1475                  signaling is possible.
   1476 
   1477                     For LOAS and LATM, explicit backwards compatible signaling
   1478                  only works together with AudioMuxVersion==1. The reason is
   1479                  that, for explicit backwards compatible signaling, additional
   1480                  information will be appended to the ASC. A decoder that is only
   1481                  capable of decoding AAC-LC will skip this part. Nevertheless,
   1482                  for jumping to the end of the ASC, it needs to know the ASC
   1483                  length. Transmitting the length of the ASC is a feature of
   1484                  AudioMuxVersion==1, it is not possible to transmit the length
   1485                  of the ASC with AudioMuxVersion==0, therefore an AAC-LC-only
   1486                  decoder will not be able to parse a LOAS/LATM stream that was
   1487                  being encoded with AudioMuxVersion==0.
   1488 
   1489                     For downsampled SBR, explicit signaling is mandatory. The
   1490                  reason for this is that the extension sampling frequency (which
   1491                  is in case of SBR the sampling frequqncy of the SBR part) can
   1492                  only be signaled in explicit mode.
   1493 
   1494                     For AAC-ELD, the SBR information is transmitted in the
   1495                  ELDSpecific Config, which is part of the AudioSpecificConfig.
   1496                  Therefore, the settings here will have no effect on AAC-ELD.*/
   1497 
   1498   AACENC_TPSUBFRAMES =
   1499       0x0303, /*!< Number of sub frames in a transport frame for LOAS/LATM or
   1500                  ADTS (default 1).
   1501                    - ADTS: Maximum number of sub frames restricted to 4.
   1502                    - LOAS/LATM: Maximum number of sub frames restricted to 2.*/
   1503 
   1504   AACENC_AUDIOMUXVER =
   1505       0x0304, /*!< AudioMuxVersion to be used for LATM. (AudioMuxVersionA,
   1506                  currently not implemented):
   1507                    - 0: Default, no transmission of tara Buffer fullness, no ASC
   1508                  length and including actual latm Buffer fullnes.
   1509                    - 1: Transmission of tara Buffer fullness, ASC length and
   1510                  actual latm Buffer fullness.
   1511                    - 2: Transmission of tara Buffer fullness, ASC length and
   1512                  maximum level of latm Buffer fullness. */
   1513 
   1514   AACENC_PROTECTION = 0x0306, /*!< Configure protection in transport layer:
   1515                                    - 0: No protection. (default)
   1516                                    - 1: CRC active for ADTS transport format. */
   1517 
   1518   AACENC_ANCILLARY_BITRATE =
   1519       0x0500, /*!< Constant ancillary data bitrate in bits/second.
   1520                    - 0: Either no ancillary data or insert exact number of
   1521                  bytes, denoted via input parameter, numAncBytes in
   1522                  AACENC_InArgs.
   1523                    - else: Insert ancillary data with specified bitrate. */
   1524 
   1525   AACENC_METADATA_MODE = 0x0600, /*!< Configure Meta Data. See ::AACENC_MetaData
   1526                                     for further details:
   1527                                       - 0: Do not embed any metadata.
   1528                                       - 1: Embed dynamic_range_info metadata.
   1529                                       - 2: Embed dynamic_range_info and
   1530                                     ancillary_data metadata.
   1531                                       - 3: Embed ancillary_data metadata. */
   1532 
   1533   AACENC_CONTROL_STATE =
   1534       0xFF00, /*!< There is an automatic process which internally reconfigures
   1535                  the encoder instance when a configuration parameter changed or
   1536                  an error occured. This paramerter allows overwriting or getting
   1537                  the control status of this process. See ::AACENC_CTRLFLAGS. */
   1538 
   1539   AACENC_NONE = 0xFFFF /*!< ------ */
   1540 
   1541 } AACENC_PARAM;
   1542 
   1543 #ifdef __cplusplus
   1544 extern "C" {
   1545 #endif
   1546 
   1547 /**
   1548  * \brief  Open an instance of the encoder.
   1549  *
   1550  * Allocate memory for an encoder instance with a functional range denoted by
   1551  * the function parameters. Preinitialize encoder instance with default
   1552  * configuration.
   1553  *
   1554  * \param phAacEncoder  A pointer to an encoder handle. Initialized on return.
   1555  * \param encModules    Specify encoder modules to be supported in this encoder
   1556  * instance:
   1557  *                      - 0x0: Allocate memory for all available encoder
   1558  * modules.
   1559  *                      - else: Select memory allocation regarding encoder
   1560  * modules. Following flags are possible and can be combined.
   1561  *                              - 0x01: AAC module.
   1562  *                              - 0x02: SBR module.
   1563  *                              - 0x04: PS module.
   1564  *                              - 0x08: MPS module.
   1565  *                              - 0x10: Metadata module.
   1566  *                              - example: (0x01|0x02|0x04|0x08|0x10) allocates
   1567  * all modules and is equivalent to default configuration denotet by 0x0.
   1568  * \param maxChannels   Number of channels to be allocated. This parameter can
   1569  * be used in different ways:
   1570  *                      - 0: Allocate maximum number of AAC and SBR channels as
   1571  * supported by the library.
   1572  *                      - nChannels: Use same maximum number of channels for
   1573  * allocating memory in AAC and SBR module.
   1574  *                      - nChannels | (nSbrCh<<8): Number of SBR channels can be
   1575  * different to AAC channels to save data memory.
   1576  *
   1577  * \return
   1578  *          - AACENC_OK, on succes.
   1579  *          - AACENC_INVALID_HANDLE, AACENC_MEMORY_ERROR, AACENC_INVALID_CONFIG,
   1580  * on failure.
   1581  */
   1582 AACENC_ERROR aacEncOpen(HANDLE_AACENCODER *phAacEncoder, const UINT encModules,
   1583                         const UINT maxChannels);
   1584 
   1585 /**
   1586  * \brief  Close the encoder instance.
   1587  *
   1588  * Deallocate encoder instance and free whole memory.
   1589  *
   1590  * \param phAacEncoder  Pointer to the encoder handle to be deallocated.
   1591  *
   1592  * \return
   1593  *          - AACENC_OK, on success.
   1594  *          - AACENC_INVALID_HANDLE, on failure.
   1595  */
   1596 AACENC_ERROR aacEncClose(HANDLE_AACENCODER *phAacEncoder);
   1597 
   1598 /**
   1599  * \brief Encode audio data.
   1600  *
   1601  * This function is mainly for encoding audio data. In addition the function can
   1602  * be used for an encoder (re)configuration process.
   1603  * - PCM input data will be retrieved from external input buffer until the fill
   1604  * level allows encoding a single frame. This functionality allows an external
   1605  * buffer with reduced size in comparison to the AAC or HE-AAC audio frame
   1606  * length.
   1607  * - If the value of the input samples argument is zero, just internal
   1608  * reinitialization will be applied if it is requested.
   1609  * - At the end of a file the flushing process can be triggerd via setting the
   1610  * value of the input samples argument to -1. The encoder delay lines are fully
   1611  * flushed when the encoder returns no valid bitstream data
   1612  * AACENC_OutArgs::numOutBytes. Furthermore the end of file is signaled by the
   1613  * return value AACENC_ENCODE_EOF.
   1614  * - If an error occured in the previous frame or any of the encoder parameters
   1615  * changed, an internal reinitialization process will be applied before encoding
   1616  * the incoming audio samples.
   1617  * - The function can also be used for an independent reconfiguration process
   1618  * without encoding. The first parameter has to be a valid encoder handle and
   1619  * all other parameters can be set to NULL.
   1620  * - If the size of the external bitbuffer in outBufDesc is not sufficient for
   1621  * writing the whole bitstream, an internal error will be the return value and a
   1622  * reconfiguration will be triggered.
   1623  *
   1624  * \param hAacEncoder           A valid AAC encoder handle.
   1625  * \param inBufDesc             Input buffer descriptor, see AACENC_BufDesc:
   1626  *                              - At least one input buffer with audio data is
   1627  * expected.
   1628  *                              - Optionally a second input buffer with
   1629  * ancillary data can be fed.
   1630  * \param outBufDesc            Output buffer descriptor, see AACENC_BufDesc:
   1631  *                              - Provide one output buffer for the encoded
   1632  * bitstream.
   1633  * \param inargs                Input arguments, see AACENC_InArgs.
   1634  * \param outargs               Output arguments, AACENC_OutArgs.
   1635  *
   1636  * \return
   1637  *          - AACENC_OK, on success.
   1638  *          - AACENC_INVALID_HANDLE, AACENC_ENCODE_ERROR, on failure in encoding
   1639  * process.
   1640  *          - AACENC_INVALID_CONFIG, AACENC_INIT_ERROR, AACENC_INIT_AAC_ERROR,
   1641  * AACENC_INIT_SBR_ERROR, AACENC_INIT_TP_ERROR, AACENC_INIT_META_ERROR,
   1642  * AACENC_INIT_MPS_ERROR, on failure in encoder initialization.
   1643  *          - AACENC_UNSUPPORTED_PARAMETER, on incorrect input or output buffer
   1644  * descriptor initialization.
   1645  *          - AACENC_ENCODE_EOF, when flushing fully concluded.
   1646  */
   1647 AACENC_ERROR aacEncEncode(const HANDLE_AACENCODER hAacEncoder,
   1648                           const AACENC_BufDesc *inBufDesc,
   1649                           const AACENC_BufDesc *outBufDesc,
   1650                           const AACENC_InArgs *inargs, AACENC_OutArgs *outargs);
   1651 
   1652 /**
   1653  * \brief  Acquire info about present encoder instance.
   1654  *
   1655  * This function retrieves information of the encoder configuration. In addition
   1656  * to informative internal states, a configuration data block of the current
   1657  * encoder settings will be returned. The format is either Audio Specific Config
   1658  * in case of Raw Packets transport format or StreamMuxConfig in case of
   1659  * LOAS/LATM transport format. The configuration data block is binary coded as
   1660  * specified in ISO/IEC 14496-3 (MPEG-4 audio), to be used directly for MPEG-4
   1661  * File Format or RFC3016 or RFC3640 applications.
   1662  *
   1663  * \param hAacEncoder           A valid AAC encoder handle.
   1664  * \param pInfo                 Pointer to AACENC_InfoStruct. Filled on return.
   1665  *
   1666  * \return
   1667  *          - AACENC_OK, on succes.
   1668  *          - AACENC_INIT_ERROR, on failure.
   1669  */
   1670 AACENC_ERROR aacEncInfo(const HANDLE_AACENCODER hAacEncoder,
   1671                         AACENC_InfoStruct *pInfo);
   1672 
   1673 /**
   1674  * \brief  Set one single AAC encoder parameter.
   1675  *
   1676  * This function allows configuration of all encoder parameters specified in
   1677  * ::AACENC_PARAM. Each parameter must be set with a separate function call. An
   1678  * internal validation of the configuration value range will be done and an
   1679  * internal reconfiguration will be signaled. The actual configuration adoption
   1680  * is part of the subsequent aacEncEncode() call.
   1681  *
   1682  * \param hAacEncoder           A valid AAC encoder handle.
   1683  * \param param                 Parameter to be set. See ::AACENC_PARAM.
   1684  * \param value                 Parameter value. See parameter description in
   1685  * ::AACENC_PARAM.
   1686  *
   1687  * \return
   1688  *          - AACENC_OK, on success.
   1689  *          - AACENC_INVALID_HANDLE, AACENC_UNSUPPORTED_PARAMETER,
   1690  * AACENC_INVALID_CONFIG, on failure.
   1691  */
   1692 AACENC_ERROR aacEncoder_SetParam(const HANDLE_AACENCODER hAacEncoder,
   1693                                  const AACENC_PARAM param, const UINT value);
   1694 
   1695 /**
   1696  * \brief  Get one single AAC encoder parameter.
   1697  *
   1698  * This function is the complement to aacEncoder_SetParam(). After encoder
   1699  * reinitialization with user defined settings, the internal status can be
   1700  * obtained of each parameter, specified with ::AACENC_PARAM.
   1701  *
   1702  * \param hAacEncoder           A valid AAC encoder handle.
   1703  * \param param                 Parameter to be returned. See ::AACENC_PARAM.
   1704  *
   1705  * \return  Internal configuration value of specifed parameter ::AACENC_PARAM.
   1706  */
   1707 UINT aacEncoder_GetParam(const HANDLE_AACENCODER hAacEncoder,
   1708                          const AACENC_PARAM param);
   1709 
   1710 /**
   1711  * \brief  Get information about encoder library build.
   1712  *
   1713  * Fill a given LIB_INFO structure with library version information.
   1714  *
   1715  * \param info  Pointer to an allocated LIB_INFO struct.
   1716  *
   1717  * \return
   1718  *          - AACENC_OK, on success.
   1719  *          - AACENC_INVALID_HANDLE, AACENC_INIT_ERROR, on failure.
   1720  */
   1721 AACENC_ERROR aacEncGetLibInfo(LIB_INFO *info);
   1722 
   1723 #ifdef __cplusplus
   1724 }
   1725 #endif
   1726 
   1727 #endif /* AACENC_LIB_H */
   1728