Home | History | Annotate | Download | only in vorbisenc
      1 <html>
      2 
      3 <head>
      4 <title>libvorbisenc - API Overview</title>
      5 <link rel=stylesheet href="style.css" type="text/css">
      6 </head>
      7 
      8 <body bgcolor=white text=black link="#5555ff" alink="#5555ff" vlink="#5555ff">
      9 <table border=0 width=100%>
     10 <tr>
     11 <td><p class=tiny>libvorbisenc documentation</p></td>
     12 <td align=right><p class=tiny>libvorbisenc release 1.1 - 20040709</p></td>
     13 </tr>
     14 </table>
     15 
     16 <h1>Libvorbisenc API Overview</h1>
     17 
     18 <p>Libvorbisenc is an encoding convenience library intended to
     19 encapsulate the elaborate setup that libvorbis requires for encoding.
     20 Libvorbisenc gives easy access to all high-level adjustments an
     21 application may require when encoding and also exposes some low-level
     22 tuning parameters to allow applications to make detailed adjustments
     23 to the encoding process. <p>
     24 
     25 All the <b>libvorbisenc</b> routines are declared in "vorbis/vorbisenc.h".
     26 
     27 <em>Note: libvorbis and libvorbisenc always
     28 encode in a single pass. Thus, all possible encoding setups will work
     29 properly with live input and produce streams that decode properly when
     30 streamed.  See the subsection titled <a href="#BBR">"managed bitrate
     31 modes"</a> for details on setting limits on bitrate usage when Vorbis
     32 streams are used in a limited-bandwidth environment.</em>
     33 
     34 <h2>workflow</h2>
     35 
     36 <p>Libvorbisenc is used only during encoder setup; its function
     37 is to automate initialization of a multitude of settings in a
     38 <tt>vorbis_info</tt> structure which libvorbis then uses as a reference
     39 during the encoding process.  Libvorbisenc plays no part in the
     40 encoding process after setup.
     41 
     42 <p>Encode setup using libvorbisenc consists of three steps: 
     43 
     44 <ol>
     45 <li>high-level initialization of a <tt>vorbis_info</tt> structure by
     46 calling one of <a
     47 href="vorbis_encode_setup_vbr.html">vorbis_encode_setup_vbr()</a> or <a
     48 href="vorbis_encode_setup_managed.html">vorbis_encode_setup_managed()</a>
     49 with the basic input audio parameters (rate and channels) and the
     50 basic desired encoded audio output parameters (VBR quality or ABR/CBR
     51 bitrate)<p>
     52 
     53 <li>optional adjustment of the basic setup defaults using <a
     54 href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a><p>
     55 
     56 <li>calling <a
     57 href="vorbis_encode_setup_init.html">vorbis_encode_setup_init()</a> to
     58 finalize the high-level setup into the detailed low-level reference
     59 values needed by libvorbis to encode audio. The <tt>vorbis_info</tt>
     60 structure is then ready to use for encoding by libvorbis.<p>
     61 
     62 </ol>
     63 
     64 These three steps can be collapsed into a single call by using <a
     65 href="vorbis_encode_init_vbr.html">vorbis_encode_init_vbr</a> to set up a
     66 quality-based VBR stream or <a
     67 href="vorbis_encode_init.html">vorbis_encode_init</a> to set up a managed
     68 bitrate (ABR or CBR) stream.<p>
     69 
     70 <h2>adjustable encoding parameters</h2>
     71 
     72 <h3>input audio parameters</h3>
     73 
     74 <p>
     75 <table border=1 color=black width=50% cellspacing=0 cellpadding=7>
     76 <tr bgcolor=#cccccc>
     77 	<td><b>parameter</b></td>
     78 	<td><b>description</b></td>
     79 </tr>
     80 <tr valign=top>
     81 <td>sampling rate</td>
     82 <td>
     83 The sampling rate (in samples per second) of the input audio.  Common examples are 8000 for telephony, 44100 for CD audio and 48000 for DAT.  Note that a mono sample (one center value) and a stereo sample (one left value and one right value) both are a single sample.
     84 
     85 </td>
     86 </tr>
     87 <tr valign=top>
     88 <td>channels</td>
     89 <td>
     90 
     91 The number of channels encoded in each input sample.  By default,
     92 stereo input modes (two channels) are 'coupled' by Vorbis 1.1 such
     93 that the stereo relationship between the samples is taken into account
     94 when encoding.  Stereo coupling my be disabled by using <a
     95 href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a> with <a
     96 href="vorbis_encode_ctl.html#OV_ECTL_COUPLE_SET">OV_ECTL_COUPLE_SET</a>.
     97 
     98 </td>
     99 </tr>
    100 </table>
    101 
    102 <h3>quality and VBR modes</h3>
    103 
    104 Vorbis is natively a VBR codec; a user requests a given constant
    105 <em>quality</em> and the encoder keeps the encoding quality constant
    106 while allowing the bitrate to vary.  'Quality' modes (Variable BitRate)
    107 will always produce the most consistent encoding results as well as
    108 the highest quality for the amount of bits used.
    109 
    110 <p>
    111 <table border=1 color=black width=50% cellspacing=0 cellpadding=7>
    112 <tr bgcolor=#cccccc>
    113 	<td><b>parameter</b></td>
    114 	<td><b>description</b></td>
    115 </tr>
    116 <tr valign=top>
    117 <td>quality</td>
    118 <td>
    119 A decimal float value requesting a desired quality.  Libvorbisenc 1.1 allows quality requests in the range of -0.1 (lowest quality, smallest files) through +1.0 (highest-quality, largest files). Quality -0.1 is intended as an ultra-low setting in which low bitrate is much more important than quality consistency.  Quality settings 0.0 and above are intended to produce consistent results at all times.  
    120 
    121 </td>
    122 </tr>
    123 </table>
    124 
    125 <a name="BBR">
    126 <h3>managed bitrate modes</h3>
    127 
    128 Although the Vorbis codec is natively VBR, libvorbis includes
    129 infrastructure for 'managing' the bitrate of streams by setting
    130 minimum and maximum usage constraints, as well as functionality for
    131 nudging a stream toward a desired average value.  These features
    132 should <em>only</em> be used when there is a requirement to limit
    133 bitrate in some way.  Although the difference is usually slight,
    134 managed bitrate modes will always produce output inferior to VBR
    135 (given equal bitrate usage). Setting overly or impossibly tight
    136 bitrate management requirements can affect output quality dramatically
    137 for the worse.<p>
    138 
    139 Beginning in libvorbis 1.1, bitrate management is implemented using a
    140 <em>bit-reservoir</em> algorithm. The encoder has a fixed-size
    141 reservoir used as a 'savings account' in encoding.  When a frame is
    142 smaller than the target rate, the unused bits go into the reservoir so
    143 that they may be used by future frames.  When a frame is larger than
    144 target bitrate, it draws 'banked' bits out of the reservoir.  Encoding
    145 is managed so that the reservoir never goes negative (when a maximum
    146 bitrate is specified) or fills beyond a fixed limit (when a minimum
    147 bitrate is specified).  An 'average bitrate' request is used as the
    148 set-point in a long-range bitrate tracker which adjusts the encoder's
    149 aggressiveness up or down depending on whether or not frames are coming
    150 in larger or smaller than the requested average point.
    151 
    152 <p>
    153 <table border=1 color=black width=50% cellspacing=0 cellpadding=7>
    154 <tr bgcolor=#cccccc>
    155 	<td><b>parameter</b></td>
    156 	<td><b>description</b></td>
    157 </tr>
    158 <tr valign=top>
    159 <td>maximum bitrate</td> <td> The maximum allowed bitrate, set in bits
    160 per second.  If the bitrate would otherwise rise such that oversized
    161 frames would underflow the bit-reservoir by consuming banked bits,
    162 bitrate management will force the encoder to use fewer bits per frame
    163 by encoding with a more aggressive psychoacoustic model.<p> This
    164 setting is a hard limit; the bitstream will never be allowed, under
    165 any circumstances, to increase above the specified bitrate over the
    166 average period set by the reservoir; it may momentarily rise over if
    167 inspected on a granularity much finer than the average period across
    168 the reservoir.  Normally, the encoder will conserve bits gracefully by
    169 using more aggressive psychoacoustics to shrink a frame when forced
    170 to.  However, if the encoder runs out of means of gracefully shrinking
    171 a frame, it will simply take the smallest frame it can otherwise
    172 generate and truncate it to the maximum allowed length.  Note that
    173 this is not an error and although it will obviously adversely affect
    174 audio quality, a Vorbis decoder will be able to decode a truncated
    175 frame into audio.
    176 
    177 </td>
    178 </tr>
    179 
    180 <tr valign=top>
    181 <td>average bitrate</td> 
    182 
    183 <td>
    184 
    185 The average desired bitrate of a stream, set
    186 in bits per second.  Average bitrate is tracked via a reservoir like
    187 minimum and maximum bitrate, however the averaging reservior does not
    188 impose a hard limit; it is used to nudge the bitrate toward the
    189 desired average by slowly adjusting the psychoacoustic aggressiveness.
    190 As such, the reservoir size does not affect the average bitrate
    191 behavior.  Because this setting alone is not used to impose hard
    192 bitrate limits, the bitrate of a stream produced using only the
    193 <tt>average bitrate</tt> constraint will track the average over time
    194 but not necessarily adhere strictly to that average for any given
    195 period.  Should a strict localized average be required, <tt>average
    196 bitrate</tt> should be used along with <tt>minimum bitrate</tt> and
    197 <tt>maximum bitrate</tt>.
    198 </td>
    199 
    200 </tr>
    201 
    202 <tr valign=top>
    203 <td>minimum bitrate</td>
    204 <td> 
    205  The minimum allowed bitrate, set in bits per second.  If
    206 the bitrate would otherwise fall such that undersized frames would
    207 overflow the bit-reservoir with unused bits, bitrate management will
    208 force the encoder to use more bits per frame by encoding with a less
    209 aggressive psychoacoustic model.<p> This setting is a hard limit; the
    210 bitstream will never be allowed, under any circumstances, to drop
    211 below the specified bitrate over the average period set by the
    212 reservoir; it may momentarily fall under if inspected on a granularity
    213 much finer than the average period across the reservoir.  Normally,
    214 the encoder will fill out undersided frames with additional useful
    215 coding information by increasing the perceived quality of the stream.
    216 If the encoder runs out of useful ways to consume more bits, it will
    217 pad frames out with zeroes.
    218 </td>
    219 </tr>
    220 
    221 <tr valign=top>
    222 <td>reservoir size</td> <td> The size of the minimum/maximum bitrate
    223 tracking reservoir, set in bits.  The reservoir is used as a 'bit
    224 bank' to average out localized surges and dips in bitrate while
    225 providing predictable, guaranteed buffering behavior for streams to be
    226 used in situations with constrained transport bandwidth.  The default
    227 setting is two seconds of average bitrate.<p>
    228 
    229 When a single frame is larger than the maximum allowed overall
    230 bitrate, the bits are 'borrowed' from the bitrate reservoir; if the
    231 reservoir contains insufficient bits to cover the defecit, the encoder
    232 must find some way to reduce the frame size. <p>
    233 
    234 When a frame is under the minimum limit, the surplus bits are placed
    235 into the reservoir, banking them for future use.  If the reservoir is
    236 already full of banked bits, the encoder is forced to find some way to
    237 make the frame larger.<p>
    238 
    239 If the frame size is between the minimum and maximum rates (thus
    240 implying the minimum and maximum allowed rates are different), the
    241 reservoir gravitates toward a fill point configured by the
    242 <tt>reservoir bias</tt> setting described next.  If the reservoir is
    243 fuller than the fill point (a 'surplus of surplus'), the encoder will
    244 consume a number bits from the reservoir equal to the number of the
    245 bits by which the frame exceeds minimum size.  If the reservoir is
    246 emptier than the fillpoint (a 'surplus of defecit'), bits are returned
    247 to the reservoir equaling the current frame's number of bits under the
    248 maximum frame size.  The idea of the fill point is to buffer against
    249 both underruns and overruns, by trying to hold the reservoir to a
    250 middle course.
    251 </td>
    252 </tr>
    253 
    254 <tr valign=top>
    255 <td>reservoir bias</td>
    256 
    257 <td>
    258 
    259 Reservoir bias is a setting between 0.0 and 1.0 that biases bitrate
    260 management toward smoothing bitrate spikes (0.0) or bitrate peaks
    261 (1.0); the default setting is 0.1.<p>
    262 
    263 Using settings toward 0.0 causes the bitrate manager to hoard bits in
    264 the bit reservoir such that there is a large pool of banked surplus to
    265 draw upon during short spikes in bitrate.  As a result, the encoder
    266 will react less aggressively and less drastically to curtail framesize
    267 during brief surges in bitrate.<p>
    268 
    269 Using settings toward 1.0 causes the bitrate manager to empty the bit
    270 reservoir such that there is a large buffer available to store surplus
    271 bits during sudden drops in bitrate.  As a result, the encoder will
    272 react less aggressively and less drastically to support minimum frame
    273 sizes during drops in bitrate and will tend not to store any extra
    274 bits in the reservoir for future bitrate spikes.<p>
    275 
    276 </td>
    277 </tr>
    278 
    279 <tr valign=top>
    280 <td>average track damping</td>
    281 <td> 
    282 
    283 A decimal value, in seconds, that controls how quickly the average
    284 bitrate tracker is allowed to slew from enforcing minimum frame sizes
    285 to maximum framesizes and vice versa.  Default value is 1.5
    286 seconds.<p>
    287 
    288 When the 'average bitrate' setting is in use, the average bitrate
    289 tracker uses an unbounded reservoir to track overall bitrate-to-date
    290 in the stream.  When bitrates are too low, the tracker will try to
    291 nudge bitrates up and when the bitrate is too high, nudge it down.
    292 The damping value regulates the maximum strength of the nudge; it
    293 describes, in seconds, how quickly the tracker may transition from an
    294 extreme nudge in one direction to an extreme nudge in the other.<p>
    295 
    296 </td>
    297 </tr>
    298 
    299 </table>
    300 
    301 <h3>encoding model adjustments</h3>
    302 
    303 The <a href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a> call provides
    304 a generalized interface for making encoding setup adjustments to the
    305 basic high-level setup provided by <a
    306 href="vorbis_encode_setup_vbr.html">vorbis_encode_setup_vbr()</a> or <a
    307 href="vorbis_encode_setup_managed.html">vorbis_encode_setup_managed()</a>.
    308 In reality, these two calls use <a
    309 href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a> internally, and <a
    310 href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a> can be used to adjust
    311 most of the parameters set by other calls.<p>
    312 
    313 In Vorbis 1.1, <a href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a> can
    314 adjust the following additional parameters not described elsewhere:
    315 
    316 <p>
    317 <table border=1 color=black width=50% cellspacing=0 cellpadding=7>
    318 <tr bgcolor=#cccccc>
    319 	<td><b>parameter</b></td>
    320 	<td><b>description</b></td>
    321 </tr>
    322 <tr valign=top>
    323 <td>management mode</td> <td> Configures whether or not bitrate
    324 management is in use or not.  Normally, this value is set implicitly
    325 during encoding setup; however, the supported means of selecting a
    326 quality mode by bitrate (that is, requesting a true VBR stream, but
    327 doing so by asking for an approximate bitrate) is to use <a
    328 href="vorbis_encode_setup_managed.html">vorbis_encode_setup_managed()</a>
    329 and then to explicitly turn off bitrate management by calling <a
    330 href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a> with <a
    331 href="vorbis_encode_ctl.html#OV_ECTL_RATEMANAGE2_SET">OV_ECTL_RATEMANAGE2_SET</a>
    332 </td>
    333 </tr>
    334 
    335 <tr valign=top>
    336 <td>coupling</td> <td> Stereo encoding (and in the future, surround
    337 encodings) are normally encoded assuming the channels form a stereo
    338 image and that lossy-stereo modelling is appropriate; this is called
    339 'coupling'.  Stereo coupling may be explicitly enabled or disabled.
    340 </td>
    341 </tr>
    342 <tr valign=top>
    343 <td>lowpass</td> <td> Sets the hard lowpass of a given encoding mode;
    344 this may be used to conserve a few bits in high-rate audio that has
    345 limited bandwidth, or in testing of the encoder's acoustic model.  The
    346 encoder is generally already configured with ideal lowpasses (if any
    347 at all) for given modes; use of this parameter is strongly discouraged
    348 if the point is to try to 'improve' a given encoding mode for general
    349 encoding.
    350 </td>
    351 </tr>
    352 
    353 <tr valign=top>
    354 <td>impulse coding aggressiveness</td> <td>By default, libvorbis
    355 attempts to compromise between preventing wide bitrate swings and
    356 high-resolution impulse coding (which is required for the crispest
    357 possible attacks, but also requires a relatively large momentary
    358 bitrate increase).  This parameter allows an application to tune the
    359 compromise or eliminate it; A value of 0.0 indicates normal behavior
    360 while a value of -15.0 requests maximum possible impulse
    361 resolution.</td>
    362 </tr>
    363 
    364 </table>
    365 
    366 
    367 <br><br>
    368 <hr noshade>
    369 <table border=0 width=100%>
    370 <tr valign=top>
    371 <td><p class=tiny>copyright &copy; 2004 Vorbis team</p></td>
    372 <td align=right><p class=tiny><a href="http://www.xiph.org/ogg/vorbis/index.html">Ogg Vorbis</a><br><a href="mailto:team (a] vorbis.org">team (a] vorbis.org</a></p></td>
    373 </tr><tr>
    374 <td><p class=tiny>libvorbisenc documentation</p></td>
    375 <td align=right><p class=tiny>libvorbisenc release 1.1 - 20040709</p></td>
    376 </tr>
    377 </table>
    378 
    379 </body>
    380 
    381 </html>
    382 
    383