Home | History | Annotate | Download | only in doc
      1 % -*- mode: latex; TeX-master: "Vorbis_I_spec"; -*-
      2 %!TEX root = Vorbis_I_spec.tex
      3 % $Id$
      4 \section{Probability Model and Codebooks} \label{vorbis:spec:codebook}
      5 
      6 \subsection{Overview}
      7 
      8 Unlike practically every other mainstream audio codec, Vorbis has no
      9 statically configured probability model, instead packing all entropy
     10 decoding configuration, VQ and Huffman, into the bitstream itself in
     11 the third header, the codec setup header.  This packed configuration
     12 consists of multiple 'codebooks', each containing a specific
     13 Huffman-equivalent representation for decoding compressed codewords as
     14 well as an optional lookup table of output vector values to which a
     15 decoded Huffman value is applied as an offset, generating the final
     16 decoded output corresponding to a given compressed codeword.
     17 
     18 \subsubsection{Bitwise operation}
     19 The codebook mechanism is built on top of the vorbis bitpacker. Both
     20 the codebooks themselves and the codewords they decode are unrolled
     21 from a packet as a series of arbitrary-width values read from the
     22 stream according to \xref{vorbis:spec:bitpacking}.
     23 
     24 
     25 
     26 
     27 \subsection{Packed codebook format}
     28 
     29 For purposes of the examples below, we assume that the storage
     30 system's native byte width is eight bits.  This is not universally
     31 true; see \xref{vorbis:spec:bitpacking} for discussion
     32 relating to non-eight-bit bytes.
     33 
     34 \subsubsection{codebook decode}
     35 
     36 A codebook begins with a 24 bit sync pattern, 0x564342:
     37 
     38 \begin{Verbatim}[commandchars=\\\{\}]
     39 byte 0: [ 0 1 0 0 0 0 1 0 ] (0x42)
     40 byte 1: [ 0 1 0 0 0 0 1 1 ] (0x43)
     41 byte 2: [ 0 1 0 1 0 1 1 0 ] (0x56)
     42 \end{Verbatim}
     43 
     44 16 bit \varname{[codebook_dimensions]} and 24 bit \varname{[codebook_entries]} fields:
     45 
     46 \begin{Verbatim}[commandchars=\\\{\}]
     47 
     48 byte 3: [ X X X X X X X X ]
     49 byte 4: [ X X X X X X X X ] [codebook_dimensions] (16 bit unsigned)
     50 
     51 byte 5: [ X X X X X X X X ]
     52 byte 6: [ X X X X X X X X ]
     53 byte 7: [ X X X X X X X X ] [codebook_entries] (24 bit unsigned)
     54 
     55 \end{Verbatim}
     56 
     57 Next is the \varname{[ordered]} bit flag:
     58 
     59 \begin{Verbatim}[commandchars=\\\{\}]
     60 
     61 byte 8: [               X ] [ordered] (1 bit)
     62 
     63 \end{Verbatim}
     64 
     65 Each entry, numbering a
     66 total of \varname{[codebook_entries]}, is assigned a codeword length.
     67 We now read the list of codeword lengths and store these lengths in
     68 the array \varname{[codebook_codeword_lengths]}. Decode of lengths is
     69 according to whether the \varname{[ordered]} flag is set or unset.
     70 
     71 \begin{itemize}
     72 \item
     73   If the \varname{[ordered]} flag is unset, the codeword list is not
     74   length ordered and the decoder needs to read each codeword length
     75   one-by-one.
     76 
     77   The decoder first reads one additional bit flag, the
     78   \varname{[sparse]} flag.  This flag determines whether or not the
     79   codebook contains unused entries that are not to be included in the
     80   codeword decode tree:
     81 
     82 \begin{Verbatim}[commandchars=\\\{\}]
     83 byte 8: [             X 1 ] [sparse] flag (1 bit)
     84 \end{Verbatim}
     85 
     86   The decoder now performs for each of the \varname{[codebook_entries]}
     87   codebook entries:
     88 
     89 \begin{Verbatim}[commandchars=\\\{\}]
     90 
     91   1) if([sparse] is set) \{
     92 
     93          2) [flag] = read one bit;
     94          3) if([flag] is set) \{
     95 
     96               4) [length] = read a five bit unsigned integer;
     97               5) codeword length for this entry is [length]+1;
     98 
     99             \} else \{
    100 
    101               6) this entry is unused.  mark it as such.
    102 
    103             \}
    104 
    105      \} else the sparse flag is not set \{
    106 
    107         7) [length] = read a five bit unsigned integer;
    108         8) the codeword length for this entry is [length]+1;
    109 
    110      \}
    111 
    112 \end{Verbatim}
    113 
    114 \item
    115   If the \varname{[ordered]} flag is set, the codeword list for this
    116   codebook is encoded in ascending length order.  Rather than reading
    117   a length for every codeword, the encoder reads the number of
    118   codewords per length.  That is, beginning at entry zero:
    119 
    120 \begin{Verbatim}[commandchars=\\\{\}]
    121   1) [current_entry] = 0;
    122   2) [current_length] = read a five bit unsigned integer and add 1;
    123   3) [number] = read \link{vorbis:spec:ilog}{ilog}([codebook_entries] - [current_entry]) bits as an unsigned integer
    124   4) set the entries [current_entry] through [current_entry]+[number]-1, inclusive,
    125     of the [codebook_codeword_lengths] array to [current_length]
    126   5) set [current_entry] to [number] + [current_entry]
    127   6) increment [current_length] by 1
    128   7) if [current_entry] is greater than [codebook_entries] ERROR CONDITION;
    129     the decoder will not be able to read this stream.
    130   8) if [current_entry] is less than [codebook_entries], repeat process starting at 3)
    131   9) done.
    132 \end{Verbatim}
    133 
    134 \end{itemize}
    135 
    136 After all codeword lengths have been decoded, the decoder reads the
    137 vector lookup table.  Vorbis I supports three lookup types:
    138 \begin{enumerate}
    139 \item
    140 No lookup
    141 \item
    142 Implicitly populated value mapping (lattice VQ)
    143 \item
    144 Explicitly populated value mapping (tessellated or 'foam'
    145 VQ)
    146 \end{enumerate}
    147 
    148 
    149 The lookup table type is read as a four bit unsigned integer:
    150 \begin{Verbatim}[commandchars=\\\{\}]
    151   1) [codebook_lookup_type] = read four bits as an unsigned integer
    152 \end{Verbatim}
    153 
    154 Codebook decode precedes according to \varname{[codebook_lookup_type]}:
    155 \begin{itemize}
    156 \item
    157 Lookup type zero indicates no lookup to be read.  Proceed past
    158 lookup decode.
    159 \item
    160 Lookup types one and two are similar, differing only in the
    161 number of lookup values to be read.  Lookup type one reads a list of
    162 values that are permuted in a set pattern to build a list of vectors,
    163 each vector of order \varname{[codebook_dimensions]} scalars.  Lookup
    164 type two builds the same vector list, but reads each scalar for each
    165 vector explicitly, rather than building vectors from a smaller list of
    166 possible scalar values.  Lookup decode proceeds as follows:
    167 
    168 \begin{Verbatim}[commandchars=\\\{\}]
    169   1) [codebook_minimum_value] = \link{vorbis:spec:float32:unpack}{float32_unpack}( read 32 bits as an unsigned integer)
    170   2) [codebook_delta_value] = \link{vorbis:spec:float32:unpack}{float32_unpack}( read 32 bits as an unsigned integer)
    171   3) [codebook_value_bits] = read 4 bits as an unsigned integer and add 1
    172   4) [codebook_sequence_p] = read 1 bit as a boolean flag
    173 
    174   if ( [codebook_lookup_type] is 1 ) \{
    175 
    176      5) [codebook_lookup_values] = \link{vorbis:spec:lookup1:values}{lookup1_values}(\varname{[codebook_entries]}, \varname{[codebook_dimensions]} )
    177 
    178   \} else \{
    179 
    180      6) [codebook_lookup_values] = \varname{[codebook_entries]} * \varname{[codebook_dimensions]}
    181 
    182   \}
    183 
    184   7) read a total of [codebook_lookup_values] unsigned integers of [codebook_value_bits] each;
    185      store these in order in the array [codebook_multiplicands]
    186 \end{Verbatim}
    187 \item
    188 A \varname{[codebook_lookup_type]} of greater than two is reserved
    189 and indicates a stream that is not decodable by the specification in this
    190 document.
    191 
    192 \end{itemize}
    193 
    194 
    195 An 'end of packet' during any read operation in the above steps is
    196 considered an error condition rendering the stream undecodable.
    197 
    198 \paragraph{Huffman decision tree representation}
    199 
    200 The \varname{[codebook_codeword_lengths]} array and
    201 \varname{[codebook_entries]} value uniquely define the Huffman decision
    202 tree used for entropy decoding.
    203 
    204 Briefly, each used codebook entry (recall that length-unordered
    205 codebooks support unused codeword entries) is assigned, in order, the
    206 lowest valued unused binary Huffman codeword possible.  Assume the
    207 following codeword length list:
    208 
    209 \begin{Verbatim}[commandchars=\\\{\}]
    210 entry 0: length 2
    211 entry 1: length 4
    212 entry 2: length 4
    213 entry 3: length 4
    214 entry 4: length 4
    215 entry 5: length 2
    216 entry 6: length 3
    217 entry 7: length 3
    218 \end{Verbatim}
    219 
    220 Assigning codewords in order (lowest possible value of the appropriate
    221 length to highest) results in the following codeword list:
    222 
    223 \begin{Verbatim}[commandchars=\\\{\}]
    224 entry 0: length 2 codeword 00
    225 entry 1: length 4 codeword 0100
    226 entry 2: length 4 codeword 0101
    227 entry 3: length 4 codeword 0110
    228 entry 4: length 4 codeword 0111
    229 entry 5: length 2 codeword 10
    230 entry 6: length 3 codeword 110
    231 entry 7: length 3 codeword 111
    232 \end{Verbatim}
    233 
    234 
    235 \begin{note}
    236 Unlike most binary numerical values in this document, we
    237 intend the above codewords to be read and used bit by bit from left to
    238 right, thus the codeword '001' is the bit string 'zero, zero, one'.
    239 When determining 'lowest possible value' in the assignment definition
    240 above, the leftmost bit is the MSb.
    241 \end{note}
    242 
    243 It is clear that the codeword length list represents a Huffman
    244 decision tree with the entry numbers equivalent to the leaves numbered
    245 left-to-right:
    246 
    247 \begin{center}
    248 \includegraphics[width=10cm]{hufftree}
    249 \captionof{figure}{huffman tree illustration}
    250 \end{center}
    251 
    252 
    253 As we assign codewords in order, we see that each choice constructs a
    254 new leaf in the leftmost possible position.
    255 
    256 Note that it's possible to underspecify or overspecify a Huffman tree
    257 via the length list.  In the above example, if codeword seven were
    258 eliminated, it's clear that the tree is unfinished:
    259 
    260 \begin{center}
    261 \includegraphics[width=10cm]{hufftree-under}
    262 \captionof{figure}{underspecified huffman tree illustration}
    263 \end{center}
    264 
    265 
    266 Similarly, in the original codebook, it's clear that the tree is fully
    267 populated and a ninth codeword is impossible.  Both underspecified and
    268 overspecified trees are an error condition rendering the stream
    269 undecodable. Take special care that a codebook with a single used
    270 entry is handled properly; it consists of a single codework of zero
    271 bits and 'reading' a value out of such a codebook always returns the
    272 single used value and sinks zero bits.  
    273 
    274 Codebook entries marked 'unused' are simply skipped in the assigning
    275 process.  They have no codeword and do not appear in the decision
    276 tree, thus it's impossible for any bit pattern read from the stream to
    277 decode to that entry number.
    278 
    279 
    280 
    281 \paragraph{VQ lookup table vector representation}
    282 
    283 Unpacking the VQ lookup table vectors relies on the following values:
    284 \begin{programlisting}
    285 the [codebook_multiplicands] array
    286 [codebook_minimum_value]
    287 [codebook_delta_value]
    288 [codebook_sequence_p]
    289 [codebook_lookup_type]
    290 [codebook_entries]
    291 [codebook_dimensions]
    292 [codebook_lookup_values]
    293 \end{programlisting}
    294 
    295 \bigskip
    296 
    297 Decoding (unpacking) a specific vector in the vector lookup table
    298 proceeds according to \varname{[codebook_lookup_type]}.  The unpacked
    299 vector values are what a codebook would return during audio packet
    300 decode in a VQ context.
    301 
    302 \paragraph{Vector value decode: Lookup type 1}
    303 
    304 Lookup type one specifies a lattice VQ lookup table built
    305 algorithmically from a list of scalar values.  Calculate (unpack) the
    306 final values of a codebook entry vector from the entries in
    307 \varname{[codebook_multiplicands]} as follows (\varname{[value_vector]}
    308 is the output vector representing the vector of values for entry number
    309 \varname{[lookup_offset]} in this codebook):
    310 
    311 \begin{Verbatim}[commandchars=\\\{\}]
    312   1) [last] = 0;
    313   2) [index_divisor] = 1;
    314   3) iterate [i] over the range 0 ... [codebook_dimensions]-1 (once for each scalar value in the value vector) \{
    315 
    316        4) [multiplicand_offset] = ( [lookup_offset] divided by [index_divisor] using integer
    317           division ) integer modulo [codebook_lookup_values]
    318 
    319        5) vector [value_vector] element [i] =
    320             ( [codebook_multiplicands] array element number [multiplicand_offset] ) *
    321             [codebook_delta_value] + [codebook_minimum_value] + [last];
    322 
    323        6) if ( [codebook_sequence_p] is set ) then set [last] = vector [value_vector] element [i]
    324 
    325        7) [index_divisor] = [index_divisor] * [codebook_lookup_values]
    326 
    327      \}
    328 
    329   8) vector calculation completed.
    330 \end{Verbatim}
    331 
    332 
    333 
    334 \paragraph{Vector value decode: Lookup type 2}
    335 
    336 Lookup type two specifies a VQ lookup table in which each scalar in
    337 each vector is explicitly set by the \varname{[codebook_multiplicands]}
    338 array in a one-to-one mapping.  Calculate [unpack] the
    339 final values of a codebook entry vector from the entries in
    340 \varname{[codebook_multiplicands]} as follows (\varname{[value_vector]}
    341 is the output vector representing the vector of values for entry number
    342 \varname{[lookup_offset]} in this codebook):
    343 
    344 \begin{Verbatim}[commandchars=\\\{\}]
    345   1) [last] = 0;
    346   2) [multiplicand_offset] = [lookup_offset] * [codebook_dimensions]
    347   3) iterate [i] over the range 0 ... [codebook_dimensions]-1 (once for each scalar value in the value vector) \{
    348 
    349        4) vector [value_vector] element [i] =
    350             ( [codebook_multiplicands] array element number [multiplicand_offset] ) *
    351             [codebook_delta_value] + [codebook_minimum_value] + [last];
    352 
    353        5) if ( [codebook_sequence_p] is set ) then set [last] = vector [value_vector] element [i]
    354 
    355        6) increment [multiplicand_offset]
    356 
    357      \}
    358 
    359   7) vector calculation completed.
    360 \end{Verbatim}
    361 
    362 
    363 
    364 
    365 
    366 
    367 
    368 
    369 
    370 \subsection{Use of the codebook abstraction}
    371 
    372 The decoder uses the codebook abstraction much as it does the
    373 bit-unpacking convention; a specific codebook reads a
    374 codeword from the bitstream, decoding it into an entry number, and then
    375 returns that entry number to the decoder (when used in a scalar
    376 entropy coding context), or uses that entry number as an offset into
    377 the VQ lookup table, returning a vector of values (when used in a context
    378 desiring a VQ value). Scalar or VQ context is always explicit; any call
    379 to the codebook mechanism requests either a scalar entry number or a
    380 lookup vector.
    381 
    382 Note that VQ lookup type zero indicates that there is no lookup table;
    383 requesting decode using a codebook of lookup type 0 in any context
    384 expecting a vector return value (even in a case where a vector of
    385 dimension one) is forbidden.  If decoder setup or decode requests such
    386 an action, that is an error condition rendering the packet
    387 undecodable.
    388 
    389 Using a codebook to read from the packet bitstream consists first of
    390 reading and decoding the next codeword in the bitstream. The decoder
    391 reads bits until the accumulated bits match a codeword in the
    392 codebook.  This process can be though of as logically walking the
    393 Huffman decode tree by reading one bit at a time from the bitstream,
    394 and using the bit as a decision boolean to take the 0 branch (left in
    395 the above examples) or the 1 branch (right in the above examples).
    396 Walking the tree finishes when the decode process hits a leaf in the
    397 decision tree; the result is the entry number corresponding to that
    398 leaf.  Reading past the end of a packet propagates the 'end-of-stream'
    399 condition to the decoder.
    400 
    401 When used in a scalar context, the resulting codeword entry is the
    402 desired return value.
    403 
    404 When used in a VQ context, the codeword entry number is used as an
    405 offset into the VQ lookup table.  The value returned to the decoder is
    406 the vector of scalars corresponding to this offset.
    407