1 % -*- mode: latex; TeX-master: "Vorbis_I_spec"; -*- 2 %!TEX root = Vorbis_I_spec.tex 3 % $Id$ 4 \section{Residue setup and decode} \label{vorbis:spec:residue} 5 6 \subsection{Overview} 7 8 A residue vector represents the fine detail of the audio spectrum of 9 one channel in an audio frame after the encoder subtracts the floor 10 curve and performs any channel coupling. A residue vector may 11 represent spectral lines, spectral magnitude, spectral phase or 12 hybrids as mixed by channel coupling. The exact semantic content of 13 the vector does not matter to the residue abstraction. 14 15 Whatever the exact qualities, the Vorbis residue abstraction codes the 16 residue vectors into the bitstream packet, and then reconstructs the 17 vectors during decode. Vorbis makes use of three different encoding 18 variants (numbered 0, 1 and 2) of the same basic vector encoding 19 abstraction. 20 21 22 23 \subsection{Residue format} 24 25 Residue format partitions each vector in the vector bundle into chunks, 26 classifies each chunk, encodes the chunk classifications and finally 27 encodes the chunks themselves using the the specific VQ arrangement 28 defined for each selected classification. 29 The exact interleaving and partitioning vary by residue encoding number, 30 however the high-level process used to classify and encode the residue 31 vector is the same in all three variants. 32 33 A set of coded residue vectors are all of the same length. High level 34 coding structure, ignoring for the moment exactly how a partition is 35 encoded and simply trusting that it is, is as follows: 36 37 \begin{itemize} 38 \item Each vector is partitioned into multiple equal sized chunks 39 according to configuration specified. If we have a vector size of 40 \emph{n}, a partition size \emph{residue_partition_size}, and a total 41 of \emph{ch} residue vectors, the total number of partitioned chunks 42 coded is \emph{n}/\emph{residue_partition_size}*\emph{ch}. It is 43 important to note that the integer division truncates. In the below 44 example, we assume an example \emph{residue_partition_size} of 8. 45 46 \item Each partition in each vector has a classification number that 47 specifies which of multiple configured VQ codebook setups are used to 48 decode that partition. The classification numbers of each partition 49 can be thought of as forming a vector in their own right, as in the 50 illustration below. Just as the residue vectors are coded in grouped 51 partitions to increase encoding efficiency, the classification vector 52 is also partitioned into chunks. The integer elements of each scalar 53 in a classification chunk are built into a single scalar that 54 represents the classification numbers in that chunk. In the below 55 example, the classification codeword encodes two classification 56 numbers. 57 58 \item The values in a residue vector may be encoded monolithically in a 59 single pass through the residue vector, but more often efficient 60 codebook design dictates that each vector is encoded as the additive 61 sum of several passes through the residue vector using more than one 62 VQ codebook. Thus, each residue value potentially accumulates values 63 from multiple decode passes. The classification value associated with 64 a partition is the same in each pass, thus the classification codeword 65 is coded only in the first pass. 66 67 \end{itemize} 68 69 70 \begin{center} 71 \includegraphics[width=\textwidth]{residue-pack} 72 \captionof{figure}{illustration of residue vector format} 73 \end{center} 74 75 76 77 \subsection{residue 0} 78 79 Residue 0 and 1 differ only in the way the values within a residue 80 partition are interleaved during partition encoding (visually treated 81 as a black box--or cyan box or brown box--in the above figure). 82 83 Residue encoding 0 interleaves VQ encoding according to the 84 dimension of the codebook used to encode a partition in a specific 85 pass. The dimension of the codebook need not be the same in multiple 86 passes, however the partition size must be an even multiple of the 87 codebook dimension. 88 89 As an example, assume a partition vector of size eight, to be encoded 90 by residue 0 using codebook sizes of 8, 4, 2 and 1: 91 92 \begin{programlisting} 93 94 original residue vector: [ 0 1 2 3 4 5 6 7 ] 95 96 codebook dimensions = 8 encoded as: [ 0 1 2 3 4 5 6 7 ] 97 98 codebook dimensions = 4 encoded as: [ 0 2 4 6 ], [ 1 3 5 7 ] 99 100 codebook dimensions = 2 encoded as: [ 0 4 ], [ 1 5 ], [ 2 6 ], [ 3 7 ] 101 102 codebook dimensions = 1 encoded as: [ 0 ], [ 1 ], [ 2 ], [ 3 ], [ 4 ], [ 5 ], [ 6 ], [ 7 ] 103 104 \end{programlisting} 105 106 It is worth mentioning at this point that no configurable value in the 107 residue coding setup is restricted to a power of two. 108 109 110 111 \subsection{residue 1} 112 113 Residue 1 does not interleave VQ encoding. It represents partition 114 vector scalars in order. As with residue 0, however, partition length 115 must be an integer multiple of the codebook dimension, although 116 dimension may vary from pass to pass. 117 118 As an example, assume a partition vector of size eight, to be encoded 119 by residue 0 using codebook sizes of 8, 4, 2 and 1: 120 121 \begin{programlisting} 122 123 original residue vector: [ 0 1 2 3 4 5 6 7 ] 124 125 codebook dimensions = 8 encoded as: [ 0 1 2 3 4 5 6 7 ] 126 127 codebook dimensions = 4 encoded as: [ 0 1 2 3 ], [ 4 5 6 7 ] 128 129 codebook dimensions = 2 encoded as: [ 0 1 ], [ 2 3 ], [ 4 5 ], [ 6 7 ] 130 131 codebook dimensions = 1 encoded as: [ 0 ], [ 1 ], [ 2 ], [ 3 ], [ 4 ], [ 5 ], [ 6 ], [ 7 ] 132 133 \end{programlisting} 134 135 136 137 \subsection{residue 2} 138 139 Residue type two can be thought of as a variant of residue type 1. 140 Rather than encoding multiple passed-in vectors as in residue type 1, 141 the \emph{ch} passed in vectors of length \emph{n} are first 142 interleaved and flattened into a single vector of length 143 \emph{ch}*\emph{n}. Encoding then proceeds as in type 1. Decoding is 144 as in type 1 with decode interleave reversed. If operating on a single 145 vector to begin with, residue type 1 and type 2 are equivalent. 146 147 \begin{center} 148 \includegraphics[width=\textwidth]{residue2} 149 \captionof{figure}{illustration of residue type 2} 150 \end{center} 151 152 153 \subsection{Residue decode} 154 155 \subsubsection{header decode} 156 157 Header decode for all three residue types is identical. 158 \begin{programlisting} 159 1) [residue_begin] = read 24 bits as unsigned integer 160 2) [residue_end] = read 24 bits as unsigned integer 161 3) [residue_partition_size] = read 24 bits as unsigned integer and add one 162 4) [residue_classifications] = read 6 bits as unsigned integer and add one 163 5) [residue_classbook] = read 8 bits as unsigned integer 164 \end{programlisting} 165 166 \varname{[residue_begin]} and 167 \varname{[residue_end]} select the specific sub-portion of 168 each vector that is actually coded; it implements akin to a bandpass 169 where, for coding purposes, the vector effectively begins at element 170 \varname{[residue_begin]} and ends at 171 \varname{[residue_end]}. Preceding and following values in 172 the unpacked vectors are zeroed. Note that for residue type 2, these 173 values as well as \varname{[residue_partition_size]}apply to 174 the interleaved vector, not the individual vectors before interleave. 175 \varname{[residue_partition_size]} is as explained above, 176 \varname{[residue_classifications]} is the number of possible 177 classification to which a partition can belong and 178 \varname{[residue_classbook]} is the codebook number used to 179 code classification codewords. The number of dimensions in book 180 \varname{[residue_classbook]} determines how many 181 classification values are grouped into a single classification 182 codeword. Note that the number of entries and dimensions in book 183 \varname{[residue_classbook]}, along with 184 \varname{[residue_classifications]}, overdetermines to 185 possible number of classification codewords. 186 If \varname{[residue_classifications]}\^{}\varname{[residue_classbook]}.dimensions 187 exceeds \varname{[residue_classbook]}.entries, the 188 bitstream should be regarded to be undecodable. 189 190 Next we read a bitmap pattern that specifies which partition classes 191 code values in which passes. 192 193 \begin{programlisting} 194 1) iterate [i] over the range 0 ... [residue_classifications]-1 { 195 196 2) [high_bits] = 0 197 3) [low_bits] = read 3 bits as unsigned integer 198 4) [bitflag] = read one bit as boolean 199 5) if ( [bitflag] is set ) then [high_bits] = read five bits as unsigned integer 200 6) vector [residue_cascade] element [i] = [high_bits] * 8 + [low_bits] 201 } 202 7) done 203 \end{programlisting} 204 205 Finally, we read in a list of book numbers, each corresponding to 206 specific bit set in the cascade bitmap. We loop over the possible 207 codebook classifications and the maximum possible number of encoding 208 stages (8 in Vorbis I, as constrained by the elements of the cascade 209 bitmap being eight bits): 210 211 \begin{programlisting} 212 1) iterate [i] over the range 0 ... [residue_classifications]-1 { 213 214 2) iterate [j] over the range 0 ... 7 { 215 216 3) if ( vector [residue_cascade] element [i] bit [j] is set ) { 217 218 4) array [residue_books] element [i][j] = read 8 bits as unsigned integer 219 220 } else { 221 222 5) array [residue_books] element [i][j] = unused 223 224 } 225 } 226 } 227 228 6) done 229 \end{programlisting} 230 231 An end-of-packet condition at any point in header decode renders the 232 stream undecodable. In addition, any codebook number greater than the 233 maximum numbered codebook set up in this stream also renders the 234 stream undecodable. All codebooks in array [residue_books] are 235 required to have a value mapping. The presence of codebook in array 236 [residue_books] without a value mapping (maptype equals zero) renders 237 the stream undecodable. 238 239 240 241 \subsubsection{packet decode} 242 243 Format 0 and 1 packet decode is identical except for specific 244 partition interleave. Format 2 packet decode can be built out of the 245 format 1 decode process. Thus we describe first the decode 246 infrastructure identical to all three formats. 247 248 In addition to configuration information, the residue decode process 249 is passed the number of vectors in the submap bundle and a vector of 250 flags indicating if any of the vectors are not to be decoded. If the 251 passed in number of vectors is 3 and vector number 1 is marked 'do not 252 decode', decode skips vector 1 during the decode loop. However, even 253 'do not decode' vectors are allocated and zeroed. 254 255 Depending on the values of \varname{[residue_begin]} and 256 \varname{[residue_end]}, it is obvious that the encoded 257 portion of a residue vector may be the entire possible residue vector 258 or some other strict subset of the actual residue vector size with 259 zero padding at either uncoded end. However, it is also possible to 260 set \varname{[residue_begin]} and 261 \varname{[residue_end]} to specify a range partially or 262 wholly beyond the maximum vector size. Before beginning residue 263 decode, limit \varname{[residue_begin]} and 264 \varname{[residue_end]} to the maximum possible vector size 265 as follows. We assume that the number of vectors being encoded, 266 \varname{[ch]} is provided by the higher level decoding 267 process. 268 269 \begin{programlisting} 270 1) [actual_size] = current blocksize/2; 271 2) if residue encoding is format 2 272 3) [actual_size] = [actual_size] * [ch]; 273 4) [limit_residue_begin] = maximum of ([residue_begin],[actual_size]); 274 5) [limit_residue_end] = maximum of ([residue_end],[actual_size]); 275 \end{programlisting} 276 277 The following convenience values are conceptually useful to clarifying 278 the decode process: 279 280 \begin{programlisting} 281 1) [classwords_per_codeword] = [codebook_dimensions] value of codebook [residue_classbook] 282 2) [n_to_read] = [limit_residue_end] - [limit_residue_begin] 283 3) [partitions_to_read] = [n_to_read] / [residue_partition_size] 284 \end{programlisting} 285 286 Packet decode proceeds as follows, matching the description offered earlier in the document. 287 \begin{programlisting} 288 1) allocate and zero all vectors that will be returned. 289 2) if ([n_to_read] is zero), stop; there is no residue to decode. 290 3) iterate [pass] over the range 0 ... 7 { 291 292 4) [partition_count] = 0 293 294 5) while [partition_count] is less than [partitions_to_read] 295 296 6) if ([pass] is zero) { 297 298 7) iterate [j] over the range 0 .. [ch]-1 { 299 300 8) if vector [j] is not marked 'do not decode' { 301 302 9) [temp] = read from packet using codebook [residue_classbook] in scalar context 303 10) iterate [i] descending over the range [classwords_per_codeword]-1 ... 0 { 304 305 11) array [classifications] element [j],([i]+[partition_count]) = 306 [temp] integer modulo [residue_classifications] 307 12) [temp] = [temp] / [residue_classifications] using integer division 308 309 } 310 311 } 312 313 } 314 315 } 316 317 13) iterate [i] over the range 0 .. ([classwords_per_codeword] - 1) while [partition_count] 318 is also less than [partitions_to_read] { 319 320 14) iterate [j] over the range 0 .. [ch]-1 { 321 322 15) if vector [j] is not marked 'do not decode' { 323 324 16) [vqclass] = array [classifications] element [j],[partition_count] 325 17) [vqbook] = array [residue_books] element [vqclass],[pass] 326 18) if ([vqbook] is not 'unused') { 327 328 19) decode partition into output vector number [j], starting at scalar 329 offset [limit_residue_begin]+[partition_count]*[residue_partition_size] using 330 codebook number [vqbook] in VQ context 331 } 332 } 333 334 20) increment [partition_count] by one 335 336 } 337 } 338 } 339 340 21) done 341 342 \end{programlisting} 343 344 An end-of-packet condition during packet decode is to be considered a 345 nominal occurrence. Decode returns the result of vector decode up to 346 that point. 347 348 349 350 \subsubsection{format 0 specifics} 351 352 Format zero decodes partitions exactly as described earlier in the 353 'Residue Format: residue 0' section. The following pseudocode 354 presents the same algorithm. Assume: 355 356 \begin{itemize} 357 \item \varname{[n]} is the value in \varname{[residue_partition_size]} 358 \item \varname{[v]} is the residue vector 359 \item \varname{[offset]} is the beginning read offset in [v] 360 \end{itemize} 361 362 363 \begin{programlisting} 364 1) [step] = [n] / [codebook_dimensions] 365 2) iterate [i] over the range 0 ... [step]-1 { 366 367 3) vector [entry_temp] = read vector from packet using current codebook in VQ context 368 4) iterate [j] over the range 0 ... [codebook_dimensions]-1 { 369 370 5) vector [v] element ([offset]+[i]+[j]*[step]) = 371 vector [v] element ([offset]+[i]+[j]*[step]) + 372 vector [entry_temp] element [j] 373 374 } 375 376 } 377 378 6) done 379 380 \end{programlisting} 381 382 383 384 \subsubsection{format 1 specifics} 385 386 Format 1 decodes partitions exactly as described earlier in the 387 'Residue Format: residue 1' section. The following pseudocode 388 presents the same algorithm. Assume: 389 390 \begin{itemize} 391 \item \varname{[n]} is the value in 392 \varname{[residue_partition_size]} 393 \item \varname{[v]} is the residue vector 394 \item \varname{[offset]} is the beginning read offset in [v] 395 \end{itemize} 396 397 398 \begin{programlisting} 399 1) [i] = 0 400 2) vector [entry_temp] = read vector from packet using current codebook in VQ context 401 3) iterate [j] over the range 0 ... [codebook_dimensions]-1 { 402 403 4) vector [v] element ([offset]+[i]) = 404 vector [v] element ([offset]+[i]) + 405 vector [entry_temp] element [j] 406 5) increment [i] 407 408 } 409 410 6) if ( [i] is less than [n] ) continue at step 2 411 7) done 412 \end{programlisting} 413 414 415 416 \subsubsection{format 2 specifics} 417 418 Format 2 is reducible to format 1. It may be implemented as an additional step prior to and an additional post-decode step after a normal format 1 decode. 419 420 421 Format 2 handles 'do not decode' vectors differently than residue 0 or 422 1; if all vectors are marked 'do not decode', no decode occurrs. 423 However, if at least one vector is to be decoded, all the vectors are 424 decoded. We then request normal format 1 to decode a single vector 425 representing all output channels, rather than a vector for each 426 channel. After decode, deinterleave the vector into independent vectors, one for each output channel. That is: 427 428 \begin{enumerate} 429 \item If all vectors 0 through \emph{ch}-1 are marked 'do not decode', allocate and clear a single vector \varname{[v]}of length \emph{ch*n} and skip step 2 below; proceed directly to the post-decode step. 430 \item Rather than performing format 1 decode to produce \emph{ch} vectors of length \emph{n} each, call format 1 decode to produce a single vector \varname{[v]} of length \emph{ch*n}. 431 \item Post decode: Deinterleave the single vector \varname{[v]} returned by format 1 decode as described above into \emph{ch} independent vectors, one for each outputchannel, according to: 432 \begin{programlisting} 433 1) iterate [i] over the range 0 ... [n]-1 { 434 435 2) iterate [j] over the range 0 ... [ch]-1 { 436 437 3) output vector number [j] element [i] = vector [v] element ([i] * [ch] + [j]) 438 439 } 440 } 441 442 4) done 443 \end{programlisting} 444 445 \end{enumerate} 446 447 448 449 450 451 452 453