Home | History | Annotate | Download | only in doc
      1 % -*- mode: latex; TeX-master: "Vorbis_I_spec"; -*-
      2 %!TEX root = Vorbis_I_spec.tex
      3 % $Id$
      4 \section{Bitpacking Convention} \label{vorbis:spec:bitpacking}
      5 
      6 \subsection{Overview}
      7 
      8 The Vorbis codec uses relatively unstructured raw packets containing
      9 arbitrary-width binary integer fields.  Logically, these packets are a
     10 bitstream in which bits are coded one-by-one by the encoder and then
     11 read one-by-one in the same monotonically increasing order by the
     12 decoder.  Most current binary storage arrangements group bits into a
     13 native word size of eight bits (octets), sixteen bits, thirty-two bits
     14 or, less commonly other fixed word sizes.  The Vorbis bitpacking
     15 convention specifies the correct mapping of the logical packet
     16 bitstream into an actual representation in fixed-width words.
     17 
     18 
     19 \subsubsection{octets, bytes and words}
     20 
     21 In most contemporary architectures, a 'byte' is synonymous with an
     22 'octet', that is, eight bits.  This has not always been the case;
     23 seven, ten, eleven and sixteen bit 'bytes' have been used.  For
     24 purposes of the bitpacking convention, a byte implies the native,
     25 smallest integer storage representation offered by a platform.  On
     26 modern platforms, this is generally assumed to be eight bits (not
     27 necessarily because of the processor but because of the
     28 filesystem/memory architecture.  Modern filesystems invariably offer
     29 bytes as the fundamental atom of storage).  A 'word' is an integer
     30 size that is a grouped multiple of this smallest size.
     31 
     32 The most ubiquitous architectures today consider a 'byte' to be an
     33 octet (eight bits) and a word to be a group of two, four or eight
     34 bytes (16, 32 or 64 bits).  Note however that the Vorbis bitpacking
     35 convention is still well defined for any native byte size; Vorbis uses
     36 the native bit-width of a given storage system. This document assumes
     37 that a byte is one octet for purposes of example.
     38 
     39 \subsubsection{bit order}
     40 
     41 A byte has a well-defined 'least significant' bit (LSb), which is the
     42 only bit set when the byte is storing the two's complement integer
     43 value +1.  A byte's 'most significant' bit (MSb) is at the opposite
     44 end of the byte. Bits in a byte are numbered from zero at the LSb to
     45 $n$ ($n=7$ in an octet) for the
     46 MSb.
     47 
     48 
     49 
     50 \subsubsection{byte order}
     51 
     52 Words are native groupings of multiple bytes.  Several byte orderings
     53 are possible in a word; the common ones are 3-2-1-0 ('big endian' or
     54 'most significant byte first' in which the highest-valued byte comes
     55 first), 0-1-2-3 ('little endian' or 'least significant byte first' in
     56 which the lowest value byte comes first) and less commonly 3-1-2-0 and
     57 0-2-1-3 ('mixed endian').
     58 
     59 The Vorbis bitpacking convention specifies storage and bitstream
     60 manipulation at the byte, not word, level, thus host word ordering is
     61 of a concern only during optimization when writing high performance
     62 code that operates on a word of storage at a time rather than by byte.
     63 Logically, bytes are always coded and decoded in order from byte zero
     64 through byte $n$.
     65 
     66 
     67 
     68 \subsubsection{coding bits into byte sequences}
     69 
     70 The Vorbis codec has need to code arbitrary bit-width integers, from
     71 zero to 32 bits wide, into packets.  These integer fields are not
     72 aligned to the boundaries of the byte representation; the next field
     73 is written at the bit position at which the previous field ends.
     74 
     75 The encoder logically packs integers by writing the LSb of a binary
     76 integer to the logical bitstream first, followed by next least
     77 significant bit, etc, until the requested number of bits have been
     78 coded.  When packing the bits into bytes, the encoder begins by
     79 placing the LSb of the integer to be written into the least
     80 significant unused bit position of the destination byte, followed by
     81 the next-least significant bit of the source integer and so on up to
     82 the requested number of bits.  When all bits of the destination byte
     83 have been filled, encoding continues by zeroing all bits of the next
     84 byte and writing the next bit into the bit position 0 of that byte.
     85 Decoding follows the same process as encoding, but by reading bits
     86 from the byte stream and reassembling them into integers.
     87 
     88 
     89 
     90 \subsubsection{signedness}
     91 
     92 The signedness of a specific number resulting from decode is to be
     93 interpreted by the decoder given decode context.  That is, the three
     94 bit binary pattern 'b111' can be taken to represent either 'seven' as
     95 an unsigned integer, or '-1' as a signed, two's complement integer.
     96 The encoder and decoder are responsible for knowing if fields are to
     97 be treated as signed or unsigned.
     98 
     99 
    100 
    101 \subsubsection{coding example}
    102 
    103 Code the 4 bit integer value '12' [b1100] into an empty bytestream.
    104 Bytestream result:
    105 
    106 \begin{Verbatim}[commandchars=\\\{\}]
    107               |
    108               V
    109 
    110         7 6 5 4 3 2 1 0
    111 byte 0 [0 0 0 0 1 1 0 0]  <-
    112 byte 1 [               ]
    113 byte 2 [               ]
    114 byte 3 [               ]
    115              ...
    116 byte n [               ]  bytestream length == 1 byte
    117 
    118 \end{Verbatim}
    119 
    120 
    121 Continue by coding the 3 bit integer value '-1' [b111]:
    122 
    123 \begin{Verbatim}[commandchars=\\\{\}]
    124         |
    125         V
    126 
    127         7 6 5 4 3 2 1 0
    128 byte 0 [0 1 1 1 1 1 0 0]  <-
    129 byte 1 [               ]
    130 byte 2 [               ]
    131 byte 3 [               ]
    132              ...
    133 byte n [               ]  bytestream length == 1 byte
    134 \end{Verbatim}
    135 
    136 
    137 Continue by coding the 7 bit integer value '17' [b0010001]:
    138 
    139 \begin{Verbatim}[commandchars=\\\{\}]
    140           |
    141           V
    142 
    143         7 6 5 4 3 2 1 0
    144 byte 0 [1 1 1 1 1 1 0 0]
    145 byte 1 [0 0 0 0 1 0 0 0]  <-
    146 byte 2 [               ]
    147 byte 3 [               ]
    148              ...
    149 byte n [               ]  bytestream length == 2 bytes
    150                           bit cursor == 6
    151 \end{Verbatim}
    152 
    153 
    154 Continue by coding the 13 bit integer value '6969' [b110 11001110 01]:
    155 
    156 \begin{Verbatim}[commandchars=\\\{\}]
    157                 |
    158                 V
    159 
    160         7 6 5 4 3 2 1 0
    161 byte 0 [1 1 1 1 1 1 0 0]
    162 byte 1 [0 1 0 0 1 0 0 0]
    163 byte 2 [1 1 0 0 1 1 1 0]
    164 byte 3 [0 0 0 0 0 1 1 0]  <-
    165              ...
    166 byte n [               ]  bytestream length == 4 bytes
    167 
    168 \end{Verbatim}
    169 
    170 
    171 
    172 
    173 \subsubsection{decoding example}
    174 
    175 Reading from the beginning of the bytestream encoded in the above example:
    176 
    177 \begin{Verbatim}[commandchars=\\\{\}]
    178                       |
    179                       V
    180 
    181         7 6 5 4 3 2 1 0
    182 byte 0 [1 1 1 1 1 1 0 0]  <-
    183 byte 1 [0 1 0 0 1 0 0 0]
    184 byte 2 [1 1 0 0 1 1 1 0]
    185 byte 3 [0 0 0 0 0 1 1 0]  bytestream length == 4 bytes
    186 
    187 \end{Verbatim}
    188 
    189 
    190 We read two, two-bit integer fields, resulting in the returned numbers
    191 'b00' and 'b11'.  Two things are worth noting here:
    192 
    193 \begin{itemize}
    194 \item Although these four bits were originally written as a single
    195 four-bit integer, reading some other combination of bit-widths from the
    196 bitstream is well defined.  There are no artificial alignment
    197 boundaries maintained in the bitstream.
    198 
    199 \item The second value is the
    200 two-bit-wide integer 'b11'.  This value may be interpreted either as
    201 the unsigned value '3', or the signed value '-1'.  Signedness is
    202 dependent on decode context.
    203 \end{itemize}
    204 
    205 
    206 
    207 
    208 \subsubsection{end-of-packet alignment}
    209 
    210 The typical use of bitpacking is to produce many independent
    211 byte-aligned packets which are embedded into a larger byte-aligned
    212 container structure, such as an Ogg transport bitstream.  Externally,
    213 each bytestream (encoded bitstream) must begin and end on a byte
    214 boundary.  Often, the encoded bitstream is not an integer number of
    215 bytes, and so there is unused (uncoded) space in the last byte of a
    216 packet.
    217 
    218 Unused space in the last byte of a bytestream is always zeroed during
    219 the coding process.  Thus, should this unused space be read, it will
    220 return binary zeroes.
    221 
    222 Attempting to read past the end of an encoded packet results in an
    223 'end-of-packet' condition.  End-of-packet is not to be considered an
    224 error; it is merely a state indicating that there is insufficient
    225 remaining data to fulfill the desired read size.  Vorbis uses truncated
    226 packets as a normal mode of operation, and as such, decoders must
    227 handle reading past the end of a packet as a typical mode of
    228 operation. Any further read operations after an 'end-of-packet'
    229 condition shall also return 'end-of-packet'.
    230 
    231 
    232 
    233 \subsubsection{reading zero bits}
    234 
    235 Reading a zero-bit-wide integer returns the value '0' and does not
    236 increment the stream cursor.  Reading to the end of the packet (but
    237 not past, such that an 'end-of-packet' condition has not triggered)
    238 and then reading a zero bit integer shall succeed, returning 0, and
    239 not trigger an end-of-packet condition.  Reading a zero-bit-wide
    240 integer after a previous read sets 'end-of-packet' shall also fail
    241 with 'end-of-packet'.
    242 
    243 
    244 
    245 
    246 
    247 
    248