Home | History | Annotate | Download | only in DOC
      1 LZMA specification (DRAFT version)
      2 ----------------------------------
      3 
      4 Author: Igor Pavlov
      5 Date: 2013-07-28
      6 
      7 This specification defines the format of LZMA compressed data and lzma file format.
      8 
      9 Notation 
     10 --------
     11 
     12 We use the syntax of C++ programming language.
     13 We use the following types in C++ code:
     14   unsigned - unsigned integer, at least 16 bits in size
     15   int      - signed integer, at least 16 bits in size
     16   UInt64   - 64-bit unsigned integer
     17   UInt32   - 32-bit unsigned integer
     18   UInt16   - 16-bit unsigned integer
     19   Byte     - 8-bit unsigned integer
     20   bool     - boolean type with two possible values: false, true
     21 
     22 
     23 lzma file format
     24 ================
     25 
     26 The lzma file contains the raw LZMA stream and the header with related properties.
     27 
     28 The files in that format use ".lzma" extension.
     29 
     30 The lzma file format layout:
     31 
     32 Offset Size Description
     33 
     34   0     1   LZMA model properties (lc, lp, pb) in encoded form
     35   1     4   Dictionary size (32-bit unsigned integer, little-endian)
     36   5     8   Uncompressed size (64-bit unsigned integer, little-endian)
     37  13         Compressed data (LZMA stream)
     38 
     39 LZMA properties:
     40 
     41     name  Range          Description
     42 
     43       lc  [0, 8]         the number of "literal context" bits
     44       lp  [0, 4]         the number of "literal pos" bits
     45       pb  [0, 4]         the number of "pos" bits
     46 dictSize  [0, 2^32 - 1]  the dictionary size 
     47 
     48 The following code encodes LZMA properties:
     49 
     50 void EncodeProperties(Byte *properties)
     51 {
     52   properties[0] = (Byte)((pb * 5 + lp) * 9 + lc);
     53   Set_UInt32_LittleEndian(properties + 1, dictSize);
     54 }
     55 
     56 If the value of dictionary size in properties is smaller than (1 << 12),
     57 the LZMA decoder must set the dictionary size variable to (1 << 12).
     58 
     59 #define LZMA_DIC_MIN (1 << 12)
     60 
     61   unsigned lc, pb, lp;
     62   UInt32 dictSize;
     63   UInt32 dictSizeInProperties;
     64 
     65   void DecodeProperties(const Byte *properties)
     66   {
     67     unsigned d = properties[0];
     68     if (d >= (9 * 5 * 5))
     69       throw "Incorrect LZMA properties";
     70     lc = d % 9;
     71     d /= 9;
     72     pb = d / 5;
     73     lp = d % 5;
     74     dictSizeInProperties = 0;
     75     for (int i = 0; i < 4; i++)
     76       dictSizeInProperties |= (UInt32)properties[i + 1] << (8 * i);
     77     dictSize = dictSizeInProperties;
     78     if (dictSize < LZMA_DIC_MIN)
     79       dictSize = LZMA_DIC_MIN;
     80   }
     81 
     82 If "Uncompressed size" field contains ones in all 64 bits, it means that
     83 uncompressed size is unknown and there is the "end marker" in stream,
     84 that indicates the end of decoding point.
     85 In opposite case, if the value from "Uncompressed size" field is not
     86 equal to ((2^64) - 1), the LZMA stream decoding must be finished after
     87 specified number of bytes (Uncompressed size) is decoded. And if there 
     88 is the "end marker", the LZMA decoder must read that marker also.
     89 
     90 
     91 The new scheme to encode LZMA properties
     92 ----------------------------------------
     93 
     94 If LZMA compression is used for some another format, it's recommended to
     95 use a new improved scheme to encode LZMA properties. That new scheme was
     96 used in xz format that uses the LZMA2 compression algorithm.
     97 The LZMA2 is a new compression algorithm that is based on the LZMA algorithm.
     98 
     99 The dictionary size in LZMA2 is encoded with just one byte and LZMA2 supports
    100 only reduced set of dictionary sizes:
    101   (2 << 11), (3 << 11),
    102   (2 << 12), (3 << 12),
    103   ...
    104   (2 << 30), (3 << 30),
    105   (2 << 31) - 1
    106 
    107 The dictionary size can be extracted from encoded value with the following code:
    108 
    109   dictSize = (p == 40) ? 0xFFFFFFFF : (((UInt32)2 | ((p) & 1)) << ((p) / 2 + 11));
    110 
    111 Also there is additional limitation (lc + lp <= 4) in LZMA2 for values of 
    112 "lc" and "lp" properties:
    113 
    114   if (lc + lp > 4)
    115     throw "Unsupported properties: (lc + lp) > 4";
    116 
    117 There are some advantages for LZMA decoder with such (lc + lp) value
    118 limitation. It reduces the maximum size of tables allocated by decoder.
    119 And it reduces the complexity of initialization procedure, that can be 
    120 important to keep high speed of decoding of big number of small LZMA streams.
    121 
    122 It's recommended to use that limitation (lc + lp <= 4) for any new format
    123 that uses LZMA compression. Note that the combinations of "lc" and "lp" 
    124 parameters, where (lc + lp > 4), can provide significant improvement in 
    125 compression ratio only in some rare cases.
    126 
    127 The LZMA properties can be encoded into two bytes in new scheme:
    128 
    129 Offset Size Description
    130 
    131   0     1   The dictionary size encoded with LZMA2 scheme
    132   1     1   LZMA model properties (lc, lp, pb) in encoded form
    133 
    134 
    135 The RAM usage 
    136 =============
    137 
    138 The RAM usage for LZMA decoder is determined by the following parts:
    139 
    140 1) The Sliding Window (from 4 KiB to 4 GiB).
    141 2) The probability model counter arrays (arrays of 16-bit variables).
    142 3) Some additional state variables (about 10 variables of 32-bit integers).
    143 
    144 
    145 The RAM usage for Sliding Window
    146 --------------------------------
    147 
    148 There are two main scenarios of decoding:
    149 
    150 1) The decoding of full stream to one RAM buffer.
    151 
    152   If we decode full LZMA stream to one output buffer in RAM, the decoder 
    153   can use that output buffer as sliding window. So the decoder doesn't 
    154   need additional buffer allocated for sliding window.
    155 
    156 2) The decoding to some external storage.
    157 
    158   If we decode LZMA stream to external storage, the decoder must allocate
    159   the buffer for sliding window. The size of that buffer must be equal 
    160   or larger than the value of dictionary size from properties of LZMA stream.
    161 
    162 In this specification we describe the code for decoding to some external
    163 storage. The optimized version of code for decoding of full stream to one
    164 output RAM buffer can require some minor changes in code.
    165 
    166 
    167 The RAM usage for the probability model counters
    168 ------------------------------------------------
    169 
    170 The size of the probability model counter arrays is calculated with the 
    171 following formula:
    172 
    173 size_of_prob_arrays = 1846 + 768 * (1 << (lp + lc))
    174 
    175 Each probability model counter is 11-bit unsigned integer.
    176 If we use 16-bit integer variables (2-byte integers) for these probability 
    177 model counters, the RAM usage required by probability model counter arrays 
    178 can be estimated with the following formula:
    179 
    180   RAM = 4 KiB + 1.5 KiB * (1 << (lp + lc))
    181 
    182 For example, for default LZMA parameters (lp = 0 and lc = 3), the RAM usage is
    183 
    184   RAM_lc3_lp0 = 4 KiB + 1.5 KiB * 8 = 16 KiB
    185 
    186 The maximum RAM state usage is required for decoding the stream with lp = 4 
    187 and lc = 8:
    188 
    189   RAM_lc8_lp4 = 4 KiB + 1.5 KiB * 4096 = 6148 KiB
    190 
    191 If the decoder uses LZMA2's limited property condition 
    192 (lc + lp <= 4), the RAM usage will be not larger than
    193 
    194   RAM_lc_lp_4 = 4 KiB + 1.5 KiB * 16 = 28 KiB
    195 
    196 
    197 The RAM usage for encoder
    198 -------------------------
    199 
    200 There are many variants for LZMA encoding code.
    201 These variants have different values for memory consumption.
    202 Note that memory consumption for LZMA Encoder can not be 
    203 smaller than memory consumption of LZMA Decoder for same stream.
    204 
    205 The RAM usage required by modern effective implementation of 
    206 LZMA Encoder can be estimated with the following formula:
    207 
    208   Encoder_RAM_Usage = 4 MiB + 11 * dictionarySize.
    209 
    210 But there are some modes of the encoder that require less memory.
    211 
    212 
    213 LZMA Decoding
    214 =============
    215 
    216 The LZMA compression algorithm uses LZ-based compression with Sliding Window
    217 and Range Encoding as entropy coding method.
    218 
    219 
    220 Sliding Window
    221 --------------
    222 
    223 LZMA uses Sliding Window compression similar to LZ77 algorithm.
    224 
    225 LZMA stream must be decoded to the sequence that consists
    226 of MATCHES and LITERALS:
    227   
    228   - a LITERAL is a 8-bit character (one byte).
    229     The decoder just puts that LITERAL to the uncompressed stream.
    230   
    231   - a MATCH is a pair of two numbers (DISTANCE-LENGTH pair).
    232     The decoder takes one byte exactly "DISTANCE" characters behind
    233     current position in the uncompressed stream and puts it to 
    234     uncompressed stream. The decoder must repeat it "LENGTH" times.
    235 
    236 The "DISTANCE" can not be larger than dictionary size.
    237 And the "DISTANCE" can not be larger than the number of bytes in
    238 the uncompressed stream that were decoded before that match.
    239 
    240 In this specification we use cyclic buffer to implement Sliding Window
    241 for LZMA decoder:
    242 
    243 class COutWindow
    244 {
    245   Byte *Buf;
    246   UInt32 Pos;
    247   UInt32 Size;
    248   bool IsFull;
    249 
    250 public:
    251   unsigned TotalPos;
    252   COutStream OutStream;
    253 
    254   COutWindow(): Buf(NULL) {}
    255   ~COutWindow() { delete []Buf; }
    256  
    257   void Create(UInt32 dictSize)
    258   {
    259     Buf = new Byte[dictSize];
    260     Pos = 0;
    261     Size = dictSize;
    262     IsFull = false;
    263     TotalPos = 0;
    264   }
    265 
    266   void PutByte(Byte b)
    267   {
    268     TotalPos++;
    269     Buf[Pos++] = b;
    270     if (Pos == Size)
    271     {
    272       Pos = 0;
    273       IsFull = true;
    274     }
    275     OutStream.WriteByte(b);
    276   }
    277 
    278   Byte GetByte(UInt32 dist) const
    279   {
    280     return Buf[dist <= Pos ? Pos - dist : Size - dist + Pos];
    281   }
    282 
    283   void CopyMatch(UInt32 dist, unsigned len)
    284   {
    285     for (; len > 0; len--)
    286       PutByte(GetByte(dist));
    287   }
    288 
    289   bool CheckDistance(UInt32 dist) const
    290   {
    291     return dist <= Pos || IsFull;
    292   }
    293 
    294   bool IsEmpty() const
    295   {
    296     return Pos == 0 && !IsFull;
    297   }
    298 };
    299 
    300 
    301 In another implementation it's possible to use one buffer that contains 
    302 Sliding Window and the whole data stream after uncompressing.
    303 
    304 
    305 Range Decoder
    306 -------------
    307 
    308 LZMA algorithm uses Range Encoding (1) as entropy coding method.
    309 
    310 LZMA stream contains just one very big number in big-endian encoding.
    311 LZMA decoder uses the Range Decoder to extract a sequence of binary
    312 symbols from that big number.
    313 
    314 The state of the Range Decoder:
    315 
    316 struct CRangeDecoder
    317 {
    318   UInt32 Range; 
    319   UInt32 Code;
    320   InputStream *InStream;
    321 
    322   bool Corrupted;
    323 }
    324 
    325 The notes about UInt32 type for the "Range" and "Code" variables:
    326 
    327   It's possible to use 64-bit (unsigned or signed) integer type
    328   for the "Range" and the "Code" variables instead of 32-bit unsigned,
    329   but some additional code must be used to truncate the values to 
    330   low 32-bits after some operations.
    331 
    332   If the programming language does not support 32-bit unsigned integer type 
    333   (like in case of JAVA language), it's possible to use 32-bit signed integer, 
    334   but some code must be changed. For example, it's required to change the code
    335   that uses comparison operations for UInt32 variables in this specification.
    336 
    337 The Range Decoder can be in some states that can be treated as 
    338 "Corruption" in LZMA stream. The Range Decoder uses the variable "Corrupted":
    339 
    340   (Corrupted == false), if the Range Decoder has not detected any corruption.
    341   (Corrupted == true), if the Range Decoder has detected some corruption.
    342 
    343 The reference LZMA Decoder ignores the value of the "Corrupted" variable.
    344 So it continues to decode the stream, even if the corruption can be detected
    345 in the Range Decoder. To provide the full compatibility with output of the 
    346 reference LZMA Decoder, another LZMA Decoder implementations must also 
    347 ignore the value of the "Corrupted" variable.
    348 
    349 The LZMA Encoder is required to create only such LZMA streams, that will not 
    350 lead the Range Decoder to states, where the "Corrupted" variable is set to true.
    351 
    352 The Range Decoder reads first 5 bytes from input stream to initialize
    353 the state:
    354 
    355 void CRangeDecoder::Init()
    356 {
    357   Corrupted = false;
    358   
    359   if (InStream->ReadByte() != 0)
    360     Corrupted = true;
    361   
    362   Range = 0xFFFFFFFF;
    363   Code = 0;
    364   for (int i = 0; i < 4; i++)
    365     Code = (Code << 8) | InStream->ReadByte();
    366   
    367   if (Code == Range)
    368     Corrupted = true;
    369 }
    370 
    371 The LZMA Encoder always writes ZERO in initial byte of compressed stream.
    372 That scheme allows to simplify the code of the Range Encoder in the 
    373 LZMA Encoder.
    374 
    375 After the last bit of data was decoded by Range Decoder, the value of the
    376 "Code" variable must be equal to 0. The LZMA Decoder must check it by 
    377 calling the IsFinishedOK() function:
    378 
    379   bool IsFinishedOK() const { return Code == 0; }
    380 
    381 If there is corruption in data stream, there is big probability that
    382 the "Code" value will be not equal to 0 in the Finish() function. So that
    383 check in the IsFinishedOK() function provides very good feature for 
    384 corruption detection.
    385 
    386 The value of the "Range" variable before each bit decoding can not be smaller 
    387 than ((UInt32)1 << 24). The Normalize() function keeps the "Range" value in 
    388 described range.
    389 
    390 #define kTopValue ((UInt32)1 << 24)
    391 
    392 void CRangeDecoder::Normalize()
    393 {
    394   if (Range < kTopValue)
    395   {
    396     Range <<= 8;
    397     Code = (Code << 8) | InStream->ReadByte();
    398   }
    399 }
    400 
    401 Notes: if the size of the "Code" variable is larger than 32 bits, it's
    402 required to keep only low 32 bits of the "Code" variable after the change
    403 in Normalize() function.
    404 
    405 If the LZMA Stream is not corrupted, the value of the "Code" variable is
    406 always smaller than value of the "Range" variable.
    407 But the Range Decoder ignores some types of corruptions, so the value of
    408 the "Code" variable can be equal or larger than value of the "Range" variable
    409 for some "Corrupted" archives.
    410 
    411 
    412 LZMA uses Range Encoding only with binary symbols of two types:
    413   1) binary symbols with fixed and equal probabilities (direct bits)
    414   2) binary symbols with predicted probabilities
    415 
    416 The DecodeDirectBits() function decodes the sequence of direct bits:
    417 
    418 UInt32 CRangeDecoder::DecodeDirectBits(unsigned numBits)
    419 {
    420   UInt32 res = 0;
    421   do
    422   {
    423     Range >>= 1;
    424     Code -= Range;
    425     UInt32 t = 0 - ((UInt32)Code >> 31);
    426     Code += Range & t;
    427     
    428     if (Code == Range)
    429       Corrupted = true;
    430     
    431     Normalize();
    432     res <<= 1;
    433     res += t + 1;
    434   }
    435   while (--numBits);
    436   return res;
    437 }
    438 
    439 
    440 The Bit Decoding with Probability Model
    441 ---------------------------------------
    442 
    443 The task of Bit Probability Model is to estimate probabilities of binary
    444 symbols. And then it provides the Range Decoder with that information.
    445 The better prediction provides better compression ratio.
    446 The Bit Probability Model uses statistical data of previous decoded
    447 symbols.
    448 
    449 That estimated probability is presented as 11-bit unsigned integer value
    450 that represents the probability of symbol "0".
    451 
    452 #define kNumBitModelTotalBits 11
    453 
    454 Mathematical probabilities can be presented with the following formulas:
    455      probability(symbol_0) = prob / 2048.
    456      probability(symbol_1) =  1 - Probability(symbol_0) =  
    457                            =  1 - prob / 2048 =  
    458                            =  (2048 - prob) / 2048
    459 where the "prob" variable contains 11-bit integer probability counter.
    460 
    461 It's recommended to use 16-bit unsigned integer type, to store these 11-bit
    462 probability values:
    463 
    464 typedef UInt16 CProb;
    465 
    466 Each probability value must be initialized with value ((1 << 11) / 2),
    467 that represents the state, where probabilities of symbols 0 and 1 
    468 are equal to 0.5:
    469 
    470 #define PROB_INIT_VAL ((1 << kNumBitModelTotalBits) / 2)
    471 
    472 The INIT_PROBS macro is used to initialize the array of CProb variables:
    473 
    474 #define INIT_PROBS(p) \
    475  { for (unsigned i = 0; i < sizeof(p) / sizeof(p[0]); i++) p[i] = PROB_INIT_VAL; }
    476 
    477 
    478 The DecodeBit() function decodes one bit.
    479 The LZMA decoder provides the pointer to CProb variable that contains 
    480 information about estimated probability for symbol 0 and the Range Decoder 
    481 updates that CProb variable after decoding. The Range Decoder increases 
    482 estimated probability of the symbol that was decoded:
    483 
    484 #define kNumMoveBits 5
    485 
    486 unsigned CRangeDecoder::DecodeBit(CProb *prob)
    487 {
    488   unsigned v = *prob;
    489   UInt32 bound = (Range >> kNumBitModelTotalBits) * v;
    490   unsigned symbol;
    491   if (Code < bound)
    492   {
    493     v += ((1 << kNumBitModelTotalBits) - v) >> kNumMoveBits;
    494     Range = bound;
    495     symbol = 0;
    496   }
    497   else
    498   {
    499     v -= v >> kNumMoveBits;
    500     Code -= bound;
    501     Range -= bound;
    502     symbol = 1;
    503   }
    504   *prob = (CProb)v;
    505   Normalize();
    506   return symbol;
    507 }
    508 
    509 
    510 The Binary Tree of bit model counters
    511 -------------------------------------
    512 
    513 LZMA uses a tree of Bit model variables to decode symbol that needs
    514 several bits for storing. There are two versions of such trees in LZMA:
    515   1) the tree that decodes bits from high bit to low bit (the normal scheme).
    516   2) the tree that decodes bits from low bit to high bit (the reverse scheme).
    517 
    518 Each binary tree structure supports different size of decoded symbol
    519 (the size of binary sequence that contains value of symbol).
    520 If that size of decoded symbol is "NumBits" bits, the tree structure 
    521 uses the array of (2 << NumBits) counters of CProb type. 
    522 But only ((2 << NumBits) - 1) items are used by encoder and decoder.
    523 The first item (the item with index equal to 0) in array is unused.
    524 That scheme with unused array's item allows to simplify the code.
    525 
    526 unsigned BitTreeReverseDecode(CProb *probs, unsigned numBits, CRangeDecoder *rc)
    527 {
    528   unsigned m = 1;
    529   unsigned symbol = 0;
    530   for (unsigned i = 0; i < numBits; i++)
    531   {
    532     unsigned bit = rc->DecodeBit(&probs[m]);
    533     m <<= 1;
    534     m += bit;
    535     symbol |= (bit << i);
    536   }
    537   return symbol;
    538 }
    539 
    540 template <unsigned NumBits>
    541 class CBitTreeDecoder
    542 {
    543   CProb Probs[(unsigned)1 << NumBits];
    544 
    545 public:
    546 
    547   void Init()
    548   {
    549     INIT_PROBS(Probs);
    550   }
    551 
    552   unsigned Decode(CRangeDecoder *rc)
    553   {
    554     unsigned m = 1;
    555     for (unsigned i = 0; i < NumBits; i++)
    556       m = (m << 1) + rc->DecodeBit(&Probs[m]);
    557     return m - ((unsigned)1 << NumBits);
    558   }
    559 
    560   unsigned ReverseDecode(CRangeDecoder *rc)
    561   {
    562     return BitTreeReverseDecode(Probs, NumBits, rc);
    563   }
    564 };
    565 
    566 
    567 LZ part of LZMA 
    568 ---------------
    569 
    570 LZ part of LZMA describes details about the decoding of MATCHES and LITERALS.
    571 
    572 
    573 The Literal Decoding
    574 --------------------
    575 
    576 The LZMA Decoder uses (1 << (lc + lp)) tables with CProb values, where 
    577 each table contains 0x300 CProb values:
    578 
    579   CProb *LitProbs;
    580 
    581   void CreateLiterals()
    582   {
    583     LitProbs = new CProb[(UInt32)0x300 << (lc + lp)];
    584   }
    585   
    586   void InitLiterals()
    587   {
    588     UInt32 num = (UInt32)0x300 << (lc + lp);
    589     for (UInt32 i = 0; i < num; i++)
    590       LitProbs[i] = PROB_INIT_VAL;
    591   }
    592 
    593 To select the table for decoding it uses the context that consists of
    594 (lc) high bits from previous literal and (lp) low bits from value that
    595 represents current position in outputStream.
    596 
    597 If (State > 7), the Literal Decoder also uses "matchByte" that represents 
    598 the byte in OutputStream at position the is the DISTANCE bytes before 
    599 current position, where the DISTANCE is the distance in DISTANCE-LENGTH pair
    600 of latest decoded match.
    601 
    602 The following code decodes one literal and puts it to Sliding Window buffer:
    603 
    604   void DecodeLiteral(unsigned state, UInt32 rep0)
    605   {
    606     unsigned prevByte = 0;
    607     if (!OutWindow.IsEmpty())
    608       prevByte = OutWindow.GetByte(1);
    609     
    610     unsigned symbol = 1;
    611     unsigned litState = ((OutWindow.TotalPos & ((1 << lp) - 1)) << lc) + (prevByte >> (8 - lc));
    612     CProb *probs = &LitProbs[(UInt32)0x300 * litState];
    613     
    614     if (state >= 7)
    615     {
    616       unsigned matchByte = OutWindow.GetByte(rep0 + 1);
    617       do
    618       {
    619         unsigned matchBit = (matchByte >> 7) & 1;
    620         matchByte <<= 1;
    621         unsigned bit = RangeDec.DecodeBit(&probs[((1 + matchBit) << 8) + symbol]);
    622         symbol = (symbol << 1) | bit;
    623         if (matchBit != bit)
    624           break;
    625       }
    626       while (symbol < 0x100);
    627     }
    628     while (symbol < 0x100)
    629       symbol = (symbol << 1) | RangeDec.DecodeBit(&probs[symbol]);
    630     OutWindow.PutByte((Byte)(symbol - 0x100));
    631   }
    632 
    633 
    634 The match length decoding
    635 -------------------------
    636 
    637 The match length decoder returns normalized (zero-based value) 
    638 length of match. That value can be converted to real length of the match 
    639 with the following code:
    640 
    641 #define kMatchMinLen 2
    642 
    643     matchLen = len + kMatchMinLen;
    644 
    645 The match length decoder can return the values from 0 to 271.
    646 And the corresponded real match length values can be in the range 
    647 from 2 to 273.
    648 
    649 The following scheme is used for the match length encoding:
    650 
    651   Binary encoding    Binary Tree structure    Zero-based match length 
    652   sequence                                    (binary + decimal):
    653 
    654   0 xxx              LowCoder[posState]       xxx
    655   1 0 yyy            MidCoder[posState]       yyy + 8
    656   1 1 zzzzzzzz       HighCoder                zzzzzzzz + 16
    657 
    658 LZMA uses bit model variable "Choice" to decode the first selection bit.
    659 
    660 If the first selection bit is equal to 0, the decoder uses binary tree 
    661   LowCoder[posState] to decode 3-bit zero-based match length (xxx).
    662 
    663 If the first selection bit is equal to 1, the decoder uses bit model 
    664   variable "Choice2" to decode the second selection bit.
    665 
    666   If the second selection bit is equal to 0, the decoder uses binary tree 
    667     MidCoder[posState] to decode 3-bit "yyy" value, and zero-based match
    668     length is equal to (yyy + 8).
    669 
    670   If the second selection bit is equal to 1, the decoder uses binary tree 
    671     HighCoder to decode 8-bit "zzzzzzzz" value, and zero-based 
    672     match length is equal to (zzzzzzzz + 16).
    673 
    674 LZMA uses "posState" value as context to select the binary tree 
    675 from LowCoder and MidCoder binary tree arrays:
    676 
    677     unsigned posState = OutWindow.TotalPos & ((1 << pb) - 1);
    678 
    679 The full code of the length decoder:
    680 
    681 class CLenDecoder
    682 {
    683   CProb Choice;
    684   CProb Choice2;
    685   CBitTreeDecoder<3> LowCoder[1 << kNumPosBitsMax];
    686   CBitTreeDecoder<3> MidCoder[1 << kNumPosBitsMax];
    687   CBitTreeDecoder<8> HighCoder;
    688 
    689 public:
    690 
    691   void Init()
    692   {
    693     Choice = PROB_INIT_VAL;
    694     Choice2 = PROB_INIT_VAL;
    695     HighCoder.Init();
    696     for (unsigned i = 0; i < (1 << kNumPosBitsMax); i++)
    697     {
    698       LowCoder[i].Init();
    699       MidCoder[i].Init();
    700     }
    701   }
    702 
    703   unsigned Decode(CRangeDecoder *rc, unsigned posState)
    704   {
    705     if (rc->DecodeBit(&Choice) == 0)
    706       return LowCoder[posState].Decode(rc);
    707     if (rc->DecodeBit(&Choice2) == 0)
    708       return 8 + MidCoder[posState].Decode(rc);
    709     return 16 + HighCoder.Decode(rc);
    710   }
    711 };
    712 
    713 The LZMA decoder uses two instances of CLenDecoder class.
    714 The first instance is for the matches of "Simple Match" type,
    715 and the second instance is for the matches of "Rep Match" type:
    716 
    717   CLenDecoder LenDecoder;
    718   CLenDecoder RepLenDecoder;
    719 
    720 
    721 The match distance decoding
    722 ---------------------------
    723 
    724 LZMA supports dictionary sizes up to 4 GiB minus 1.
    725 The value of match distance (decoded by distance decoder) can be 
    726 from 1 to 2^32. But the distance value that is equal to 2^32 is used to
    727 indicate the "End of stream" marker. So real largest match distance 
    728 that is used for LZ-window match is (2^32 - 1).
    729 
    730 LZMA uses normalized match length (zero-based length) 
    731 to calculate the context state "lenState" do decode the distance value:
    732 
    733 #define kNumLenToPosStates 4
    734 
    735     unsigned lenState = len;
    736     if (lenState > kNumLenToPosStates - 1)
    737       lenState = kNumLenToPosStates - 1;
    738 
    739 The distance decoder returns the "dist" value that is zero-based value 
    740 of match distance. The real match distance can be calculated with the
    741 following code:
    742   
    743   matchDistance = dist + 1; 
    744 
    745 The state of the distance decoder and the initialization code: 
    746 
    747   #define kEndPosModelIndex 14
    748   #define kNumFullDistances (1 << (kEndPosModelIndex >> 1))
    749   #define kNumAlignBits 4
    750 
    751   CBitTreeDecoder<6> PosSlotDecoder[kNumLenToPosStates];
    752   CProb PosDecoders[1 + kNumFullDistances - kEndPosModelIndex];
    753   CBitTreeDecoder<kNumAlignBits> AlignDecoder;
    754 
    755   void InitDist()
    756   {
    757     for (unsigned i = 0; i < kNumLenToPosStates; i++)
    758       PosSlotDecoder[i].Init();
    759     AlignDecoder.Init();
    760     INIT_PROBS(PosDecoders);
    761   }
    762 
    763 At first stage the distance decoder decodes 6-bit "posSlot" value with bit
    764 tree decoder from PosSlotDecoder array. It's possible to get 2^6=64 different 
    765 "posSlot" values.
    766 
    767     unsigned posSlot = PosSlotDecoder[lenState].Decode(&RangeDec);
    768 
    769 The encoding scheme for distance value is shown in the following table:
    770 
    771 posSlot (decimal) /
    772       zero-based distance (binary)
    773  0    0
    774  1    1
    775  2    10
    776  3    11
    777 
    778  4    10 x
    779  5    11 x
    780  6    10 xx
    781  7    11 xx
    782  8    10 xxx
    783  9    11 xxx
    784 10    10 xxxx
    785 11    11 xxxx
    786 12    10 xxxxx
    787 13    11 xxxxx
    788 
    789 14    10 yy zzzz
    790 15    11 yy zzzz
    791 16    10 yyy zzzz
    792 17    11 yyy zzzz
    793 ...
    794 62    10 yyyyyyyyyyyyyyyyyyyyyyyyyy zzzz
    795 63    11 yyyyyyyyyyyyyyyyyyyyyyyyyy zzzz
    796 
    797 where 
    798   "x ... x" means the sequence of binary symbols encoded with binary tree and 
    799       "Reverse" scheme. It uses separated binary tree for each posSlot from 4 to 13.
    800   "y" means direct bit encoded with range coder.
    801   "zzzz" means the sequence of four binary symbols encoded with binary
    802       tree with "Reverse" scheme, where one common binary tree "AlignDecoder"
    803       is used for all posSlot values.
    804 
    805 If (posSlot < 4), the "dist" value is equal to posSlot value.
    806 
    807 If (posSlot >= 4), the decoder uses "posSlot" value to calculate the value of
    808   the high bits of "dist" value and the number of the low bits.
    809 
    810   If (4 <= posSlot < kEndPosModelIndex), the decoder uses bit tree decoders.
    811     (one separated bit tree decoder per one posSlot value) and "Reverse" scheme.
    812     In this implementation we use one CProb array "PosDecoders" that contains 
    813     all CProb variables for all these bit decoders.
    814   
    815   if (posSlot >= kEndPosModelIndex), the middle bits are decoded as direct 
    816     bits from RangeDecoder and the low 4 bits are decoded with a bit tree 
    817     decoder "AlignDecoder" with "Reverse" scheme.
    818 
    819 The code to decode zero-based match distance:
    820   
    821   unsigned DecodeDistance(unsigned len)
    822   {
    823     unsigned lenState = len;
    824     if (lenState > kNumLenToPosStates - 1)
    825       lenState = kNumLenToPosStates - 1;
    826     
    827     unsigned posSlot = PosSlotDecoder[lenState].Decode(&RangeDec);
    828     if (posSlot < 4)
    829       return posSlot;
    830     
    831     unsigned numDirectBits = (unsigned)((posSlot >> 1) - 1);
    832     UInt32 dist = ((2 | (posSlot & 1)) << numDirectBits);
    833     if (posSlot < kEndPosModelIndex)
    834       dist += BitTreeReverseDecode(PosDecoders + dist - posSlot, numDirectBits, &RangeDec);
    835     else
    836     {
    837       dist += RangeDec.DecodeDirectBits(numDirectBits - kNumAlignBits) << kNumAlignBits;
    838       dist += AlignDecoder.ReverseDecode(&RangeDec);
    839     }
    840     return dist;
    841   }
    842 
    843 
    844 
    845 LZMA Decoding modes
    846 -------------------
    847 
    848 There are 2 types of LZMA streams:
    849 
    850 1) The stream with "End of stream" marker.
    851 2) The stream without "End of stream" marker.
    852 
    853 And the LZMA Decoder supports 3 modes of decoding:
    854 
    855 1) The unpack size is undefined. The LZMA decoder stops decoding after 
    856    getting "End of stream" marker. 
    857    The input variables for that case:
    858     
    859       markerIsMandatory = true
    860       unpackSizeDefined = false
    861       unpackSize contains any value
    862 
    863 2) The unpack size is defined and LZMA decoder supports both variants, 
    864    where the stream can contain "End of stream" marker or the stream is
    865    finished without "End of stream" marker. The LZMA decoder must detect 
    866    any of these situations.
    867    The input variables for that case:
    868     
    869       markerIsMandatory = false
    870       unpackSizeDefined = true
    871       unpackSize contains unpack size
    872 
    873 3) The unpack size is defined and the LZMA stream must contain 
    874    "End of stream" marker
    875    The input variables for that case:
    876     
    877       markerIsMandatory = true
    878       unpackSizeDefined = true
    879       unpackSize contains unpack size
    880 
    881 
    882 The main loop of decoder
    883 ------------------------
    884 
    885 The main loop of LZMA decoder:
    886 
    887 Initialize the LZMA state.
    888 loop
    889 {
    890   // begin of loop
    891   Check "end of stream" conditions.
    892   Decode Type of MATCH / LITERAL. 
    893     If it's LITERAL, decode LITERAL value and put the LITERAL to Window.
    894     If it's MATCH, decode the length of match and the match distance. 
    895         Check error conditions, check end of stream conditions and copy
    896         the sequence of match bytes from sliding window to current position
    897         in window.
    898   Go to begin of loop
    899 }
    900 
    901 The reference implementation of LZMA decoder uses "unpackSize" variable
    902 to keep the number of remaining bytes in output stream. So it reduces 
    903 "unpackSize" value after each decoded LITERAL or MATCH.
    904 
    905 The following code contains the "end of stream" condition check at the start
    906 of the loop:
    907 
    908     if (unpackSizeDefined && unpackSize == 0 && !markerIsMandatory)
    909       if (RangeDec.IsFinishedOK())
    910         return LZMA_RES_FINISHED_WITHOUT_MARKER;
    911 
    912 LZMA uses three types of matches:
    913 
    914 1) "Simple Match" -     the match with distance value encoded with bit models.
    915 
    916 2) "Rep Match" -        the match that uses the distance from distance
    917                         history table.
    918 
    919 3) "Short Rep Match" -  the match of single byte length, that uses the latest 
    920                         distance from distance history table.
    921 
    922 The LZMA decoder keeps the history of latest 4 match distances that were used 
    923 by decoder. That set of 4 variables contains zero-based match distances and 
    924 these variables are initialized with zero values:
    925 
    926   UInt32 rep0 = 0, rep1 = 0, rep2 = 0, rep3 = 0;
    927 
    928 The LZMA decoder uses binary model variables to select type of MATCH or LITERAL:
    929 
    930 #define kNumStates 12
    931 #define kNumPosBitsMax 4
    932 
    933   CProb IsMatch[kNumStates << kNumPosBitsMax];
    934   CProb IsRep[kNumStates];
    935   CProb IsRepG0[kNumStates];
    936   CProb IsRepG1[kNumStates];
    937   CProb IsRepG2[kNumStates];
    938   CProb IsRep0Long[kNumStates << kNumPosBitsMax];
    939 
    940 The decoder uses "state" variable value to select exact variable 
    941 from "IsRep", "IsRepG0", "IsRepG1" and "IsRepG2" arrays.
    942 The "state" variable can get the value from 0 to 11.
    943 Initial value for "state" variable is zero:
    944 
    945   unsigned state = 0;
    946 
    947 The "state" variable is updated after each LITERAL or MATCH with one of the
    948 following functions:
    949 
    950 unsigned UpdateState_Literal(unsigned state)
    951 {
    952   if (state < 4) return 0;
    953   else if (state < 10) return state - 3;
    954   else return state - 6;
    955 }
    956 unsigned UpdateState_Match   (unsigned state) { return state < 7 ? 7 : 10; }
    957 unsigned UpdateState_Rep     (unsigned state) { return state < 7 ? 8 : 11; }
    958 unsigned UpdateState_ShortRep(unsigned state) { return state < 7 ? 9 : 11; }
    959 
    960 The decoder calculates "state2" variable value to select exact variable from 
    961 "IsMatch" and "IsRep0Long" arrays:
    962 
    963 unsigned posState = OutWindow.TotalPos & ((1 << pb) - 1);
    964 unsigned state2 = (state << kNumPosBitsMax) + posState;
    965 
    966 The decoder uses the following code flow scheme to select exact 
    967 type of LITERAL or MATCH:
    968 
    969 IsMatch[state2] decode
    970   0 - the Literal
    971   1 - the Match
    972     IsRep[state] decode
    973       0 - Simple Match
    974       1 - Rep Match
    975         IsRepG0[state] decode
    976           0 - the distance is rep0
    977             IsRep0Long[state2] decode
    978               0 - Short Rep Match
    979               1 - Rep Match 0
    980           1 - 
    981             IsRepG1[state] decode
    982               0 - Rep Match 1
    983               1 - 
    984                 IsRepG2[state] decode
    985                   0 - Rep Match 2
    986                   1 - Rep Match 3
    987 
    988 
    989 LITERAL symbol
    990 --------------
    991 If the value "0" was decoded with IsMatch[state2] decoding, we have "LITERAL" type.
    992 
    993 At first the LZMA decoder must check that it doesn't exceed 
    994 specified uncompressed size:
    995 
    996       if (unpackSizeDefined && unpackSize == 0)
    997         return LZMA_RES_ERROR;
    998 
    999 Then it decodes literal value and puts it to sliding window:
   1000 
   1001       DecodeLiteral(state, rep0);
   1002 
   1003 Then the decoder must update the "state" value and "unpackSize" value;
   1004 
   1005       state = UpdateState_Literal(state);
   1006       unpackSize--;
   1007 
   1008 Then the decoder must go to the begin of main loop to decode next Match or Literal.
   1009 
   1010 
   1011 Simple Match
   1012 ------------
   1013 
   1014 If the value "1" was decoded with IsMatch[state2] decoding,
   1015 we have the "Simple Match" type.
   1016 
   1017 The distance history table is updated with the following scheme:
   1018     
   1019       rep3 = rep2;
   1020       rep2 = rep1;
   1021       rep1 = rep0;
   1022 
   1023 The zero-based length is decoded with "LenDecoder":
   1024 
   1025       len = LenDecoder.Decode(&RangeDec, posState);
   1026 
   1027 The state is update with UpdateState_Match function:
   1028 
   1029       state = UpdateState_Match(state);
   1030 
   1031 and the new "rep0" value is decoded with DecodeDistance:
   1032 
   1033       rep0 = DecodeDistance(len);
   1034 
   1035 That "rep0" will be used as zero-based distance for current match.
   1036 
   1037 If the value of "rep0" is equal to 0xFFFFFFFF, it means that we have 
   1038 "End of stream" marker, so we can stop decoding and check finishing 
   1039 condition in Range Decoder:
   1040 
   1041       if (rep0 == 0xFFFFFFFF)
   1042         return RangeDec.IsFinishedOK() ?
   1043             LZMA_RES_FINISHED_WITH_MARKER :
   1044             LZMA_RES_ERROR;
   1045 
   1046 If uncompressed size is defined, LZMA decoder must check that it doesn't 
   1047 exceed that specified uncompressed size:
   1048 
   1049       if (unpackSizeDefined && unpackSize == 0)
   1050         return LZMA_RES_ERROR;
   1051 
   1052 Also the decoder must check that "rep0" value is not larger than dictionary size
   1053 and is not larger than the number of already decoded bytes:
   1054 
   1055       if (rep0 >= dictSize || !OutWindow.CheckDistance(rep0))
   1056         return LZMA_RES_ERROR;
   1057 
   1058 Then the decoder must copy match bytes as described in 
   1059 "The match symbols copying" section.
   1060 
   1061 
   1062 Rep Match
   1063 ---------
   1064 
   1065 If the LZMA decoder has decoded the value "1" with IsRep[state] variable,
   1066 we have "Rep Match" type.
   1067 
   1068 At first the LZMA decoder must check that it doesn't exceed 
   1069 specified uncompressed size:
   1070 
   1071       if (unpackSizeDefined && unpackSize == 0)
   1072         return LZMA_RES_ERROR;
   1073 
   1074 Also the decoder must return error, if the LZ window is empty:
   1075 
   1076       if (OutWindow.IsEmpty())
   1077         return LZMA_RES_ERROR;
   1078 
   1079 If the match type is "Rep Match", the decoder uses one of the 4 variables of
   1080 distance history table to get the value of distance for current match.
   1081 And there are 4 corresponding ways of decoding flow. 
   1082 
   1083 The decoder updates the distance history with the following scheme 
   1084 depending from type of match:
   1085 
   1086 - "Rep Match 0" or "Short Rep Match":
   1087       ; LZMA doesn't update the distance history    
   1088 
   1089 - "Rep Match 1":
   1090       UInt32 dist = rep1;
   1091       rep1 = rep0;
   1092       rep0 = dist;
   1093 
   1094 - "Rep Match 2":
   1095       UInt32 dist = rep2;
   1096       rep2 = rep1;
   1097       rep1 = rep0;
   1098       rep0 = dist;
   1099 
   1100 - "Rep Match 3":
   1101       UInt32 dist = rep3;
   1102       rep3 = rep2;
   1103       rep2 = rep1;
   1104       rep1 = rep0;
   1105       rep0 = dist;
   1106 
   1107 Then the decoder decodes exact subtype of "Rep Match" using "IsRepG0", "IsRep0Long",
   1108 "IsRepG1", "IsRepG2".
   1109 
   1110 If the subtype is "Short Rep Match", the decoder updates the state, puts 
   1111 the one byte from window to current position in window and goes to next 
   1112 MATCH/LITERAL symbol (the begin of main loop):
   1113 
   1114           state = UpdateState_ShortRep(state);
   1115           OutWindow.PutByte(OutWindow.GetByte(rep0 + 1));
   1116           unpackSize--;
   1117           continue;
   1118 
   1119 In other cases (Rep Match 0/1/2/3), it decodes the zero-based 
   1120 length of match with "RepLenDecoder" decoder:
   1121 
   1122       len = RepLenDecoder.Decode(&RangeDec, posState);
   1123 
   1124 Then it updates the state:
   1125 
   1126       state = UpdateState_Rep(state);
   1127 
   1128 Then the decoder must copy match bytes as described in 
   1129 "The Match symbols copying" section.
   1130 
   1131 
   1132 The match symbols copying
   1133 -------------------------
   1134 
   1135 If we have the match (Simple Match or Rep Match 0/1/2/3), the decoder must
   1136 copy the sequence of bytes with calculated match distance and match length.
   1137 If uncompressed size is defined, LZMA decoder must check that it doesn't 
   1138 exceed that specified uncompressed size:
   1139 
   1140     len += kMatchMinLen;
   1141     bool isError = false;
   1142     if (unpackSizeDefined && unpackSize < len)
   1143     {
   1144       len = (unsigned)unpackSize;
   1145       isError = true;
   1146     }
   1147     OutWindow.CopyMatch(rep0 + 1, len);
   1148     unpackSize -= len;
   1149     if (isError)
   1150       return LZMA_RES_ERROR;
   1151 
   1152 Then the decoder must go to the begin of main loop to decode next MATCH or LITERAL.
   1153 
   1154 
   1155 
   1156 NOTES
   1157 -----
   1158 
   1159 This specification doesn't describe the variant of decoder implementation 
   1160 that supports partial decoding. Such partial decoding case can require some 
   1161 changes in "end of stream" condition checks code. Also such code 
   1162 can use additional status codes, returned by decoder.
   1163 
   1164 This specification uses C++ code with templates to simplify describing.
   1165 The optimized version of LZMA decoder doesn't need templates.
   1166 Such optimized version can use just two arrays of CProb variables:
   1167   1) The dynamic array of CProb variables allocated for the Literal Decoder.
   1168   2) The one common array that contains all other CProb variables.
   1169 
   1170 
   1171 References:      
   1172 
   1173 1. G. N. N. Martin, Range encoding: an algorithm for removing redundancy 
   1174    from a digitized message, Video & Data Recording Conference, 
   1175    Southampton, UK, July 24-27, 1979.
   1176