Home | History | Annotate | Download | only in DOC
      1 LZMA SDK 16.04
      2 --------------
      3 
      4 LZMA SDK provides the documentation, samples, header files,
      5 libraries, and tools you need to develop applications that 
      6 use 7z / LZMA / LZMA2 / XZ compression.
      7 
      8 LZMA is an improved version of famous LZ77 compression algorithm. 
      9 It was improved in way of maximum increasing of compression ratio,
     10 keeping high decompression speed and low memory requirements for 
     11 decompressing.
     12 
     13 LZMA2 is a LZMA based compression method. LZMA2 provides better 
     14 multithreading support for compression than LZMA and some other improvements.
     15 
     16 7z is a file format for data compression and file archiving.
     17 7z is a main file format for 7-Zip compression program (www.7-zip.org).
     18 7z format supports different compression methods: LZMA, LZMA2 and others.
     19 7z also supports AES-256 based encryption.
     20 
     21 XZ is a file format for data compression that uses LZMA2 compression.
     22 XZ format provides additional features: SHA/CRC check, filters for 
     23 improved compression ratio, splitting to blocks and streams,
     24 
     25 
     26 
     27 LICENSE
     28 -------
     29 
     30 LZMA SDK is written and placed in the public domain by Igor Pavlov.
     31 
     32 Some code in LZMA SDK is based on public domain code from another developers:
     33   1) PPMd var.H (2001): Dmitry Shkarin
     34   2) SHA-256: Wei Dai (Crypto++ library)
     35 
     36 Anyone is free to copy, modify, publish, use, compile, sell, or distribute the 
     37 original LZMA SDK code, either in source code form or as a compiled binary, for 
     38 any purpose, commercial or non-commercial, and by any means.
     39 
     40 LZMA SDK code is compatible with open source licenses, for example, you can 
     41 include it to GNU GPL or GNU LGPL code.
     42 
     43 
     44 LZMA SDK Contents
     45 -----------------
     46 
     47   Source code:
     48 
     49     - C / C++ / C# / Java   - LZMA compression and decompression
     50     - C / C++               - LZMA2 compression and decompression
     51     - C / C++               - XZ compression and decompression
     52     - C                     - 7z decompression
     53     -     C++               - 7z compression and decompression
     54     - C                     - small SFXs for installers (7z decompression)
     55     -     C++               - SFXs and SFXs for installers (7z decompression)
     56 
     57   Precomiled binaries:
     58 
     59     - console programs for lzma / 7z / xz compression and decompression
     60     - SFX modules for installers.
     61 
     62 
     63 UNIX/Linux version 
     64 ------------------
     65 To compile C++ version of file->file LZMA encoding, go to directory
     66 CPP/7zip/Bundles/LzmaCon
     67 and call make to recompile it:
     68   make -f makefile.gcc clean all
     69 
     70 In some UNIX/Linux versions you must compile LZMA with static libraries.
     71 To compile with static libraries, you can use 
     72 LIB = -lm -static
     73 
     74 Also you can use p7zip (port of 7-Zip for POSIX systems like Unix or Linux):
     75   
     76   http://p7zip.sourceforge.net/
     77 
     78 
     79 Files
     80 -----
     81 
     82 DOC/7zC.txt          - 7z ANSI-C Decoder description
     83 DOC/7zFormat.txt     - 7z Format description
     84 DOC/installer.txt    - information about 7-Zip for installers
     85 DOC/lzma.txt         - LZMA compression description
     86 DOC/lzma-sdk.txt     - LZMA SDK description (this file)
     87 DOC/lzma-history.txt - history of LZMA SDK
     88 DOC/lzma-specification.txt - Specification of LZMA
     89 DOC/Methods.txt      - Compression method IDs for .7z
     90 
     91 bin/installer/   - example script to create installer that uses SFX module,
     92 
     93 bin/7zdec.exe    - simplified 7z archive decoder
     94 bin/7zr.exe      - 7-Zip console program (reduced version)
     95 bin/x64/7zr.exe  - 7-Zip console program (reduced version) (x64 version)
     96 bin/lzma.exe     - file->file LZMA encoder/decoder for Windows
     97 bin/7zS2.sfx     - small SFX module for installers (GUI version)
     98 bin/7zS2con.sfx  - small SFX module for installers (Console version)
     99 bin/7zSD.sfx     - SFX module for installers.
    100 
    101 
    102 7zDec.exe
    103 ---------
    104 7zDec.exe is simplified 7z archive decoder.
    105 It supports only LZMA, LZMA2, and PPMd methods.
    106 7zDec decodes whole solid block from 7z archive to RAM.
    107 The RAM consumption can be high.
    108 
    109 
    110 
    111 
    112 Source code structure
    113 ---------------------
    114 
    115 
    116 Asm/ - asm files (optimized code for CRC calculation and Intel-AES encryption)
    117 
    118 C/  - C files (compression / decompression and other)
    119   Util/
    120     7z       - 7z decoder program (decoding 7z files)
    121     Lzma     - LZMA program (file->file LZMA encoder/decoder).
    122     LzmaLib  - LZMA library (.DLL for Windows)
    123     SfxSetup - small SFX module for installers 
    124 
    125 CPP/ -- CPP files
    126 
    127   Common  - common files for C++ projects
    128   Windows - common files for Windows related code
    129 
    130   7zip    - files related to 7-Zip
    131 
    132     Archive - files related to archiving
    133 
    134       Common   - common files for archive handling
    135       7z       - 7z C++ Encoder/Decoder
    136 
    137     Bundles  - Modules that are bundles of other modules (files)
    138   
    139       Alone7z       - 7zr.exe: Standalone 7-Zip console program (reduced version)
    140       Format7zExtractR  - 7zxr.dll: Reduced version of 7z DLL: extracting from 7z/LZMA/BCJ/BCJ2.
    141       Format7zR         - 7zr.dll:  Reduced version of 7z DLL: extracting/compressing to 7z/LZMA/BCJ/BCJ2
    142       LzmaCon       - lzma.exe: LZMA compression/decompression
    143       LzmaSpec      - example code for LZMA Specification
    144       SFXCon        - 7zCon.sfx: Console 7z SFX module
    145       SFXSetup      - 7zS.sfx: 7z SFX module for installers
    146       SFXWin        - 7z.sfx: GUI 7z SFX module
    147 
    148     Common   - common files for 7-Zip
    149 
    150     Compress - files for compression/decompression
    151 
    152     Crypto   - files for encryption / decompression
    153 
    154     UI       - User Interface files
    155          
    156       Client7z - Test application for 7za.dll, 7zr.dll, 7zxr.dll
    157       Common   - Common UI files
    158       Console  - Code for console program (7z.exe)
    159       Explorer    - Some code from 7-Zip Shell extension
    160       FileManager - Some GUI code from 7-Zip File Manager
    161       GUI         - Some GUI code from 7-Zip
    162 
    163 
    164 CS/ - C# files
    165   7zip
    166     Common   - some common files for 7-Zip
    167     Compress - files related to compression/decompression
    168       LZ     - files related to LZ (Lempel-Ziv) compression algorithm
    169       LZMA         - LZMA compression/decompression
    170       LzmaAlone    - file->file LZMA compression/decompression
    171       RangeCoder   - Range Coder (special code of compression/decompression)
    172 
    173 Java/  - Java files
    174   SevenZip
    175     Compression    - files related to compression/decompression
    176       LZ           - files related to LZ (Lempel-Ziv) compression algorithm
    177       LZMA         - LZMA compression/decompression
    178       RangeCoder   - Range Coder (special code of compression/decompression)
    179 
    180 
    181 Note: 
    182   Asm / C / C++ source code of LZMA SDK is part of 7-Zip's source code.
    183   7-Zip's source code can be downloaded from 7-Zip's SourceForge page:
    184 
    185   http://sourceforge.net/projects/sevenzip/
    186 
    187 
    188 
    189 LZMA features
    190 -------------
    191   - Variable dictionary size (up to 1 GB)
    192   - Estimated compressing speed: about 2 MB/s on 2 GHz CPU
    193   - Estimated decompressing speed: 
    194       - 20-30 MB/s on modern 2 GHz cpu
    195       - 1-2 MB/s on 200 MHz simple RISC cpu: (ARM, MIPS, PowerPC)
    196   - Small memory requirements for decompressing (16 KB + DictionarySize)
    197   - Small code size for decompressing: 5-8 KB
    198 
    199 LZMA decoder uses only integer operations and can be 
    200 implemented in any modern 32-bit CPU (or on 16-bit CPU with some conditions).
    201 
    202 Some critical operations that affect the speed of LZMA decompression:
    203   1) 32*16 bit integer multiply
    204   2) Mispredicted branches (penalty mostly depends from pipeline length)
    205   3) 32-bit shift and arithmetic operations
    206 
    207 The speed of LZMA decompressing mostly depends from CPU speed.
    208 Memory speed has no big meaning. But if your CPU has small data cache, 
    209 overall weight of memory speed will slightly increase.
    210 
    211 
    212 How To Use
    213 ----------
    214 
    215 Using LZMA encoder/decoder executable
    216 --------------------------------------
    217 
    218 Usage:  LZMA <e|d> inputFile outputFile [<switches>...]
    219 
    220   e: encode file
    221 
    222   d: decode file
    223 
    224   b: Benchmark. There are two tests: compressing and decompressing 
    225      with LZMA method. Benchmark shows rating in MIPS (million 
    226      instructions per second). Rating value is calculated from 
    227      measured speed and it is normalized with Intel's Core 2 results.
    228      Also Benchmark checks possible hardware errors (RAM 
    229      errors in most cases). Benchmark uses these settings:
    230      (-a1, -d21, -fb32, -mfbt4). You can change only -d parameter. 
    231      Also you can change the number of iterations. Example for 30 iterations:
    232        LZMA b 30
    233      Default number of iterations is 10.
    234 
    235 <Switches>
    236   
    237 
    238   -a{N}:  set compression mode 0 = fast, 1 = normal
    239           default: 1 (normal)
    240 
    241   d{N}:   Sets Dictionary size - [0, 30], default: 23 (8MB)
    242           The maximum value for dictionary size is 1 GB = 2^30 bytes.
    243           Dictionary size is calculated as DictionarySize = 2^N bytes. 
    244           For decompressing file compressed by LZMA method with dictionary 
    245           size D = 2^N you need about D bytes of memory (RAM).
    246 
    247   -fb{N}: set number of fast bytes - [5, 273], default: 128
    248           Usually big number gives a little bit better compression ratio 
    249           and slower compression process.
    250 
    251   -lc{N}: set number of literal context bits - [0, 8], default: 3
    252           Sometimes lc=4 gives gain for big files.
    253 
    254   -lp{N}: set number of literal pos bits - [0, 4], default: 0
    255           lp switch is intended for periodical data when period is 
    256           equal 2^N. For example, for 32-bit (4 bytes) 
    257           periodical data you can use lp=2. Often it's better to set lc0, 
    258           if you change lp switch.
    259 
    260   -pb{N}: set number of pos bits - [0, 4], default: 2
    261           pb switch is intended for periodical data 
    262           when period is equal 2^N.
    263 
    264   -mf{MF_ID}: set Match Finder. Default: bt4. 
    265               Algorithms from hc* group doesn't provide good compression 
    266               ratio, but they often works pretty fast in combination with 
    267               fast mode (-a0).
    268 
    269               Memory requirements depend from dictionary size 
    270               (parameter "d" in table below). 
    271 
    272                MF_ID     Memory                   Description
    273 
    274                 bt2    d *  9.5 + 4MB  Binary Tree with 2 bytes hashing.
    275                 bt3    d * 11.5 + 4MB  Binary Tree with 3 bytes hashing.
    276                 bt4    d * 11.5 + 4MB  Binary Tree with 4 bytes hashing.
    277                 hc4    d *  7.5 + 4MB  Hash Chain with 4 bytes hashing.
    278 
    279   -eos:   write End Of Stream marker. By default LZMA doesn't write 
    280           eos marker, since LZMA decoder knows uncompressed size 
    281           stored in .lzma file header.
    282 
    283   -si:    Read data from stdin (it will write End Of Stream marker).
    284   -so:    Write data to stdout
    285 
    286 
    287 Examples:
    288 
    289 1) LZMA e file.bin file.lzma -d16 -lc0 
    290 
    291 compresses file.bin to file.lzma with 64 KB dictionary (2^16=64K)  
    292 and 0 literal context bits. -lc0 allows to reduce memory requirements 
    293 for decompression.
    294 
    295 
    296 2) LZMA e file.bin file.lzma -lc0 -lp2
    297 
    298 compresses file.bin to file.lzma with settings suitable 
    299 for 32-bit periodical data (for example, ARM or MIPS code).
    300 
    301 3) LZMA d file.lzma file.bin
    302 
    303 decompresses file.lzma to file.bin.
    304 
    305 
    306 Compression ratio hints
    307 -----------------------
    308 
    309 Recommendations
    310 ---------------
    311 
    312 To increase the compression ratio for LZMA compressing it's desirable 
    313 to have aligned data (if it's possible) and also it's desirable to locate
    314 data in such order, where code is grouped in one place and data is 
    315 grouped in other place (it's better than such mixing: code, data, code,
    316 data, ...).
    317 
    318 
    319 Filters
    320 -------
    321 You can increase the compression ratio for some data types, using
    322 special filters before compressing. For example, it's possible to 
    323 increase the compression ratio on 5-10% for code for those CPU ISAs: 
    324 x86, IA-64, ARM, ARM-Thumb, PowerPC, SPARC.
    325 
    326 You can find C source code of such filters in C/Bra*.* files
    327 
    328 You can check the compression ratio gain of these filters with such 
    329 7-Zip commands (example for ARM code):
    330 No filter:
    331   7z a a1.7z a.bin -m0=lzma
    332 
    333 With filter for little-endian ARM code:
    334   7z a a2.7z a.bin -m0=arm -m1=lzma        
    335 
    336 It works in such manner:
    337 Compressing    = Filter_encoding + LZMA_encoding
    338 Decompressing  = LZMA_decoding + Filter_decoding
    339 
    340 Compressing and decompressing speed of such filters is very high,
    341 so it will not increase decompressing time too much.
    342 Moreover, it reduces decompression time for LZMA_decoding, 
    343 since compression ratio with filtering is higher.
    344 
    345 These filters convert CALL (calling procedure) instructions 
    346 from relative offsets to absolute addresses, so such data becomes more 
    347 compressible.
    348 
    349 For some ISAs (for example, for MIPS) it's impossible to get gain from such filter.
    350 
    351 
    352 
    353 ---
    354 
    355 http://www.7-zip.org
    356 http://www.7-zip.org/sdk.html
    357 http://www.7-zip.org/support.html
    358