1 :mod:`zlib` --- Compression compatible with :program:`gzip` 2 =========================================================== 3 4 .. module:: zlib 5 :synopsis: Low-level interface to compression and decompression routines 6 compatible with gzip. 7 8 -------------- 9 10 For applications that require data compression, the functions in this module 11 allow compression and decompression, using the zlib library. The zlib library 12 has its own home page at http://www.zlib.net. There are known 13 incompatibilities between the Python module and versions of the zlib library 14 earlier than 1.1.3; 1.1.3 has a security vulnerability, so we recommend using 15 1.1.4 or later. 16 17 zlib's functions have many options and often need to be used in a particular 18 order. This documentation doesn't attempt to cover all of the permutations; 19 consult the zlib manual at http://www.zlib.net/manual.html for authoritative 20 information. 21 22 For reading and writing ``.gz`` files see the :mod:`gzip` module. 23 24 The available exception and functions in this module are: 25 26 27 .. exception:: error 28 29 Exception raised on compression and decompression errors. 30 31 32 .. function:: adler32(data[, value]) 33 34 Computes an Adler-32 checksum of *data*. (An Adler-32 checksum is almost as 35 reliable as a CRC32 but can be computed much more quickly.) The result 36 is an unsigned 32-bit integer. If *value* is present, it is used as 37 the starting value of the checksum; otherwise, a default value of 1 38 is used. Passing in *value* allows computing a running checksum over the 39 concatenation of several inputs. The algorithm is not cryptographically 40 strong, and should not be used for authentication or digital signatures. Since 41 the algorithm is designed for use as a checksum algorithm, it is not suitable 42 for use as a general hash algorithm. 43 44 .. versionchanged:: 3.0 45 Always returns an unsigned value. 46 To generate the same numeric value across all Python versions and 47 platforms, use ``adler32(data) & 0xffffffff``. 48 49 50 .. function:: compress(data, level=-1) 51 52 Compresses the bytes in *data*, returning a bytes object containing compressed data. 53 *level* is an integer from ``0`` to ``9`` or ``-1`` controlling the level of compression; 54 ``1`` (Z_BEST_SPEED) is fastest and produces the least compression, ``9`` (Z_BEST_COMPRESSION) 55 is slowest and produces the most. ``0`` (Z_NO_COMPRESSION) is no compression. 56 The default value is ``-1`` (Z_DEFAULT_COMPRESSION). Z_DEFAULT_COMPRESSION represents a default 57 compromise between speed and compression (currently equivalent to level 6). 58 Raises the :exc:`error` exception if any error occurs. 59 60 .. versionchanged:: 3.6 61 *level* can now be used as a keyword parameter. 62 63 64 .. function:: compressobj(level=-1, method=DEFLATED, wbits=MAX_WBITS, memLevel=DEF_MEM_LEVEL, strategy=Z_DEFAULT_STRATEGY[, zdict]) 65 66 Returns a compression object, to be used for compressing data streams that won't 67 fit into memory at once. 68 69 *level* is the compression level -- an integer from ``0`` to ``9`` or ``-1``. 70 A value of ``1`` (Z_BEST_SPEED) is fastest and produces the least compression, 71 while a value of ``9`` (Z_BEST_COMPRESSION) is slowest and produces the most. 72 ``0`` (Z_NO_COMPRESSION) is no compression. The default value is ``-1`` (Z_DEFAULT_COMPRESSION). 73 Z_DEFAULT_COMPRESSION represents a default compromise between speed and compression 74 (currently equivalent to level 6). 75 76 *method* is the compression algorithm. Currently, the only supported value is 77 :const:`DEFLATED`. 78 79 The *wbits* argument controls the size of the history buffer (or the 80 "window size") used when compressing data, and whether a header and 81 trailer is included in the output. It can take several ranges of values, 82 defaulting to ``15`` (MAX_WBITS): 83 84 * +9 to +15: The base-two logarithm of the window size, which 85 therefore ranges between 512 and 32768. Larger values produce 86 better compression at the expense of greater memory usage. The 87 resulting output will include a zlib-specific header and trailer. 88 89 * 9 to 15: Uses the absolute value of *wbits* as the 90 window size logarithm, while producing a raw output stream with no 91 header or trailing checksum. 92 93 * +25 to +31 = 16 + (9 to 15): Uses the low 4 bits of the value as the 94 window size logarithm, while including a basic :program:`gzip` header 95 and trailing checksum in the output. 96 97 The *memLevel* argument controls the amount of memory used for the 98 internal compression state. Valid values range from ``1`` to ``9``. 99 Higher values use more memory, but are faster and produce smaller output. 100 101 *strategy* is used to tune the compression algorithm. Possible values are 102 :const:`Z_DEFAULT_STRATEGY`, :const:`Z_FILTERED`, :const:`Z_HUFFMAN_ONLY`, 103 :const:`Z_RLE` (zlib 1.2.0.1) and :const:`Z_FIXED` (zlib 1.2.2.2). 104 105 *zdict* is a predefined compression dictionary. This is a sequence of bytes 106 (such as a :class:`bytes` object) containing subsequences that are expected 107 to occur frequently in the data that is to be compressed. Those subsequences 108 that are expected to be most common should come at the end of the dictionary. 109 110 .. versionchanged:: 3.3 111 Added the *zdict* parameter and keyword argument support. 112 113 114 .. function:: crc32(data[, value]) 115 116 .. index:: 117 single: Cyclic Redundancy Check 118 single: checksum; Cyclic Redundancy Check 119 120 Computes a CRC (Cyclic Redundancy Check) checksum of *data*. The 121 result is an unsigned 32-bit integer. If *value* is present, it is used 122 as the starting value of the checksum; otherwise, a default value of 0 123 is used. Passing in *value* allows computing a running checksum over the 124 concatenation of several inputs. The algorithm is not cryptographically 125 strong, and should not be used for authentication or digital signatures. Since 126 the algorithm is designed for use as a checksum algorithm, it is not suitable 127 for use as a general hash algorithm. 128 129 .. versionchanged:: 3.0 130 Always returns an unsigned value. 131 To generate the same numeric value across all Python versions and 132 platforms, use ``crc32(data) & 0xffffffff``. 133 134 135 .. function:: decompress(data, wbits=MAX_WBITS, bufsize=DEF_BUF_SIZE) 136 137 Decompresses the bytes in *data*, returning a bytes object containing the 138 uncompressed data. The *wbits* parameter depends on 139 the format of *data*, and is discussed further below. 140 If *bufsize* is given, it is used as the initial size of the output 141 buffer. Raises the :exc:`error` exception if any error occurs. 142 143 .. _decompress-wbits: 144 145 The *wbits* parameter controls the size of the history buffer 146 (or "window size"), and what header and trailer format is expected. 147 It is similar to the parameter for :func:`compressobj`, but accepts 148 more ranges of values: 149 150 * +8 to +15: The base-two logarithm of the window size. The input 151 must include a zlib header and trailer. 152 153 * 0: Automatically determine the window size from the zlib header. 154 Only supported since zlib 1.2.3.5. 155 156 * 8 to 15: Uses the absolute value of *wbits* as the window size 157 logarithm. The input must be a raw stream with no header or trailer. 158 159 * +24 to +31 = 16 + (8 to 15): Uses the low 4 bits of the value as 160 the window size logarithm. The input must include a gzip header and 161 trailer. 162 163 * +40 to +47 = 32 + (8 to 15): Uses the low 4 bits of the value as 164 the window size logarithm, and automatically accepts either 165 the zlib or gzip format. 166 167 When decompressing a stream, the window size must not be smaller 168 than the size originally used to compress the stream; using a too-small 169 value may result in an :exc:`error` exception. The default *wbits* value 170 corresponds to the largest window size and requires a zlib header and 171 trailer to be included. 172 173 *bufsize* is the initial size of the buffer used to hold decompressed data. If 174 more space is required, the buffer size will be increased as needed, so you 175 don't have to get this value exactly right; tuning it will only save a few calls 176 to :c:func:`malloc`. 177 178 .. versionchanged:: 3.6 179 *wbits* and *bufsize* can be used as keyword arguments. 180 181 .. function:: decompressobj(wbits=MAX_WBITS[, zdict]) 182 183 Returns a decompression object, to be used for decompressing data streams that 184 won't fit into memory at once. 185 186 The *wbits* parameter controls the size of the history buffer (or the 187 "window size"), and what header and trailer format is expected. It has 188 the same meaning as `described for decompress() <#decompress-wbits>`__. 189 190 The *zdict* parameter specifies a predefined compression dictionary. If 191 provided, this must be the same dictionary as was used by the compressor that 192 produced the data that is to be decompressed. 193 194 .. note:: 195 196 If *zdict* is a mutable object (such as a :class:`bytearray`), you must not 197 modify its contents between the call to :func:`decompressobj` and the first 198 call to the decompressor's ``decompress()`` method. 199 200 .. versionchanged:: 3.3 201 Added the *zdict* parameter. 202 203 204 Compression objects support the following methods: 205 206 207 .. method:: Compress.compress(data) 208 209 Compress *data*, returning a bytes object containing compressed data for at least 210 part of the data in *data*. This data should be concatenated to the output 211 produced by any preceding calls to the :meth:`compress` method. Some input may 212 be kept in internal buffers for later processing. 213 214 215 .. method:: Compress.flush([mode]) 216 217 All pending input is processed, and a bytes object containing the remaining compressed 218 output is returned. *mode* can be selected from the constants 219 :const:`Z_NO_FLUSH`, :const:`Z_PARTIAL_FLUSH`, :const:`Z_SYNC_FLUSH`, 220 :const:`Z_FULL_FLUSH`, :const:`Z_BLOCK` (zlib 1.2.3.4), or :const:`Z_FINISH`, 221 defaulting to :const:`Z_FINISH`. Except :const:`Z_FINISH`, all constants 222 allow compressing further bytestrings of data, while :const:`Z_FINISH` finishes the 223 compressed stream and prevents compressing any more data. After calling :meth:`flush` 224 with *mode* set to :const:`Z_FINISH`, the :meth:`compress` method cannot be called again; 225 the only realistic action is to delete the object. 226 227 228 .. method:: Compress.copy() 229 230 Returns a copy of the compression object. This can be used to efficiently 231 compress a set of data that share a common initial prefix. 232 233 234 Decompression objects support the following methods and attributes: 235 236 237 .. attribute:: Decompress.unused_data 238 239 A bytes object which contains any bytes past the end of the compressed data. That is, 240 this remains ``b""`` until the last byte that contains compression data is 241 available. If the whole bytestring turned out to contain compressed data, this is 242 ``b""``, an empty bytes object. 243 244 245 .. attribute:: Decompress.unconsumed_tail 246 247 A bytes object that contains any data that was not consumed by the last 248 :meth:`decompress` call because it exceeded the limit for the uncompressed data 249 buffer. This data has not yet been seen by the zlib machinery, so you must feed 250 it (possibly with further data concatenated to it) back to a subsequent 251 :meth:`decompress` method call in order to get correct output. 252 253 254 .. attribute:: Decompress.eof 255 256 A boolean indicating whether the end of the compressed data stream has been 257 reached. 258 259 This makes it possible to distinguish between a properly-formed compressed 260 stream, and an incomplete or truncated one. 261 262 .. versionadded:: 3.3 263 264 265 .. method:: Decompress.decompress(data, max_length=0) 266 267 Decompress *data*, returning a bytes object containing the uncompressed data 268 corresponding to at least part of the data in *string*. This data should be 269 concatenated to the output produced by any preceding calls to the 270 :meth:`decompress` method. Some of the input data may be preserved in internal 271 buffers for later processing. 272 273 If the optional parameter *max_length* is non-zero then the return value will be 274 no longer than *max_length*. This may mean that not all of the compressed input 275 can be processed; and unconsumed data will be stored in the attribute 276 :attr:`unconsumed_tail`. This bytestring must be passed to a subsequent call to 277 :meth:`decompress` if decompression is to continue. If *max_length* is zero 278 then the whole input is decompressed, and :attr:`unconsumed_tail` is empty. 279 280 .. versionchanged:: 3.6 281 *max_length* can be used as a keyword argument. 282 283 284 .. method:: Decompress.flush([length]) 285 286 All pending input is processed, and a bytes object containing the remaining 287 uncompressed output is returned. After calling :meth:`flush`, the 288 :meth:`decompress` method cannot be called again; the only realistic action is 289 to delete the object. 290 291 The optional parameter *length* sets the initial size of the output buffer. 292 293 294 .. method:: Decompress.copy() 295 296 Returns a copy of the decompression object. This can be used to save the state 297 of the decompressor midway through the data stream in order to speed up random 298 seeks into the stream at a future point. 299 300 301 Information about the version of the zlib library in use is available through 302 the following constants: 303 304 305 .. data:: ZLIB_VERSION 306 307 The version string of the zlib library that was used for building the module. 308 This may be different from the zlib library actually used at runtime, which 309 is available as :const:`ZLIB_RUNTIME_VERSION`. 310 311 312 .. data:: ZLIB_RUNTIME_VERSION 313 314 The version string of the zlib library actually loaded by the interpreter. 315 316 .. versionadded:: 3.3 317 318 319 .. seealso:: 320 321 Module :mod:`gzip` 322 Reading and writing :program:`gzip`\ -format files. 323 324 http://www.zlib.net 325 The zlib library home page. 326 327 http://www.zlib.net/manual.html 328 The zlib manual explains the semantics and usage of the library's many 329 functions. 330 331