Home | History | Annotate | Download | only in library
      1 :mod:`gzip` --- Support for :program:`gzip` files
      2 =================================================
      3 
      4 .. module:: gzip
      5    :synopsis: Interfaces for gzip compression and decompression using file objects.
      6 
      7 **Source code:** :source:`Lib/gzip.py`
      8 
      9 --------------
     10 
     11 This module provides a simple interface to compress and decompress files just
     12 like the GNU programs :program:`gzip` and :program:`gunzip` would.
     13 
     14 The data compression is provided by the :mod:`zlib` module.
     15 
     16 The :mod:`gzip` module provides the :class:`GzipFile` class, as well as the
     17 :func:`.open`, :func:`compress` and :func:`decompress` convenience functions.
     18 The :class:`GzipFile` class reads and writes :program:`gzip`\ -format files,
     19 automatically compressing or decompressing the data so that it looks like an
     20 ordinary :term:`file object`.
     21 
     22 Note that additional file formats which can be decompressed by the
     23 :program:`gzip` and :program:`gunzip` programs, such  as those produced by
     24 :program:`compress` and :program:`pack`, are not supported by this module.
     25 
     26 The module defines the following items:
     27 
     28 
     29 .. function:: open(filename, mode='rb', compresslevel=9, encoding=None, errors=None, newline=None)
     30 
     31    Open a gzip-compressed file in binary or text mode, returning a :term:`file
     32    object`.
     33 
     34    The *filename* argument can be an actual filename (a :class:`str` or
     35    :class:`bytes` object), or an existing file object to read from or write to.
     36 
     37    The *mode* argument can be any of ``'r'``, ``'rb'``, ``'a'``, ``'ab'``,
     38    ``'w'``, ``'wb'``, ``'x'`` or ``'xb'`` for binary mode, or ``'rt'``,
     39    ``'at'``, ``'wt'``, or ``'xt'`` for text mode. The default is ``'rb'``.
     40 
     41    The *compresslevel* argument is an integer from 0 to 9, as for the
     42    :class:`GzipFile` constructor.
     43 
     44    For binary mode, this function is equivalent to the :class:`GzipFile`
     45    constructor: ``GzipFile(filename, mode, compresslevel)``. In this case, the
     46    *encoding*, *errors* and *newline* arguments must not be provided.
     47 
     48    For text mode, a :class:`GzipFile` object is created, and wrapped in an
     49    :class:`io.TextIOWrapper` instance with the specified encoding, error
     50    handling behavior, and line ending(s).
     51 
     52    .. versionchanged:: 3.3
     53       Added support for *filename* being a file object, support for text mode,
     54       and the *encoding*, *errors* and *newline* arguments.
     55 
     56    .. versionchanged:: 3.4
     57       Added support for the ``'x'``, ``'xb'`` and ``'xt'`` modes.
     58 
     59    .. versionchanged:: 3.6
     60       Accepts a :term:`path-like object`.
     61 
     62 .. class:: GzipFile(filename=None, mode=None, compresslevel=9, fileobj=None, mtime=None)
     63 
     64    Constructor for the :class:`GzipFile` class, which simulates most of the
     65    methods of a :term:`file object`, with the exception of the :meth:`truncate`
     66    method.  At least one of *fileobj* and *filename* must be given a non-trivial
     67    value.
     68 
     69    The new class instance is based on *fileobj*, which can be a regular file, an
     70    :class:`io.BytesIO` object, or any other object which simulates a file.  It
     71    defaults to ``None``, in which case *filename* is opened to provide a file
     72    object.
     73 
     74    When *fileobj* is not ``None``, the *filename* argument is only used to be
     75    included in the :program:`gzip` file header, which may include the original
     76    filename of the uncompressed file.  It defaults to the filename of *fileobj*, if
     77    discernible; otherwise, it defaults to the empty string, and in this case the
     78    original filename is not included in the header.
     79 
     80    The *mode* argument can be any of ``'r'``, ``'rb'``, ``'a'``, ``'ab'``, ``'w'``,
     81    ``'wb'``, ``'x'``, or ``'xb'``, depending on whether the file will be read or
     82    written.  The default is the mode of *fileobj* if discernible; otherwise, the
     83    default is ``'rb'``.
     84 
     85    Note that the file is always opened in binary mode. To open a compressed file
     86    in text mode, use :func:`.open` (or wrap your :class:`GzipFile` with an
     87    :class:`io.TextIOWrapper`).
     88 
     89    The *compresslevel* argument is an integer from ``0`` to ``9`` controlling
     90    the level of compression; ``1`` is fastest and produces the least
     91    compression, and ``9`` is slowest and produces the most compression. ``0``
     92    is no compression. The default is ``9``.
     93 
     94    The *mtime* argument is an optional numeric timestamp to be written to
     95    the last modification time field in the stream when compressing.  It
     96    should only be provided in compression mode.  If omitted or ``None``, the
     97    current time is used.  See the :attr:`mtime` attribute for more details.
     98 
     99    Calling a :class:`GzipFile` object's :meth:`close` method does not close
    100    *fileobj*, since you might wish to append more material after the compressed
    101    data.  This also allows you to pass an :class:`io.BytesIO` object opened for
    102    writing as *fileobj*, and retrieve the resulting memory buffer using the
    103    :class:`io.BytesIO` object's :meth:`~io.BytesIO.getvalue` method.
    104 
    105    :class:`GzipFile` supports the :class:`io.BufferedIOBase` interface,
    106    including iteration and the :keyword:`with` statement.  Only the
    107    :meth:`truncate` method isn't implemented.
    108 
    109    :class:`GzipFile` also provides the following method and attribute:
    110 
    111    .. method:: peek(n)
    112 
    113       Read *n* uncompressed bytes without advancing the file position.
    114       At most one single read on the compressed stream is done to satisfy
    115       the call.  The number of bytes returned may be more or less than
    116       requested.
    117 
    118       .. note:: While calling :meth:`peek` does not change the file position of
    119          the :class:`GzipFile`, it may change the position of the underlying
    120          file object (e.g. if the :class:`GzipFile` was constructed with the
    121          *fileobj* parameter).
    122 
    123       .. versionadded:: 3.2
    124 
    125    .. attribute:: mtime
    126 
    127       When decompressing, the value of the last modification time field in
    128       the most recently read header may be read from this attribute, as an
    129       integer.  The initial value before reading any headers is ``None``.
    130 
    131       All :program:`gzip` compressed streams are required to contain this
    132       timestamp field.  Some programs, such as :program:`gunzip`\ , make use
    133       of the timestamp.  The format is the same as the return value of
    134       :func:`time.time` and the :attr:`~os.stat_result.st_mtime` attribute of
    135       the object returned by :func:`os.stat`.
    136 
    137    .. versionchanged:: 3.1
    138       Support for the :keyword:`with` statement was added, along with the
    139       *mtime* constructor argument and :attr:`mtime` attribute.
    140 
    141    .. versionchanged:: 3.2
    142       Support for zero-padded and unseekable files was added.
    143 
    144    .. versionchanged:: 3.3
    145       The :meth:`io.BufferedIOBase.read1` method is now implemented.
    146 
    147    .. versionchanged:: 3.4
    148       Added support for the ``'x'`` and ``'xb'`` modes.
    149 
    150    .. versionchanged:: 3.5
    151       Added support for writing arbitrary
    152       :term:`bytes-like objects <bytes-like object>`.
    153       The :meth:`~io.BufferedIOBase.read` method now accepts an argument of
    154       ``None``.
    155 
    156    .. versionchanged:: 3.6
    157       Accepts a :term:`path-like object`.
    158 
    159 
    160 .. function:: compress(data, compresslevel=9)
    161 
    162    Compress the *data*, returning a :class:`bytes` object containing
    163    the compressed data.  *compresslevel* has the same meaning as in
    164    the :class:`GzipFile` constructor above.
    165 
    166    .. versionadded:: 3.2
    167 
    168 .. function:: decompress(data)
    169 
    170    Decompress the *data*, returning a :class:`bytes` object containing the
    171    uncompressed data.
    172 
    173    .. versionadded:: 3.2
    174 
    175 
    176 .. _gzip-usage-examples:
    177 
    178 Examples of usage
    179 -----------------
    180 
    181 Example of how to read a compressed file::
    182 
    183    import gzip
    184    with gzip.open('/home/joe/file.txt.gz', 'rb') as f:
    185        file_content = f.read()
    186 
    187 Example of how to create a compressed GZIP file::
    188 
    189    import gzip
    190    content = b"Lots of content here"
    191    with gzip.open('/home/joe/file.txt.gz', 'wb') as f:
    192        f.write(content)
    193 
    194 Example of how to GZIP compress an existing file::
    195 
    196    import gzip
    197    import shutil
    198    with open('/home/joe/file.txt', 'rb') as f_in:
    199        with gzip.open('/home/joe/file.txt.gz', 'wb') as f_out:
    200            shutil.copyfileobj(f_in, f_out)
    201 
    202 Example of how to GZIP compress a binary string::
    203 
    204    import gzip
    205    s_in = b"Lots of content here"
    206    s_out = gzip.compress(s_in)
    207 
    208 .. seealso::
    209 
    210    Module :mod:`zlib`
    211       The basic data compression module needed to support the :program:`gzip` file
    212       format.
    213 
    214