Home | History | Annotate | Download | only in library
      1 :mod:`tarfile` --- Read and write tar archive files
      2 ===================================================
      3 
      4 .. module:: tarfile
      5    :synopsis: Read and write tar-format archive files.
      6 
      7 .. moduleauthor:: Lars Gustbel <lars (a] gustaebel.de>
      8 .. sectionauthor:: Lars Gustbel <lars (a] gustaebel.de>
      9 
     10 **Source code:** :source:`Lib/tarfile.py`
     11 
     12 --------------
     13 
     14 The :mod:`tarfile` module makes it possible to read and write tar
     15 archives, including those using gzip, bz2 and lzma compression.
     16 Use the :mod:`zipfile` module to read or write :file:`.zip` files, or the
     17 higher-level functions in :ref:`shutil <archiving-operations>`.
     18 
     19 Some facts and figures:
     20 
     21 * reads and writes :mod:`gzip`, :mod:`bz2` and :mod:`lzma` compressed archives
     22   if the respective modules are available.
     23 
     24 * read/write support for the POSIX.1-1988 (ustar) format.
     25 
     26 * read/write support for the GNU tar format including *longname* and *longlink*
     27   extensions, read-only support for all variants of the *sparse* extension
     28   including restoration of sparse files.
     29 
     30 * read/write support for the POSIX.1-2001 (pax) format.
     31 
     32 * handles directories, regular files, hardlinks, symbolic links, fifos,
     33   character devices and block devices and is able to acquire and restore file
     34   information like timestamp, access permissions and owner.
     35 
     36 .. versionchanged:: 3.3
     37    Added support for :mod:`lzma` compression.
     38 
     39 
     40 .. function:: open(name=None, mode='r', fileobj=None, bufsize=10240, \*\*kwargs)
     41 
     42    Return a :class:`TarFile` object for the pathname *name*. For detailed
     43    information on :class:`TarFile` objects and the keyword arguments that are
     44    allowed, see :ref:`tarfile-objects`.
     45 
     46    *mode* has to be a string of the form ``'filemode[:compression]'``, it defaults
     47    to ``'r'``. Here is a full list of mode combinations:
     48 
     49    +------------------+---------------------------------------------+
     50    | mode             | action                                      |
     51    +==================+=============================================+
     52    | ``'r' or 'r:*'`` | Open for reading with transparent           |
     53    |                  | compression (recommended).                  |
     54    +------------------+---------------------------------------------+
     55    | ``'r:'``         | Open for reading exclusively without        |
     56    |                  | compression.                                |
     57    +------------------+---------------------------------------------+
     58    | ``'r:gz'``       | Open for reading with gzip compression.     |
     59    +------------------+---------------------------------------------+
     60    | ``'r:bz2'``      | Open for reading with bzip2 compression.    |
     61    +------------------+---------------------------------------------+
     62    | ``'r:xz'``       | Open for reading with lzma compression.     |
     63    +------------------+---------------------------------------------+
     64    | ``'x'`` or       | Create a tarfile exclusively without        |
     65    | ``'x:'``         | compression.                                |
     66    |                  | Raise an :exc:`FileExistsError` exception   |
     67    |                  | if it already exists.                       |
     68    +------------------+---------------------------------------------+
     69    | ``'x:gz'``       | Create a tarfile with gzip compression.     |
     70    |                  | Raise an :exc:`FileExistsError` exception   |
     71    |                  | if it already exists.                       |
     72    +------------------+---------------------------------------------+
     73    | ``'x:bz2'``      | Create a tarfile with bzip2 compression.    |
     74    |                  | Raise an :exc:`FileExistsError` exception   |
     75    |                  | if it already exists.                       |
     76    +------------------+---------------------------------------------+
     77    | ``'x:xz'``       | Create a tarfile with lzma compression.     |
     78    |                  | Raise an :exc:`FileExistsError` exception   |
     79    |                  | if it already exists.                       |
     80    +------------------+---------------------------------------------+
     81    | ``'a' or 'a:'``  | Open for appending with no compression. The |
     82    |                  | file is created if it does not exist.       |
     83    +------------------+---------------------------------------------+
     84    | ``'w' or 'w:'``  | Open for uncompressed writing.              |
     85    +------------------+---------------------------------------------+
     86    | ``'w:gz'``       | Open for gzip compressed writing.           |
     87    +------------------+---------------------------------------------+
     88    | ``'w:bz2'``      | Open for bzip2 compressed writing.          |
     89    +------------------+---------------------------------------------+
     90    | ``'w:xz'``       | Open for lzma compressed writing.           |
     91    +------------------+---------------------------------------------+
     92 
     93    Note that ``'a:gz'``, ``'a:bz2'`` or ``'a:xz'`` is not possible. If *mode*
     94    is not suitable to open a certain (compressed) file for reading,
     95    :exc:`ReadError` is raised. Use *mode* ``'r'`` to avoid this.  If a
     96    compression method is not supported, :exc:`CompressionError` is raised.
     97 
     98    If *fileobj* is specified, it is used as an alternative to a :term:`file object`
     99    opened in binary mode for *name*. It is supposed to be at position 0.
    100 
    101    For modes ``'w:gz'``, ``'r:gz'``, ``'w:bz2'``, ``'r:bz2'``, ``'x:gz'``,
    102    ``'x:bz2'``, :func:`tarfile.open` accepts the keyword argument
    103    *compresslevel* (default ``9``) to specify the compression level of the file.
    104 
    105    For special purposes, there is a second format for *mode*:
    106    ``'filemode|[compression]'``.  :func:`tarfile.open` will return a :class:`TarFile`
    107    object that processes its data as a stream of blocks.  No random seeking will
    108    be done on the file. If given, *fileobj* may be any object that has a
    109    :meth:`read` or :meth:`write` method (depending on the *mode*). *bufsize*
    110    specifies the blocksize and defaults to ``20 * 512`` bytes. Use this variant
    111    in combination with e.g. ``sys.stdin``, a socket :term:`file object` or a tape
    112    device. However, such a :class:`TarFile` object is limited in that it does
    113    not allow random access, see :ref:`tar-examples`.  The currently
    114    possible modes:
    115 
    116    +-------------+--------------------------------------------+
    117    | Mode        | Action                                     |
    118    +=============+============================================+
    119    | ``'r|*'``   | Open a *stream* of tar blocks for reading  |
    120    |             | with transparent compression.              |
    121    +-------------+--------------------------------------------+
    122    | ``'r|'``    | Open a *stream* of uncompressed tar blocks |
    123    |             | for reading.                               |
    124    +-------------+--------------------------------------------+
    125    | ``'r|gz'``  | Open a gzip compressed *stream* for        |
    126    |             | reading.                                   |
    127    +-------------+--------------------------------------------+
    128    | ``'r|bz2'`` | Open a bzip2 compressed *stream* for       |
    129    |             | reading.                                   |
    130    +-------------+--------------------------------------------+
    131    | ``'r|xz'``  | Open an lzma compressed *stream* for       |
    132    |             | reading.                                   |
    133    +-------------+--------------------------------------------+
    134    | ``'w|'``    | Open an uncompressed *stream* for writing. |
    135    +-------------+--------------------------------------------+
    136    | ``'w|gz'``  | Open a gzip compressed *stream* for        |
    137    |             | writing.                                   |
    138    +-------------+--------------------------------------------+
    139    | ``'w|bz2'`` | Open a bzip2 compressed *stream* for       |
    140    |             | writing.                                   |
    141    +-------------+--------------------------------------------+
    142    | ``'w|xz'``  | Open an lzma compressed *stream* for       |
    143    |             | writing.                                   |
    144    +-------------+--------------------------------------------+
    145 
    146    .. versionchanged:: 3.5
    147       The ``'x'`` (exclusive creation) mode was added.
    148 
    149    .. versionchanged:: 3.6
    150       The *name* parameter accepts a :term:`path-like object`.
    151 
    152 
    153 .. class:: TarFile
    154 
    155    Class for reading and writing tar archives. Do not use this class directly:
    156    use :func:`tarfile.open` instead. See :ref:`tarfile-objects`.
    157 
    158 
    159 .. function:: is_tarfile(name)
    160 
    161    Return :const:`True` if *name* is a tar archive file, that the :mod:`tarfile`
    162    module can read.
    163 
    164 
    165 The :mod:`tarfile` module defines the following exceptions:
    166 
    167 
    168 .. exception:: TarError
    169 
    170    Base class for all :mod:`tarfile` exceptions.
    171 
    172 
    173 .. exception:: ReadError
    174 
    175    Is raised when a tar archive is opened, that either cannot be handled by the
    176    :mod:`tarfile` module or is somehow invalid.
    177 
    178 
    179 .. exception:: CompressionError
    180 
    181    Is raised when a compression method is not supported or when the data cannot be
    182    decoded properly.
    183 
    184 
    185 .. exception:: StreamError
    186 
    187    Is raised for the limitations that are typical for stream-like :class:`TarFile`
    188    objects.
    189 
    190 
    191 .. exception:: ExtractError
    192 
    193    Is raised for *non-fatal* errors when using :meth:`TarFile.extract`, but only if
    194    :attr:`TarFile.errorlevel`\ ``== 2``.
    195 
    196 
    197 .. exception:: HeaderError
    198 
    199    Is raised by :meth:`TarInfo.frombuf` if the buffer it gets is invalid.
    200 
    201 
    202 The following constants are available at the module level:
    203 
    204 .. data:: ENCODING
    205 
    206    The default character encoding: ``'utf-8'`` on Windows, the value returned by
    207    :func:`sys.getfilesystemencoding` otherwise.
    208 
    209 
    210 Each of the following constants defines a tar archive format that the
    211 :mod:`tarfile` module is able to create. See section :ref:`tar-formats` for
    212 details.
    213 
    214 
    215 .. data:: USTAR_FORMAT
    216 
    217    POSIX.1-1988 (ustar) format.
    218 
    219 
    220 .. data:: GNU_FORMAT
    221 
    222    GNU tar format.
    223 
    224 
    225 .. data:: PAX_FORMAT
    226 
    227    POSIX.1-2001 (pax) format.
    228 
    229 
    230 .. data:: DEFAULT_FORMAT
    231 
    232    The default format for creating archives. This is currently :const:`GNU_FORMAT`.
    233 
    234 
    235 .. seealso::
    236 
    237    Module :mod:`zipfile`
    238       Documentation of the :mod:`zipfile` standard module.
    239 
    240    :ref:`archiving-operations`
    241       Documentation of the higher-level archiving facilities provided by the
    242       standard :mod:`shutil` module.
    243 
    244    `GNU tar manual, Basic Tar Format <https://www.gnu.org/software/tar/manual/html_node/Standard.html>`_
    245       Documentation for tar archive files, including GNU tar extensions.
    246 
    247 
    248 .. _tarfile-objects:
    249 
    250 TarFile Objects
    251 ---------------
    252 
    253 The :class:`TarFile` object provides an interface to a tar archive. A tar
    254 archive is a sequence of blocks. An archive member (a stored file) is made up of
    255 a header block followed by data blocks. It is possible to store a file in a tar
    256 archive several times. Each archive member is represented by a :class:`TarInfo`
    257 object, see :ref:`tarinfo-objects` for details.
    258 
    259 A :class:`TarFile` object can be used as a context manager in a :keyword:`with`
    260 statement. It will automatically be closed when the block is completed. Please
    261 note that in the event of an exception an archive opened for writing will not
    262 be finalized; only the internally used file object will be closed. See the
    263 :ref:`tar-examples` section for a use case.
    264 
    265 .. versionadded:: 3.2
    266    Added support for the context management protocol.
    267 
    268 .. class:: TarFile(name=None, mode='r', fileobj=None, format=DEFAULT_FORMAT, tarinfo=TarInfo, dereference=False, ignore_zeros=False, encoding=ENCODING, errors='surrogateescape', pax_headers=None, debug=0, errorlevel=0)
    269 
    270    All following arguments are optional and can be accessed as instance attributes
    271    as well.
    272 
    273    *name* is the pathname of the archive. *name* may be a :term:`path-like object`.
    274    It can be omitted if *fileobj* is given.
    275    In this case, the file object's :attr:`name` attribute is used if it exists.
    276 
    277    *mode* is either ``'r'`` to read from an existing archive, ``'a'`` to append
    278    data to an existing file, ``'w'`` to create a new file overwriting an existing
    279    one, or ``'x'`` to create a new file only if it does not already exist.
    280 
    281    If *fileobj* is given, it is used for reading or writing data. If it can be
    282    determined, *mode* is overridden by *fileobj*'s mode. *fileobj* will be used
    283    from position 0.
    284 
    285    .. note::
    286 
    287       *fileobj* is not closed, when :class:`TarFile` is closed.
    288 
    289    *format* controls the archive format. It must be one of the constants
    290    :const:`USTAR_FORMAT`, :const:`GNU_FORMAT` or :const:`PAX_FORMAT` that are
    291    defined at module level.
    292 
    293    The *tarinfo* argument can be used to replace the default :class:`TarInfo` class
    294    with a different one.
    295 
    296    If *dereference* is :const:`False`, add symbolic and hard links to the archive. If it
    297    is :const:`True`, add the content of the target files to the archive. This has no
    298    effect on systems that do not support symbolic links.
    299 
    300    If *ignore_zeros* is :const:`False`, treat an empty block as the end of the archive.
    301    If it is :const:`True`, skip empty (and invalid) blocks and try to get as many members
    302    as possible. This is only useful for reading concatenated or damaged archives.
    303 
    304    *debug* can be set from ``0`` (no debug messages) up to ``3`` (all debug
    305    messages). The messages are written to ``sys.stderr``.
    306 
    307    If *errorlevel* is ``0``, all errors are ignored when using :meth:`TarFile.extract`.
    308    Nevertheless, they appear as error messages in the debug output, when debugging
    309    is enabled.  If ``1``, all *fatal* errors are raised as :exc:`OSError`
    310    exceptions. If ``2``, all *non-fatal* errors are raised as :exc:`TarError`
    311    exceptions as well.
    312 
    313    The *encoding* and *errors* arguments define the character encoding to be
    314    used for reading or writing the archive and how conversion errors are going
    315    to be handled. The default settings will work for most users.
    316    See section :ref:`tar-unicode` for in-depth information.
    317 
    318    The *pax_headers* argument is an optional dictionary of strings which
    319    will be added as a pax global header if *format* is :const:`PAX_FORMAT`.
    320 
    321    .. versionchanged:: 3.2
    322       Use ``'surrogateescape'`` as the default for the *errors* argument.
    323 
    324    .. versionchanged:: 3.5
    325       The ``'x'`` (exclusive creation) mode was added.
    326 
    327    .. versionchanged:: 3.6
    328       The *name* parameter accepts a :term:`path-like object`.
    329 
    330 
    331 .. classmethod:: TarFile.open(...)
    332 
    333    Alternative constructor. The :func:`tarfile.open` function is actually a
    334    shortcut to this classmethod.
    335 
    336 
    337 .. method:: TarFile.getmember(name)
    338 
    339    Return a :class:`TarInfo` object for member *name*. If *name* can not be found
    340    in the archive, :exc:`KeyError` is raised.
    341 
    342    .. note::
    343 
    344       If a member occurs more than once in the archive, its last occurrence is assumed
    345       to be the most up-to-date version.
    346 
    347 
    348 .. method:: TarFile.getmembers()
    349 
    350    Return the members of the archive as a list of :class:`TarInfo` objects. The
    351    list has the same order as the members in the archive.
    352 
    353 
    354 .. method:: TarFile.getnames()
    355 
    356    Return the members as a list of their names. It has the same order as the list
    357    returned by :meth:`getmembers`.
    358 
    359 
    360 .. method:: TarFile.list(verbose=True, *, members=None)
    361 
    362    Print a table of contents to ``sys.stdout``. If *verbose* is :const:`False`,
    363    only the names of the members are printed. If it is :const:`True`, output
    364    similar to that of :program:`ls -l` is produced. If optional *members* is
    365    given, it must be a subset of the list returned by :meth:`getmembers`.
    366 
    367    .. versionchanged:: 3.5
    368       Added the *members* parameter.
    369 
    370 
    371 .. method:: TarFile.next()
    372 
    373    Return the next member of the archive as a :class:`TarInfo` object, when
    374    :class:`TarFile` is opened for reading. Return :const:`None` if there is no more
    375    available.
    376 
    377 
    378 .. method:: TarFile.extractall(path=".", members=None, *, numeric_owner=False)
    379 
    380    Extract all members from the archive to the current working directory or
    381    directory *path*. If optional *members* is given, it must be a subset of the
    382    list returned by :meth:`getmembers`. Directory information like owner,
    383    modification time and permissions are set after all members have been extracted.
    384    This is done to work around two problems: A directory's modification time is
    385    reset each time a file is created in it. And, if a directory's permissions do
    386    not allow writing, extracting files to it will fail.
    387 
    388    If *numeric_owner* is :const:`True`, the uid and gid numbers from the tarfile
    389    are used to set the owner/group for the extracted files. Otherwise, the named
    390    values from the tarfile are used.
    391 
    392    .. warning::
    393 
    394       Never extract archives from untrusted sources without prior inspection.
    395       It is possible that files are created outside of *path*, e.g. members
    396       that have absolute filenames starting with ``"/"`` or filenames with two
    397       dots ``".."``.
    398 
    399    .. versionchanged:: 3.5
    400       Added the *numeric_owner* parameter.
    401 
    402    .. versionchanged:: 3.6
    403       The *path* parameter accepts a :term:`path-like object`.
    404 
    405 
    406 .. method:: TarFile.extract(member, path="", set_attrs=True, *, numeric_owner=False)
    407 
    408    Extract a member from the archive to the current working directory, using its
    409    full name. Its file information is extracted as accurately as possible. *member*
    410    may be a filename or a :class:`TarInfo` object. You can specify a different
    411    directory using *path*. *path* may be a :term:`path-like object`.
    412    File attributes (owner, mtime, mode) are set unless *set_attrs* is false.
    413 
    414    If *numeric_owner* is :const:`True`, the uid and gid numbers from the tarfile
    415    are used to set the owner/group for the extracted files. Otherwise, the named
    416    values from the tarfile are used.
    417 
    418    .. note::
    419 
    420       The :meth:`extract` method does not take care of several extraction issues.
    421       In most cases you should consider using the :meth:`extractall` method.
    422 
    423    .. warning::
    424 
    425       See the warning for :meth:`extractall`.
    426 
    427    .. versionchanged:: 3.2
    428       Added the *set_attrs* parameter.
    429 
    430    .. versionchanged:: 3.5
    431       Added the *numeric_owner* parameter.
    432 
    433    .. versionchanged:: 3.6
    434       The *path* parameter accepts a :term:`path-like object`.
    435 
    436 
    437 .. method:: TarFile.extractfile(member)
    438 
    439    Extract a member from the archive as a file object. *member* may be a filename
    440    or a :class:`TarInfo` object. If *member* is a regular file or a link, an
    441    :class:`io.BufferedReader` object is returned. Otherwise, :const:`None` is
    442    returned.
    443 
    444    .. versionchanged:: 3.3
    445       Return an :class:`io.BufferedReader` object.
    446 
    447 
    448 .. method:: TarFile.add(name, arcname=None, recursive=True, *, filter=None)
    449 
    450    Add the file *name* to the archive. *name* may be any type of file
    451    (directory, fifo, symbolic link, etc.). If given, *arcname* specifies an
    452    alternative name for the file in the archive. Directories are added
    453    recursively by default. This can be avoided by setting *recursive* to
    454    :const:`False`. Recursion adds entries in sorted order.
    455    If *filter* is given, it
    456    should be a function that takes a :class:`TarInfo` object argument and
    457    returns the changed :class:`TarInfo` object. If it instead returns
    458    :const:`None` the :class:`TarInfo` object will be excluded from the
    459    archive. See :ref:`tar-examples` for an example.
    460 
    461    .. versionchanged:: 3.2
    462       Added the *filter* parameter.
    463 
    464    .. versionchanged:: 3.7
    465       Recursion adds entries in sorted order.
    466 
    467 
    468 .. method:: TarFile.addfile(tarinfo, fileobj=None)
    469 
    470    Add the :class:`TarInfo` object *tarinfo* to the archive. If *fileobj* is given,
    471    it should be a :term:`binary file`, and
    472    ``tarinfo.size`` bytes are read from it and added to the archive.  You can
    473    create :class:`TarInfo` objects directly, or by using :meth:`gettarinfo`.
    474 
    475 
    476 .. method:: TarFile.gettarinfo(name=None, arcname=None, fileobj=None)
    477 
    478    Create a :class:`TarInfo` object from the result of :func:`os.stat` or
    479    equivalent on an existing file.  The file is either named by *name*, or
    480    specified as a :term:`file object` *fileobj* with a file descriptor.
    481    *name* may be a :term:`path-like object`.  If
    482    given, *arcname* specifies an alternative name for the file in the
    483    archive, otherwise, the name is taken from *fileobj*s
    484    :attr:`~io.FileIO.name` attribute, or the *name* argument.  The name
    485    should be a text string.
    486 
    487    You can modify
    488    some of the :class:`TarInfo`s attributes before you add it using :meth:`addfile`.
    489    If the file object is not an ordinary file object positioned at the
    490    beginning of the file, attributes such as :attr:`~TarInfo.size` may need
    491    modifying.  This is the case for objects such as :class:`~gzip.GzipFile`.
    492    The :attr:`~TarInfo.name` may also be modified, in which case *arcname*
    493    could be a dummy string.
    494 
    495    .. versionchanged:: 3.6
    496       The *name* parameter accepts a :term:`path-like object`.
    497 
    498 
    499 .. method:: TarFile.close()
    500 
    501    Close the :class:`TarFile`. In write mode, two finishing zero blocks are
    502    appended to the archive.
    503 
    504 
    505 .. attribute:: TarFile.pax_headers
    506 
    507    A dictionary containing key-value pairs of pax global headers.
    508 
    509 
    510 
    511 .. _tarinfo-objects:
    512 
    513 TarInfo Objects
    514 ---------------
    515 
    516 A :class:`TarInfo` object represents one member in a :class:`TarFile`. Aside
    517 from storing all required attributes of a file (like file type, size, time,
    518 permissions, owner etc.), it provides some useful methods to determine its type.
    519 It does *not* contain the file's data itself.
    520 
    521 :class:`TarInfo` objects are returned by :class:`TarFile`'s methods
    522 :meth:`getmember`, :meth:`getmembers` and :meth:`gettarinfo`.
    523 
    524 
    525 .. class:: TarInfo(name="")
    526 
    527    Create a :class:`TarInfo` object.
    528 
    529 
    530 .. classmethod:: TarInfo.frombuf(buf, encoding, errors)
    531 
    532    Create and return a :class:`TarInfo` object from string buffer *buf*.
    533 
    534    Raises :exc:`HeaderError` if the buffer is invalid.
    535 
    536 
    537 .. classmethod:: TarInfo.fromtarfile(tarfile)
    538 
    539    Read the next member from the :class:`TarFile` object *tarfile* and return it as
    540    a :class:`TarInfo` object.
    541 
    542 
    543 .. method:: TarInfo.tobuf(format=DEFAULT_FORMAT, encoding=ENCODING, errors='surrogateescape')
    544 
    545    Create a string buffer from a :class:`TarInfo` object. For information on the
    546    arguments see the constructor of the :class:`TarFile` class.
    547 
    548    .. versionchanged:: 3.2
    549       Use ``'surrogateescape'`` as the default for the *errors* argument.
    550 
    551 
    552 A ``TarInfo`` object has the following public data attributes:
    553 
    554 
    555 .. attribute:: TarInfo.name
    556 
    557    Name of the archive member.
    558 
    559 
    560 .. attribute:: TarInfo.size
    561 
    562    Size in bytes.
    563 
    564 
    565 .. attribute:: TarInfo.mtime
    566 
    567    Time of last modification.
    568 
    569 
    570 .. attribute:: TarInfo.mode
    571 
    572    Permission bits.
    573 
    574 
    575 .. attribute:: TarInfo.type
    576 
    577    File type.  *type* is usually one of these constants: :const:`REGTYPE`,
    578    :const:`AREGTYPE`, :const:`LNKTYPE`, :const:`SYMTYPE`, :const:`DIRTYPE`,
    579    :const:`FIFOTYPE`, :const:`CONTTYPE`, :const:`CHRTYPE`, :const:`BLKTYPE`,
    580    :const:`GNUTYPE_SPARSE`.  To determine the type of a :class:`TarInfo` object
    581    more conveniently, use the ``is*()`` methods below.
    582 
    583 
    584 .. attribute:: TarInfo.linkname
    585 
    586    Name of the target file name, which is only present in :class:`TarInfo` objects
    587    of type :const:`LNKTYPE` and :const:`SYMTYPE`.
    588 
    589 
    590 .. attribute:: TarInfo.uid
    591 
    592    User ID of the user who originally stored this member.
    593 
    594 
    595 .. attribute:: TarInfo.gid
    596 
    597    Group ID of the user who originally stored this member.
    598 
    599 
    600 .. attribute:: TarInfo.uname
    601 
    602    User name.
    603 
    604 
    605 .. attribute:: TarInfo.gname
    606 
    607    Group name.
    608 
    609 
    610 .. attribute:: TarInfo.pax_headers
    611 
    612    A dictionary containing key-value pairs of an associated pax extended header.
    613 
    614 
    615 A :class:`TarInfo` object also provides some convenient query methods:
    616 
    617 
    618 .. method:: TarInfo.isfile()
    619 
    620    Return :const:`True` if the :class:`Tarinfo` object is a regular file.
    621 
    622 
    623 .. method:: TarInfo.isreg()
    624 
    625    Same as :meth:`isfile`.
    626 
    627 
    628 .. method:: TarInfo.isdir()
    629 
    630    Return :const:`True` if it is a directory.
    631 
    632 
    633 .. method:: TarInfo.issym()
    634 
    635    Return :const:`True` if it is a symbolic link.
    636 
    637 
    638 .. method:: TarInfo.islnk()
    639 
    640    Return :const:`True` if it is a hard link.
    641 
    642 
    643 .. method:: TarInfo.ischr()
    644 
    645    Return :const:`True` if it is a character device.
    646 
    647 
    648 .. method:: TarInfo.isblk()
    649 
    650    Return :const:`True` if it is a block device.
    651 
    652 
    653 .. method:: TarInfo.isfifo()
    654 
    655    Return :const:`True` if it is a FIFO.
    656 
    657 
    658 .. method:: TarInfo.isdev()
    659 
    660    Return :const:`True` if it is one of character device, block device or FIFO.
    661 
    662 
    663 .. _tarfile-commandline:
    664 .. program:: tarfile
    665 
    666 Command-Line Interface
    667 ----------------------
    668 
    669 .. versionadded:: 3.4
    670 
    671 The :mod:`tarfile` module provides a simple command-line interface to interact
    672 with tar archives.
    673 
    674 If you want to create a new tar archive, specify its name after the :option:`-c`
    675 option and then list the filename(s) that should be included:
    676 
    677 .. code-block:: shell-session
    678 
    679     $ python -m tarfile -c monty.tar  spam.txt eggs.txt
    680 
    681 Passing a directory is also acceptable:
    682 
    683 .. code-block:: shell-session
    684 
    685     $ python -m tarfile -c monty.tar life-of-brian_1979/
    686 
    687 If you want to extract a tar archive into the current directory, use
    688 the :option:`-e` option:
    689 
    690 .. code-block:: shell-session
    691 
    692     $ python -m tarfile -e monty.tar
    693 
    694 You can also extract a tar archive into a different directory by passing the
    695 directory's name:
    696 
    697 .. code-block:: shell-session
    698 
    699     $ python -m tarfile -e monty.tar  other-dir/
    700 
    701 For a list of the files in a tar archive, use the :option:`-l` option:
    702 
    703 .. code-block:: shell-session
    704 
    705     $ python -m tarfile -l monty.tar
    706 
    707 
    708 Command-line options
    709 ~~~~~~~~~~~~~~~~~~~~
    710 
    711 .. cmdoption:: -l <tarfile>
    712                --list <tarfile>
    713 
    714    List files in a tarfile.
    715 
    716 .. cmdoption:: -c <tarfile> <source1> ... <sourceN>
    717                --create <tarfile> <source1> ... <sourceN>
    718 
    719    Create tarfile from source files.
    720 
    721 .. cmdoption:: -e <tarfile> [<output_dir>]
    722                --extract <tarfile> [<output_dir>]
    723 
    724    Extract tarfile into the current directory if *output_dir* is not specified.
    725 
    726 .. cmdoption:: -t <tarfile>
    727                --test <tarfile>
    728 
    729    Test whether the tarfile is valid or not.
    730 
    731 .. cmdoption:: -v, --verbose
    732 
    733    Verbose output.
    734 
    735 .. _tar-examples:
    736 
    737 Examples
    738 --------
    739 
    740 How to extract an entire tar archive to the current working directory::
    741 
    742    import tarfile
    743    tar = tarfile.open("sample.tar.gz")
    744    tar.extractall()
    745    tar.close()
    746 
    747 How to extract a subset of a tar archive with :meth:`TarFile.extractall` using
    748 a generator function instead of a list::
    749 
    750    import os
    751    import tarfile
    752 
    753    def py_files(members):
    754        for tarinfo in members:
    755            if os.path.splitext(tarinfo.name)[1] == ".py":
    756                yield tarinfo
    757 
    758    tar = tarfile.open("sample.tar.gz")
    759    tar.extractall(members=py_files(tar))
    760    tar.close()
    761 
    762 How to create an uncompressed tar archive from a list of filenames::
    763 
    764    import tarfile
    765    tar = tarfile.open("sample.tar", "w")
    766    for name in ["foo", "bar", "quux"]:
    767        tar.add(name)
    768    tar.close()
    769 
    770 The same example using the :keyword:`with` statement::
    771 
    772     import tarfile
    773     with tarfile.open("sample.tar", "w") as tar:
    774         for name in ["foo", "bar", "quux"]:
    775             tar.add(name)
    776 
    777 How to read a gzip compressed tar archive and display some member information::
    778 
    779    import tarfile
    780    tar = tarfile.open("sample.tar.gz", "r:gz")
    781    for tarinfo in tar:
    782        print(tarinfo.name, "is", tarinfo.size, "bytes in size and is", end="")
    783        if tarinfo.isreg():
    784            print("a regular file.")
    785        elif tarinfo.isdir():
    786            print("a directory.")
    787        else:
    788            print("something else.")
    789    tar.close()
    790 
    791 How to create an archive and reset the user information using the *filter*
    792 parameter in :meth:`TarFile.add`::
    793 
    794     import tarfile
    795     def reset(tarinfo):
    796         tarinfo.uid = tarinfo.gid = 0
    797         tarinfo.uname = tarinfo.gname = "root"
    798         return tarinfo
    799     tar = tarfile.open("sample.tar.gz", "w:gz")
    800     tar.add("foo", filter=reset)
    801     tar.close()
    802 
    803 
    804 .. _tar-formats:
    805 
    806 Supported tar formats
    807 ---------------------
    808 
    809 There are three tar formats that can be created with the :mod:`tarfile` module:
    810 
    811 * The POSIX.1-1988 ustar format (:const:`USTAR_FORMAT`). It supports filenames
    812   up to a length of at best 256 characters and linknames up to 100 characters. The
    813   maximum file size is 8 GiB. This is an old and limited but widely
    814   supported format.
    815 
    816 * The GNU tar format (:const:`GNU_FORMAT`). It supports long filenames and
    817   linknames, files bigger than 8 GiB and sparse files. It is the de facto
    818   standard on GNU/Linux systems. :mod:`tarfile` fully supports the GNU tar
    819   extensions for long names, sparse file support is read-only.
    820 
    821 * The POSIX.1-2001 pax format (:const:`PAX_FORMAT`). It is the most flexible
    822   format with virtually no limits. It supports long filenames and linknames, large
    823   files and stores pathnames in a portable way. However, not all tar
    824   implementations today are able to handle pax archives properly.
    825 
    826   The *pax* format is an extension to the existing *ustar* format. It uses extra
    827   headers for information that cannot be stored otherwise. There are two flavours
    828   of pax headers: Extended headers only affect the subsequent file header, global
    829   headers are valid for the complete archive and affect all following files. All
    830   the data in a pax header is encoded in *UTF-8* for portability reasons.
    831 
    832 There are some more variants of the tar format which can be read, but not
    833 created:
    834 
    835 * The ancient V7 format. This is the first tar format from Unix Seventh Edition,
    836   storing only regular files and directories. Names must not be longer than 100
    837   characters, there is no user/group name information. Some archives have
    838   miscalculated header checksums in case of fields with non-ASCII characters.
    839 
    840 * The SunOS tar extended format. This format is a variant of the POSIX.1-2001
    841   pax format, but is not compatible.
    842 
    843 .. _tar-unicode:
    844 
    845 Unicode issues
    846 --------------
    847 
    848 The tar format was originally conceived to make backups on tape drives with the
    849 main focus on preserving file system information. Nowadays tar archives are
    850 commonly used for file distribution and exchanging archives over networks. One
    851 problem of the original format (which is the basis of all other formats) is
    852 that there is no concept of supporting different character encodings. For
    853 example, an ordinary tar archive created on a *UTF-8* system cannot be read
    854 correctly on a *Latin-1* system if it contains non-*ASCII* characters. Textual
    855 metadata (like filenames, linknames, user/group names) will appear damaged.
    856 Unfortunately, there is no way to autodetect the encoding of an archive. The
    857 pax format was designed to solve this problem. It stores non-ASCII metadata
    858 using the universal character encoding *UTF-8*.
    859 
    860 The details of character conversion in :mod:`tarfile` are controlled by the
    861 *encoding* and *errors* keyword arguments of the :class:`TarFile` class.
    862 
    863 *encoding* defines the character encoding to use for the metadata in the
    864 archive. The default value is :func:`sys.getfilesystemencoding` or ``'ascii'``
    865 as a fallback. Depending on whether the archive is read or written, the
    866 metadata must be either decoded or encoded. If *encoding* is not set
    867 appropriately, this conversion may fail.
    868 
    869 The *errors* argument defines how characters are treated that cannot be
    870 converted. Possible values are listed in section :ref:`error-handlers`.
    871 The default scheme is ``'surrogateescape'`` which Python also uses for its
    872 file system calls, see :ref:`os-filenames`.
    873 
    874 In case of :const:`PAX_FORMAT` archives, *encoding* is generally not needed
    875 because all the metadata is stored using *UTF-8*. *encoding* is only used in
    876 the rare cases when binary pax headers are decoded or when strings with
    877 surrogate characters are stored.
    878