Home | History | Annotate | Download | only in library
      1 
      2 :mod:`multifile` --- Support for files containing distinct parts
      3 ================================================================
      4 
      5 .. module:: multifile
      6    :synopsis: Support for reading files which contain distinct parts, such as some MIME data.
      7    :deprecated:
      8 .. sectionauthor:: Eric S. Raymond <esr (a] snark.thyrsus.com>
      9 
     10 
     11 .. deprecated:: 2.5
     12    The :mod:`email` package should be used in preference to the :mod:`multifile`
     13    module. This module is present only to maintain backward compatibility.
     14 
     15 The :class:`MultiFile` object enables you to treat sections of a text file as
     16 file-like input objects, with ``''`` being returned by :meth:`readline` when a
     17 given delimiter pattern is encountered.  The defaults of this class are designed
     18 to make it useful for parsing MIME multipart messages, but by subclassing it and
     19 overriding methods  it can be easily adapted for more general use.
     20 
     21 
     22 .. class:: MultiFile(fp[, seekable])
     23 
     24    Create a multi-file.  You must instantiate this class with an input object
     25    argument for the :class:`MultiFile` instance to get lines from, such as a file
     26    object returned by :func:`open`.
     27 
     28    :class:`MultiFile` only ever looks at the input object's :meth:`readline`,
     29    :meth:`seek` and :meth:`tell` methods, and the latter two are only needed if you
     30    want random access to the individual MIME parts. To use :class:`MultiFile` on a
     31    non-seekable stream object, set the optional *seekable* argument to false; this
     32    will prevent using the input object's :meth:`seek` and :meth:`tell` methods.
     33 
     34 It will be useful to know that in :class:`MultiFile`'s view of the world, text
     35 is composed of three kinds of lines: data, section-dividers, and end-markers.
     36 MultiFile is designed to support parsing of messages that may have multiple
     37 nested message parts, each with its own pattern for section-divider and
     38 end-marker lines.
     39 
     40 
     41 .. seealso::
     42 
     43    Module :mod:`email`
     44       Comprehensive email handling package; supersedes the :mod:`multifile` module.
     45 
     46 
     47 .. _multifile-objects:
     48 
     49 MultiFile Objects
     50 -----------------
     51 
     52 A :class:`MultiFile` instance has the following methods:
     53 
     54 
     55 .. method:: MultiFile.readline(str)
     56 
     57    Read a line.  If the line is data (not a section-divider or end-marker or real
     58    EOF) return it.  If the line matches the most-recently-stacked boundary, return
     59    ``''`` and set ``self.last`` to 1 or 0 according as the match is or is not an
     60    end-marker.  If the line matches any other stacked boundary, raise an error.  On
     61    encountering end-of-file on the underlying stream object, the method raises
     62    :exc:`Error` unless all boundaries have been popped.
     63 
     64 
     65 .. method:: MultiFile.readlines(str)
     66 
     67    Return all lines remaining in this part as a list of strings.
     68 
     69 
     70 .. method:: MultiFile.read()
     71 
     72    Read all lines, up to the next section.  Return them as a single (multiline)
     73    string.  Note that this doesn't take a size argument!
     74 
     75 
     76 .. method:: MultiFile.seek(pos[, whence])
     77 
     78    Seek.  Seek indices are relative to the start of the current section. The *pos*
     79    and *whence* arguments are interpreted as for a file seek.
     80 
     81 
     82 .. method:: MultiFile.tell()
     83 
     84    Return the file position relative to the start of the current section.
     85 
     86 
     87 .. method:: MultiFile.next()
     88 
     89    Skip lines to the next section (that is, read lines until a section-divider or
     90    end-marker has been consumed).  Return true if there is such a section, false if
     91    an end-marker is seen.  Re-enable the most-recently-pushed boundary.
     92 
     93 
     94 .. method:: MultiFile.is_data(str)
     95 
     96    Return true if *str* is data and false if it might be a section boundary.  As
     97    written, it tests for a prefix other than ``'-``\ ``-'`` at start of line (which
     98    all MIME boundaries have) but it is declared so it can be overridden in derived
     99    classes.
    100 
    101    Note that this test is used intended as a fast guard for the real boundary
    102    tests; if it always returns false it will merely slow processing, not cause it
    103    to fail.
    104 
    105 
    106 .. method:: MultiFile.push(str)
    107 
    108    Push a boundary string.  When a decorated version of this boundary  is found as
    109    an input line, it will be interpreted as a section-divider  or end-marker
    110    (depending on the decoration, see :rfc:`2045`).  All subsequent reads will
    111    return the empty string to indicate end-of-file, until a call to :meth:`pop`
    112    removes the boundary a or :meth:`.next` call reenables it.
    113 
    114    It is possible to push more than one boundary.  Encountering the
    115    most-recently-pushed boundary will return EOF; encountering any other
    116    boundary will raise an error.
    117 
    118 
    119 .. method:: MultiFile.pop()
    120 
    121    Pop a section boundary.  This boundary will no longer be interpreted as EOF.
    122 
    123 
    124 .. method:: MultiFile.section_divider(str)
    125 
    126    Turn a boundary into a section-divider line.  By default, this method
    127    prepends ``'--'`` (which MIME section boundaries have) but it is declared so
    128    it can be overridden in derived classes.  This method need not append LF or
    129    CR-LF, as comparison with the result ignores trailing whitespace.
    130 
    131 
    132 .. method:: MultiFile.end_marker(str)
    133 
    134    Turn a boundary string into an end-marker line.  By default, this method
    135    prepends ``'--'`` and appends ``'--'`` (like a MIME-multipart end-of-message
    136    marker) but it is declared so it can be overridden in derived classes.  This
    137    method need not append LF or CR-LF, as comparison with the result ignores
    138    trailing whitespace.
    139 
    140 Finally, :class:`MultiFile` instances have two public instance variables:
    141 
    142 
    143 .. attribute:: MultiFile.level
    144 
    145    Nesting depth of the current part.
    146 
    147 
    148 .. attribute:: MultiFile.last
    149 
    150    True if the last end-of-file was for an end-of-message marker.
    151 
    152 
    153 .. _multifile-example:
    154 
    155 :class:`MultiFile` Example
    156 --------------------------
    157 
    158 .. sectionauthor:: Skip Montanaro <skip (a] pobox.com>
    159 
    160 
    161 ::
    162 
    163    import mimetools
    164    import multifile
    165    import StringIO
    166 
    167    def extract_mime_part_matching(stream, mimetype):
    168        """Return the first element in a multipart MIME message on stream
    169        matching mimetype."""
    170 
    171        msg = mimetools.Message(stream)
    172        msgtype = msg.gettype()
    173        params = msg.getplist()
    174 
    175        data = StringIO.StringIO()
    176        if msgtype[:10] == "multipart/":
    177 
    178            file = multifile.MultiFile(stream)
    179            file.push(msg.getparam("boundary"))
    180            while file.next():
    181                submsg = mimetools.Message(file)
    182                try:
    183                    data = StringIO.StringIO()
    184                    mimetools.decode(file, data, submsg.getencoding())
    185                except ValueError:
    186                    continue
    187                if submsg.gettype() == mimetype:
    188                    break
    189            file.pop()
    190        return data.getvalue()
    191 
    192