1 2 :mod:`multifile` --- Support for files containing distinct parts 3 ================================================================ 4 5 .. module:: multifile 6 :synopsis: Support for reading files which contain distinct parts, such as some MIME data. 7 :deprecated: 8 .. sectionauthor:: Eric S. Raymond <esr (a] snark.thyrsus.com> 9 10 11 .. deprecated:: 2.5 12 The :mod:`email` package should be used in preference to the :mod:`multifile` 13 module. This module is present only to maintain backward compatibility. 14 15 The :class:`MultiFile` object enables you to treat sections of a text file as 16 file-like input objects, with ``''`` being returned by :meth:`readline` when a 17 given delimiter pattern is encountered. The defaults of this class are designed 18 to make it useful for parsing MIME multipart messages, but by subclassing it and 19 overriding methods it can be easily adapted for more general use. 20 21 22 .. class:: MultiFile(fp[, seekable]) 23 24 Create a multi-file. You must instantiate this class with an input object 25 argument for the :class:`MultiFile` instance to get lines from, such as a file 26 object returned by :func:`open`. 27 28 :class:`MultiFile` only ever looks at the input object's :meth:`readline`, 29 :meth:`seek` and :meth:`tell` methods, and the latter two are only needed if you 30 want random access to the individual MIME parts. To use :class:`MultiFile` on a 31 non-seekable stream object, set the optional *seekable* argument to false; this 32 will prevent using the input object's :meth:`seek` and :meth:`tell` methods. 33 34 It will be useful to know that in :class:`MultiFile`'s view of the world, text 35 is composed of three kinds of lines: data, section-dividers, and end-markers. 36 MultiFile is designed to support parsing of messages that may have multiple 37 nested message parts, each with its own pattern for section-divider and 38 end-marker lines. 39 40 41 .. seealso:: 42 43 Module :mod:`email` 44 Comprehensive email handling package; supersedes the :mod:`multifile` module. 45 46 47 .. _multifile-objects: 48 49 MultiFile Objects 50 ----------------- 51 52 A :class:`MultiFile` instance has the following methods: 53 54 55 .. method:: MultiFile.readline(str) 56 57 Read a line. If the line is data (not a section-divider or end-marker or real 58 EOF) return it. If the line matches the most-recently-stacked boundary, return 59 ``''`` and set ``self.last`` to 1 or 0 according as the match is or is not an 60 end-marker. If the line matches any other stacked boundary, raise an error. On 61 encountering end-of-file on the underlying stream object, the method raises 62 :exc:`Error` unless all boundaries have been popped. 63 64 65 .. method:: MultiFile.readlines(str) 66 67 Return all lines remaining in this part as a list of strings. 68 69 70 .. method:: MultiFile.read() 71 72 Read all lines, up to the next section. Return them as a single (multiline) 73 string. Note that this doesn't take a size argument! 74 75 76 .. method:: MultiFile.seek(pos[, whence]) 77 78 Seek. Seek indices are relative to the start of the current section. The *pos* 79 and *whence* arguments are interpreted as for a file seek. 80 81 82 .. method:: MultiFile.tell() 83 84 Return the file position relative to the start of the current section. 85 86 87 .. method:: MultiFile.next() 88 89 Skip lines to the next section (that is, read lines until a section-divider or 90 end-marker has been consumed). Return true if there is such a section, false if 91 an end-marker is seen. Re-enable the most-recently-pushed boundary. 92 93 94 .. method:: MultiFile.is_data(str) 95 96 Return true if *str* is data and false if it might be a section boundary. As 97 written, it tests for a prefix other than ``'-``\ ``-'`` at start of line (which 98 all MIME boundaries have) but it is declared so it can be overridden in derived 99 classes. 100 101 Note that this test is used intended as a fast guard for the real boundary 102 tests; if it always returns false it will merely slow processing, not cause it 103 to fail. 104 105 106 .. method:: MultiFile.push(str) 107 108 Push a boundary string. When a decorated version of this boundary is found as 109 an input line, it will be interpreted as a section-divider or end-marker 110 (depending on the decoration, see :rfc:`2045`). All subsequent reads will 111 return the empty string to indicate end-of-file, until a call to :meth:`pop` 112 removes the boundary a or :meth:`.next` call reenables it. 113 114 It is possible to push more than one boundary. Encountering the 115 most-recently-pushed boundary will return EOF; encountering any other 116 boundary will raise an error. 117 118 119 .. method:: MultiFile.pop() 120 121 Pop a section boundary. This boundary will no longer be interpreted as EOF. 122 123 124 .. method:: MultiFile.section_divider(str) 125 126 Turn a boundary into a section-divider line. By default, this method 127 prepends ``'--'`` (which MIME section boundaries have) but it is declared so 128 it can be overridden in derived classes. This method need not append LF or 129 CR-LF, as comparison with the result ignores trailing whitespace. 130 131 132 .. method:: MultiFile.end_marker(str) 133 134 Turn a boundary string into an end-marker line. By default, this method 135 prepends ``'--'`` and appends ``'--'`` (like a MIME-multipart end-of-message 136 marker) but it is declared so it can be overridden in derived classes. This 137 method need not append LF or CR-LF, as comparison with the result ignores 138 trailing whitespace. 139 140 Finally, :class:`MultiFile` instances have two public instance variables: 141 142 143 .. attribute:: MultiFile.level 144 145 Nesting depth of the current part. 146 147 148 .. attribute:: MultiFile.last 149 150 True if the last end-of-file was for an end-of-message marker. 151 152 153 .. _multifile-example: 154 155 :class:`MultiFile` Example 156 -------------------------- 157 158 .. sectionauthor:: Skip Montanaro <skip (a] pobox.com> 159 160 161 :: 162 163 import mimetools 164 import multifile 165 import StringIO 166 167 def extract_mime_part_matching(stream, mimetype): 168 """Return the first element in a multipart MIME message on stream 169 matching mimetype.""" 170 171 msg = mimetools.Message(stream) 172 msgtype = msg.gettype() 173 params = msg.getplist() 174 175 data = StringIO.StringIO() 176 if msgtype[:10] == "multipart/": 177 178 file = multifile.MultiFile(stream) 179 file.push(msg.getparam("boundary")) 180 while file.next(): 181 submsg = mimetools.Message(file) 182 try: 183 data = StringIO.StringIO() 184 mimetools.decode(file, data, submsg.getencoding()) 185 except ValueError: 186 continue 187 if submsg.gettype() == mimetype: 188 break 189 file.pop() 190 return data.getvalue() 191 192