1 2 :mod:`rfc822` --- Parse RFC 2822 mail headers 3 ============================================= 4 5 .. module:: rfc822 6 :synopsis: Parse 2822 style mail messages. 7 :deprecated: 8 9 10 .. deprecated:: 2.3 11 The :mod:`email` package should be used in preference to the :mod:`rfc822` 12 module. This module is present only to maintain backward compatibility, and 13 has been removed in Python 3. 14 15 This module defines a class, :class:`Message`, which represents an "email 16 message" as defined by the Internet standard :rfc:`2822`. [#]_ Such messages 17 consist of a collection of message headers, and a message body. This module 18 also defines a helper class :class:`AddressList` for parsing :rfc:`2822` 19 addresses. Please refer to the RFC for information on the specific syntax of 20 :rfc:`2822` messages. 21 22 .. index:: module: mailbox 23 24 The :mod:`mailbox` module provides classes to read mailboxes produced by 25 various end-user mail programs. 26 27 28 .. class:: Message(file[, seekable]) 29 30 A :class:`Message` instance is instantiated with an input object as parameter. 31 Message relies only on the input object having a :meth:`readline` method; in 32 particular, ordinary file objects qualify. Instantiation reads headers from the 33 input object up to a delimiter line (normally a blank line) and stores them in 34 the instance. The message body, following the headers, is not consumed. 35 36 This class can work with any input object that supports a :meth:`readline` 37 method. If the input object has seek and tell capability, the 38 :meth:`rewindbody` method will work; also, illegal lines will be pushed back 39 onto the input stream. If the input object lacks seek but has an :meth:`unread` 40 method that can push back a line of input, :class:`Message` will use that to 41 push back illegal lines. Thus this class can be used to parse messages coming 42 from a buffered stream. 43 44 The optional *seekable* argument is provided as a workaround for certain stdio 45 libraries in which :c:func:`tell` discards buffered data before discovering that 46 the :c:func:`lseek` system call doesn't work. For maximum portability, you 47 should set the seekable argument to zero to prevent that initial :meth:`tell` 48 when passing in an unseekable object such as a file object created from a socket 49 object. 50 51 Input lines as read from the file may either be terminated by CR-LF or by a 52 single linefeed; a terminating CR-LF is replaced by a single linefeed before the 53 line is stored. 54 55 All header matching is done independent of upper or lower case; e.g. 56 ``m['From']``, ``m['from']`` and ``m['FROM']`` all yield the same result. 57 58 59 .. class:: AddressList(field) 60 61 You may instantiate the :class:`AddressList` helper class using a single string 62 parameter, a comma-separated list of :rfc:`2822` addresses to be parsed. (The 63 parameter ``None`` yields an empty list.) 64 65 66 .. function:: quote(str) 67 68 Return a new string with backslashes in *str* replaced by two backslashes and 69 double quotes replaced by backslash-double quote. 70 71 72 .. function:: unquote(str) 73 74 Return a new string which is an *unquoted* version of *str*. If *str* ends and 75 begins with double quotes, they are stripped off. Likewise if *str* ends and 76 begins with angle brackets, they are stripped off. 77 78 79 .. function:: parseaddr(address) 80 81 Parse *address*, which should be the value of some address-containing field such 82 as :mailheader:`To` or :mailheader:`Cc`, into its constituent "realname" and 83 "email address" parts. Returns a tuple of that information, unless the parse 84 fails, in which case a 2-tuple ``(None, None)`` is returned. 85 86 87 .. function:: dump_address_pair(pair) 88 89 The inverse of :meth:`parseaddr`, this takes a 2-tuple of the form ``(realname, 90 email_address)`` and returns the string value suitable for a :mailheader:`To` or 91 :mailheader:`Cc` header. If the first element of *pair* is false, then the 92 second element is returned unmodified. 93 94 95 .. function:: parsedate(date) 96 97 Attempts to parse a date according to the rules in :rfc:`2822`. however, some 98 mailers don't follow that format as specified, so :func:`parsedate` tries to 99 guess correctly in such cases. *date* is a string containing an :rfc:`2822` 100 date, such as ``'Mon, 20 Nov 1995 19:12:08 -0500'``. If it succeeds in parsing 101 the date, :func:`parsedate` returns a 9-tuple that can be passed directly to 102 :func:`time.mktime`; otherwise ``None`` will be returned. Note that indexes 6, 103 7, and 8 of the result tuple are not usable. 104 105 106 .. function:: parsedate_tz(date) 107 108 Performs the same function as :func:`parsedate`, but returns either ``None`` or 109 a 10-tuple; the first 9 elements make up a tuple that can be passed directly to 110 :func:`time.mktime`, and the tenth is the offset of the date's timezone from UTC 111 (which is the official term for Greenwich Mean Time). (Note that the sign of 112 the timezone offset is the opposite of the sign of the ``time.timezone`` 113 variable for the same timezone; the latter variable follows the POSIX standard 114 while this module follows :rfc:`2822`.) If the input string has no timezone, 115 the last element of the tuple returned is ``None``. Note that indexes 6, 7, and 116 8 of the result tuple are not usable. 117 118 119 .. function:: mktime_tz(tuple) 120 121 Turn a 10-tuple as returned by :func:`parsedate_tz` into a UTC timestamp. If 122 the timezone item in the tuple is ``None``, assume local time. Minor 123 deficiency: this first interprets the first 8 elements as a local time and then 124 compensates for the timezone difference; this may yield a slight error around 125 daylight savings time switch dates. Not enough to worry about for common use. 126 127 128 .. seealso:: 129 130 Module :mod:`email` 131 Comprehensive email handling package; supersedes the :mod:`rfc822` module. 132 133 Module :mod:`mailbox` 134 Classes to read various mailbox formats produced by end-user mail programs. 135 136 Module :mod:`mimetools` 137 Subclass of :class:`rfc822.Message` that handles MIME encoded messages. 138 139 140 .. _message-objects: 141 142 Message Objects 143 --------------- 144 145 A :class:`Message` instance has the following methods: 146 147 148 .. method:: Message.rewindbody() 149 150 Seek to the start of the message body. This only works if the file object is 151 seekable. 152 153 154 .. method:: Message.isheader(line) 155 156 Returns a line's canonicalized fieldname (the dictionary key that will be used 157 to index it) if the line is a legal :rfc:`2822` header; otherwise returns 158 ``None`` (implying that parsing should stop here and the line be pushed back on 159 the input stream). It is sometimes useful to override this method in a 160 subclass. 161 162 163 .. method:: Message.islast(line) 164 165 Return true if the given line is a delimiter on which Message should stop. The 166 delimiter line is consumed, and the file object's read location positioned 167 immediately after it. By default this method just checks that the line is 168 blank, but you can override it in a subclass. 169 170 171 .. method:: Message.iscomment(line) 172 173 Return ``True`` if the given line should be ignored entirely, just skipped. By 174 default this is a stub that always returns ``False``, but you can override it in 175 a subclass. 176 177 178 .. method:: Message.getallmatchingheaders(name) 179 180 Return a list of lines consisting of all headers matching *name*, if any. Each 181 physical line, whether it is a continuation line or not, is a separate list 182 item. Return the empty list if no header matches *name*. 183 184 185 .. method:: Message.getfirstmatchingheader(name) 186 187 Return a list of lines comprising the first header matching *name*, and its 188 continuation line(s), if any. Return ``None`` if there is no header matching 189 *name*. 190 191 192 .. method:: Message.getrawheader(name) 193 194 Return a single string consisting of the text after the colon in the first 195 header matching *name*. This includes leading whitespace, the trailing 196 linefeed, and internal linefeeds and whitespace if there any continuation 197 line(s) were present. Return ``None`` if there is no header matching *name*. 198 199 200 .. method:: Message.getheader(name[, default]) 201 202 Return a single string consisting of the last header matching *name*, 203 but strip leading and trailing whitespace. 204 Internal whitespace is not stripped. The optional *default* argument can be 205 used to specify a different default to be returned when there is no header 206 matching *name*; it defaults to ``None``. 207 This is the preferred way to get parsed headers. 208 209 210 .. method:: Message.get(name[, default]) 211 212 An alias for :meth:`getheader`, to make the interface more compatible with 213 regular dictionaries. 214 215 216 .. method:: Message.getaddr(name) 217 218 Return a pair ``(full name, email address)`` parsed from the string returned by 219 ``getheader(name)``. If no header matching *name* exists, return ``(None, 220 None)``; otherwise both the full name and the address are (possibly empty) 221 strings. 222 223 Example: If *m*'s first :mailheader:`From` header contains the string 224 ``'jack (a] cwi.nl (Jack Jansen)'``, then ``m.getaddr('From')`` will yield the pair 225 ``('Jack Jansen', 'jack (a] cwi.nl')``. If the header contained ``'Jack Jansen 226 <jack (a] cwi.nl>'`` instead, it would yield the exact same result. 227 228 229 .. method:: Message.getaddrlist(name) 230 231 This is similar to ``getaddr(list)``, but parses a header containing a list of 232 email addresses (e.g. a :mailheader:`To` header) and returns a list of ``(full 233 name, email address)`` pairs (even if there was only one address in the header). 234 If there is no header matching *name*, return an empty list. 235 236 If multiple headers exist that match the named header (e.g. if there are several 237 :mailheader:`Cc` headers), all are parsed for addresses. Any continuation lines 238 the named headers contain are also parsed. 239 240 241 .. method:: Message.getdate(name) 242 243 Retrieve a header using :meth:`getheader` and parse it into a 9-tuple compatible 244 with :func:`time.mktime`; note that fields 6, 7, and 8 are not usable. If 245 there is no header matching *name*, or it is unparsable, return ``None``. 246 247 Date parsing appears to be a black art, and not all mailers adhere to the 248 standard. While it has been tested and found correct on a large collection of 249 email from many sources, it is still possible that this function may 250 occasionally yield an incorrect result. 251 252 253 .. method:: Message.getdate_tz(name) 254 255 Retrieve a header using :meth:`getheader` and parse it into a 10-tuple; the 256 first 9 elements will make a tuple compatible with :func:`time.mktime`, and the 257 10th is a number giving the offset of the date's timezone from UTC. Note that 258 fields 6, 7, and 8 are not usable. Similarly to :meth:`getdate`, if there is 259 no header matching *name*, or it is unparsable, return ``None``. 260 261 :class:`Message` instances also support a limited mapping interface. In 262 particular: ``m[name]`` is like ``m.getheader(name)`` but raises :exc:`KeyError` 263 if there is no matching header; and ``len(m)``, ``m.get(name[, default])``, 264 ``name in m``, ``m.keys()``, ``m.values()`` ``m.items()``, and 265 ``m.setdefault(name[, default])`` act as expected, with the one difference 266 that :meth:`setdefault` uses an empty string as the default value. 267 :class:`Message` instances also support the mapping writable interface ``m[name] 268 = value`` and ``del m[name]``. :class:`Message` objects do not support the 269 :meth:`clear`, :meth:`copy`, :meth:`popitem`, or :meth:`update` methods of the 270 mapping interface. (Support for :meth:`get` and :meth:`setdefault` was only 271 added in Python 2.2.) 272 273 Finally, :class:`Message` instances have some public instance variables: 274 275 276 .. attribute:: Message.headers 277 278 A list containing the entire set of header lines, in the order in which they 279 were read (except that setitem calls may disturb this order). Each line contains 280 a trailing newline. The blank line terminating the headers is not contained in 281 the list. 282 283 284 .. attribute:: Message.fp 285 286 The file or file-like object passed at instantiation time. This can be used to 287 read the message content. 288 289 290 .. attribute:: Message.unixfrom 291 292 The Unix ``From`` line, if the message had one, or an empty string. This is 293 needed to regenerate the message in some contexts, such as an ``mbox``\ -style 294 mailbox file. 295 296 297 .. _addresslist-objects: 298 299 AddressList Objects 300 ------------------- 301 302 An :class:`AddressList` instance has the following methods: 303 304 305 .. method:: AddressList.__len__() 306 307 Return the number of addresses in the address list. 308 309 310 .. method:: AddressList.__str__() 311 312 Return a canonicalized string representation of the address list. Addresses are 313 rendered in "name" <host@domain> form, comma-separated. 314 315 316 .. method:: AddressList.__add__(alist) 317 318 Return a new :class:`AddressList` instance that contains all addresses in both 319 :class:`AddressList` operands, with duplicates removed (set union). 320 321 322 .. method:: AddressList.__iadd__(alist) 323 324 In-place version of :meth:`__add__`; turns this :class:`AddressList` instance 325 into the union of itself and the right-hand instance, *alist*. 326 327 328 .. method:: AddressList.__sub__(alist) 329 330 Return a new :class:`AddressList` instance that contains every address in the 331 left-hand :class:`AddressList` operand that is not present in the right-hand 332 address operand (set difference). 333 334 335 .. method:: AddressList.__isub__(alist) 336 337 In-place version of :meth:`__sub__`, removing addresses in this list which are 338 also in *alist*. 339 340 Finally, :class:`AddressList` instances have one public instance variable: 341 342 343 .. attribute:: AddressList.addresslist 344 345 A list of tuple string pairs, one per address. In each member, the first is the 346 canonicalized name part, the second is the actual route-address (``'@'``\ 347 -separated username-host.domain pair). 348 349 .. rubric:: Footnotes 350 351 .. [#] This module originally conformed to :rfc:`822`, hence the name. Since then, 352 :rfc:`2822` has been released as an update to :rfc:`822`. This module should be 353 considered :rfc:`2822`\ -conformant, especially in cases where the syntax or 354 semantics have changed since :rfc:`822`. 355 356