Home | History | Annotate | Download | only in library
      1 :mod:`email.message`: Representing an email message
      2 ---------------------------------------------------
      3 
      4 .. module:: email.message
      5    :synopsis: The base class representing email messages.
      6 
      7 
      8 The central class in the :mod:`email` package is the :class:`Message` class,
      9 imported from the :mod:`email.message` module.  It is the base class for the
     10 :mod:`email` object model.  :class:`Message` provides the core functionality for
     11 setting and querying header fields, and for accessing message bodies.
     12 
     13 Conceptually, a :class:`Message` object consists of *headers* and *payloads*.
     14 Headers are :rfc:`2822` style field names and values where the field name and
     15 value are separated by a colon.  The colon is not part of either the field name
     16 or the field value.
     17 
     18 Headers are stored and returned in case-preserving form but are matched
     19 case-insensitively.  There may also be a single envelope header, also known as
     20 the *Unix-From* header or the ``From_`` header.  The payload is either a string
     21 in the case of simple message objects or a list of :class:`Message` objects for
     22 MIME container documents (e.g. :mimetype:`multipart/\*` and
     23 :mimetype:`message/rfc822`).
     24 
     25 :class:`Message` objects provide a mapping style interface for accessing the
     26 message headers, and an explicit interface for accessing both the headers and
     27 the payload.  It provides convenience methods for generating a flat text
     28 representation of the message object tree, for accessing commonly used header
     29 parameters, and for recursively walking over the object tree.
     30 
     31 Here are the methods of the :class:`Message` class:
     32 
     33 
     34 .. class:: Message()
     35 
     36    The constructor takes no arguments.
     37 
     38 
     39    .. method:: as_string([unixfrom])
     40 
     41       Return the entire message flattened as a string.  When optional *unixfrom*
     42       is ``True``, the envelope header is included in the returned string.
     43       *unixfrom* defaults to ``False``.  Flattening the message may trigger
     44       changes to the :class:`Message` if defaults need to be filled in to
     45       complete the transformation to a string (for example, MIME boundaries may
     46       be generated or modified).
     47 
     48       Note that this method is provided as a convenience and may not always
     49       format the message the way you want.  For example, by default it mangles
     50       lines that begin with ``From``.  For more flexibility, instantiate a
     51       :class:`~email.generator.Generator` instance and use its
     52       :meth:`~email.generator.Generator.flatten` method directly.  For example::
     53 
     54          from cStringIO import StringIO
     55          from email.generator import Generator
     56          fp = StringIO()
     57          g = Generator(fp, mangle_from_=False, maxheaderlen=60)
     58          g.flatten(msg)
     59          text = fp.getvalue()
     60 
     61 
     62    .. method:: __str__()
     63 
     64       Equivalent to ``as_string(unixfrom=True)``.
     65 
     66 
     67    .. method:: is_multipart()
     68 
     69       Return ``True`` if the message's payload is a list of sub-\
     70       :class:`Message` objects, otherwise return ``False``.  When
     71       :meth:`is_multipart` returns ``False``, the payload should be a string object.
     72 
     73 
     74    .. method:: set_unixfrom(unixfrom)
     75 
     76       Set the message's envelope header to *unixfrom*, which should be a string.
     77 
     78 
     79    .. method:: get_unixfrom()
     80 
     81       Return the message's envelope header.  Defaults to ``None`` if the
     82       envelope header was never set.
     83 
     84 
     85    .. method:: attach(payload)
     86 
     87       Add the given *payload* to the current payload, which must be ``None`` or
     88       a list of :class:`Message` objects before the call. After the call, the
     89       payload will always be a list of :class:`Message` objects.  If you want to
     90       set the payload to a scalar object (e.g. a string), use
     91       :meth:`set_payload` instead.
     92 
     93 
     94    .. method:: get_payload([i[, decode]])
     95 
     96       Return the current payload, which will be a list of
     97       :class:`Message` objects when :meth:`is_multipart` is ``True``, or a
     98       string when :meth:`is_multipart` is ``False``.  If the payload is a list
     99       and you mutate the list object, you modify the message's payload in place.
    100 
    101       With optional argument *i*, :meth:`get_payload` will return the *i*-th
    102       element of the payload, counting from zero, if :meth:`is_multipart` is
    103       ``True``.  An :exc:`IndexError` will be raised if *i* is less than 0 or
    104       greater than or equal to the number of items in the payload.  If the
    105       payload is a string (i.e.  :meth:`is_multipart` is ``False``) and *i* is
    106       given, a :exc:`TypeError` is raised.
    107 
    108       Optional *decode* is a flag indicating whether the payload should be
    109       decoded or not, according to the :mailheader:`Content-Transfer-Encoding`
    110       header. When ``True`` and the message is not a multipart, the payload will
    111       be decoded if this header's value is ``quoted-printable`` or ``base64``.
    112       If some other encoding is used, or :mailheader:`Content-Transfer-Encoding`
    113       header is missing, or if the payload has bogus base64 data, the payload is
    114       returned as-is (undecoded).  If the message is a multipart and the
    115       *decode* flag is ``True``, then ``None`` is returned.  The default for
    116       *decode* is ``False``.
    117 
    118 
    119    .. method:: set_payload(payload[, charset])
    120 
    121       Set the entire message object's payload to *payload*.  It is the client's
    122       responsibility to ensure the payload invariants.  Optional *charset* sets
    123       the message's default character set; see :meth:`set_charset` for details.
    124 
    125       .. versionchanged:: 2.2.2
    126          *charset* argument added.
    127 
    128 
    129    .. method:: set_charset(charset)
    130 
    131       Set the character set of the payload to *charset*, which can either be a
    132       :class:`~email.charset.Charset` instance (see :mod:`email.charset`), a
    133       string naming a character set, or ``None``.  If it is a string, it will
    134       be converted to a :class:`~email.charset.Charset` instance.  If *charset*
    135       is ``None``, the ``charset`` parameter will be removed from the
    136       :mailheader:`Content-Type` header (the message will not be otherwise
    137       modified).  Anything else will generate a :exc:`TypeError`.
    138 
    139       If there is no existing :mailheader:`MIME-Version` header one will be
    140       added.  If there is no existing :mailheader:`Content-Type` header, one
    141       will be added with a value of :mimetype:`text/plain`.  Whether the
    142       :mailheader:`Content-Type` header already exists or not, its ``charset``
    143       parameter will be set to *charset.output_charset*.   If
    144       *charset.input_charset* and *charset.output_charset* differ, the payload
    145       will be re-encoded to the *output_charset*.  If there is no existing
    146       :mailheader:`Content-Transfer-Encoding` header, then the payload will be
    147       transfer-encoded, if needed, using the specified
    148       :class:`~email.charset.Charset`, and a header with the appropriate value
    149       will be added.  If a :mailheader:`Content-Transfer-Encoding` header
    150       already exists, the payload is assumed to already be correctly encoded
    151       using that :mailheader:`Content-Transfer-Encoding` and is not modified.
    152 
    153       The message will be assumed to be of type :mimetype:`text/\*`, with the
    154       payload either in unicode or encoded with *charset.input_charset*.
    155       It will be encoded or converted to *charset.output_charset*
    156       and transfer encoded properly, if needed, when generating the plain text
    157       representation of the message.  MIME headers (:mailheader:`MIME-Version`,
    158       :mailheader:`Content-Type`, :mailheader:`Content-Transfer-Encoding`) will
    159       be added as needed.
    160 
    161       .. versionadded:: 2.2.2
    162 
    163 
    164    .. method:: get_charset()
    165 
    166       Return the :class:`~email.charset.Charset` instance associated with the
    167       message's payload.
    168 
    169       .. versionadded:: 2.2.2
    170 
    171    The following methods implement a mapping-like interface for accessing the
    172    message's :rfc:`2822` headers.  Note that there are some semantic differences
    173    between these methods and a normal mapping (i.e. dictionary) interface.  For
    174    example, in a dictionary there are no duplicate keys, but here there may be
    175    duplicate message headers.  Also, in dictionaries there is no guaranteed
    176    order to the keys returned by :meth:`keys`, but in a :class:`Message` object,
    177    headers are always returned in the order they appeared in the original
    178    message, or were added to the message later.  Any header deleted and then
    179    re-added are always appended to the end of the header list.
    180 
    181    These semantic differences are intentional and are biased toward maximal
    182    convenience.
    183 
    184    Note that in all cases, any envelope header present in the message is not
    185    included in the mapping interface.
    186 
    187 
    188    .. method:: __len__()
    189 
    190       Return the total number of headers, including duplicates.
    191 
    192 
    193    .. method:: __contains__(name)
    194 
    195       Return true if the message object has a field named *name*. Matching is
    196       done case-insensitively and *name* should not include the trailing colon.
    197       Used for the ``in`` operator, e.g.::
    198 
    199          if 'message-id' in myMessage:
    200              print 'Message-ID:', myMessage['message-id']
    201 
    202 
    203    .. method:: __getitem__(name)
    204 
    205       Return the value of the named header field.  *name* should not include the
    206       colon field separator.  If the header is missing, ``None`` is returned; a
    207       :exc:`KeyError` is never raised.
    208 
    209       Note that if the named field appears more than once in the message's
    210       headers, exactly which of those field values will be returned is
    211       undefined.  Use the :meth:`get_all` method to get the values of all the
    212       extant named headers.
    213 
    214 
    215    .. method:: __setitem__(name, val)
    216 
    217       Add a header to the message with field name *name* and value *val*.  The
    218       field is appended to the end of the message's existing fields.
    219 
    220       Note that this does *not* overwrite or delete any existing header with the same
    221       name.  If you want to ensure that the new header is the only one present in the
    222       message with field name *name*, delete the field first, e.g.::
    223 
    224          del msg['subject']
    225          msg['subject'] = 'Python roolz!'
    226 
    227 
    228    .. method:: __delitem__(name)
    229 
    230       Delete all occurrences of the field with name *name* from the message's
    231       headers.  No exception is raised if the named field isn't present in the headers.
    232 
    233 
    234    .. method:: has_key(name)
    235 
    236       Return true if the message contains a header field named *name*, otherwise
    237       return false.
    238 
    239 
    240    .. method:: keys()
    241 
    242       Return a list of all the message's header field names.
    243 
    244 
    245    .. method:: values()
    246 
    247       Return a list of all the message's field values.
    248 
    249 
    250    .. method:: items()
    251 
    252       Return a list of 2-tuples containing all the message's field headers and
    253       values.
    254 
    255 
    256    .. method:: get(name[, failobj])
    257 
    258       Return the value of the named header field.  This is identical to
    259       :meth:`__getitem__` except that optional *failobj* is returned if the
    260       named header is missing (defaults to ``None``).
    261 
    262    Here are some additional useful methods:
    263 
    264 
    265    .. method:: get_all(name[, failobj])
    266 
    267       Return a list of all the values for the field named *name*. If there are
    268       no such named headers in the message, *failobj* is returned (defaults to
    269       ``None``).
    270 
    271 
    272    .. method:: add_header(_name, _value, **_params)
    273 
    274       Extended header setting.  This method is similar to :meth:`__setitem__`
    275       except that additional header parameters can be provided as keyword
    276       arguments.  *_name* is the header field to add and *_value* is the
    277       *primary* value for the header.
    278 
    279       For each item in the keyword argument dictionary *_params*, the key is
    280       taken as the parameter name, with underscores converted to dashes (since
    281       dashes are illegal in Python identifiers).  Normally, the parameter will
    282       be added as ``key="value"`` unless the value is ``None``, in which case
    283       only the key will be added.  If the value contains non-ASCII characters,
    284       it must be specified as a three tuple in the format
    285       ``(CHARSET, LANGUAGE, VALUE)``, where ``CHARSET`` is a string naming the
    286       charset to be used to encode the value, ``LANGUAGE`` can usually be set
    287       to ``None`` or the empty string (see :RFC:`2231` for other possibilities),
    288       and ``VALUE`` is the string value containing non-ASCII code points.
    289 
    290       Here's an example::
    291 
    292          msg.add_header('Content-Disposition', 'attachment', filename='bud.gif')
    293 
    294       This will add a header that looks like ::
    295 
    296          Content-Disposition: attachment; filename="bud.gif"
    297 
    298       An example with non-ASCII characters::
    299 
    300          msg.add_header('Content-Disposition', 'attachment',
    301                         filename=('iso-8859-1', '', 'Fuballer.ppt'))
    302 
    303       Which produces ::
    304 
    305          Content-Disposition: attachment; filename*="iso-8859-1''Fu%DFballer.ppt"
    306 
    307 
    308    .. method:: replace_header(_name, _value)
    309 
    310       Replace a header.  Replace the first header found in the message that
    311       matches *_name*, retaining header order and field name case.  If no
    312       matching header was found, a :exc:`KeyError` is raised.
    313 
    314       .. versionadded:: 2.2.2
    315 
    316 
    317    .. method:: get_content_type()
    318 
    319       Return the message's content type.  The returned string is coerced to
    320       lower case of the form :mimetype:`maintype/subtype`.  If there was no
    321       :mailheader:`Content-Type` header in the message the default type as given
    322       by :meth:`get_default_type` will be returned.  Since according to
    323       :rfc:`2045`, messages always have a default type, :meth:`get_content_type`
    324       will always return a value.
    325 
    326       :rfc:`2045` defines a message's default type to be :mimetype:`text/plain`
    327       unless it appears inside a :mimetype:`multipart/digest` container, in
    328       which case it would be :mimetype:`message/rfc822`.  If the
    329       :mailheader:`Content-Type` header has an invalid type specification,
    330       :rfc:`2045` mandates that the default type be :mimetype:`text/plain`.
    331 
    332       .. versionadded:: 2.2.2
    333 
    334 
    335    .. method:: get_content_maintype()
    336 
    337       Return the message's main content type.  This is the :mimetype:`maintype`
    338       part of the string returned by :meth:`get_content_type`.
    339 
    340       .. versionadded:: 2.2.2
    341 
    342 
    343    .. method:: get_content_subtype()
    344 
    345       Return the message's sub-content type.  This is the :mimetype:`subtype`
    346       part of the string returned by :meth:`get_content_type`.
    347 
    348       .. versionadded:: 2.2.2
    349 
    350 
    351    .. method:: get_default_type()
    352 
    353       Return the default content type.  Most messages have a default content
    354       type of :mimetype:`text/plain`, except for messages that are subparts of
    355       :mimetype:`multipart/digest` containers.  Such subparts have a default
    356       content type of :mimetype:`message/rfc822`.
    357 
    358       .. versionadded:: 2.2.2
    359 
    360 
    361    .. method:: set_default_type(ctype)
    362 
    363       Set the default content type.  *ctype* should either be
    364       :mimetype:`text/plain` or :mimetype:`message/rfc822`, although this is not
    365       enforced.  The default content type is not stored in the
    366       :mailheader:`Content-Type` header.
    367 
    368       .. versionadded:: 2.2.2
    369 
    370 
    371    .. method:: get_params([failobj[, header[, unquote]]])
    372 
    373       Return the message's :mailheader:`Content-Type` parameters, as a list.
    374       The elements of the returned list are 2-tuples of key/value pairs, as
    375       split on the ``'='`` sign.  The left hand side of the ``'='`` is the key,
    376       while the right hand side is the value.  If there is no ``'='`` sign in
    377       the parameter the value is the empty string, otherwise the value is as
    378       described in :meth:`get_param` and is unquoted if optional *unquote* is
    379       ``True`` (the default).
    380 
    381       Optional *failobj* is the object to return if there is no
    382       :mailheader:`Content-Type` header.  Optional *header* is the header to
    383       search instead of :mailheader:`Content-Type`.
    384 
    385       .. versionchanged:: 2.2.2
    386          *unquote* argument added.
    387 
    388 
    389    .. method:: get_param(param[, failobj[, header[, unquote]]])
    390 
    391       Return the value of the :mailheader:`Content-Type` header's parameter
    392       *param* as a string.  If the message has no :mailheader:`Content-Type`
    393       header or if there is no such parameter, then *failobj* is returned
    394       (defaults to ``None``).
    395 
    396       Optional *header* if given, specifies the message header to use instead of
    397       :mailheader:`Content-Type`.
    398 
    399       Parameter keys are always compared case insensitively.  The return value
    400       can either be a string, or a 3-tuple if the parameter was :rfc:`2231`
    401       encoded.  When it's a 3-tuple, the elements of the value are of the form
    402       ``(CHARSET, LANGUAGE, VALUE)``.  Note that both ``CHARSET`` and
    403       ``LANGUAGE`` can be ``None``, in which case you should consider ``VALUE``
    404       to be encoded in the ``us-ascii`` charset.  You can usually ignore
    405       ``LANGUAGE``.
    406 
    407       If your application doesn't care whether the parameter was encoded as in
    408       :rfc:`2231`, you can collapse the parameter value by calling
    409       :func:`email.utils.collapse_rfc2231_value`, passing in the return value
    410       from :meth:`get_param`.  This will return a suitably decoded Unicode
    411       string when the value is a tuple, or the original string unquoted if it
    412       isn't.  For example::
    413 
    414          rawparam = msg.get_param('foo')
    415          param = email.utils.collapse_rfc2231_value(rawparam)
    416 
    417       In any case, the parameter value (either the returned string, or the
    418       ``VALUE`` item in the 3-tuple) is always unquoted, unless *unquote* is set
    419       to ``False``.
    420 
    421       .. versionchanged:: 2.2.2
    422          *unquote* argument added, and 3-tuple return value possible.
    423 
    424 
    425    .. method:: set_param(param, value[, header[, requote[, charset[, language]]]])
    426 
    427       Set a parameter in the :mailheader:`Content-Type` header.  If the
    428       parameter already exists in the header, its value will be replaced with
    429       *value*.  If the :mailheader:`Content-Type` header as not yet been defined
    430       for this message, it will be set to :mimetype:`text/plain` and the new
    431       parameter value will be appended as per :rfc:`2045`.
    432 
    433       Optional *header* specifies an alternative header to
    434       :mailheader:`Content-Type`, and all parameters will be quoted as necessary
    435       unless optional *requote* is ``False`` (the default is ``True``).
    436 
    437       If optional *charset* is specified, the parameter will be encoded
    438       according to :rfc:`2231`. Optional *language* specifies the RFC 2231
    439       language, defaulting to the empty string.  Both *charset* and *language*
    440       should be strings.
    441 
    442       .. versionadded:: 2.2.2
    443 
    444 
    445    .. method:: del_param(param[, header[, requote]])
    446 
    447       Remove the given parameter completely from the :mailheader:`Content-Type`
    448       header.  The header will be re-written in place without the parameter or
    449       its value.  All values will be quoted as necessary unless *requote* is
    450       ``False`` (the default is ``True``).  Optional *header* specifies an
    451       alternative to :mailheader:`Content-Type`.
    452 
    453       .. versionadded:: 2.2.2
    454 
    455 
    456    .. method:: set_type(type[, header][, requote])
    457 
    458       Set the main type and subtype for the :mailheader:`Content-Type`
    459       header. *type* must be a string in the form :mimetype:`maintype/subtype`,
    460       otherwise a :exc:`ValueError` is raised.
    461 
    462       This method replaces the :mailheader:`Content-Type` header, keeping all
    463       the parameters in place.  If *requote* is ``False``, this leaves the
    464       existing header's quoting as is, otherwise the parameters will be quoted
    465       (the default).
    466 
    467       An alternative header can be specified in the *header* argument. When the
    468       :mailheader:`Content-Type` header is set a :mailheader:`MIME-Version`
    469       header is also added.
    470 
    471       .. versionadded:: 2.2.2
    472 
    473 
    474    .. method:: get_filename([failobj])
    475 
    476       Return the value of the ``filename`` parameter of the
    477       :mailheader:`Content-Disposition` header of the message.  If the header
    478       does not have a ``filename`` parameter, this method falls back to looking
    479       for the ``name`` parameter on the :mailheader:`Content-Type` header.  If
    480       neither is found, or the header is missing, then *failobj* is returned.
    481       The returned string will always be unquoted as per
    482       :func:`email.utils.unquote`.
    483 
    484 
    485    .. method:: get_boundary([failobj])
    486 
    487       Return the value of the ``boundary`` parameter of the
    488       :mailheader:`Content-Type` header of the message, or *failobj* if either
    489       the header is missing, or has no ``boundary`` parameter.  The returned
    490       string will always be unquoted as per :func:`email.utils.unquote`.
    491 
    492 
    493    .. method:: set_boundary(boundary)
    494 
    495       Set the ``boundary`` parameter of the :mailheader:`Content-Type` header to
    496       *boundary*.  :meth:`set_boundary` will always quote *boundary* if
    497       necessary.  A :exc:`~email.errors.HeaderParseError` is raised if the
    498       message object has no :mailheader:`Content-Type` header.
    499 
    500       Note that using this method is subtly different than deleting the old
    501       :mailheader:`Content-Type` header and adding a new one with the new
    502       boundary via :meth:`add_header`, because :meth:`set_boundary` preserves
    503       the order of the :mailheader:`Content-Type` header in the list of
    504       headers. However, it does *not* preserve any continuation lines which may
    505       have been present in the original :mailheader:`Content-Type` header.
    506 
    507 
    508    .. method:: get_content_charset([failobj])
    509 
    510       Return the ``charset`` parameter of the :mailheader:`Content-Type` header,
    511       coerced to lower case.  If there is no :mailheader:`Content-Type` header, or if
    512       that header has no ``charset`` parameter, *failobj* is returned.
    513 
    514       Note that this method differs from :meth:`get_charset` which returns the
    515       :class:`~email.charset.Charset` instance for the default encoding of the message body.
    516 
    517       .. versionadded:: 2.2.2
    518 
    519 
    520    .. method:: get_charsets([failobj])
    521 
    522       Return a list containing the character set names in the message.  If the
    523       message is a :mimetype:`multipart`, then the list will contain one element
    524       for each subpart in the payload, otherwise, it will be a list of length 1.
    525 
    526       Each item in the list will be a string which is the value of the
    527       ``charset`` parameter in the :mailheader:`Content-Type` header for the
    528       represented subpart.  However, if the subpart has no
    529       :mailheader:`Content-Type` header, no ``charset`` parameter, or is not of
    530       the :mimetype:`text` main MIME type, then that item in the returned list
    531       will be *failobj*.
    532 
    533 
    534    .. method:: walk()
    535 
    536       The :meth:`walk` method is an all-purpose generator which can be used to
    537       iterate over all the parts and subparts of a message object tree, in
    538       depth-first traversal order.  You will typically use :meth:`walk` as the
    539       iterator in a ``for`` loop; each iteration returns the next subpart.
    540 
    541       Here's an example that prints the MIME type of every part of a multipart
    542       message structure::
    543 
    544          >>> for part in msg.walk():
    545          ...     print part.get_content_type()
    546          multipart/report
    547          text/plain
    548          message/delivery-status
    549          text/plain
    550          text/plain
    551          message/rfc822
    552 
    553    .. versionchanged:: 2.5
    554       The previously deprecated methods :meth:`get_type`, :meth:`get_main_type`, and
    555       :meth:`get_subtype` were removed.
    556 
    557    :class:`Message` objects can also optionally contain two instance attributes,
    558    which can be used when generating the plain text of a MIME message.
    559 
    560 
    561    .. attribute:: preamble
    562 
    563       The format of a MIME document allows for some text between the blank line
    564       following the headers, and the first multipart boundary string. Normally,
    565       this text is never visible in a MIME-aware mail reader because it falls
    566       outside the standard MIME armor.  However, when viewing the raw text of
    567       the message, or when viewing the message in a non-MIME aware reader, this
    568       text can become visible.
    569 
    570       The *preamble* attribute contains this leading extra-armor text for MIME
    571       documents.  When the :class:`~email.parser.Parser` discovers some text
    572       after the headers but before the first boundary string, it assigns this
    573       text to the message's *preamble* attribute.  When the
    574       :class:`~email.generator.Generator` is writing out the plain text
    575       representation of a MIME message, and it finds the
    576       message has a *preamble* attribute, it will write this text in the area
    577       between the headers and the first boundary.  See :mod:`email.parser` and
    578       :mod:`email.generator` for details.
    579 
    580       Note that if the message object has no preamble, the *preamble* attribute
    581       will be ``None``.
    582 
    583 
    584    .. attribute:: epilogue
    585 
    586       The *epilogue* attribute acts the same way as the *preamble* attribute,
    587       except that it contains text that appears between the last boundary and
    588       the end of the message.
    589 
    590       .. versionchanged:: 2.5
    591          You do not need to set the epilogue to the empty string in order for the
    592          :class:`~email.generator.Generator` to print a newline at the end of the
    593          file.
    594 
    595 
    596    .. attribute:: defects
    597 
    598       The *defects* attribute contains a list of all the problems found when
    599       parsing this message.  See :mod:`email.errors` for a detailed description
    600       of the possible parsing defects.
    601 
    602       .. versionadded:: 2.4
    603 
    604