Home | History | Annotate | Download | only in library
      1 :mod:`email.generator`: Generating MIME documents
      2 -------------------------------------------------
      3 
      4 .. module:: email.generator
      5    :synopsis: Generate flat text email messages from a message structure.
      6 
      7 **Source code:** :source:`Lib/email/generator.py`
      8 
      9 --------------
     10 
     11 One of the most common tasks is to generate the flat (serialized) version of
     12 the email message represented by a message object structure.  You will need to
     13 do this if you want to send your message via :meth:`smtplib.SMTP.sendmail` or
     14 the :mod:`nntplib` module, or print the message on the console.  Taking a
     15 message object structure and producing a serialized representation is the job
     16 of the generator classes.
     17 
     18 As with the :mod:`email.parser` module, you aren't limited to the functionality
     19 of the bundled generator; you could write one from scratch yourself.  However
     20 the bundled generator knows how to generate most email in a standards-compliant
     21 way, should handle MIME and non-MIME email messages just fine, and is designed
     22 so that the bytes-oriented parsing and generation operations are inverses,
     23 assuming the same non-transforming :mod:`~email.policy` is used for both.  That
     24 is, parsing the serialized byte stream via the
     25 :class:`~email.parser.BytesParser` class and then regenerating the serialized
     26 byte stream using :class:`BytesGenerator` should produce output identical to
     27 the input [#]_.  (On the other hand, using the generator on an
     28 :class:`~email.message.EmailMessage` constructed by program may result in
     29 changes to the :class:`~email.message.EmailMessage` object as defaults are
     30 filled in.)
     31 
     32 The :class:`Generator` class can be used to flatten a message into a text (as
     33 opposed to binary) serialized representation, but since Unicode cannot
     34 represent binary data directly, the message is of necessity transformed into
     35 something that contains only ASCII characters, using the standard email RFC
     36 Content Transfer Encoding techniques for encoding email messages for transport
     37 over channels that are not "8 bit clean".
     38 
     39 
     40 .. class:: BytesGenerator(outfp, mangle_from_=None, maxheaderlen=None, *, \
     41                           policy=None)
     42 
     43    Return a :class:`BytesGenerator` object that will write any message provided
     44    to the :meth:`flatten` method, or any surrogateescape encoded text provided
     45    to the :meth:`write` method, to the :term:`file-like object` *outfp*.
     46    *outfp* must support a ``write`` method that accepts binary data.
     47 
     48    If optional *mangle_from_* is ``True``, put a ``>`` character in front of
     49    any line in the body that starts with the exact string ``"From "``, that is
     50    ``From`` followed by a space at the beginning of a line.  *mangle_from_*
     51    defaults to the value of the :attr:`~email.policy.Policy.mangle_from_`
     52    setting of the *policy* (which is ``True`` for the
     53    :data:`~email.policy.compat32` policy and ``False`` for all others).
     54    *mangle_from_* is intended for use when messages are stored in unix mbox
     55    format (see :mod:`mailbox` and `WHY THE CONTENT-LENGTH FORMAT IS BAD
     56    <http://www.jwz.org/doc/content-length.html>`_).
     57 
     58    If *maxheaderlen* is not ``None``, refold any header lines that are longer
     59    than *maxheaderlen*, or if ``0``, do not rewrap any headers.  If
     60    *manheaderlen* is ``None`` (the default), wrap headers and other message
     61    lines according to the *policy* settings.
     62 
     63    If *policy* is specified, use that policy to control message generation.  If
     64    *policy* is ``None`` (the default), use the policy associated with the
     65    :class:`~email.message.Message` or :class:`~email.message.EmailMessage`
     66    object passed to ``flatten`` to control the message generation.  See
     67    :mod:`email.policy` for details on what *policy* controls.
     68 
     69    .. versionadded:: 3.2
     70 
     71    .. versionchanged:: 3.3 Added the *policy* keyword.
     72 
     73    .. versionchanged:: 3.6 The default behavior of the *mangle_from_*
     74       and *maxheaderlen* parameters is to follow the policy.
     75 
     76 
     77    .. method:: flatten(msg, unixfrom=False, linesep=None)
     78 
     79       Print the textual representation of the message object structure rooted
     80       at *msg* to the output file specified when the :class:`BytesGenerator`
     81       instance was created.
     82 
     83       If the :mod:`~email.policy` option :attr:`~email.policy.Policy.cte_type`
     84       is ``8bit`` (the default), copy any headers in the original parsed
     85       message that have not been modified to the output with any bytes with the
     86       high bit set reproduced as in the original, and preserve the non-ASCII
     87       :mailheader:`Content-Transfer-Encoding` of any body parts that have them.
     88       If ``cte_type`` is ``7bit``, convert the bytes with the high bit set as
     89       needed using an ASCII-compatible :mailheader:`Content-Transfer-Encoding`.
     90       That is, transform parts with non-ASCII
     91       :mailheader:`Cotnent-Transfer-Encoding`
     92       (:mailheader:`Content-Transfer-Encoding: 8bit`) to an ASCII compatibile
     93       :mailheader:`Content-Transfer-Encoding`, and encode RFC-invalid non-ASCII
     94       bytes in headers using the MIME ``unknown-8bit`` character set, thus
     95       rendering them RFC-compliant.
     96 
     97       .. XXX: There should be an option that just does the RFC
     98          compliance transformation on headers but leaves CTE 8bit parts alone.
     99 
    100       If *unixfrom* is ``True``, print the envelope header delimiter used by
    101       the Unix mailbox format (see :mod:`mailbox`) before the first of the
    102       :rfc:`5322` headers of the root message object.  If the root object has
    103       no envelope header, craft a standard one.  The default is ``False``.
    104       Note that for subparts, no envelope header is ever printed.
    105 
    106       If *linesep* is not ``None``, use it as the separator character between
    107       all the lines of the flattened message.  If *linesep* is ``None`` (the
    108       default), use the value specified in the *policy*.
    109 
    110       .. XXX: flatten should take a *policy* keyword.
    111 
    112 
    113    .. method:: clone(fp)
    114 
    115       Return an independent clone of this :class:`BytesGenerator` instance with
    116       the exact same option settings, and *fp* as the new *outfp*.
    117 
    118 
    119    .. method:: write(s)
    120 
    121       Encode *s* using the ``ASCII`` codec and the ``surrogateescape`` error
    122       handler, and pass it to the *write* method of the *outfp* passed to the
    123       :class:`BytesGenerator`'s constructor.
    124 
    125 
    126 As a convenience, :class:`~email.message.EmailMessage` provides the methods
    127 :meth:`~email.message.EmailMessage.as_bytes` and ``bytes(aMessage)`` (a.k.a.
    128 :meth:`~email.message.EmailMessage.__bytes__`), which simplify the generation of
    129 a serialized binary representation of a message object.  For more detail, see
    130 :mod:`email.message`.
    131 
    132 
    133 Because strings cannot represent binary data, the :class:`Generator` class must
    134 convert any binary data in any message it flattens to an ASCII compatible
    135 format, by converting them to an ASCII compatible
    136 :mailheader:`Content-Transfer_Encoding`.  Using the terminology of the email
    137 RFCs, you can think of this as :class:`Generator` serializing to an I/O stream
    138 that is not "8 bit clean".  In other words, most applications will want
    139 to be using :class:`BytesGenerator`, and not :class:`Generator`.
    140 
    141 .. class:: Generator(outfp, mangle_from_=None, maxheaderlen=None, *, \
    142                      policy=None)
    143 
    144    Return a :class:`Generator` object that will write any message provided
    145    to the :meth:`flatten` method, or any text provided to the :meth:`write`
    146    method, to the :term:`file-like object` *outfp*.  *outfp* must support a
    147    ``write`` method that accepts string data.
    148 
    149    If optional *mangle_from_* is ``True``, put a ``>`` character in front of
    150    any line in the body that starts with the exact string ``"From "``, that is
    151    ``From`` followed by a space at the beginning of a line.  *mangle_from_*
    152    defaults to the value of the :attr:`~email.policy.Policy.mangle_from_`
    153    setting of the *policy* (which is ``True`` for the
    154    :data:`~email.policy.compat32` policy and ``False`` for all others).
    155    *mangle_from_* is intended for use when messages are stored in unix mbox
    156    format (see :mod:`mailbox` and `WHY THE CONTENT-LENGTH FORMAT IS BAD
    157    <http://www.jwz.org/doc/content-length.html>`_).
    158 
    159    If *maxheaderlen* is not ``None``, refold any header lines that are longer
    160    than *maxheaderlen*, or if ``0``, do not rewrap any headers.  If
    161    *manheaderlen* is ``None`` (the default), wrap headers and other message
    162    lines according to the *policy* settings.
    163 
    164    If *policy* is specified, use that policy to control message generation.  If
    165    *policy* is ``None`` (the default), use the policy associated with the
    166    :class:`~email.message.Message` or :class:`~email.message.EmailMessage`
    167    object passed to ``flatten`` to control the message generation.  See
    168    :mod:`email.policy` for details on what *policy* controls.
    169 
    170    .. versionchanged:: 3.3 Added the *policy* keyword.
    171 
    172    .. versionchanged:: 3.6 The default behavior of the *mangle_from_*
    173       and *maxheaderlen* parameters is to follow the policy.
    174 
    175 
    176    .. method:: flatten(msg, unixfrom=False, linesep=None)
    177 
    178       Print the textual representation of the message object structure rooted
    179       at *msg* to the output file specified when the :class:`Generator`
    180       instance was created.
    181 
    182       If the :mod:`~email.policy` option :attr:`~email.policy.Policy.cte_type`
    183       is ``8bit``, generate the message as if the option were set to ``7bit``.
    184       (This is required because strings cannot represent non-ASCII bytes.)
    185       Convert any bytes with the high bit set as needed using an
    186       ASCII-compatible :mailheader:`Content-Transfer-Encoding`.  That is,
    187       transform parts with non-ASCII :mailheader:`Cotnent-Transfer-Encoding`
    188       (:mailheader:`Content-Transfer-Encoding: 8bit`) to an ASCII compatibile
    189       :mailheader:`Content-Transfer-Encoding`, and encode RFC-invalid non-ASCII
    190       bytes in headers using the MIME ``unknown-8bit`` character set, thus
    191       rendering them RFC-compliant.
    192 
    193       If *unixfrom* is ``True``, print the envelope header delimiter used by
    194       the Unix mailbox format (see :mod:`mailbox`) before the first of the
    195       :rfc:`5322` headers of the root message object.  If the root object has
    196       no envelope header, craft a standard one.  The default is ``False``.
    197       Note that for subparts, no envelope header is ever printed.
    198 
    199       If *linesep* is not ``None``, use it as the separator character between
    200       all the lines of the flattened message.  If *linesep* is ``None`` (the
    201       default), use the value specified in the *policy*.
    202 
    203       .. XXX: flatten should take a *policy* keyword.
    204 
    205       .. versionchanged:: 3.2
    206          Added support for re-encoding ``8bit`` message bodies, and the
    207          *linesep* argument.
    208 
    209 
    210    .. method:: clone(fp)
    211 
    212       Return an independent clone of this :class:`Generator` instance with the
    213       exact same options, and *fp* as the new *outfp*.
    214 
    215 
    216    .. method:: write(s)
    217 
    218       Write *s* to the *write* method of the *outfp* passed to the
    219       :class:`Generator`'s constructor.  This provides just enough file-like
    220       API for :class:`Generator` instances to be used in the :func:`print`
    221       function.
    222 
    223 
    224 As a convenience, :class:`~email.message.EmailMessage` provides the methods
    225 :meth:`~email.message.EmailMessage.as_string` and ``str(aMessage)`` (a.k.a.
    226 :meth:`~email.message.EmailMessage.__str__`), which simplify the generation of
    227 a formatted string representation of a message object.  For more detail, see
    228 :mod:`email.message`.
    229 
    230 
    231 The :mod:`email.generator` module also provides a derived class,
    232 :class:`DecodedGenerator`, which is like the :class:`Generator` base class,
    233 except that non-\ :mimetype:`text` parts are not serialized, but are instead
    234 represented in the output stream by a string derived from a template filled
    235 in with information about the part.
    236 
    237 .. class:: DecodedGenerator(outfp, mangle_from_=None, maxheaderlen=None, \
    238                             fmt=None, *, policy=None)
    239 
    240    Act like :class:`Generator`, except that for any subpart of the message
    241    passed to :meth:`Generator.flatten`, if the subpart is of main type
    242    :mimetype:`text`, print the decoded payload of the subpart, and if the main
    243    type is not :mimetype:`text`, instead of printing it fill in the string
    244    *fmt* using information from the part and print the resulting
    245    filled-in string.
    246 
    247    To fill in *fmt*, execute ``fmt % part_info``, where ``part_info``
    248    is a dictionary composed of the following keys and values:
    249 
    250    * ``type`` -- Full MIME type of the non-\ :mimetype:`text` part
    251 
    252    * ``maintype`` -- Main MIME type of the non-\ :mimetype:`text` part
    253 
    254    * ``subtype`` -- Sub-MIME type of the non-\ :mimetype:`text` part
    255 
    256    * ``filename`` -- Filename of the non-\ :mimetype:`text` part
    257 
    258    * ``description`` -- Description associated with the non-\ :mimetype:`text` part
    259 
    260    * ``encoding`` -- Content transfer encoding of the non-\ :mimetype:`text` part
    261 
    262    If *fmt* is ``None``, use the following default *fmt*:
    263 
    264       "[Non-text (%(type)s) part of message omitted, filename %(filename)s]"
    265 
    266    Optional *_mangle_from_* and *maxheaderlen* are as with the
    267    :class:`Generator` base class.
    268 
    269 
    270 .. rubric:: Footnotes
    271 
    272 .. [#] This statement assumes that you use the appropriate setting for
    273        ``unixfrom``, and that there are no :mod:`policy` settings calling for
    274        automatic adjustments (for example,
    275        :attr:`~email.policy.Policy.refold_source` must be ``none``, which is
    276        *not* the default).  It is also not 100% true, since if the message
    277        does not conform to the RFC standards occasionally information about the
    278        exact original text is lost during parsing error recovery.  It is a goal
    279        to fix these latter edge cases when possible.
    280