Home | History | Annotate | Download | only in library
      1 
      2 :mod:`xml.parsers.expat` --- Fast XML parsing using Expat
      3 =========================================================
      4 
      5 .. module:: xml.parsers.expat
      6    :synopsis: An interface to the Expat non-validating XML parser.
      7 .. moduleauthor:: Paul Prescod <paul (a] prescod.net>
      8 
      9 
     10 .. Markup notes:
     11 
     12    Many of the attributes of the XMLParser objects are callbacks.  Since
     13    signature information must be presented, these are described using the method
     14    directive.  Since they are attributes which are set by client code, in-text
     15    references to these attributes should be marked using the :member: role.
     16 
     17 
     18 .. warning::
     19 
     20    The :mod:`pyexpat` module is not secure against maliciously
     21    constructed data.  If you need to parse untrusted or unauthenticated data see
     22    :ref:`xml-vulnerabilities`.
     23 
     24 
     25 .. versionadded:: 2.0
     26 
     27 .. index:: single: Expat
     28 
     29 The :mod:`xml.parsers.expat` module is a Python interface to the Expat
     30 non-validating XML parser. The module provides a single extension type,
     31 :class:`xmlparser`, that represents the current state of an XML parser.  After
     32 an :class:`xmlparser` object has been created, various attributes of the object
     33 can be set to handler functions.  When an XML document is then fed to the
     34 parser, the handler functions are called for the character data and markup in
     35 the XML document.
     36 
     37 .. index:: module: pyexpat
     38 
     39 This module uses the :mod:`pyexpat` module to provide access to the Expat
     40 parser.  Direct use of the :mod:`pyexpat` module is deprecated.
     41 
     42 This module provides one exception and one type object:
     43 
     44 
     45 .. exception:: ExpatError
     46 
     47    The exception raised when Expat reports an error.  See section
     48    :ref:`expaterror-objects` for more information on interpreting Expat errors.
     49 
     50 
     51 .. exception:: error
     52 
     53    Alias for :exc:`ExpatError`.
     54 
     55 
     56 .. data:: XMLParserType
     57 
     58    The type of the return values from the :func:`ParserCreate` function.
     59 
     60 The :mod:`xml.parsers.expat` module contains two functions:
     61 
     62 
     63 .. function:: ErrorString(errno)
     64 
     65    Returns an explanatory string for a given error number *errno*.
     66 
     67 
     68 .. function:: ParserCreate([encoding[, namespace_separator]])
     69 
     70    Creates and returns a new :class:`xmlparser` object.   *encoding*, if specified,
     71    must be a string naming the encoding  used by the XML data.  Expat doesn't
     72    support as many encodings as Python does, and its repertoire of encodings can't
     73    be extended; it supports UTF-8, UTF-16, ISO-8859-1 (Latin1), and ASCII.  If
     74    *encoding* [1]_ is given it will override the implicit or explicit encoding of the
     75    document.
     76 
     77    Expat can optionally do XML namespace processing for you, enabled by providing a
     78    value for *namespace_separator*.  The value must be a one-character string; a
     79    :exc:`ValueError` will be raised if the string has an illegal length (``None``
     80    is considered the same as omission).  When namespace processing is enabled,
     81    element type names and attribute names that belong to a namespace will be
     82    expanded.  The element name passed to the element handlers
     83    :attr:`StartElementHandler` and :attr:`EndElementHandler` will be the
     84    concatenation of the namespace URI, the namespace separator character, and the
     85    local part of the name.  If the namespace separator is a zero byte (``chr(0)``)
     86    then the namespace URI and the local part will be concatenated without any
     87    separator.
     88 
     89    For example, if *namespace_separator* is set to a space character (``' '``) and
     90    the following document is parsed:
     91 
     92    .. code-block:: xml
     93 
     94       <?xml version="1.0"?>
     95       <root xmlns    = "http://default-namespace.org/"
     96             xmlns:py = "http://www.python.org/ns/">
     97         <py:elem1 />
     98         <elem2 xmlns="" />
     99       </root>
    100 
    101    :attr:`StartElementHandler` will receive the following strings for each
    102    element::
    103 
    104       http://default-namespace.org/ root
    105       http://www.python.org/ns/ elem1
    106       elem2
    107 
    108    Due to limitations in the ``Expat`` library used by :mod:`pyexpat`,
    109    the :class:`xmlparser` instance returned can only be used to parse a single
    110    XML document.  Call ``ParserCreate`` for each document to provide unique
    111    parser instances.
    112 
    113 .. seealso::
    114 
    115    `The Expat XML Parser <http://www.libexpat.org/>`_
    116       Home page of the Expat project.
    117 
    118 
    119 .. _xmlparser-objects:
    120 
    121 XMLParser Objects
    122 -----------------
    123 
    124 :class:`xmlparser` objects have the following methods:
    125 
    126 
    127 .. method:: xmlparser.Parse(data[, isfinal])
    128 
    129    Parses the contents of the string *data*, calling the appropriate handler
    130    functions to process the parsed data.  *isfinal* must be true on the final call
    131    to this method; it allows the parsing of a single file in fragments,
    132    not the submission of multiple files.
    133    *data* can be the empty string at any time.
    134 
    135 
    136 .. method:: xmlparser.ParseFile(file)
    137 
    138    Parse XML data reading from the object *file*.  *file* only needs to provide
    139    the ``read(nbytes)`` method, returning the empty string when there's no more
    140    data.
    141 
    142 
    143 .. method:: xmlparser.SetBase(base)
    144 
    145    Sets the base to be used for resolving relative URIs in system identifiers in
    146    declarations.  Resolving relative identifiers is left to the application: this
    147    value will be passed through as the *base* argument to the
    148    :func:`ExternalEntityRefHandler`, :func:`NotationDeclHandler`, and
    149    :func:`UnparsedEntityDeclHandler` functions.
    150 
    151 
    152 .. method:: xmlparser.GetBase()
    153 
    154    Returns a string containing the base set by a previous call to :meth:`SetBase`,
    155    or ``None`` if  :meth:`SetBase` hasn't been called.
    156 
    157 
    158 .. method:: xmlparser.GetInputContext()
    159 
    160    Returns the input data that generated the current event as a string. The data is
    161    in the encoding of the entity which contains the text. When called while an
    162    event handler is not active, the return value is ``None``.
    163 
    164    .. versionadded:: 2.1
    165 
    166 
    167 .. method:: xmlparser.ExternalEntityParserCreate(context[, encoding])
    168 
    169    Create a "child" parser which can be used to parse an external parsed entity
    170    referred to by content parsed by the parent parser.  The *context* parameter
    171    should be the string passed to the :meth:`ExternalEntityRefHandler` handler
    172    function, described below. The child parser is created with the
    173    :attr:`ordered_attributes`, :attr:`returns_unicode` and
    174    :attr:`specified_attributes` set to the values of this parser.
    175 
    176 .. method:: xmlparser.SetParamEntityParsing(flag)
    177 
    178    Control parsing of parameter entities (including the external DTD subset).
    179    Possible *flag* values are :const:`XML_PARAM_ENTITY_PARSING_NEVER`,
    180    :const:`XML_PARAM_ENTITY_PARSING_UNLESS_STANDALONE` and
    181    :const:`XML_PARAM_ENTITY_PARSING_ALWAYS`.  Return true if setting the flag
    182    was successful.
    183 
    184 .. method:: xmlparser.UseForeignDTD([flag])
    185 
    186    Calling this with a true value for *flag* (the default) will cause Expat to call
    187    the :attr:`ExternalEntityRefHandler` with :const:`None` for all arguments to
    188    allow an alternate DTD to be loaded.  If the document does not contain a
    189    document type declaration, the :attr:`ExternalEntityRefHandler` will still be
    190    called, but the :attr:`StartDoctypeDeclHandler` and
    191    :attr:`EndDoctypeDeclHandler` will not be called.
    192 
    193    Passing a false value for *flag* will cancel a previous call that passed a true
    194    value, but otherwise has no effect.
    195 
    196    This method can only be called before the :meth:`Parse` or :meth:`ParseFile`
    197    methods are called; calling it after either of those have been called causes
    198    :exc:`ExpatError` to be raised with the :attr:`code` attribute set to
    199    :const:`errors.XML_ERROR_CANT_CHANGE_FEATURE_ONCE_PARSING`.
    200 
    201    .. versionadded:: 2.3
    202 
    203 :class:`xmlparser` objects have the following attributes:
    204 
    205 
    206 .. attribute:: xmlparser.buffer_size
    207 
    208    The size of the buffer used when :attr:`buffer_text` is true.
    209    A new buffer size can be set by assigning a new integer value
    210    to this attribute.
    211    When the size is changed, the buffer will be flushed.
    212 
    213    .. versionadded:: 2.3
    214 
    215    .. versionchanged:: 2.6
    216       The buffer size can now be changed.
    217 
    218 .. attribute:: xmlparser.buffer_text
    219 
    220    Setting this to true causes the :class:`xmlparser` object to buffer textual
    221    content returned by Expat to avoid multiple calls to the
    222    :meth:`CharacterDataHandler` callback whenever possible.  This can improve
    223    performance substantially since Expat normally breaks character data into chunks
    224    at every line ending.  This attribute is false by default, and may be changed at
    225    any time.
    226 
    227    .. versionadded:: 2.3
    228 
    229 
    230 .. attribute:: xmlparser.buffer_used
    231 
    232    If :attr:`buffer_text` is enabled, the number of bytes stored in the buffer.
    233    These bytes represent UTF-8 encoded text.  This attribute has no meaningful
    234    interpretation when :attr:`buffer_text` is false.
    235 
    236    .. versionadded:: 2.3
    237 
    238 
    239 .. attribute:: xmlparser.ordered_attributes
    240 
    241    Setting this attribute to a non-zero integer causes the attributes to be
    242    reported as a list rather than a dictionary.  The attributes are presented in
    243    the order found in the document text.  For each attribute, two list entries are
    244    presented: the attribute name and the attribute value.  (Older versions of this
    245    module also used this format.)  By default, this attribute is false; it may be
    246    changed at any time.
    247 
    248    .. versionadded:: 2.1
    249 
    250 
    251 .. attribute:: xmlparser.returns_unicode
    252 
    253    If this attribute is set to a non-zero integer, the handler functions will be
    254    passed Unicode strings.  If :attr:`returns_unicode` is :const:`False`, 8-bit
    255    strings containing UTF-8 encoded data will be passed to the handlers.  This is
    256    :const:`True` by default when Python is built with Unicode support.
    257 
    258    .. versionchanged:: 1.6
    259       Can be changed at any time to affect the result type.
    260 
    261 
    262 .. attribute:: xmlparser.specified_attributes
    263 
    264    If set to a non-zero integer, the parser will report only those attributes which
    265    were specified in the document instance and not those which were derived from
    266    attribute declarations.  Applications which set this need to be especially
    267    careful to use what additional information is available from the declarations as
    268    needed to comply with the standards for the behavior of XML processors.  By
    269    default, this attribute is false; it may be changed at any time.
    270 
    271    .. versionadded:: 2.1
    272 
    273 The following attributes contain values relating to the most recent error
    274 encountered by an :class:`xmlparser` object, and will only have correct values
    275 once a call to :meth:`Parse` or :meth:`ParseFile` has raised an
    276 :exc:`xml.parsers.expat.ExpatError` exception.
    277 
    278 
    279 .. attribute:: xmlparser.ErrorByteIndex
    280 
    281    Byte index at which an error occurred.
    282 
    283 
    284 .. attribute:: xmlparser.ErrorCode
    285 
    286    Numeric code specifying the problem.  This value can be passed to the
    287    :func:`ErrorString` function, or compared to one of the constants defined in the
    288    ``errors`` object.
    289 
    290 
    291 .. attribute:: xmlparser.ErrorColumnNumber
    292 
    293    Column number at which an error occurred.
    294 
    295 
    296 .. attribute:: xmlparser.ErrorLineNumber
    297 
    298    Line number at which an error occurred.
    299 
    300 The following attributes contain values relating to the current parse location
    301 in an :class:`xmlparser` object.  During a callback reporting a parse event they
    302 indicate the location of the first of the sequence of characters that generated
    303 the event.  When called outside of a callback, the position indicated will be
    304 just past the last parse event (regardless of whether there was an associated
    305 callback).
    306 
    307 .. versionadded:: 2.4
    308 
    309 
    310 .. attribute:: xmlparser.CurrentByteIndex
    311 
    312    Current byte index in the parser input.
    313 
    314 
    315 .. attribute:: xmlparser.CurrentColumnNumber
    316 
    317    Current column number in the parser input.
    318 
    319 
    320 .. attribute:: xmlparser.CurrentLineNumber
    321 
    322    Current line number in the parser input.
    323 
    324 Here is the list of handlers that can be set.  To set a handler on an
    325 :class:`xmlparser` object *o*, use ``o.handlername = func``.  *handlername* must
    326 be taken from the following list, and *func* must be a callable object accepting
    327 the correct number of arguments.  The arguments are all strings, unless
    328 otherwise stated.
    329 
    330 
    331 .. method:: xmlparser.XmlDeclHandler(version, encoding, standalone)
    332 
    333    Called when the XML declaration is parsed.  The XML declaration is the
    334    (optional) declaration of the applicable version of the XML recommendation, the
    335    encoding of the document text, and an optional "standalone" declaration.
    336    *version* and *encoding* will be strings of the type dictated by the
    337    :attr:`returns_unicode` attribute, and *standalone* will be ``1`` if the
    338    document is declared standalone, ``0`` if it is declared not to be standalone,
    339    or ``-1`` if the standalone clause was omitted. This is only available with
    340    Expat version 1.95.0 or newer.
    341 
    342    .. versionadded:: 2.1
    343 
    344 
    345 .. method:: xmlparser.StartDoctypeDeclHandler(doctypeName, systemId, publicId, has_internal_subset)
    346 
    347    Called when Expat begins parsing the document type declaration (``<!DOCTYPE
    348    ...``).  The *doctypeName* is provided exactly as presented.  The *systemId* and
    349    *publicId* parameters give the system and public identifiers if specified, or
    350    ``None`` if omitted.  *has_internal_subset* will be true if the document
    351    contains and internal document declaration subset. This requires Expat version
    352    1.2 or newer.
    353 
    354 
    355 .. method:: xmlparser.EndDoctypeDeclHandler()
    356 
    357    Called when Expat is done parsing the document type declaration. This requires
    358    Expat version 1.2 or newer.
    359 
    360 
    361 .. method:: xmlparser.ElementDeclHandler(name, model)
    362 
    363    Called once for each element type declaration.  *name* is the name of the
    364    element type, and *model* is a representation of the content model.
    365 
    366 
    367 .. method:: xmlparser.AttlistDeclHandler(elname, attname, type, default, required)
    368 
    369    Called for each declared attribute for an element type.  If an attribute list
    370    declaration declares three attributes, this handler is called three times, once
    371    for each attribute.  *elname* is the name of the element to which the
    372    declaration applies and *attname* is the name of the attribute declared.  The
    373    attribute type is a string passed as *type*; the possible values are
    374    ``'CDATA'``, ``'ID'``, ``'IDREF'``, ... *default* gives the default value for
    375    the attribute used when the attribute is not specified by the document instance,
    376    or ``None`` if there is no default value (``#IMPLIED`` values).  If the
    377    attribute is required to be given in the document instance, *required* will be
    378    true. This requires Expat version 1.95.0 or newer.
    379 
    380 
    381 .. method:: xmlparser.StartElementHandler(name, attributes)
    382 
    383    Called for the start of every element.  *name* is a string containing the
    384    element name, and *attributes* is a dictionary mapping attribute names to their
    385    values.
    386 
    387 
    388 .. method:: xmlparser.EndElementHandler(name)
    389 
    390    Called for the end of every element.
    391 
    392 
    393 .. method:: xmlparser.ProcessingInstructionHandler(target, data)
    394 
    395    Called for every processing instruction.
    396 
    397 
    398 .. method:: xmlparser.CharacterDataHandler(data)
    399 
    400    Called for character data.  This will be called for normal character data, CDATA
    401    marked content, and ignorable whitespace.  Applications which must distinguish
    402    these cases can use the :attr:`StartCdataSectionHandler`,
    403    :attr:`EndCdataSectionHandler`, and :attr:`ElementDeclHandler` callbacks to
    404    collect the required information.
    405 
    406 
    407 .. method:: xmlparser.UnparsedEntityDeclHandler(entityName, base, systemId, publicId, notationName)
    408 
    409    Called for unparsed (NDATA) entity declarations.  This is only present for
    410    version 1.2 of the Expat library; for more recent versions, use
    411    :attr:`EntityDeclHandler` instead.  (The underlying function in the Expat
    412    library has been declared obsolete.)
    413 
    414 
    415 .. method:: xmlparser.EntityDeclHandler(entityName, is_parameter_entity, value, base, systemId, publicId, notationName)
    416 
    417    Called for all entity declarations.  For parameter and internal entities,
    418    *value* will be a string giving the declared contents of the entity; this will
    419    be ``None`` for external entities.  The *notationName* parameter will be
    420    ``None`` for parsed entities, and the name of the notation for unparsed
    421    entities. *is_parameter_entity* will be true if the entity is a parameter entity
    422    or false for general entities (most applications only need to be concerned with
    423    general entities). This is only available starting with version 1.95.0 of the
    424    Expat library.
    425 
    426    .. versionadded:: 2.1
    427 
    428 
    429 .. method:: xmlparser.NotationDeclHandler(notationName, base, systemId, publicId)
    430 
    431    Called for notation declarations.  *notationName*, *base*, and *systemId*, and
    432    *publicId* are strings if given.  If the public identifier is omitted,
    433    *publicId* will be ``None``.
    434 
    435 
    436 .. method:: xmlparser.StartNamespaceDeclHandler(prefix, uri)
    437 
    438    Called when an element contains a namespace declaration.  Namespace declarations
    439    are processed before the :attr:`StartElementHandler` is called for the element
    440    on which declarations are placed.
    441 
    442 
    443 .. method:: xmlparser.EndNamespaceDeclHandler(prefix)
    444 
    445    Called when the closing tag is reached for an element  that contained a
    446    namespace declaration.  This is called once for each namespace declaration on
    447    the element in the reverse of the order for which the
    448    :attr:`StartNamespaceDeclHandler` was called to indicate the start of each
    449    namespace declaration's scope.  Calls to this handler are made after the
    450    corresponding :attr:`EndElementHandler` for the end of the element.
    451 
    452 
    453 .. method:: xmlparser.CommentHandler(data)
    454 
    455    Called for comments.  *data* is the text of the comment, excluding the leading
    456    ``'<!-``\ ``-'`` and trailing ``'-``\ ``->'``.
    457 
    458 
    459 .. method:: xmlparser.StartCdataSectionHandler()
    460 
    461    Called at the start of a CDATA section.  This and :attr:`EndCdataSectionHandler`
    462    are needed to be able to identify the syntactical start and end for CDATA
    463    sections.
    464 
    465 
    466 .. method:: xmlparser.EndCdataSectionHandler()
    467 
    468    Called at the end of a CDATA section.
    469 
    470 
    471 .. method:: xmlparser.DefaultHandler(data)
    472 
    473    Called for any characters in the XML document for which no applicable handler
    474    has been specified.  This means characters that are part of a construct which
    475    could be reported, but for which no handler has been supplied.
    476 
    477 
    478 .. method:: xmlparser.DefaultHandlerExpand(data)
    479 
    480    This is the same as the :func:`DefaultHandler`,  but doesn't inhibit expansion
    481    of internal entities. The entity reference will not be passed to the default
    482    handler.
    483 
    484 
    485 .. method:: xmlparser.NotStandaloneHandler()
    486 
    487    Called if the XML document hasn't been declared as being a standalone document.
    488    This happens when there is an external subset or a reference to a parameter
    489    entity, but the XML declaration does not set standalone to ``yes`` in an XML
    490    declaration.  If this handler returns ``0``, then the parser will raise an
    491    :const:`XML_ERROR_NOT_STANDALONE` error.  If this handler is not set, no
    492    exception is raised by the parser for this condition.
    493 
    494 
    495 .. method:: xmlparser.ExternalEntityRefHandler(context, base, systemId, publicId)
    496 
    497    Called for references to external entities.  *base* is the current base, as set
    498    by a previous call to :meth:`SetBase`.  The public and system identifiers,
    499    *systemId* and *publicId*, are strings if given; if the public identifier is not
    500    given, *publicId* will be ``None``.  The *context* value is opaque and should
    501    only be used as described below.
    502 
    503    For external entities to be parsed, this handler must be implemented. It is
    504    responsible for creating the sub-parser using
    505    ``ExternalEntityParserCreate(context)``, initializing it with the appropriate
    506    callbacks, and parsing the entity.  This handler should return an integer; if it
    507    returns ``0``, the parser will raise an
    508    :const:`XML_ERROR_EXTERNAL_ENTITY_HANDLING` error, otherwise parsing will
    509    continue.
    510 
    511    If this handler is not provided, external entities are reported by the
    512    :attr:`DefaultHandler` callback, if provided.
    513 
    514 
    515 .. _expaterror-objects:
    516 
    517 ExpatError Exceptions
    518 ---------------------
    519 
    520 .. sectionauthor:: Fred L. Drake, Jr. <fdrake (a] acm.org>
    521 
    522 
    523 :exc:`ExpatError` exceptions have a number of interesting attributes:
    524 
    525 
    526 .. attribute:: ExpatError.code
    527 
    528    Expat's internal error number for the specific error.  This will match one of
    529    the constants defined in the ``errors`` object from this module.
    530 
    531    .. versionadded:: 2.1
    532 
    533 
    534 .. attribute:: ExpatError.lineno
    535 
    536    Line number on which the error was detected.  The first line is numbered ``1``.
    537 
    538    .. versionadded:: 2.1
    539 
    540 
    541 .. attribute:: ExpatError.offset
    542 
    543    Character offset into the line where the error occurred.  The first column is
    544    numbered ``0``.
    545 
    546    .. versionadded:: 2.1
    547 
    548 
    549 .. _expat-example:
    550 
    551 Example
    552 -------
    553 
    554 The following program defines three handlers that just print out their
    555 arguments. ::
    556 
    557    import xml.parsers.expat
    558 
    559    # 3 handler functions
    560    def start_element(name, attrs):
    561        print 'Start element:', name, attrs
    562    def end_element(name):
    563        print 'End element:', name
    564    def char_data(data):
    565        print 'Character data:', repr(data)
    566 
    567    p = xml.parsers.expat.ParserCreate()
    568 
    569    p.StartElementHandler = start_element
    570    p.EndElementHandler = end_element
    571    p.CharacterDataHandler = char_data
    572 
    573    p.Parse("""<?xml version="1.0"?>
    574    <parent id="top"><child1 name="paul">Text goes here</child1>
    575    <child2 name="fred">More text</child2>
    576    </parent>""", 1)
    577 
    578 The output from this program is::
    579 
    580    Start element: parent {'id': 'top'}
    581    Start element: child1 {'name': 'paul'}
    582    Character data: 'Text goes here'
    583    End element: child1
    584    Character data: '\n'
    585    Start element: child2 {'name': 'fred'}
    586    Character data: 'More text'
    587    End element: child2
    588    Character data: '\n'
    589    End element: parent
    590 
    591 
    592 .. _expat-content-models:
    593 
    594 Content Model Descriptions
    595 --------------------------
    596 
    597 .. sectionauthor:: Fred L. Drake, Jr. <fdrake (a] acm.org>
    598 
    599 
    600 Content models are described using nested tuples.  Each tuple contains four
    601 values: the type, the quantifier, the name, and a tuple of children.  Children
    602 are simply additional content model descriptions.
    603 
    604 The values of the first two fields are constants defined in the ``model`` object
    605 of the :mod:`xml.parsers.expat` module.  These constants can be collected in two
    606 groups: the model type group and the quantifier group.
    607 
    608 The constants in the model type group are:
    609 
    610 
    611 .. data:: XML_CTYPE_ANY
    612    :noindex:
    613 
    614    The element named by the model name was declared to have a content model of
    615    ``ANY``.
    616 
    617 
    618 .. data:: XML_CTYPE_CHOICE
    619    :noindex:
    620 
    621    The named element allows a choice from a number of options; this is used for
    622    content models such as ``(A | B | C)``.
    623 
    624 
    625 .. data:: XML_CTYPE_EMPTY
    626    :noindex:
    627 
    628    Elements which are declared to be ``EMPTY`` have this model type.
    629 
    630 
    631 .. data:: XML_CTYPE_MIXED
    632    :noindex:
    633 
    634 
    635 .. data:: XML_CTYPE_NAME
    636    :noindex:
    637 
    638 
    639 .. data:: XML_CTYPE_SEQ
    640    :noindex:
    641 
    642    Models which represent a series of models which follow one after the other are
    643    indicated with this model type.  This is used for models such as ``(A, B, C)``.
    644 
    645 The constants in the quantifier group are:
    646 
    647 
    648 .. data:: XML_CQUANT_NONE
    649    :noindex:
    650 
    651    No modifier is given, so it can appear exactly once, as for ``A``.
    652 
    653 
    654 .. data:: XML_CQUANT_OPT
    655    :noindex:
    656 
    657    The model is optional: it can appear once or not at all, as for ``A?``.
    658 
    659 
    660 .. data:: XML_CQUANT_PLUS
    661    :noindex:
    662 
    663    The model must occur one or more times (like ``A+``).
    664 
    665 
    666 .. data:: XML_CQUANT_REP
    667    :noindex:
    668 
    669    The model must occur zero or more times, as for ``A*``.
    670 
    671 
    672 .. _expat-errors:
    673 
    674 Expat error constants
    675 ---------------------
    676 
    677 The following constants are provided in the ``errors`` object of the
    678 :mod:`xml.parsers.expat` module.  These constants are useful in interpreting
    679 some of the attributes of the :exc:`ExpatError` exception objects raised when an
    680 error has occurred.
    681 
    682 The ``errors`` object has the following attributes:
    683 
    684 
    685 .. data:: XML_ERROR_ASYNC_ENTITY
    686    :noindex:
    687 
    688 
    689 .. data:: XML_ERROR_ATTRIBUTE_EXTERNAL_ENTITY_REF
    690    :noindex:
    691 
    692    An entity reference in an attribute value referred to an external entity instead
    693    of an internal entity.
    694 
    695 
    696 .. data:: XML_ERROR_BAD_CHAR_REF
    697    :noindex:
    698 
    699    A character reference referred to a character which is illegal in XML (for
    700    example, character ``0``, or '``&#0;``').
    701 
    702 
    703 .. data:: XML_ERROR_BINARY_ENTITY_REF
    704    :noindex:
    705 
    706    An entity reference referred to an entity which was declared with a notation, so
    707    cannot be parsed.
    708 
    709 
    710 .. data:: XML_ERROR_DUPLICATE_ATTRIBUTE
    711    :noindex:
    712 
    713    An attribute was used more than once in a start tag.
    714 
    715 
    716 .. data:: XML_ERROR_INCORRECT_ENCODING
    717    :noindex:
    718 
    719 
    720 .. data:: XML_ERROR_INVALID_TOKEN
    721    :noindex:
    722 
    723    Raised when an input byte could not properly be assigned to a character; for
    724    example, a NUL byte (value ``0``) in a UTF-8 input stream.
    725 
    726 
    727 .. data:: XML_ERROR_JUNK_AFTER_DOC_ELEMENT
    728    :noindex:
    729 
    730    Something other than whitespace occurred after the document element.
    731 
    732 
    733 .. data:: XML_ERROR_MISPLACED_XML_PI
    734    :noindex:
    735 
    736    An XML declaration was found somewhere other than the start of the input data.
    737 
    738 
    739 .. data:: XML_ERROR_NO_ELEMENTS
    740    :noindex:
    741 
    742    The document contains no elements (XML requires all documents to contain exactly
    743    one top-level element)..
    744 
    745 
    746 .. data:: XML_ERROR_NO_MEMORY
    747    :noindex:
    748 
    749    Expat was not able to allocate memory internally.
    750 
    751 
    752 .. data:: XML_ERROR_PARAM_ENTITY_REF
    753    :noindex:
    754 
    755    A parameter entity reference was found where it was not allowed.
    756 
    757 
    758 .. data:: XML_ERROR_PARTIAL_CHAR
    759    :noindex:
    760 
    761    An incomplete character was found in the input.
    762 
    763 
    764 .. data:: XML_ERROR_RECURSIVE_ENTITY_REF
    765    :noindex:
    766 
    767    An entity reference contained another reference to the same entity; possibly via
    768    a different name, and possibly indirectly.
    769 
    770 
    771 .. data:: XML_ERROR_SYNTAX
    772    :noindex:
    773 
    774    Some unspecified syntax error was encountered.
    775 
    776 
    777 .. data:: XML_ERROR_TAG_MISMATCH
    778    :noindex:
    779 
    780    An end tag did not match the innermost open start tag.
    781 
    782 
    783 .. data:: XML_ERROR_UNCLOSED_TOKEN
    784    :noindex:
    785 
    786    Some token (such as a start tag) was not closed before the end of the stream or
    787    the next token was encountered.
    788 
    789 
    790 .. data:: XML_ERROR_UNDEFINED_ENTITY
    791    :noindex:
    792 
    793    A reference was made to an entity which was not defined.
    794 
    795 
    796 .. data:: XML_ERROR_UNKNOWN_ENCODING
    797    :noindex:
    798 
    799    The document encoding is not supported by Expat.
    800 
    801 
    802 .. data:: XML_ERROR_UNCLOSED_CDATA_SECTION
    803    :noindex:
    804 
    805    A CDATA marked section was not closed.
    806 
    807 
    808 .. data:: XML_ERROR_EXTERNAL_ENTITY_HANDLING
    809    :noindex:
    810 
    811 
    812 .. data:: XML_ERROR_NOT_STANDALONE
    813    :noindex:
    814 
    815    The parser determined that the document was not "standalone" though it declared
    816    itself to be in the XML declaration, and the :attr:`NotStandaloneHandler` was
    817    set and returned ``0``.
    818 
    819 
    820 .. data:: XML_ERROR_UNEXPECTED_STATE
    821    :noindex:
    822 
    823 
    824 .. data:: XML_ERROR_ENTITY_DECLARED_IN_PE
    825    :noindex:
    826 
    827 
    828 .. data:: XML_ERROR_FEATURE_REQUIRES_XML_DTD
    829    :noindex:
    830 
    831    An operation was requested that requires DTD support to be compiled in, but
    832    Expat was configured without DTD support.  This should never be reported by a
    833    standard build of the :mod:`xml.parsers.expat` module.
    834 
    835 
    836 .. data:: XML_ERROR_CANT_CHANGE_FEATURE_ONCE_PARSING
    837    :noindex:
    838 
    839    A behavioral change was requested after parsing started that can only be changed
    840    before parsing has started.  This is (currently) only raised by
    841    :meth:`UseForeignDTD`.
    842 
    843 
    844 .. data:: XML_ERROR_UNBOUND_PREFIX
    845    :noindex:
    846 
    847    An undeclared prefix was found when namespace processing was enabled.
    848 
    849 
    850 .. data:: XML_ERROR_UNDECLARING_PREFIX
    851    :noindex:
    852 
    853    The document attempted to remove the namespace declaration associated with a
    854    prefix.
    855 
    856 
    857 .. data:: XML_ERROR_INCOMPLETE_PE
    858    :noindex:
    859 
    860    A parameter entity contained incomplete markup.
    861 
    862 
    863 .. data:: XML_ERROR_XML_DECL
    864    :noindex:
    865 
    866    The document contained no document element at all.
    867 
    868 
    869 .. data:: XML_ERROR_TEXT_DECL
    870    :noindex:
    871 
    872    There was an error parsing a text declaration in an external entity.
    873 
    874 
    875 .. data:: XML_ERROR_PUBLICID
    876    :noindex:
    877 
    878    Characters were found in the public id that are not allowed.
    879 
    880 
    881 .. data:: XML_ERROR_SUSPENDED
    882    :noindex:
    883 
    884    The requested operation was made on a suspended parser, but isn't allowed.  This
    885    includes attempts to provide additional input or to stop the parser.
    886 
    887 
    888 .. data:: XML_ERROR_NOT_SUSPENDED
    889    :noindex:
    890 
    891    An attempt to resume the parser was made when the parser had not been suspended.
    892 
    893 
    894 .. data:: XML_ERROR_ABORTED
    895    :noindex:
    896 
    897    This should not be reported to Python applications.
    898 
    899 
    900 .. data:: XML_ERROR_FINISHED
    901    :noindex:
    902 
    903    The requested operation was made on a parser which was finished parsing input,
    904    but isn't allowed.  This includes attempts to provide additional input or to
    905    stop the parser.
    906 
    907 
    908 .. data:: XML_ERROR_SUSPEND_PE
    909    :noindex:
    910 
    911 
    912 .. rubric:: Footnotes
    913 
    914 .. [#] The encoding string included in XML output should conform to the
    915    appropriate standards. For example, "UTF-8" is valid, but "UTF8" is
    916    not. See https://www.w3.org/TR/2006/REC-xml11-20060816/#NT-EncodingDecl
    917    and https://www.iana.org/assignments/character-sets/character-sets.xhtml.
    918 
    919