Home | History | Annotate | Download | only in library
      1 :mod:`xml.sax` --- Support for SAX2 parsers
      2 ===========================================
      3 
      4 .. module:: xml.sax
      5    :synopsis: Package containing SAX2 base classes and convenience functions.
      6 
      7 .. moduleauthor:: Lars Marius Garshol <larsga (a] garshol.priv.no>
      8 .. sectionauthor:: Fred L. Drake, Jr. <fdrake (a] acm.org>
      9 .. sectionauthor:: Martin v. Lwis <martin (a] v.loewis.de>
     10 
     11 **Source code:** :source:`Lib/xml/sax/__init__.py`
     12 
     13 --------------
     14 
     15 The :mod:`xml.sax` package provides a number of modules which implement the
     16 Simple API for XML (SAX) interface for Python.  The package itself provides the
     17 SAX exceptions and the convenience functions which will be most used by users of
     18 the SAX API.
     19 
     20 
     21 .. warning::
     22 
     23    The :mod:`xml.sax` module is not secure against maliciously
     24    constructed data.  If you need to parse untrusted or unauthenticated data see
     25    :ref:`xml-vulnerabilities`.
     26 
     27 
     28 The convenience functions are:
     29 
     30 
     31 .. function:: make_parser(parser_list=[])
     32 
     33    Create and return a SAX :class:`~xml.sax.xmlreader.XMLReader` object.  The
     34    first parser found will
     35    be used.  If *parser_list* is provided, it must be a sequence of strings which
     36    name modules that have a function named :func:`create_parser`.  Modules listed
     37    in *parser_list* will be used before modules in the default list of parsers.
     38 
     39 
     40 .. function:: parse(filename_or_stream, handler, error_handler=handler.ErrorHandler())
     41 
     42    Create a SAX parser and use it to parse a document.  The document, passed in as
     43    *filename_or_stream*, can be a filename or a file object.  The *handler*
     44    parameter needs to be a SAX :class:`~handler.ContentHandler` instance.  If
     45    *error_handler* is given, it must be a SAX :class:`~handler.ErrorHandler`
     46    instance; if
     47    omitted,  :exc:`SAXParseException` will be raised on all errors.  There is no
     48    return value; all work must be done by the *handler* passed in.
     49 
     50 
     51 .. function:: parseString(string, handler, error_handler=handler.ErrorHandler())
     52 
     53    Similar to :func:`parse`, but parses from a buffer *string* received as a
     54    parameter.  *string* must be a :class:`str` instance or a
     55    :term:`bytes-like object`.
     56 
     57    .. versionchanged:: 3.5
     58       Added support of :class:`str` instances.
     59 
     60 A typical SAX application uses three kinds of objects: readers, handlers and
     61 input sources.  "Reader" in this context is another term for parser, i.e. some
     62 piece of code that reads the bytes or characters from the input source, and
     63 produces a sequence of events. The events then get distributed to the handler
     64 objects, i.e. the reader invokes a method on the handler.  A SAX application
     65 must therefore obtain a reader object, create or open the input sources, create
     66 the handlers, and connect these objects all together.  As the final step of
     67 preparation, the reader is called to parse the input. During parsing, methods on
     68 the handler objects are called based on structural and syntactic events from the
     69 input data.
     70 
     71 For these objects, only the interfaces are relevant; they are normally not
     72 instantiated by the application itself.  Since Python does not have an explicit
     73 notion of interface, they are formally introduced as classes, but applications
     74 may use implementations which do not inherit from the provided classes.  The
     75 :class:`~xml.sax.xmlreader.InputSource`, :class:`~xml.sax.xmlreader.Locator`,
     76 :class:`~xml.sax.xmlreader.Attributes`, :class:`~xml.sax.xmlreader.AttributesNS`,
     77 and :class:`~xml.sax.xmlreader.XMLReader` interfaces are defined in the
     78 module :mod:`xml.sax.xmlreader`.  The handler interfaces are defined in
     79 :mod:`xml.sax.handler`.  For convenience,
     80 :class:`~xml.sax.xmlreader.InputSource` (which is often
     81 instantiated directly) and the handler classes are also available from
     82 :mod:`xml.sax`.  These interfaces are described below.
     83 
     84 In addition to these classes, :mod:`xml.sax` provides the following exception
     85 classes.
     86 
     87 
     88 .. exception:: SAXException(msg, exception=None)
     89 
     90    Encapsulate an XML error or warning.  This class can contain basic error or
     91    warning information from either the XML parser or the application: it can be
     92    subclassed to provide additional functionality or to add localization.  Note
     93    that although the handlers defined in the
     94    :class:`~xml.sax.handler.ErrorHandler` interface
     95    receive instances of this exception, it is not required to actually raise the
     96    exception --- it is also useful as a container for information.
     97 
     98    When instantiated, *msg* should be a human-readable description of the error.
     99    The optional *exception* parameter, if given, should be ``None`` or an exception
    100    that was caught by the parsing code and is being passed along as information.
    101 
    102    This is the base class for the other SAX exception classes.
    103 
    104 
    105 .. exception:: SAXParseException(msg, exception, locator)
    106 
    107    Subclass of :exc:`SAXException` raised on parse errors. Instances of this
    108    class are passed to the methods of the SAX
    109    :class:`~xml.sax.handler.ErrorHandler` interface to provide information
    110    about the parse error.  This class supports the SAX
    111    :class:`~xml.sax.xmlreader.Locator` interface as well as the
    112    :class:`SAXException` interface.
    113 
    114 
    115 .. exception:: SAXNotRecognizedException(msg, exception=None)
    116 
    117    Subclass of :exc:`SAXException` raised when a SAX
    118    :class:`~xml.sax.xmlreader.XMLReader` is
    119    confronted with an unrecognized feature or property.  SAX applications and
    120    extensions may use this class for similar purposes.
    121 
    122 
    123 .. exception:: SAXNotSupportedException(msg, exception=None)
    124 
    125    Subclass of :exc:`SAXException` raised when a SAX
    126    :class:`~xml.sax.xmlreader.XMLReader` is asked to
    127    enable a feature that is not supported, or to set a property to a value that the
    128    implementation does not support.  SAX applications and extensions may use this
    129    class for similar purposes.
    130 
    131 
    132 .. seealso::
    133 
    134    `SAX: The Simple API for XML <http://www.saxproject.org/>`_
    135       This site is the focal point for the definition of the SAX API.  It provides a
    136       Java implementation and online documentation.  Links to implementations and
    137       historical information are also available.
    138 
    139    Module :mod:`xml.sax.handler`
    140       Definitions of the interfaces for application-provided objects.
    141 
    142    Module :mod:`xml.sax.saxutils`
    143       Convenience functions for use in SAX applications.
    144 
    145    Module :mod:`xml.sax.xmlreader`
    146       Definitions of the interfaces for parser-provided objects.
    147 
    148 
    149 .. _sax-exception-objects:
    150 
    151 SAXException Objects
    152 --------------------
    153 
    154 The :class:`SAXException` exception class supports the following methods:
    155 
    156 
    157 .. method:: SAXException.getMessage()
    158 
    159    Return a human-readable message describing the error condition.
    160 
    161 
    162 .. method:: SAXException.getException()
    163 
    164    Return an encapsulated exception object, or ``None``.
    165 
    166