Home | History | Annotate | Download | only in library
      1 
      2 :mod:`xml.sax` --- Support for SAX2 parsers
      3 ===========================================
      4 
      5 .. module:: xml.sax
      6    :synopsis: Package containing SAX2 base classes and convenience functions.
      7 .. moduleauthor:: Lars Marius Garshol <larsga (a] garshol.priv.no>
      8 .. sectionauthor:: Fred L. Drake, Jr. <fdrake (a] acm.org>
      9 .. sectionauthor:: Martin v. Lwis <martin (a] v.loewis.de>
     10 
     11 
     12 .. versionadded:: 2.0
     13 
     14 The :mod:`xml.sax` package provides a number of modules which implement the
     15 Simple API for XML (SAX) interface for Python.  The package itself provides the
     16 SAX exceptions and the convenience functions which will be most used by users of
     17 the SAX API.
     18 
     19 
     20 .. warning::
     21 
     22    The :mod:`xml.sax` module is not secure against maliciously
     23    constructed data.  If you need to parse untrusted or unauthenticated data see
     24    :ref:`xml-vulnerabilities`.
     25 
     26 
     27 The convenience functions are:
     28 
     29 
     30 .. function:: make_parser([parser_list])
     31 
     32    Create and return a SAX :class:`~xml.sax.xmlreader.XMLReader` object.  The
     33    first parser found will
     34    be used.  If *parser_list* is provided, it must be a sequence of strings which
     35    name modules that have a function named :func:`create_parser`.  Modules listed
     36    in *parser_list* will be used before modules in the default list of parsers.
     37 
     38 
     39 .. function:: parse(filename_or_stream, handler[, error_handler])
     40 
     41    Create a SAX parser and use it to parse a document.  The document, passed in as
     42    *filename_or_stream*, can be a filename or a file object.  The *handler*
     43    parameter needs to be a SAX :class:`~handler.ContentHandler` instance.  If
     44    *error_handler* is given, it must be a SAX :class:`~handler.ErrorHandler`
     45    instance; if
     46    omitted,  :exc:`SAXParseException` will be raised on all errors.  There is no
     47    return value; all work must be done by the *handler* passed in.
     48 
     49 
     50 .. function:: parseString(string, handler[, error_handler])
     51 
     52    Similar to :func:`parse`, but parses from a buffer *string* received as a
     53    parameter.
     54 
     55 A typical SAX application uses three kinds of objects: readers, handlers and
     56 input sources.  "Reader" in this context is another term for parser, i.e. some
     57 piece of code that reads the bytes or characters from the input source, and
     58 produces a sequence of events. The events then get distributed to the handler
     59 objects, i.e. the reader invokes a method on the handler.  A SAX application
     60 must therefore obtain a reader object, create or open the input sources, create
     61 the handlers, and connect these objects all together.  As the final step of
     62 preparation, the reader is called to parse the input. During parsing, methods on
     63 the handler objects are called based on structural and syntactic events from the
     64 input data.
     65 
     66 For these objects, only the interfaces are relevant; they are normally not
     67 instantiated by the application itself.  Since Python does not have an explicit
     68 notion of interface, they are formally introduced as classes, but applications
     69 may use implementations which do not inherit from the provided classes.  The
     70 :class:`~xml.sax.xmlreader.InputSource`, :class:`~xml.sax.xmlreader.Locator`,
     71 :class:`~xml.sax.xmlreader.Attributes`, :class:`~xml.sax.xmlreader.AttributesNS`,
     72 and :class:`~xml.sax.xmlreader.XMLReader` interfaces are defined in the
     73 module :mod:`xml.sax.xmlreader`.  The handler interfaces are defined in
     74 :mod:`xml.sax.handler`.  For convenience,
     75 :class:`~xml.sax.xmlreader.InputSource` (which is often
     76 instantiated directly) and the handler classes are also available from
     77 :mod:`xml.sax`.  These interfaces are described below.
     78 
     79 In addition to these classes, :mod:`xml.sax` provides the following exception
     80 classes.
     81 
     82 
     83 .. exception:: SAXException(msg[, exception])
     84 
     85    Encapsulate an XML error or warning.  This class can contain basic error or
     86    warning information from either the XML parser or the application: it can be
     87    subclassed to provide additional functionality or to add localization.  Note
     88    that although the handlers defined in the
     89    :class:`~xml.sax.handler.ErrorHandler` interface
     90    receive instances of this exception, it is not required to actually raise the
     91    exception --- it is also useful as a container for information.
     92 
     93    When instantiated, *msg* should be a human-readable description of the error.
     94    The optional *exception* parameter, if given, should be ``None`` or an exception
     95    that was caught by the parsing code and is being passed along as information.
     96 
     97    This is the base class for the other SAX exception classes.
     98 
     99 
    100 .. exception:: SAXParseException(msg, exception, locator)
    101 
    102    Subclass of :exc:`SAXException` raised on parse errors. Instances of this
    103    class are passed to the methods of the SAX
    104    :class:`~xml.sax.handler.ErrorHandler` interface to provide information
    105    about the parse error.  This class supports the SAX
    106    :class:`~xml.sax.xmlreader.Locator` interface as well as the
    107    :class:`SAXException` interface.
    108 
    109 
    110 .. exception:: SAXNotRecognizedException(msg[, exception])
    111 
    112    Subclass of :exc:`SAXException` raised when a SAX
    113    :class:`~xml.sax.xmlreader.XMLReader` is
    114    confronted with an unrecognized feature or property.  SAX applications and
    115    extensions may use this class for similar purposes.
    116 
    117 
    118 .. exception:: SAXNotSupportedException(msg[, exception])
    119 
    120    Subclass of :exc:`SAXException` raised when a SAX
    121    :class:`~xml.sax.xmlreader.XMLReader` is asked to
    122    enable a feature that is not supported, or to set a property to a value that the
    123    implementation does not support.  SAX applications and extensions may use this
    124    class for similar purposes.
    125 
    126 
    127 .. seealso::
    128 
    129    `SAX: The Simple API for XML <http://www.saxproject.org/>`_
    130       This site is the focal point for the definition of the SAX API.  It provides a
    131       Java implementation and online documentation.  Links to implementations and
    132       historical information are also available.
    133 
    134    Module :mod:`xml.sax.handler`
    135       Definitions of the interfaces for application-provided objects.
    136 
    137    Module :mod:`xml.sax.saxutils`
    138       Convenience functions for use in SAX applications.
    139 
    140    Module :mod:`xml.sax.xmlreader`
    141       Definitions of the interfaces for parser-provided objects.
    142 
    143 
    144 .. _sax-exception-objects:
    145 
    146 SAXException Objects
    147 --------------------
    148 
    149 The :class:`SAXException` exception class supports the following methods:
    150 
    151 
    152 .. method:: SAXException.getMessage()
    153 
    154    Return a human-readable message describing the error condition.
    155 
    156 
    157 .. method:: SAXException.getException()
    158 
    159    Return an encapsulated exception object, or ``None``.
    160 
    161