Home | History | Annotate | Download | only in whatsnew
      1 ****************************
      2   What's New in Python 2.3
      3 ****************************
      4 
      5 :Author: A.M. Kuchling
      6 
      7 .. |release| replace:: 1.01
      8 
      9 .. $Id: whatsnew23.tex 54631 2007-03-31 11:58:36Z georg.brandl $
     10 
     11 This article explains the new features in Python 2.3.  Python 2.3 was released
     12 on July 29, 2003.
     13 
     14 The main themes for Python 2.3 are polishing some of the features added in 2.2,
     15 adding various small but useful enhancements to the core language, and expanding
     16 the standard library.  The new object model introduced in the previous version
     17 has benefited from 18 months of bugfixes and from optimization efforts that have
     18 improved the performance of new-style classes.  A few new built-in functions
     19 have been added such as :func:`sum` and :func:`enumerate`.  The :keyword:`in`
     20 operator can now be used for substring searches (e.g. ``"ab" in "abc"`` returns
     21 :const:`True`).
     22 
     23 Some of the many new library features include Boolean, set, heap, and date/time
     24 data types, the ability to import modules from ZIP-format archives, metadata
     25 support for the long-awaited Python catalog, an updated version of IDLE, and
     26 modules for logging messages, wrapping text, parsing CSV files, processing
     27 command-line options, using BerkeleyDB databases...  the list of new and
     28 enhanced modules is lengthy.
     29 
     30 This article doesn't attempt to provide a complete specification of the new
     31 features, but instead provides a convenient overview.  For full details, you
     32 should refer to the documentation for Python 2.3, such as the Python Library
     33 Reference and the Python Reference Manual.  If you want to understand the
     34 complete implementation and design rationale, refer to the PEP for a particular
     35 new feature.
     36 
     37 .. ======================================================================
     38 
     39 
     40 PEP 218: A Standard Set Datatype
     41 ================================
     42 
     43 The new :mod:`sets` module contains an implementation of a set datatype.  The
     44 :class:`Set` class is for mutable sets, sets that can have members added and
     45 removed.  The :class:`ImmutableSet` class is for sets that can't be modified,
     46 and instances of :class:`ImmutableSet` can therefore be used as dictionary keys.
     47 Sets are built on top of dictionaries, so the elements within a set must be
     48 hashable.
     49 
     50 Here's a simple example::
     51 
     52    >>> import sets
     53    >>> S = sets.Set([1,2,3])
     54    >>> S
     55    Set([1, 2, 3])
     56    >>> 1 in S
     57    True
     58    >>> 0 in S
     59    False
     60    >>> S.add(5)
     61    >>> S.remove(3)
     62    >>> S
     63    Set([1, 2, 5])
     64    >>>
     65 
     66 The union and intersection of sets can be computed with the :meth:`union` and
     67 :meth:`intersection` methods; an alternative notation uses the bitwise operators
     68 ``&`` and ``|``. Mutable sets also have in-place versions of these methods,
     69 :meth:`union_update` and :meth:`intersection_update`. ::
     70 
     71    >>> S1 = sets.Set([1,2,3])
     72    >>> S2 = sets.Set([4,5,6])
     73    >>> S1.union(S2)
     74    Set([1, 2, 3, 4, 5, 6])
     75    >>> S1 | S2                  # Alternative notation
     76    Set([1, 2, 3, 4, 5, 6])
     77    >>> S1.intersection(S2)
     78    Set([])
     79    >>> S1 & S2                  # Alternative notation
     80    Set([])
     81    >>> S1.union_update(S2)
     82    >>> S1
     83    Set([1, 2, 3, 4, 5, 6])
     84    >>>
     85 
     86 It's also possible to take the symmetric difference of two sets.  This is the
     87 set of all elements in the union that aren't in the intersection.  Another way
     88 of putting it is that the symmetric difference contains all elements that are in
     89 exactly one set.  Again, there's an alternative notation (``^``), and an in-
     90 place version with the ungainly name :meth:`symmetric_difference_update`. ::
     91 
     92    >>> S1 = sets.Set([1,2,3,4])
     93    >>> S2 = sets.Set([3,4,5,6])
     94    >>> S1.symmetric_difference(S2)
     95    Set([1, 2, 5, 6])
     96    >>> S1 ^ S2
     97    Set([1, 2, 5, 6])
     98    >>>
     99 
    100 There are also :meth:`issubset` and :meth:`issuperset` methods for checking
    101 whether one set is a subset or superset of another::
    102 
    103    >>> S1 = sets.Set([1,2,3])
    104    >>> S2 = sets.Set([2,3])
    105    >>> S2.issubset(S1)
    106    True
    107    >>> S1.issubset(S2)
    108    False
    109    >>> S1.issuperset(S2)
    110    True
    111    >>>
    112 
    113 
    114 .. seealso::
    115 
    116    :pep:`218` - Adding a Built-In Set Object Type
    117       PEP written by Greg V. Wilson. Implemented by Greg V. Wilson, Alex Martelli, and
    118       GvR.
    119 
    120 .. ======================================================================
    121 
    122 
    123 .. _section-generators:
    124 
    125 PEP 255: Simple Generators
    126 ==========================
    127 
    128 In Python 2.2, generators were added as an optional feature, to be enabled by a
    129 ``from __future__ import generators`` directive.  In 2.3 generators no longer
    130 need to be specially enabled, and are now always present; this means that
    131 :keyword:`yield` is now always a keyword.  The rest of this section is a copy of
    132 the description of generators from the "What's New in Python 2.2" document; if
    133 you read it back when Python 2.2 came out, you can skip the rest of this
    134 section.
    135 
    136 You're doubtless familiar with how function calls work in Python or C. When you
    137 call a function, it gets a private namespace where its local variables are
    138 created.  When the function reaches a :keyword:`return` statement, the local
    139 variables are destroyed and the resulting value is returned to the caller.  A
    140 later call to the same function will get a fresh new set of local variables.
    141 But, what if the local variables weren't thrown away on exiting a function?
    142 What if you could later resume the function where it left off?  This is what
    143 generators provide; they can be thought of as resumable functions.
    144 
    145 Here's the simplest example of a generator function::
    146 
    147    def generate_ints(N):
    148        for i in range(N):
    149            yield i
    150 
    151 A new keyword, :keyword:`yield`, was introduced for generators.  Any function
    152 containing a :keyword:`yield` statement is a generator function; this is
    153 detected by Python's bytecode compiler which compiles the function specially as
    154 a result.
    155 
    156 When you call a generator function, it doesn't return a single value; instead it
    157 returns a generator object that supports the iterator protocol.  On executing
    158 the :keyword:`yield` statement, the generator outputs the value of ``i``,
    159 similar to a :keyword:`return` statement.  The big difference between
    160 :keyword:`yield` and a :keyword:`return` statement is that on reaching a
    161 :keyword:`yield` the generator's state of execution is suspended and local
    162 variables are preserved.  On the next call to the generator's ``.next()``
    163 method, the function will resume executing immediately after the
    164 :keyword:`yield` statement.  (For complicated reasons, the :keyword:`yield`
    165 statement isn't allowed inside the :keyword:`try` block of a :keyword:`try`...\
    166 :keyword:`finally` statement; read :pep:`255` for a full explanation of the
    167 interaction between :keyword:`yield` and exceptions.)
    168 
    169 Here's a sample usage of the :func:`generate_ints` generator::
    170 
    171    >>> gen = generate_ints(3)
    172    >>> gen
    173    <generator object at 0x8117f90>
    174    >>> gen.next()
    175    0
    176    >>> gen.next()
    177    1
    178    >>> gen.next()
    179    2
    180    >>> gen.next()
    181    Traceback (most recent call last):
    182      File "stdin", line 1, in ?
    183      File "stdin", line 2, in generate_ints
    184    StopIteration
    185 
    186 You could equally write ``for i in generate_ints(5)``, or ``a,b,c =
    187 generate_ints(3)``.
    188 
    189 Inside a generator function, the :keyword:`return` statement can only be used
    190 without a value, and signals the end of the procession of values; afterwards the
    191 generator cannot return any further values. :keyword:`return` with a value, such
    192 as ``return 5``, is a syntax error inside a generator function.  The end of the
    193 generator's results can also be indicated by raising :exc:`StopIteration`
    194 manually, or by just letting the flow of execution fall off the bottom of the
    195 function.
    196 
    197 You could achieve the effect of generators manually by writing your own class
    198 and storing all the local variables of the generator as instance variables.  For
    199 example, returning a list of integers could be done by setting ``self.count`` to
    200 0, and having the :meth:`next` method increment ``self.count`` and return it.
    201 However, for a moderately complicated generator, writing a corresponding class
    202 would be much messier. :file:`Lib/test/test_generators.py` contains a number of
    203 more interesting examples.  The simplest one implements an in-order traversal of
    204 a tree using generators recursively. ::
    205 
    206    # A recursive generator that generates Tree leaves in in-order.
    207    def inorder(t):
    208        if t:
    209            for x in inorder(t.left):
    210                yield x
    211            yield t.label
    212            for x in inorder(t.right):
    213                yield x
    214 
    215 Two other examples in :file:`Lib/test/test_generators.py` produce solutions for
    216 the N-Queens problem (placing $N$ queens on an $NxN$ chess board so that no
    217 queen threatens another) and the Knight's Tour (a route that takes a knight to
    218 every square of an $NxN$ chessboard without visiting any square twice).
    219 
    220 The idea of generators comes from other programming languages, especially Icon
    221 (https://www.cs.arizona.edu/icon/), where the idea of generators is central.  In
    222 Icon, every expression and function call behaves like a generator.  One example
    223 from "An Overview of the Icon Programming Language" at
    224 https://www.cs.arizona.edu/icon/docs/ipd266.htm gives an idea of what this looks
    225 like::
    226 
    227    sentence := "Store it in the neighboring harbor"
    228    if (i := find("or", sentence)) > 5 then write(i)
    229 
    230 In Icon the :func:`find` function returns the indexes at which the substring
    231 "or" is found: 3, 23, 33.  In the :keyword:`if` statement, ``i`` is first
    232 assigned a value of 3, but 3 is less than 5, so the comparison fails, and Icon
    233 retries it with the second value of 23.  23 is greater than 5, so the comparison
    234 now succeeds, and the code prints the value 23 to the screen.
    235 
    236 Python doesn't go nearly as far as Icon in adopting generators as a central
    237 concept.  Generators are considered part of the core Python language, but
    238 learning or using them isn't compulsory; if they don't solve any problems that
    239 you have, feel free to ignore them. One novel feature of Python's interface as
    240 compared to Icon's is that a generator's state is represented as a concrete
    241 object (the iterator) that can be passed around to other functions or stored in
    242 a data structure.
    243 
    244 
    245 .. seealso::
    246 
    247    :pep:`255` - Simple Generators
    248       Written by Neil Schemenauer, Tim Peters, Magnus Lie Hetland.  Implemented mostly
    249       by Neil Schemenauer and Tim Peters, with other fixes from the Python Labs crew.
    250 
    251 .. ======================================================================
    252 
    253 
    254 .. _section-encodings:
    255 
    256 PEP 263: Source Code Encodings
    257 ==============================
    258 
    259 Python source files can now be declared as being in different character set
    260 encodings.  Encodings are declared by including a specially formatted comment in
    261 the first or second line of the source file.  For example, a UTF-8 file can be
    262 declared with::
    263 
    264    #!/usr/bin/env python
    265    # -*- coding: UTF-8 -*-
    266 
    267 Without such an encoding declaration, the default encoding used is 7-bit ASCII.
    268 Executing or importing modules that contain string literals with 8-bit
    269 characters and have no encoding declaration will result in a
    270 :exc:`DeprecationWarning` being signalled by Python 2.3; in 2.4 this will be a
    271 syntax error.
    272 
    273 The encoding declaration only affects Unicode string literals, which will be
    274 converted to Unicode using the specified encoding.  Note that Python identifiers
    275 are still restricted to ASCII characters, so you can't have variable names that
    276 use characters outside of the usual alphanumerics.
    277 
    278 
    279 .. seealso::
    280 
    281    :pep:`263` - Defining Python Source Code Encodings
    282       Written by Marc-Andr Lemburg and Martin von Lwis; implemented by Suzuki Hisao
    283       and Martin von Lwis.
    284 
    285 .. ======================================================================
    286 
    287 
    288 PEP 273: Importing Modules from ZIP Archives
    289 ============================================
    290 
    291 The new :mod:`zipimport` module adds support for importing modules from a ZIP-
    292 format archive.  You don't need to import the module explicitly; it will be
    293 automatically imported if a ZIP archive's filename is added to ``sys.path``.
    294 For example:
    295 
    296 .. code-block:: shell-session
    297 
    298    amk@nyman:~/src/python$ unzip -l /tmp/example.zip
    299    Archive:  /tmp/example.zip
    300      Length     Date   Time    Name
    301     --------    ----   ----    ----
    302         8467  11-26-02 22:30   jwzthreading.py
    303     --------                   -------
    304         8467                   1 file
    305    amk@nyman:~/src/python$ ./python
    306    Python 2.3 (#1, Aug 1 2003, 19:54:32)
    307    >>> import sys
    308    >>> sys.path.insert(0, '/tmp/example.zip')  # Add .zip file to front of path
    309    >>> import jwzthreading
    310    >>> jwzthreading.__file__
    311    '/tmp/example.zip/jwzthreading.py'
    312    >>>
    313 
    314 An entry in ``sys.path`` can now be the filename of a ZIP archive. The ZIP
    315 archive can contain any kind of files, but only files named :file:`\*.py`,
    316 :file:`\*.pyc`, or :file:`\*.pyo` can be imported.  If an archive only contains
    317 :file:`\*.py` files, Python will not attempt to modify the archive by adding the
    318 corresponding :file:`\*.pyc` file, meaning that if a ZIP archive doesn't contain
    319 :file:`\*.pyc` files, importing may be rather slow.
    320 
    321 A path within the archive can also be specified to only import from a
    322 subdirectory; for example, the path :file:`/tmp/example.zip/lib/` would only
    323 import from the :file:`lib/` subdirectory within the archive.
    324 
    325 
    326 .. seealso::
    327 
    328    :pep:`273` - Import Modules from Zip Archives
    329       Written by James C. Ahlstrom,  who also provided an implementation. Python 2.3
    330       follows the specification in :pep:`273`,  but uses an implementation written by
    331       Just van Rossum  that uses the import hooks described in :pep:`302`. See section
    332       :ref:`section-pep302` for a description of the new import hooks.
    333 
    334 .. ======================================================================
    335 
    336 
    337 PEP 277: Unicode file name support for Windows NT
    338 =================================================
    339 
    340 On Windows NT, 2000, and XP, the system stores file names as Unicode strings.
    341 Traditionally, Python has represented file names as byte strings, which is
    342 inadequate because it renders some file names inaccessible.
    343 
    344 Python now allows using arbitrary Unicode strings (within the limitations of the
    345 file system) for all functions that expect file names, most notably the
    346 :func:`open` built-in function. If a Unicode string is passed to
    347 :func:`os.listdir`, Python now returns a list of Unicode strings.  A new
    348 function, :func:`os.getcwdu`, returns the current directory as a Unicode string.
    349 
    350 Byte strings still work as file names, and on Windows Python will transparently
    351 convert them to Unicode using the ``mbcs`` encoding.
    352 
    353 Other systems also allow Unicode strings as file names but convert them to byte
    354 strings before passing them to the system, which can cause a :exc:`UnicodeError`
    355 to be raised. Applications can test whether arbitrary Unicode strings are
    356 supported as file names by checking :attr:`os.path.supports_unicode_filenames`,
    357 a Boolean value.
    358 
    359 Under MacOS, :func:`os.listdir` may now return Unicode filenames.
    360 
    361 
    362 .. seealso::
    363 
    364    :pep:`277` - Unicode file name support for Windows NT
    365       Written by Neil Hodgson; implemented by Neil Hodgson, Martin von Lwis, and Mark
    366       Hammond.
    367 
    368 .. ======================================================================
    369 
    370 
    371 .. index::
    372    single: universal newlines; What's new
    373 
    374 PEP 278: Universal Newline Support
    375 ==================================
    376 
    377 The three major operating systems used today are Microsoft Windows, Apple's
    378 Macintosh OS, and the various Unix derivatives.  A minor irritation of cross-
    379 platform work  is that these three platforms all use different characters to
    380 mark the ends of lines in text files.  Unix uses the linefeed (ASCII character
    381 10), MacOS uses the carriage return (ASCII character 13), and Windows uses a
    382 two-character sequence of a carriage return plus a newline.
    383 
    384 Python's file objects can now support end of line conventions other than the
    385 one followed by the platform on which Python is running. Opening a file with
    386 the mode ``'U'`` or ``'rU'`` will open a file for reading in :term:`universal
    387 newlines` mode.  All three line ending conventions will be translated to a
    388 ``'\n'`` in the strings returned by the various file methods such as
    389 :meth:`read` and :meth:`readline`.
    390 
    391 Universal newline support is also used when importing modules and when executing
    392 a file with the :func:`execfile` function.  This means that Python modules can
    393 be shared between all three operating systems without needing to convert the
    394 line-endings.
    395 
    396 This feature can be disabled when compiling Python by specifying the
    397 :option:`!--without-universal-newlines` switch when running Python's
    398 :program:`configure` script.
    399 
    400 
    401 .. seealso::
    402 
    403    :pep:`278` - Universal Newline Support
    404       Written and implemented by Jack Jansen.
    405 
    406 .. ======================================================================
    407 
    408 
    409 .. _section-enumerate:
    410 
    411 PEP 279: enumerate()
    412 ====================
    413 
    414 A new built-in function, :func:`enumerate`, will make certain loops a bit
    415 clearer.  ``enumerate(thing)``, where *thing* is either an iterator or a
    416 sequence, returns an iterator that will return ``(0, thing[0])``, ``(1,
    417 thing[1])``, ``(2, thing[2])``, and so forth.
    418 
    419 A common idiom to change every element of a list looks like this::
    420 
    421    for i in range(len(L)):
    422        item = L[i]
    423        # ... compute some result based on item ...
    424        L[i] = result
    425 
    426 This can be rewritten using :func:`enumerate` as::
    427 
    428    for i, item in enumerate(L):
    429        # ... compute some result based on item ...
    430        L[i] = result
    431 
    432 
    433 .. seealso::
    434 
    435    :pep:`279` - The enumerate() built-in function
    436       Written and implemented by Raymond D. Hettinger.
    437 
    438 .. ======================================================================
    439 
    440 
    441 PEP 282: The logging Package
    442 ============================
    443 
    444 A standard package for writing logs, :mod:`logging`, has been added to Python
    445 2.3.  It provides a powerful and flexible mechanism for generating logging
    446 output which can then be filtered and processed in various ways.  A
    447 configuration file written in a standard format can be used to control the
    448 logging behavior of a program.  Python includes handlers that will write log
    449 records to standard error or to a file or socket, send them to the system log,
    450 or even e-mail them to a particular address; of course, it's also possible to
    451 write your own handler classes.
    452 
    453 The :class:`Logger` class is the primary class. Most application code will deal
    454 with one or more :class:`Logger` objects, each one used by a particular
    455 subsystem of the application. Each :class:`Logger` is identified by a name, and
    456 names are organized into a hierarchy using ``.``  as the component separator.
    457 For example, you might have :class:`Logger` instances named ``server``,
    458 ``server.auth`` and ``server.network``.  The latter two instances are below
    459 ``server`` in the hierarchy.  This means that if you turn up the verbosity for
    460 ``server`` or direct ``server`` messages to a different handler, the changes
    461 will also apply to records logged to ``server.auth`` and ``server.network``.
    462 There's also a root :class:`Logger` that's the parent of all other loggers.
    463 
    464 For simple uses, the :mod:`logging` package contains some convenience functions
    465 that always use the root log::
    466 
    467    import logging
    468 
    469    logging.debug('Debugging information')
    470    logging.info('Informational message')
    471    logging.warning('Warning:config file %s not found', 'server.conf')
    472    logging.error('Error occurred')
    473    logging.critical('Critical error -- shutting down')
    474 
    475 This produces the following output::
    476 
    477    WARNING:root:Warning:config file server.conf not found
    478    ERROR:root:Error occurred
    479    CRITICAL:root:Critical error -- shutting down
    480 
    481 In the default configuration, informational and debugging messages are
    482 suppressed and the output is sent to standard error.  You can enable the display
    483 of informational and debugging messages by calling the :meth:`setLevel` method
    484 on the root logger.
    485 
    486 Notice the :func:`warning` call's use of string formatting operators; all of the
    487 functions for logging messages take the arguments ``(msg, arg1, arg2, ...)`` and
    488 log the string resulting from ``msg % (arg1, arg2, ...)``.
    489 
    490 There's also an :func:`exception` function that records the most recent
    491 traceback.  Any of the other functions will also record the traceback if you
    492 specify a true value for the keyword argument *exc_info*. ::
    493 
    494    def f():
    495        try:    1/0
    496        except: logging.exception('Problem recorded')
    497 
    498    f()
    499 
    500 This produces the following output::
    501 
    502    ERROR:root:Problem recorded
    503    Traceback (most recent call last):
    504      File "t.py", line 6, in f
    505        1/0
    506    ZeroDivisionError: integer division or modulo by zero
    507 
    508 Slightly more advanced programs will use a logger other than the root logger.
    509 The :func:`getLogger(name)` function is used to get a particular log, creating
    510 it if it doesn't exist yet. :func:`getLogger(None)` returns the root logger. ::
    511 
    512    log = logging.getLogger('server')
    513     ...
    514    log.info('Listening on port %i', port)
    515     ...
    516    log.critical('Disk full')
    517     ...
    518 
    519 Log records are usually propagated up the hierarchy, so a message logged to
    520 ``server.auth`` is also seen by ``server`` and ``root``, but a :class:`Logger`
    521 can prevent this by setting its :attr:`propagate` attribute to :const:`False`.
    522 
    523 There are more classes provided by the :mod:`logging` package that can be
    524 customized.  When a :class:`Logger` instance is told to log a message, it
    525 creates a :class:`LogRecord` instance that is sent to any number of different
    526 :class:`Handler` instances.  Loggers and handlers can also have an attached list
    527 of filters, and each filter can cause the :class:`LogRecord` to be ignored or
    528 can modify the record before passing it along.  When they're finally output,
    529 :class:`LogRecord` instances are converted to text by a :class:`Formatter`
    530 class.  All of these classes can be replaced by your own specially-written
    531 classes.
    532 
    533 With all of these features the :mod:`logging` package should provide enough
    534 flexibility for even the most complicated applications.  This is only an
    535 incomplete overview of its features, so please see the package's reference
    536 documentation for all of the details.  Reading :pep:`282` will also be helpful.
    537 
    538 
    539 .. seealso::
    540 
    541    :pep:`282` - A Logging System
    542       Written by Vinay Sajip and Trent Mick; implemented by Vinay Sajip.
    543 
    544 .. ======================================================================
    545 
    546 
    547 .. _section-bool:
    548 
    549 PEP 285: A Boolean Type
    550 =======================
    551 
    552 A Boolean type was added to Python 2.3.  Two new constants were added to the
    553 :mod:`__builtin__` module, :const:`True` and :const:`False`.  (:const:`True` and
    554 :const:`False` constants were added to the built-ins in Python 2.2.1, but the
    555 2.2.1 versions are simply set to integer values of 1 and 0 and aren't a
    556 different type.)
    557 
    558 The type object for this new type is named :class:`bool`; the constructor for it
    559 takes any Python value and converts it to :const:`True` or :const:`False`. ::
    560 
    561    >>> bool(1)
    562    True
    563    >>> bool(0)
    564    False
    565    >>> bool([])
    566    False
    567    >>> bool( (1,) )
    568    True
    569 
    570 Most of the standard library modules and built-in functions have been changed to
    571 return Booleans. ::
    572 
    573    >>> obj = []
    574    >>> hasattr(obj, 'append')
    575    True
    576    >>> isinstance(obj, list)
    577    True
    578    >>> isinstance(obj, tuple)
    579    False
    580 
    581 Python's Booleans were added with the primary goal of making code clearer.  For
    582 example, if you're reading a function and encounter the statement ``return 1``,
    583 you might wonder whether the ``1`` represents a Boolean truth value, an index,
    584 or a coefficient that multiplies some other quantity.  If the statement is
    585 ``return True``, however, the meaning of the return value is quite clear.
    586 
    587 Python's Booleans were *not* added for the sake of strict type-checking.  A very
    588 strict language such as Pascal would also prevent you performing arithmetic with
    589 Booleans, and would require that the expression in an :keyword:`if` statement
    590 always evaluate to a Boolean result.  Python is not this strict and never will
    591 be, as :pep:`285` explicitly says.  This means you can still use any expression
    592 in an :keyword:`if` statement, even ones that evaluate to a list or tuple or
    593 some random object.  The Boolean type is a subclass of the :class:`int` class so
    594 that arithmetic using a Boolean still works. ::
    595 
    596    >>> True + 1
    597    2
    598    >>> False + 1
    599    1
    600    >>> False * 75
    601    0
    602    >>> True * 75
    603    75
    604 
    605 To sum up :const:`True` and :const:`False` in a sentence: they're alternative
    606 ways to spell the integer values 1 and 0, with the single difference that
    607 :func:`str` and :func:`repr` return the strings ``'True'`` and ``'False'``
    608 instead of ``'1'`` and ``'0'``.
    609 
    610 
    611 .. seealso::
    612 
    613    :pep:`285` - Adding a bool type
    614       Written and implemented by GvR.
    615 
    616 .. ======================================================================
    617 
    618 
    619 PEP 293: Codec Error Handling Callbacks
    620 =======================================
    621 
    622 When encoding a Unicode string into a byte string, unencodable characters may be
    623 encountered.  So far, Python has allowed specifying the error processing as
    624 either "strict" (raising :exc:`UnicodeError`), "ignore" (skipping the
    625 character), or "replace" (using a question mark in the output string), with
    626 "strict" being the default behavior. It may be desirable to specify alternative
    627 processing of such errors, such as inserting an XML character reference or HTML
    628 entity reference into the converted string.
    629 
    630 Python now has a flexible framework to add different processing strategies.  New
    631 error handlers can be added with :func:`codecs.register_error`, and codecs then
    632 can access the error handler with :func:`codecs.lookup_error`. An equivalent C
    633 API has been added for codecs written in C. The error handler gets the necessary
    634 state information such as the string being converted, the position in the string
    635 where the error was detected, and the target encoding.  The handler can then
    636 either raise an exception or return a replacement string.
    637 
    638 Two additional error handlers have been implemented using this framework:
    639 "backslashreplace" uses Python backslash quoting to represent unencodable
    640 characters and "xmlcharrefreplace" emits XML character references.
    641 
    642 
    643 .. seealso::
    644 
    645    :pep:`293` - Codec Error Handling Callbacks
    646       Written and implemented by Walter Drwald.
    647 
    648 .. ======================================================================
    649 
    650 
    651 .. _section-pep301:
    652 
    653 PEP 301: Package Index and Metadata for Distutils
    654 =================================================
    655 
    656 Support for the long-requested Python catalog makes its first appearance in 2.3.
    657 
    658 The heart of the catalog is the new Distutils :command:`register` command.
    659 Running ``python setup.py register`` will collect the metadata describing a
    660 package, such as its name, version, maintainer, description, &c., and send it to
    661 a central catalog server.  The resulting catalog is available from
    662 https://pypi.python.org/pypi.
    663 
    664 To make the catalog a bit more useful, a new optional *classifiers* keyword
    665 argument has been added to the Distutils :func:`setup` function.  A list of
    666 `Trove <http://catb.org/~esr/trove/>`_-style strings can be supplied to help
    667 classify the software.
    668 
    669 Here's an example :file:`setup.py` with classifiers, written to be compatible
    670 with older versions of the Distutils::
    671 
    672    from distutils import core
    673    kw = {'name': "Quixote",
    674          'version': "0.5.1",
    675          'description': "A highly Pythonic Web application framework",
    676          # ...
    677          }
    678 
    679    if (hasattr(core, 'setup_keywords') and
    680        'classifiers' in core.setup_keywords):
    681        kw['classifiers'] = \
    682            ['Topic :: Internet :: WWW/HTTP :: Dynamic Content',
    683             'Environment :: No Input/Output (Daemon)',
    684             'Intended Audience :: Developers'],
    685 
    686    core.setup(**kw)
    687 
    688 The full list of classifiers can be obtained by running  ``python setup.py
    689 register --list-classifiers``.
    690 
    691 
    692 .. seealso::
    693 
    694    :pep:`301` - Package Index and Metadata for Distutils
    695       Written and implemented by Richard Jones.
    696 
    697 .. ======================================================================
    698 
    699 
    700 .. _section-pep302:
    701 
    702 PEP 302: New Import Hooks
    703 =========================
    704 
    705 While it's been possible to write custom import hooks ever since the
    706 :mod:`ihooks` module was introduced in Python 1.3, no one has ever been really
    707 happy with it because writing new import hooks is difficult and messy.  There
    708 have been various proposed alternatives such as the :mod:`imputil` and :mod:`iu`
    709 modules, but none of them has ever gained much acceptance, and none of them were
    710 easily usable from C code.
    711 
    712 :pep:`302` borrows ideas from its predecessors, especially from Gordon
    713 McMillan's :mod:`iu` module.  Three new items  are added to the :mod:`sys`
    714 module:
    715 
    716 * ``sys.path_hooks`` is a list of callable objects; most  often they'll be
    717   classes.  Each callable takes a string containing a path and either returns an
    718   importer object that will handle imports from this path or raises an
    719   :exc:`ImportError` exception if it can't handle this path.
    720 
    721 * ``sys.path_importer_cache`` caches importer objects for each path, so
    722   ``sys.path_hooks`` will only need to be traversed once for each path.
    723 
    724 * ``sys.meta_path`` is a list of importer objects that will be traversed before
    725   ``sys.path`` is checked.  This list is initially empty, but user code can add
    726   objects to it.  Additional built-in and frozen modules can be imported by an
    727   object added to this list.
    728 
    729 Importer objects must have a single method, :meth:`find_module(fullname,
    730 path=None)`.  *fullname* will be a module or package name, e.g. ``string`` or
    731 ``distutils.core``.  :meth:`find_module` must return a loader object that has a
    732 single method, :meth:`load_module(fullname)`, that creates and returns the
    733 corresponding module object.
    734 
    735 Pseudo-code for Python's new import logic, therefore, looks something like this
    736 (simplified a bit; see :pep:`302` for the full details)::
    737 
    738    for mp in sys.meta_path:
    739        loader = mp(fullname)
    740        if loader is not None:
    741            <module> = loader.load_module(fullname)
    742 
    743    for path in sys.path:
    744        for hook in sys.path_hooks:
    745            try:
    746                importer = hook(path)
    747            except ImportError:
    748                # ImportError, so try the other path hooks
    749                pass
    750            else:
    751                loader = importer.find_module(fullname)
    752                <module> = loader.load_module(fullname)
    753 
    754    # Not found!
    755    raise ImportError
    756 
    757 
    758 .. seealso::
    759 
    760    :pep:`302` - New Import Hooks
    761       Written by Just van Rossum and Paul Moore. Implemented by Just van Rossum.
    762 
    763 .. ======================================================================
    764 
    765 
    766 .. _section-pep305:
    767 
    768 PEP 305: Comma-separated Files
    769 ==============================
    770 
    771 Comma-separated files are a format frequently used for exporting data from
    772 databases and spreadsheets.  Python 2.3 adds a parser for comma-separated files.
    773 
    774 Comma-separated format is deceptively simple at first glance::
    775 
    776    Costs,150,200,3.95
    777 
    778 Read a line and call ``line.split(',')``: what could be simpler? But toss in
    779 string data that can contain commas, and things get more complicated::
    780 
    781    "Costs",150,200,3.95,"Includes taxes, shipping, and sundry items"
    782 
    783 A big ugly regular expression can parse this, but using the new  :mod:`csv`
    784 package is much simpler::
    785 
    786    import csv
    787 
    788    input = open('datafile', 'rb')
    789    reader = csv.reader(input)
    790    for line in reader:
    791        print line
    792 
    793 The :func:`reader` function takes a number of different options. The field
    794 separator isn't limited to the comma and can be changed to any character, and so
    795 can the quoting and line-ending characters.
    796 
    797 Different dialects of comma-separated files can be defined and registered;
    798 currently there are two dialects, both used by Microsoft Excel. A separate
    799 :class:`csv.writer` class will generate comma-separated files from a succession
    800 of tuples or lists, quoting strings that contain the delimiter.
    801 
    802 
    803 .. seealso::
    804 
    805    :pep:`305` - CSV File API
    806       Written and implemented  by Kevin Altis, Dave Cole, Andrew McNamara, Skip
    807       Montanaro, Cliff Wells.
    808 
    809 .. ======================================================================
    810 
    811 
    812 .. _section-pep307:
    813 
    814 PEP 307: Pickle Enhancements
    815 ============================
    816 
    817 The :mod:`pickle` and :mod:`cPickle` modules received some attention during the
    818 2.3 development cycle.  In 2.2, new-style classes could be pickled without
    819 difficulty, but they weren't pickled very compactly; :pep:`307` quotes a trivial
    820 example where a new-style class results in a pickled string three times longer
    821 than that for a classic class.
    822 
    823 The solution was to invent a new pickle protocol.  The :func:`pickle.dumps`
    824 function has supported a text-or-binary flag  for a long time.  In 2.3, this
    825 flag is redefined from a Boolean to an integer: 0 is the old text-mode pickle
    826 format, 1 is the old binary format, and now 2 is a new 2.3-specific format.  A
    827 new constant, :const:`pickle.HIGHEST_PROTOCOL`, can be used to select the
    828 fanciest protocol available.
    829 
    830 Unpickling is no longer considered a safe operation.  2.2's :mod:`pickle`
    831 provided hooks for trying to prevent unsafe classes from being unpickled
    832 (specifically, a :attr:`__safe_for_unpickling__` attribute), but none of this
    833 code was ever audited and therefore it's all been ripped out in 2.3.  You should
    834 not unpickle untrusted data in any version of Python.
    835 
    836 To reduce the pickling overhead for new-style classes, a new interface for
    837 customizing pickling was added using three special methods:
    838 :meth:`__getstate__`, :meth:`__setstate__`, and :meth:`__getnewargs__`.  Consult
    839 :pep:`307` for the full semantics  of these methods.
    840 
    841 As a way to compress pickles yet further, it's now possible to use integer codes
    842 instead of long strings to identify pickled classes. The Python Software
    843 Foundation will maintain a list of standardized codes; there's also a range of
    844 codes for private use.  Currently no codes have been specified.
    845 
    846 
    847 .. seealso::
    848 
    849    :pep:`307` - Extensions to the pickle protocol
    850       Written and implemented  by Guido van Rossum and Tim Peters.
    851 
    852 .. ======================================================================
    853 
    854 
    855 .. _section-slices:
    856 
    857 Extended Slices
    858 ===============
    859 
    860 Ever since Python 1.4, the slicing syntax has supported an optional third "step"
    861 or "stride" argument.  For example, these are all legal Python syntax:
    862 ``L[1:10:2]``, ``L[:-1:1]``, ``L[::-1]``.  This was added to Python at the
    863 request of the developers of Numerical Python, which uses the third argument
    864 extensively.  However, Python's built-in list, tuple, and string sequence types
    865 have never supported this feature, raising a :exc:`TypeError` if you tried it.
    866 Michael Hudson contributed a patch to fix this shortcoming.
    867 
    868 For example, you can now easily extract the elements of a list that have even
    869 indexes::
    870 
    871    >>> L = range(10)
    872    >>> L[::2]
    873    [0, 2, 4, 6, 8]
    874 
    875 Negative values also work to make a copy of the same list in reverse order::
    876 
    877    >>> L[::-1]
    878    [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
    879 
    880 This also works for tuples, arrays, and strings::
    881 
    882    >>> s='abcd'
    883    >>> s[::2]
    884    'ac'
    885    >>> s[::-1]
    886    'dcba'
    887 
    888 If you have a mutable sequence such as a list or an array you can assign to or
    889 delete an extended slice, but there are some differences between assignment to
    890 extended and regular slices.  Assignment to a regular slice can be used to
    891 change the length of the sequence::
    892 
    893    >>> a = range(3)
    894    >>> a
    895    [0, 1, 2]
    896    >>> a[1:3] = [4, 5, 6]
    897    >>> a
    898    [0, 4, 5, 6]
    899 
    900 Extended slices aren't this flexible.  When assigning to an extended slice, the
    901 list on the right hand side of the statement must contain the same number of
    902 items as the slice it is replacing::
    903 
    904    >>> a = range(4)
    905    >>> a
    906    [0, 1, 2, 3]
    907    >>> a[::2]
    908    [0, 2]
    909    >>> a[::2] = [0, -1]
    910    >>> a
    911    [0, 1, -1, 3]
    912    >>> a[::2] = [0,1,2]
    913    Traceback (most recent call last):
    914      File "<stdin>", line 1, in ?
    915    ValueError: attempt to assign sequence of size 3 to extended slice of size 2
    916 
    917 Deletion is more straightforward::
    918 
    919    >>> a = range(4)
    920    >>> a
    921    [0, 1, 2, 3]
    922    >>> a[::2]
    923    [0, 2]
    924    >>> del a[::2]
    925    >>> a
    926    [1, 3]
    927 
    928 One can also now pass slice objects to the :meth:`__getitem__` methods of the
    929 built-in sequences::
    930 
    931    >>> range(10).__getitem__(slice(0, 5, 2))
    932    [0, 2, 4]
    933 
    934 Or use slice objects directly in subscripts::
    935 
    936    >>> range(10)[slice(0, 5, 2)]
    937    [0, 2, 4]
    938 
    939 To simplify implementing sequences that support extended slicing, slice objects
    940 now have a method :meth:`indices(length)` which, given the length of a sequence,
    941 returns a ``(start, stop, step)`` tuple that can be passed directly to
    942 :func:`range`. :meth:`indices` handles omitted and out-of-bounds indices in a
    943 manner consistent with regular slices (and this innocuous phrase hides a welter
    944 of confusing details!).  The method is intended to be used like this::
    945 
    946    class FakeSeq:
    947        ...
    948        def calc_item(self, i):
    949            ...
    950        def __getitem__(self, item):
    951            if isinstance(item, slice):
    952                indices = item.indices(len(self))
    953                return FakeSeq([self.calc_item(i) for i in range(*indices)])
    954            else:
    955                return self.calc_item(i)
    956 
    957 From this example you can also see that the built-in :class:`slice` object is
    958 now the type object for the slice type, and is no longer a function.  This is
    959 consistent with Python 2.2, where :class:`int`, :class:`str`, etc., underwent
    960 the same change.
    961 
    962 .. ======================================================================
    963 
    964 
    965 Other Language Changes
    966 ======================
    967 
    968 Here are all of the changes that Python 2.3 makes to the core Python language.
    969 
    970 * The :keyword:`yield` statement is now always a keyword, as described in
    971   section :ref:`section-generators` of this document.
    972 
    973 * A new built-in function :func:`enumerate` was added, as described in section
    974   :ref:`section-enumerate` of this document.
    975 
    976 * Two new constants, :const:`True` and :const:`False` were added along with the
    977   built-in :class:`bool` type, as described in section :ref:`section-bool` of this
    978   document.
    979 
    980 * The :func:`int` type constructor will now return a long integer instead of
    981   raising an :exc:`OverflowError` when a string or floating-point number is too
    982   large to fit into an integer.  This can lead to the paradoxical result that
    983   ``isinstance(int(expression), int)`` is false, but that seems unlikely to cause
    984   problems in practice.
    985 
    986 * Built-in types now support the extended slicing syntax, as described in
    987   section :ref:`section-slices` of this document.
    988 
    989 * A new built-in function, :func:`sum(iterable, start=0)`,  adds up the numeric
    990   items in the iterable object and returns their sum.  :func:`sum` only accepts
    991   numbers, meaning that you can't use it to concatenate a bunch of strings.
    992   (Contributed by Alex Martelli.)
    993 
    994 * ``list.insert(pos, value)`` used to  insert *value* at the front of the list
    995   when *pos* was negative.  The behaviour has now been changed to be consistent
    996   with slice indexing, so when *pos* is -1 the value will be inserted before the
    997   last element, and so forth.
    998 
    999 * ``list.index(value)``, which searches for *value*  within the list and returns
   1000   its index, now takes optional  *start* and *stop* arguments to limit the search
   1001   to  only part of the list.
   1002 
   1003 * Dictionaries have a new method, :meth:`pop(key[, *default*])`, that returns
   1004   the value corresponding to *key* and removes that key/value pair from the
   1005   dictionary.  If the requested key isn't present in the dictionary, *default* is
   1006   returned if it's specified and :exc:`KeyError` raised if it isn't. ::
   1007 
   1008      >>> d = {1:2}
   1009      >>> d
   1010      {1: 2}
   1011      >>> d.pop(4)
   1012      Traceback (most recent call last):
   1013        File "stdin", line 1, in ?
   1014      KeyError: 4
   1015      >>> d.pop(1)
   1016      2
   1017      >>> d.pop(1)
   1018      Traceback (most recent call last):
   1019        File "stdin", line 1, in ?
   1020      KeyError: 'pop(): dictionary is empty'
   1021      >>> d
   1022      {}
   1023      >>>
   1024 
   1025   There's also a new class method,  :meth:`dict.fromkeys(iterable, value)`, that
   1026   creates a dictionary with keys taken from the supplied iterator *iterable* and
   1027   all values set to *value*, defaulting to ``None``.
   1028 
   1029   (Patches contributed by Raymond Hettinger.)
   1030 
   1031   Also, the :func:`dict` constructor now accepts keyword arguments to simplify
   1032   creating small dictionaries::
   1033 
   1034      >>> dict(red=1, blue=2, green=3, black=4)
   1035      {'blue': 2, 'black': 4, 'green': 3, 'red': 1}
   1036 
   1037   (Contributed by Just van Rossum.)
   1038 
   1039 * The :keyword:`assert` statement no longer checks the ``__debug__`` flag, so
   1040   you can no longer disable assertions by assigning to ``__debug__``. Running
   1041   Python with the :option:`-O` switch will still generate code that doesn't
   1042   execute any assertions.
   1043 
   1044 * Most type objects are now callable, so you can use them to create new objects
   1045   such as functions, classes, and modules.  (This means that the :mod:`new` module
   1046   can be deprecated in a future Python version, because you can now use the type
   1047   objects available in the :mod:`types` module.) For example, you can create a new
   1048   module object with the following code:
   1049 
   1050   ::
   1051 
   1052      >>> import types
   1053      >>> m = types.ModuleType('abc','docstring')
   1054      >>> m
   1055      <module 'abc' (built-in)>
   1056      >>> m.__doc__
   1057      'docstring'
   1058 
   1059 * A new warning, :exc:`PendingDeprecationWarning` was added to indicate features
   1060   which are in the process of being deprecated.  The warning will *not* be printed
   1061   by default.  To check for use of features that will be deprecated in the future,
   1062   supply :option:`-Walways::PendingDeprecationWarning:: <-W>` on the command line or
   1063   use :func:`warnings.filterwarnings`.
   1064 
   1065 * The process of deprecating string-based exceptions, as in ``raise "Error
   1066   occurred"``, has begun.  Raising a string will now trigger
   1067   :exc:`PendingDeprecationWarning`.
   1068 
   1069 * Using ``None`` as a variable name will now result in a :exc:`SyntaxWarning`
   1070   warning.  In a future version of Python, ``None`` may finally become a keyword.
   1071 
   1072 * The :meth:`xreadlines` method of file objects, introduced in Python 2.1, is no
   1073   longer necessary because files now behave as their own iterator.
   1074   :meth:`xreadlines` was originally introduced as a faster way to loop over all
   1075   the lines in a file, but now you can simply write ``for line in file_obj``.
   1076   File objects also have a new read-only :attr:`encoding` attribute that gives the
   1077   encoding used by the file; Unicode strings written to the file will be
   1078   automatically  converted to bytes using the given encoding.
   1079 
   1080 * The method resolution order used by new-style classes has changed, though
   1081   you'll only notice the difference if you have a really complicated inheritance
   1082   hierarchy.  Classic classes are unaffected by this change.  Python 2.2
   1083   originally used a topological sort of a class's ancestors, but 2.3 now uses the
   1084   C3 algorithm as described in the paper `"A Monotonic Superclass Linearization
   1085   for Dylan" <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.19.3910>`_. To
   1086   understand the motivation for this change,  read Michele Simionato's article
   1087   `"Python 2.3 Method Resolution Order" <http://www.phyast.pitt.edu/~micheles/mro.html>`_, or
   1088   read the thread on python-dev starting with the message at
   1089   https://mail.python.org/pipermail/python-dev/2002-October/029035.html. Samuele
   1090   Pedroni first pointed out the problem and also implemented the fix by coding the
   1091   C3 algorithm.
   1092 
   1093 * Python runs multithreaded programs by switching between threads after
   1094   executing N bytecodes.  The default value for N has been increased from 10 to
   1095   100 bytecodes, speeding up single-threaded applications by reducing the
   1096   switching overhead.  Some multithreaded applications may suffer slower response
   1097   time, but that's easily fixed by setting the limit back to a lower number using
   1098   :func:`sys.setcheckinterval(N)`. The limit can be retrieved with the new
   1099   :func:`sys.getcheckinterval` function.
   1100 
   1101 * One minor but far-reaching change is that the names of extension types defined
   1102   by the modules included with Python now contain the module and a ``'.'`` in
   1103   front of the type name.  For example, in Python 2.2, if you created a socket and
   1104   printed its :attr:`__class__`, you'd get this output::
   1105 
   1106      >>> s = socket.socket()
   1107      >>> s.__class__
   1108      <type 'socket'>
   1109 
   1110   In 2.3, you get this::
   1111 
   1112      >>> s.__class__
   1113      <type '_socket.socket'>
   1114 
   1115 * One of the noted incompatibilities between old- and new-style classes has been
   1116   removed: you can now assign to the :attr:`~definition.__name__` and :attr:`~class.__bases__`
   1117   attributes of new-style classes.  There are some restrictions on what can be
   1118   assigned to :attr:`~class.__bases__` along the lines of those relating to assigning to
   1119   an instance's :attr:`~instance.__class__` attribute.
   1120 
   1121 .. ======================================================================
   1122 
   1123 
   1124 String Changes
   1125 --------------
   1126 
   1127 * The :keyword:`in` operator now works differently for strings. Previously, when
   1128   evaluating ``X in Y`` where *X* and *Y* are strings, *X* could only be a single
   1129   character. That's now changed; *X* can be a string of any length, and ``X in Y``
   1130   will return :const:`True` if *X* is a substring of *Y*.  If *X* is the empty
   1131   string, the result is always :const:`True`. ::
   1132 
   1133      >>> 'ab' in 'abcd'
   1134      True
   1135      >>> 'ad' in 'abcd'
   1136      False
   1137      >>> '' in 'abcd'
   1138      True
   1139 
   1140   Note that this doesn't tell you where the substring starts; if you need that
   1141   information, use the :meth:`find` string method.
   1142 
   1143 * The :meth:`strip`, :meth:`lstrip`, and :meth:`rstrip` string methods now have
   1144   an optional argument for specifying the characters to strip.  The default is
   1145   still to remove all whitespace characters::
   1146 
   1147      >>> '   abc '.strip()
   1148      'abc'
   1149      >>> '><><abc<><><>'.strip('<>')
   1150      'abc'
   1151      >>> '><><abc<><><>\n'.strip('<>')
   1152      'abc<><><>\n'
   1153      >>> u'\u4000\u4001abc\u4000'.strip(u'\u4000')
   1154      u'\u4001abc'
   1155      >>>
   1156 
   1157   (Suggested by Simon Brunning and implemented by Walter Drwald.)
   1158 
   1159 * The :meth:`startswith` and :meth:`endswith` string methods now accept negative
   1160   numbers for the *start* and *end* parameters.
   1161 
   1162 * Another new string method is :meth:`zfill`, originally a function in the
   1163   :mod:`string` module.  :meth:`zfill` pads a numeric string with zeros on the
   1164   left until it's the specified width. Note that the ``%`` operator is still more
   1165   flexible and powerful than :meth:`zfill`. ::
   1166 
   1167      >>> '45'.zfill(4)
   1168      '0045'
   1169      >>> '12345'.zfill(4)
   1170      '12345'
   1171      >>> 'goofy'.zfill(6)
   1172      '0goofy'
   1173 
   1174   (Contributed by Walter Drwald.)
   1175 
   1176 * A new type object, :class:`basestring`, has been added. Both 8-bit strings and
   1177   Unicode strings inherit from this type, so ``isinstance(obj, basestring)`` will
   1178   return :const:`True` for either kind of string.  It's a completely abstract
   1179   type, so you can't create :class:`basestring` instances.
   1180 
   1181 * Interned strings are no longer immortal and will now be garbage-collected in
   1182   the usual way when the only reference to them is from the internal dictionary of
   1183   interned strings.  (Implemented by Oren Tirosh.)
   1184 
   1185 .. ======================================================================
   1186 
   1187 
   1188 Optimizations
   1189 -------------
   1190 
   1191 * The creation of new-style class instances has been made much faster; they're
   1192   now faster than classic classes!
   1193 
   1194 * The :meth:`sort` method of list objects has been extensively rewritten by Tim
   1195   Peters, and the implementation is significantly faster.
   1196 
   1197 * Multiplication of large long integers is now much faster thanks to an
   1198   implementation of Karatsuba multiplication, an algorithm that scales better than
   1199   the O(n\*n) required for the grade-school multiplication algorithm.  (Original
   1200   patch by Christopher A. Craig, and significantly reworked by Tim Peters.)
   1201 
   1202 * The ``SET_LINENO`` opcode is now gone.  This may provide a small speed
   1203   increase, depending on your compiler's idiosyncrasies. See section
   1204   :ref:`section-other` for a longer explanation. (Removed by Michael Hudson.)
   1205 
   1206 * :func:`xrange` objects now have their own iterator, making ``for i in
   1207   xrange(n)`` slightly faster than ``for i in range(n)``.  (Patch by Raymond
   1208   Hettinger.)
   1209 
   1210 * A number of small rearrangements have been made in various hotspots to improve
   1211   performance, such as inlining a function or removing some code.  (Implemented
   1212   mostly by GvR, but lots of people have contributed single changes.)
   1213 
   1214 The net result of the 2.3 optimizations is that Python 2.3 runs the  pystone
   1215 benchmark around 25% faster than Python 2.2.
   1216 
   1217 .. ======================================================================
   1218 
   1219 
   1220 New, Improved, and Deprecated Modules
   1221 =====================================
   1222 
   1223 As usual, Python's standard library received a number of enhancements and bug
   1224 fixes.  Here's a partial list of the most notable changes, sorted alphabetically
   1225 by module name. Consult the :file:`Misc/NEWS` file in the source tree for a more
   1226 complete list of changes, or look through the CVS logs for all the details.
   1227 
   1228 * The :mod:`array` module now supports arrays of Unicode characters using the
   1229   ``'u'`` format character.  Arrays also now support using the ``+=`` assignment
   1230   operator to add another array's contents, and the ``*=`` assignment operator to
   1231   repeat an array. (Contributed by Jason Orendorff.)
   1232 
   1233 * The :mod:`bsddb` module has been replaced by version 4.1.6 of the `PyBSDDB
   1234   <http://pybsddb.sourceforge.net>`_ package, providing a more complete interface
   1235   to the transactional features of the BerkeleyDB library.
   1236 
   1237   The old version of the module has been renamed to  :mod:`bsddb185` and is no
   1238   longer built automatically; you'll  have to edit :file:`Modules/Setup` to enable
   1239   it.  Note that the new :mod:`bsddb` package is intended to be compatible with
   1240   the  old module, so be sure to file bugs if you discover any incompatibilities.
   1241   When upgrading to Python 2.3, if the new interpreter is compiled with a new
   1242   version of  the underlying BerkeleyDB library, you will almost certainly have to
   1243   convert your database files to the new version.  You can do this fairly easily
   1244   with the new scripts :file:`db2pickle.py` and :file:`pickle2db.py` which you
   1245   will find in the distribution's :file:`Tools/scripts` directory.  If you've
   1246   already been using the PyBSDDB package and importing it as :mod:`bsddb3`, you
   1247   will have to change your ``import`` statements to import it as :mod:`bsddb`.
   1248 
   1249 * The new :mod:`bz2` module is an interface to the bz2 data compression library.
   1250   bz2-compressed data is usually smaller than  corresponding :mod:`zlib`\
   1251   -compressed data. (Contributed by Gustavo Niemeyer.)
   1252 
   1253 * A set of standard date/time types has been added in the new :mod:`datetime`
   1254   module.  See the following section for more details.
   1255 
   1256 * The Distutils :class:`Extension` class now supports an extra constructor
   1257   argument named *depends* for listing additional source files that an extension
   1258   depends on.  This lets Distutils recompile the module if any of the dependency
   1259   files are modified.  For example, if :file:`sampmodule.c` includes the header
   1260   file :file:`sample.h`, you would create the :class:`Extension` object like
   1261   this::
   1262 
   1263      ext = Extension("samp",
   1264                      sources=["sampmodule.c"],
   1265                      depends=["sample.h"])
   1266 
   1267   Modifying :file:`sample.h` would then cause the module to be recompiled.
   1268   (Contributed by Jeremy Hylton.)
   1269 
   1270 * Other minor changes to Distutils: it now checks for the :envvar:`CC`,
   1271   :envvar:`CFLAGS`, :envvar:`CPP`, :envvar:`LDFLAGS`, and :envvar:`CPPFLAGS`
   1272   environment variables, using them to override the settings in Python's
   1273   configuration (contributed by Robert Weber).
   1274 
   1275 * Previously the :mod:`doctest` module would only search the docstrings of
   1276   public methods and functions for test cases, but it now also examines private
   1277   ones as well.  The :func:`DocTestSuite(` function creates a
   1278   :class:`unittest.TestSuite` object from a set of :mod:`doctest` tests.
   1279 
   1280 * The new :func:`gc.get_referents(object)` function returns a list of all the
   1281   objects referenced by *object*.
   1282 
   1283 * The :mod:`getopt` module gained a new function, :func:`gnu_getopt`, that
   1284   supports the same arguments as the existing :func:`getopt` function but uses
   1285   GNU-style scanning mode. The existing :func:`getopt` stops processing options as
   1286   soon as a non-option argument is encountered, but in GNU-style mode processing
   1287   continues, meaning that options and arguments can be mixed.  For example::
   1288 
   1289      >>> getopt.getopt(['-f', 'filename', 'output', '-v'], 'f:v')
   1290      ([('-f', 'filename')], ['output', '-v'])
   1291      >>> getopt.gnu_getopt(['-f', 'filename', 'output', '-v'], 'f:v')
   1292      ([('-f', 'filename'), ('-v', '')], ['output'])
   1293 
   1294   (Contributed by Peter strand.)
   1295 
   1296 * The :mod:`grp`, :mod:`pwd`, and :mod:`resource` modules now return enhanced
   1297   tuples::
   1298 
   1299      >>> import grp
   1300      >>> g = grp.getgrnam('amk')
   1301      >>> g.gr_name, g.gr_gid
   1302      ('amk', 500)
   1303 
   1304 * The :mod:`gzip` module can now handle files exceeding 2 GiB.
   1305 
   1306 * The new :mod:`heapq` module contains an implementation of a heap queue
   1307   algorithm.  A heap is an array-like data structure that keeps items in a
   1308   partially sorted order such that, for every index *k*, ``heap[k] <=
   1309   heap[2*k+1]`` and ``heap[k] <= heap[2*k+2]``.  This makes it quick to remove the
   1310   smallest item, and inserting a new item while maintaining the heap property is
   1311   O(lg n).  (See https://xlinux.nist.gov/dads//HTML/priorityque.html for more
   1312   information about the priority queue data structure.)
   1313 
   1314   The :mod:`heapq` module provides :func:`heappush` and :func:`heappop` functions
   1315   for adding and removing items while maintaining the heap property on top of some
   1316   other mutable Python sequence type.  Here's an example that uses a Python list::
   1317 
   1318      >>> import heapq
   1319      >>> heap = []
   1320      >>> for item in [3, 7, 5, 11, 1]:
   1321      ...    heapq.heappush(heap, item)
   1322      ...
   1323      >>> heap
   1324      [1, 3, 5, 11, 7]
   1325      >>> heapq.heappop(heap)
   1326      1
   1327      >>> heapq.heappop(heap)
   1328      3
   1329      >>> heap
   1330      [5, 7, 11]
   1331 
   1332   (Contributed by Kevin O'Connor.)
   1333 
   1334 * The IDLE integrated development environment has been updated using the code
   1335   from the IDLEfork project (http://idlefork.sourceforge.net).  The most notable feature is
   1336   that the code being developed is now executed in a subprocess, meaning that
   1337   there's no longer any need for manual ``reload()`` operations. IDLE's core code
   1338   has been incorporated into the standard library as the :mod:`idlelib` package.
   1339 
   1340 * The :mod:`imaplib` module now supports IMAP over SSL. (Contributed by Piers
   1341   Lauder and Tino Lange.)
   1342 
   1343 * The :mod:`itertools` contains a number of useful functions for use with
   1344   iterators, inspired by various functions provided by the ML and Haskell
   1345   languages.  For example, ``itertools.ifilter(predicate, iterator)`` returns all
   1346   elements in the iterator for which the function :func:`predicate` returns
   1347   :const:`True`, and ``itertools.repeat(obj, N)`` returns ``obj`` *N* times.
   1348   There are a number of other functions in the module; see the package's reference
   1349   documentation for details.
   1350   (Contributed by Raymond Hettinger.)
   1351 
   1352 * Two new functions in the :mod:`math` module, :func:`degrees(rads)` and
   1353   :func:`radians(degs)`, convert between radians and degrees.  Other functions in
   1354   the :mod:`math` module such as :func:`math.sin` and :func:`math.cos` have always
   1355   required input values measured in radians.  Also, an optional *base* argument
   1356   was added to :func:`math.log` to make it easier to compute logarithms for bases
   1357   other than ``e`` and ``10``.  (Contributed by Raymond Hettinger.)
   1358 
   1359 * Several new POSIX functions (:func:`getpgid`, :func:`killpg`, :func:`lchown`,
   1360   :func:`loadavg`, :func:`major`, :func:`makedev`, :func:`minor`, and
   1361   :func:`mknod`) were added to the :mod:`posix` module that underlies the
   1362   :mod:`os` module. (Contributed by Gustavo Niemeyer, Geert Jansen, and Denis S.
   1363   Otkidach.)
   1364 
   1365 * In the :mod:`os` module, the :func:`\*stat` family of functions can now report
   1366   fractions of a second in a timestamp.  Such time stamps are represented as
   1367   floats, similar to the value returned by :func:`time.time`.
   1368 
   1369   During testing, it was found that some applications will break if time stamps
   1370   are floats.  For compatibility, when using the tuple interface of the
   1371   :class:`stat_result` time stamps will be represented as integers. When using
   1372   named fields (a feature first introduced in Python 2.2), time stamps are still
   1373   represented as integers, unless :func:`os.stat_float_times` is invoked to enable
   1374   float return values::
   1375 
   1376      >>> os.stat("/tmp").st_mtime
   1377      1034791200
   1378      >>> os.stat_float_times(True)
   1379      >>> os.stat("/tmp").st_mtime
   1380      1034791200.6335014
   1381 
   1382   In Python 2.4, the default will change to always returning floats.
   1383 
   1384   Application developers should enable this feature only if all their libraries
   1385   work properly when confronted with floating point time stamps, or if they use
   1386   the tuple API. If used, the feature should be activated on an application level
   1387   instead of trying to enable it on a per-use basis.
   1388 
   1389 * The :mod:`optparse` module contains a new parser for command-line arguments
   1390   that can convert option values to a particular Python type  and will
   1391   automatically generate a usage message.  See the following section for  more
   1392   details.
   1393 
   1394 * The old and never-documented :mod:`linuxaudiodev` module has been deprecated,
   1395   and a new version named :mod:`ossaudiodev` has been added.  The module was
   1396   renamed because the OSS sound drivers can be used on platforms other than Linux,
   1397   and the interface has also been tidied and brought up to date in various ways.
   1398   (Contributed by Greg Ward and Nicholas FitzRoy-Dale.)
   1399 
   1400 * The new :mod:`platform` module contains a number of functions that try to
   1401   determine various properties of the platform you're running on.  There are
   1402   functions for getting the architecture, CPU type, the Windows OS version, and
   1403   even the Linux distribution version. (Contributed by Marc-Andr Lemburg.)
   1404 
   1405 * The parser objects provided by the :mod:`pyexpat` module can now optionally
   1406   buffer character data, resulting in fewer calls to your character data handler
   1407   and therefore faster performance.  Setting the parser object's
   1408   :attr:`buffer_text` attribute to :const:`True` will enable buffering.
   1409 
   1410 * The :func:`sample(population, k)` function was added to the :mod:`random`
   1411   module.  *population* is a sequence or :class:`xrange` object containing the
   1412   elements of a population, and :func:`sample` chooses *k* elements from the
   1413   population without replacing chosen elements.  *k* can be any value up to
   1414   ``len(population)``. For example::
   1415 
   1416      >>> days = ['Mo', 'Tu', 'We', 'Th', 'Fr', 'St', 'Sn']
   1417      >>> random.sample(days, 3)      # Choose 3 elements
   1418      ['St', 'Sn', 'Th']
   1419      >>> random.sample(days, 7)      # Choose 7 elements
   1420      ['Tu', 'Th', 'Mo', 'We', 'St', 'Fr', 'Sn']
   1421      >>> random.sample(days, 7)      # Choose 7 again
   1422      ['We', 'Mo', 'Sn', 'Fr', 'Tu', 'St', 'Th']
   1423      >>> random.sample(days, 8)      # Can't choose eight
   1424      Traceback (most recent call last):
   1425        File "<stdin>", line 1, in ?
   1426        File "random.py", line 414, in sample
   1427            raise ValueError, "sample larger than population"
   1428      ValueError: sample larger than population
   1429      >>> random.sample(xrange(1,10000,2), 10)   # Choose ten odd nos. under 10000
   1430      [3407, 3805, 1505, 7023, 2401, 2267, 9733, 3151, 8083, 9195]
   1431 
   1432   The :mod:`random` module now uses a new algorithm, the Mersenne Twister,
   1433   implemented in C.  It's faster and more extensively studied than the previous
   1434   algorithm.
   1435 
   1436   (All changes contributed by Raymond Hettinger.)
   1437 
   1438 * The :mod:`readline` module also gained a number of new functions:
   1439   :func:`get_history_item`, :func:`get_current_history_length`, and
   1440   :func:`redisplay`.
   1441 
   1442 * The :mod:`rexec` and :mod:`Bastion` modules have been declared dead, and
   1443   attempts to import them will fail with a :exc:`RuntimeError`.  New-style classes
   1444   provide new ways to break out of the restricted execution environment provided
   1445   by :mod:`rexec`, and no one has interest in fixing them or time to do so.  If
   1446   you have applications using :mod:`rexec`, rewrite them to use something else.
   1447 
   1448   (Sticking with Python 2.2 or 2.1 will not make your applications any safer
   1449   because there are known bugs in the :mod:`rexec` module in those versions.  To
   1450   repeat: if you're using :mod:`rexec`, stop using it immediately.)
   1451 
   1452 * The :mod:`rotor` module has been deprecated because the  algorithm it uses for
   1453   encryption is not believed to be secure.  If you need encryption, use one of the
   1454   several AES Python modules that are available separately.
   1455 
   1456 * The :mod:`shutil` module gained a :func:`move(src, dest)` function that
   1457   recursively moves a file or directory to a new location.
   1458 
   1459 * Support for more advanced POSIX signal handling was added to the :mod:`signal`
   1460   but then removed again as it proved impossible to make it work reliably across
   1461   platforms.
   1462 
   1463 * The :mod:`socket` module now supports timeouts.  You can call the
   1464   :meth:`settimeout(t)` method on a socket object to set a timeout of *t* seconds.
   1465   Subsequent socket operations that take longer than *t* seconds to complete will
   1466   abort and raise a :exc:`socket.timeout` exception.
   1467 
   1468   The original timeout implementation was by Tim O'Malley.  Michael Gilfix
   1469   integrated it into the Python :mod:`socket` module and shepherded it through a
   1470   lengthy review.  After the code was checked in, Guido van Rossum rewrote parts
   1471   of it.  (This is a good example of a collaborative development process in
   1472   action.)
   1473 
   1474 * On Windows, the :mod:`socket` module now ships with Secure  Sockets Layer
   1475   (SSL) support.
   1476 
   1477 * The value of the C :const:`PYTHON_API_VERSION` macro is now exposed at the
   1478   Python level as ``sys.api_version``.  The current exception can be cleared by
   1479   calling the new :func:`sys.exc_clear` function.
   1480 
   1481 * The new :mod:`tarfile` module  allows reading from and writing to
   1482   :program:`tar`\ -format archive files. (Contributed by Lars Gustbel.)
   1483 
   1484 * The new :mod:`textwrap` module contains functions for wrapping strings
   1485   containing paragraphs of text.  The :func:`wrap(text, width)` function takes a
   1486   string and returns a list containing the text split into lines of no more than
   1487   the chosen width.  The :func:`fill(text, width)` function returns a single
   1488   string, reformatted to fit into lines no longer than the chosen width. (As you
   1489   can guess, :func:`fill` is built on top of :func:`wrap`.  For example::
   1490 
   1491      >>> import textwrap
   1492      >>> paragraph = "Not a whit, we defy augury: ... more text ..."
   1493      >>> textwrap.wrap(paragraph, 60)
   1494      ["Not a whit, we defy augury: there's a special providence in",
   1495       "the fall of a sparrow. If it be now, 'tis not to come; if it",
   1496       ...]
   1497      >>> print textwrap.fill(paragraph, 35)
   1498      Not a whit, we defy augury: there's
   1499      a special providence in the fall of
   1500      a sparrow. If it be now, 'tis not
   1501      to come; if it be not to come, it
   1502      will be now; if it be not now, yet
   1503      it will come: the readiness is all.
   1504      >>>
   1505 
   1506   The module also contains a :class:`TextWrapper` class that actually implements
   1507   the text wrapping strategy.   Both the :class:`TextWrapper` class and the
   1508   :func:`wrap` and :func:`fill` functions support a number of additional keyword
   1509   arguments for fine-tuning the formatting; consult the module's documentation
   1510   for details. (Contributed by Greg Ward.)
   1511 
   1512 * The :mod:`thread` and :mod:`threading` modules now have companion modules,
   1513   :mod:`dummy_thread` and :mod:`dummy_threading`, that provide a do-nothing
   1514   implementation of the :mod:`thread` module's interface for platforms where
   1515   threads are not supported.  The intention is to simplify thread-aware modules
   1516   (ones that *don't* rely on threads to run) by putting the following code at the
   1517   top::
   1518 
   1519      try:
   1520          import threading as _threading
   1521      except ImportError:
   1522          import dummy_threading as _threading
   1523 
   1524   In this example, :mod:`_threading` is used as the module name to make it clear
   1525   that the module being used is not necessarily the actual :mod:`threading`
   1526   module. Code can call functions and use classes in :mod:`_threading` whether or
   1527   not threads are supported, avoiding an :keyword:`if` statement and making the
   1528   code slightly clearer.  This module will not magically make multithreaded code
   1529   run without threads; code that waits for another thread to return or to do
   1530   something will simply hang forever.
   1531 
   1532 * The :mod:`time` module's :func:`strptime` function has long been an annoyance
   1533   because it uses the platform C library's :func:`strptime` implementation, and
   1534   different platforms sometimes have odd bugs.  Brett Cannon contributed a
   1535   portable implementation that's written in pure Python and should behave
   1536   identically on all platforms.
   1537 
   1538 * The new :mod:`timeit` module helps measure how long snippets of Python code
   1539   take to execute.  The :file:`timeit.py` file can be run directly from the
   1540   command line, or the module's :class:`Timer` class can be imported and used
   1541   directly.  Here's a short example that figures out whether it's faster to
   1542   convert an 8-bit string to Unicode by appending an empty Unicode string to it or
   1543   by using the :func:`unicode` function::
   1544 
   1545      import timeit
   1546 
   1547      timer1 = timeit.Timer('unicode("abc")')
   1548      timer2 = timeit.Timer('"abc" + u""')
   1549 
   1550      # Run three trials
   1551      print timer1.repeat(repeat=3, number=100000)
   1552      print timer2.repeat(repeat=3, number=100000)
   1553 
   1554      # On my laptop this outputs:
   1555      # [0.36831796169281006, 0.37441694736480713, 0.35304892063140869]
   1556      # [0.17574405670166016, 0.18193507194519043, 0.17565798759460449]
   1557 
   1558 * The :mod:`Tix` module has received various bug fixes and updates for the
   1559   current version of the Tix package.
   1560 
   1561 * The :mod:`Tkinter` module now works with a thread-enabled  version of Tcl.
   1562   Tcl's threading model requires that widgets only be accessed from the thread in
   1563   which they're created; accesses from another thread can cause Tcl to panic.  For
   1564   certain Tcl interfaces, :mod:`Tkinter` will now automatically avoid this  when a
   1565   widget is accessed from a different thread by marshalling a command, passing it
   1566   to the correct thread, and waiting for the results.  Other interfaces can't be
   1567   handled automatically but :mod:`Tkinter` will now raise an exception on such an
   1568   access so that you can at least find out about the problem.  See
   1569   https://mail.python.org/pipermail/python-dev/2002-December/031107.html for a more
   1570   detailed explanation of this change.  (Implemented by Martin von Lwis.)
   1571 
   1572 * Calling Tcl methods through :mod:`_tkinter` no longer  returns only strings.
   1573   Instead, if Tcl returns other objects those objects are converted to their
   1574   Python equivalent, if one exists, or wrapped with a :class:`_tkinter.Tcl_Obj`
   1575   object if no Python equivalent exists. This behavior can be controlled through
   1576   the :meth:`wantobjects` method of :class:`tkapp` objects.
   1577 
   1578   When using :mod:`_tkinter` through the :mod:`Tkinter` module (as most Tkinter
   1579   applications will), this feature is always activated. It should not cause
   1580   compatibility problems, since Tkinter would always convert string results to
   1581   Python types where possible.
   1582 
   1583   If any incompatibilities are found, the old behavior can be restored by setting
   1584   the :attr:`wantobjects` variable in the :mod:`Tkinter` module to false before
   1585   creating the first :class:`tkapp` object. ::
   1586 
   1587      import Tkinter
   1588      Tkinter.wantobjects = 0
   1589 
   1590   Any breakage caused by this change should be reported as a bug.
   1591 
   1592 * The :mod:`UserDict` module has a new :class:`DictMixin` class which defines
   1593   all dictionary methods for classes that already have a minimum mapping
   1594   interface.  This greatly simplifies writing classes that need to be
   1595   substitutable for dictionaries, such as the classes in  the :mod:`shelve`
   1596   module.
   1597 
   1598   Adding the mix-in as a superclass provides the full dictionary interface
   1599   whenever the class defines :meth:`__getitem__`, :meth:`__setitem__`,
   1600   :meth:`__delitem__`, and :meth:`keys`. For example::
   1601 
   1602      >>> import UserDict
   1603      >>> class SeqDict(UserDict.DictMixin):
   1604      ...     """Dictionary lookalike implemented with lists."""
   1605      ...     def __init__(self):
   1606      ...         self.keylist = []
   1607      ...         self.valuelist = []
   1608      ...     def __getitem__(self, key):
   1609      ...         try:
   1610      ...             i = self.keylist.index(key)
   1611      ...         except ValueError:
   1612      ...             raise KeyError
   1613      ...         return self.valuelist[i]
   1614      ...     def __setitem__(self, key, value):
   1615      ...         try:
   1616      ...             i = self.keylist.index(key)
   1617      ...             self.valuelist[i] = value
   1618      ...         except ValueError:
   1619      ...             self.keylist.append(key)
   1620      ...             self.valuelist.append(value)
   1621      ...     def __delitem__(self, key):
   1622      ...         try:
   1623      ...             i = self.keylist.index(key)
   1624      ...         except ValueError:
   1625      ...             raise KeyError
   1626      ...         self.keylist.pop(i)
   1627      ...         self.valuelist.pop(i)
   1628      ...     def keys(self):
   1629      ...         return list(self.keylist)
   1630      ...
   1631      >>> s = SeqDict()
   1632      >>> dir(s)      # See that other dictionary methods are implemented
   1633      ['__cmp__', '__contains__', '__delitem__', '__doc__', '__getitem__',
   1634       '__init__', '__iter__', '__len__', '__module__', '__repr__',
   1635       '__setitem__', 'clear', 'get', 'has_key', 'items', 'iteritems',
   1636       'iterkeys', 'itervalues', 'keylist', 'keys', 'pop', 'popitem',
   1637       'setdefault', 'update', 'valuelist', 'values']
   1638 
   1639   (Contributed by Raymond Hettinger.)
   1640 
   1641 * The DOM implementation in :mod:`xml.dom.minidom` can now generate XML output
   1642   in a particular encoding by providing an optional encoding argument to the
   1643   :meth:`toxml` and :meth:`toprettyxml` methods of DOM nodes.
   1644 
   1645 * The :mod:`xmlrpclib` module now supports an XML-RPC extension for handling nil
   1646   data values such as Python's ``None``.  Nil values are always supported on
   1647   unmarshalling an XML-RPC response.  To generate requests containing ``None``,
   1648   you must supply a true value for the *allow_none* parameter when creating a
   1649   :class:`Marshaller` instance.
   1650 
   1651 * The new :mod:`DocXMLRPCServer` module allows writing self-documenting XML-RPC
   1652   servers. Run it in demo mode (as a program) to see it in action.   Pointing the
   1653   Web browser to the RPC server produces pydoc-style documentation; pointing
   1654   xmlrpclib to the server allows invoking the actual methods. (Contributed by
   1655   Brian Quinlan.)
   1656 
   1657 * Support for internationalized domain names (RFCs 3454, 3490, 3491, and 3492)
   1658   has been added. The "idna" encoding can be used to convert between a Unicode
   1659   domain name and the ASCII-compatible encoding (ACE) of that name. ::
   1660 
   1661      >{}>{}> u"www.Alliancefranaise.nu".encode("idna")
   1662      'www.xn--alliancefranaise-npb.nu'
   1663 
   1664   The :mod:`socket` module has also been extended to transparently convert
   1665   Unicode hostnames to the ACE version before passing them to the C library.
   1666   Modules that deal with hostnames such as :mod:`httplib` and :mod:`ftplib`)
   1667   also support Unicode host names; :mod:`httplib` also sends HTTP ``Host``
   1668   headers using the ACE version of the domain name.  :mod:`urllib` supports
   1669   Unicode URLs with non-ASCII host names as long as the ``path`` part of the URL
   1670   is ASCII only.
   1671 
   1672   To implement this change, the :mod:`stringprep` module, the  ``mkstringprep``
   1673   tool and the ``punycode`` encoding have been added.
   1674 
   1675 .. ======================================================================
   1676 
   1677 
   1678 Date/Time Type
   1679 --------------
   1680 
   1681 Date and time types suitable for expressing timestamps were added as the
   1682 :mod:`datetime` module.  The types don't support different calendars or many
   1683 fancy features, and just stick to the basics of representing time.
   1684 
   1685 The three primary types are: :class:`date`, representing a day, month, and year;
   1686 :class:`~datetime.time`, consisting of hour, minute, and second; and :class:`~datetime.datetime`,
   1687 which contains all the attributes of both :class:`date` and :class:`~datetime.time`.
   1688 There's also a :class:`timedelta` class representing differences between two
   1689 points in time, and time zone logic is implemented by classes inheriting from
   1690 the abstract :class:`tzinfo` class.
   1691 
   1692 You can create instances of :class:`date` and :class:`~datetime.time` by either supplying
   1693 keyword arguments to the appropriate constructor, e.g.
   1694 ``datetime.date(year=1972, month=10, day=15)``, or by using one of a number of
   1695 class methods.  For example, the :meth:`date.today` class method returns the
   1696 current local date.
   1697 
   1698 Once created, instances of the date/time classes are all immutable. There are a
   1699 number of methods for producing formatted strings from objects::
   1700 
   1701    >>> import datetime
   1702    >>> now = datetime.datetime.now()
   1703    >>> now.isoformat()
   1704    '2002-12-30T21:27:03.994956'
   1705    >>> now.ctime()  # Only available on date, datetime
   1706    'Mon Dec 30 21:27:03 2002'
   1707    >>> now.strftime('%Y %d %b')
   1708    '2002 30 Dec'
   1709 
   1710 The :meth:`replace` method allows modifying one or more fields  of a
   1711 :class:`date` or :class:`~datetime.datetime` instance, returning a new instance::
   1712 
   1713    >>> d = datetime.datetime.now()
   1714    >>> d
   1715    datetime.datetime(2002, 12, 30, 22, 15, 38, 827738)
   1716    >>> d.replace(year=2001, hour = 12)
   1717    datetime.datetime(2001, 12, 30, 12, 15, 38, 827738)
   1718    >>>
   1719 
   1720 Instances can be compared, hashed, and converted to strings (the result is the
   1721 same as that of :meth:`isoformat`).  :class:`date` and :class:`~datetime.datetime`
   1722 instances can be subtracted from each other, and added to :class:`timedelta`
   1723 instances.  The largest missing feature is that there's no standard library
   1724 support for parsing strings and getting back a :class:`date` or
   1725 :class:`~datetime.datetime`.
   1726 
   1727 For more information, refer to the module's reference documentation.
   1728 (Contributed by Tim Peters.)
   1729 
   1730 .. ======================================================================
   1731 
   1732 
   1733 The optparse Module
   1734 -------------------
   1735 
   1736 The :mod:`getopt` module provides simple parsing of command-line arguments.  The
   1737 new :mod:`optparse` module (originally named Optik) provides more elaborate
   1738 command-line parsing that follows the Unix conventions, automatically creates
   1739 the output for :option:`!--help`, and can perform different actions for different
   1740 options.
   1741 
   1742 You start by creating an instance of :class:`OptionParser` and telling it what
   1743 your program's options are. ::
   1744 
   1745    import sys
   1746    from optparse import OptionParser
   1747 
   1748    op = OptionParser()
   1749    op.add_option('-i', '--input',
   1750                  action='store', type='string', dest='input',
   1751                  help='set input filename')
   1752    op.add_option('-l', '--length',
   1753                  action='store', type='int', dest='length',
   1754                  help='set maximum length of output')
   1755 
   1756 Parsing a command line is then done by calling the :meth:`parse_args` method. ::
   1757 
   1758    options, args = op.parse_args(sys.argv[1:])
   1759    print options
   1760    print args
   1761 
   1762 This returns an object containing all of the option values, and a list of
   1763 strings containing the remaining arguments.
   1764 
   1765 Invoking the script with the various arguments now works as you'd expect it to.
   1766 Note that the length argument is automatically converted to an integer.
   1767 
   1768 .. code-block:: shell-session
   1769 
   1770    $ ./python opt.py -i data arg1
   1771    <Values at 0x400cad4c: {'input': 'data', 'length': None}>
   1772    ['arg1']
   1773    $ ./python opt.py --input=data --length=4
   1774    <Values at 0x400cad2c: {'input': 'data', 'length': 4}>
   1775    []
   1776    $
   1777 
   1778 The help message is automatically generated for you:
   1779 
   1780 .. code-block:: shell-session
   1781 
   1782    $ ./python opt.py --help
   1783    usage: opt.py [options]
   1784 
   1785    options:
   1786      -h, --help            show this help message and exit
   1787      -iINPUT, --input=INPUT
   1788                            set input filename
   1789      -lLENGTH, --length=LENGTH
   1790                            set maximum length of output
   1791    $
   1792 
   1793 See the module's documentation for more details.
   1794 
   1795 
   1796 Optik was written by Greg Ward, with suggestions from the readers of the Getopt
   1797 SIG.
   1798 
   1799 .. ======================================================================
   1800 
   1801 
   1802 .. _section-pymalloc:
   1803 
   1804 Pymalloc: A Specialized Object Allocator
   1805 ========================================
   1806 
   1807 Pymalloc, a specialized object allocator written by Vladimir Marangozov, was a
   1808 feature added to Python 2.1.  Pymalloc is intended to be faster than the system
   1809 :c:func:`malloc` and to have less memory overhead for allocation patterns typical
   1810 of Python programs. The allocator uses C's :c:func:`malloc` function to get large
   1811 pools of memory and then fulfills smaller memory requests from these pools.
   1812 
   1813 In 2.1 and 2.2, pymalloc was an experimental feature and wasn't enabled by
   1814 default; you had to explicitly enable it when compiling Python by providing the
   1815 :option:`!--with-pymalloc` option to the :program:`configure` script.  In 2.3,
   1816 pymalloc has had further enhancements and is now enabled by default; you'll have
   1817 to supply :option:`!--without-pymalloc` to disable it.
   1818 
   1819 This change is transparent to code written in Python; however, pymalloc may
   1820 expose bugs in C extensions.  Authors of C extension modules should test their
   1821 code with pymalloc enabled, because some incorrect code may cause core dumps at
   1822 runtime.
   1823 
   1824 There's one particularly common error that causes problems.  There are a number
   1825 of memory allocation functions in Python's C API that have previously just been
   1826 aliases for the C library's :c:func:`malloc` and :c:func:`free`, meaning that if
   1827 you accidentally called mismatched functions the error wouldn't be noticeable.
   1828 When the object allocator is enabled, these functions aren't aliases of
   1829 :c:func:`malloc` and :c:func:`free` any more, and calling the wrong function to
   1830 free memory may get you a core dump.  For example, if memory was allocated using
   1831 :c:func:`PyObject_Malloc`, it has to be freed using :c:func:`PyObject_Free`, not
   1832 :c:func:`free`.  A few modules included with Python fell afoul of this and had to
   1833 be fixed; doubtless there are more third-party modules that will have the same
   1834 problem.
   1835 
   1836 As part of this change, the confusing multiple interfaces for allocating memory
   1837 have been consolidated down into two API families. Memory allocated with one
   1838 family must not be manipulated with functions from the other family.  There is
   1839 one family for allocating chunks of memory and another family of functions
   1840 specifically for allocating Python objects.
   1841 
   1842 * To allocate and free an undistinguished chunk of memory use the "raw memory"
   1843   family: :c:func:`PyMem_Malloc`, :c:func:`PyMem_Realloc`, and :c:func:`PyMem_Free`.
   1844 
   1845 * The "object memory" family is the interface to the pymalloc facility described
   1846   above and is biased towards a large number of "small" allocations:
   1847   :c:func:`PyObject_Malloc`, :c:func:`PyObject_Realloc`, and :c:func:`PyObject_Free`.
   1848 
   1849 * To allocate and free Python objects, use the "object" family
   1850   :c:func:`PyObject_New`, :c:func:`PyObject_NewVar`, and :c:func:`PyObject_Del`.
   1851 
   1852 Thanks to lots of work by Tim Peters, pymalloc in 2.3 also provides debugging
   1853 features to catch memory overwrites and doubled frees in both extension modules
   1854 and in the interpreter itself.  To enable this support, compile a debugging
   1855 version of the Python interpreter by running :program:`configure` with
   1856 :option:`!--with-pydebug`.
   1857 
   1858 To aid extension writers, a header file :file:`Misc/pymemcompat.h` is
   1859 distributed with the source to Python 2.3 that allows Python extensions to use
   1860 the 2.3 interfaces to memory allocation while compiling against any version of
   1861 Python since 1.5.2.  You would copy the file from Python's source distribution
   1862 and bundle it with the source of your extension.
   1863 
   1864 
   1865 .. seealso::
   1866 
   1867    https://hg.python.org/cpython/file/default/Objects/obmalloc.c
   1868       For the full details of the pymalloc implementation, see the comments at
   1869       the top of the file :file:`Objects/obmalloc.c` in the Python source code.
   1870       The above link points to the file within the python.org SVN browser.
   1871 
   1872 .. ======================================================================
   1873 
   1874 
   1875 Build and C API Changes
   1876 =======================
   1877 
   1878 Changes to Python's build process and to the C API include:
   1879 
   1880 * The cycle detection implementation used by the garbage collection has proven
   1881   to be stable, so it's now been made mandatory.  You can no longer compile Python
   1882   without it, and the :option:`!--with-cycle-gc` switch to :program:`configure` has
   1883   been removed.
   1884 
   1885 * Python can now optionally be built as a shared library
   1886   (:file:`libpython2.3.so`) by supplying :option:`!--enable-shared` when running
   1887   Python's :program:`configure` script.  (Contributed by Ondrej Palkovsky.)
   1888 
   1889 * The :c:macro:`DL_EXPORT` and :c:macro:`DL_IMPORT` macros are now deprecated.
   1890   Initialization functions for Python extension modules should now be declared
   1891   using the new macro :c:macro:`PyMODINIT_FUNC`, while the Python core will
   1892   generally use the :c:macro:`PyAPI_FUNC` and :c:macro:`PyAPI_DATA` macros.
   1893 
   1894 * The interpreter can be compiled without any docstrings for the built-in
   1895   functions and modules by supplying :option:`!--without-doc-strings` to the
   1896   :program:`configure` script. This makes the Python executable about 10% smaller,
   1897   but will also mean that you can't get help for Python's built-ins.  (Contributed
   1898   by Gustavo Niemeyer.)
   1899 
   1900 * The :c:func:`PyArg_NoArgs` macro is now deprecated, and code that uses it
   1901   should be changed.  For Python 2.2 and later, the method definition table can
   1902   specify the :const:`METH_NOARGS` flag, signalling that there are no arguments,
   1903   and the argument checking can then be removed.  If compatibility with pre-2.2
   1904   versions of Python is important, the code could use ``PyArg_ParseTuple(args,
   1905   "")`` instead, but this will be slower than using :const:`METH_NOARGS`.
   1906 
   1907 * :c:func:`PyArg_ParseTuple` accepts new format characters for various sizes of
   1908   unsigned integers: ``B`` for :c:type:`unsigned char`, ``H`` for :c:type:`unsigned
   1909   short int`,  ``I`` for :c:type:`unsigned int`,  and ``K`` for :c:type:`unsigned
   1910   long long`.
   1911 
   1912 * A new function, :c:func:`PyObject_DelItemString(mapping, char \*key)` was added
   1913   as shorthand for ``PyObject_DelItem(mapping, PyString_New(key))``.
   1914 
   1915 * File objects now manage their internal string buffer differently, increasing
   1916   it exponentially when needed.  This results in the benchmark tests in
   1917   :file:`Lib/test/test_bufio.py` speeding up considerably (from 57 seconds to 1.7
   1918   seconds, according to one measurement).
   1919 
   1920 * It's now possible to define class and static methods for a C extension type by
   1921   setting either the :const:`METH_CLASS` or :const:`METH_STATIC` flags in a
   1922   method's :c:type:`PyMethodDef` structure.
   1923 
   1924 * Python now includes a copy of the Expat XML parser's source code, removing any
   1925   dependence on a system version or local installation of Expat.
   1926 
   1927 * If you dynamically allocate type objects in your extension, you should be
   1928   aware of a change in the rules relating to the :attr:`__module__` and
   1929   :attr:`~definition.__name__` attributes.  In summary, you will want to ensure the type's
   1930   dictionary contains a ``'__module__'`` key; making the module name the part of
   1931   the type name leading up to the final period will no longer have the desired
   1932   effect.  For more detail, read the API reference documentation or the  source.
   1933 
   1934 .. ======================================================================
   1935 
   1936 
   1937 Port-Specific Changes
   1938 ---------------------
   1939 
   1940 Support for a port to IBM's OS/2 using the EMX runtime environment was merged
   1941 into the main Python source tree.  EMX is a POSIX emulation layer over the OS/2
   1942 system APIs.  The Python port for EMX tries to support all the POSIX-like
   1943 capability exposed by the EMX runtime, and mostly succeeds; :func:`fork` and
   1944 :func:`fcntl` are restricted by the limitations of the underlying emulation
   1945 layer.  The standard OS/2 port, which uses IBM's Visual Age compiler, also
   1946 gained support for case-sensitive import semantics as part of the integration of
   1947 the EMX port into CVS.  (Contributed by Andrew MacIntyre.)
   1948 
   1949 On MacOS, most toolbox modules have been weaklinked to improve backward
   1950 compatibility.  This means that modules will no longer fail to load if a single
   1951 routine is missing on the current OS version. Instead calling the missing
   1952 routine will raise an exception. (Contributed by Jack Jansen.)
   1953 
   1954 The RPM spec files, found in the :file:`Misc/RPM/` directory in the Python
   1955 source distribution, were updated for 2.3.  (Contributed by Sean Reifschneider.)
   1956 
   1957 Other new platforms now supported by Python include AtheOS
   1958 (http://atheos.cx/), GNU/Hurd, and OpenVMS.
   1959 
   1960 .. ======================================================================
   1961 
   1962 
   1963 .. _section-other:
   1964 
   1965 Other Changes and Fixes
   1966 =======================
   1967 
   1968 As usual, there were a bunch of other improvements and bugfixes scattered
   1969 throughout the source tree.  A search through the CVS change logs finds there
   1970 were 523 patches applied and 514 bugs fixed between Python 2.2 and 2.3.  Both
   1971 figures are likely to be underestimates.
   1972 
   1973 Some of the more notable changes are:
   1974 
   1975 * If the :envvar:`PYTHONINSPECT` environment variable is set, the Python
   1976   interpreter will enter the interactive prompt after running a Python program, as
   1977   if Python had been invoked with the :option:`-i` option. The environment
   1978   variable can be set before running the Python interpreter, or it can be set by
   1979   the Python program as part of its execution.
   1980 
   1981 * The :file:`regrtest.py` script now provides a way to allow "all resources
   1982   except *foo*."  A resource name passed to the :option:`!-u` option can now be
   1983   prefixed with a hyphen (``'-'``) to mean "remove this resource."  For example,
   1984   the option '``-uall,-bsddb``' could be used to enable the use of all resources
   1985   except ``bsddb``.
   1986 
   1987 * The tools used to build the documentation now work under Cygwin as well as
   1988   Unix.
   1989 
   1990 * The ``SET_LINENO`` opcode has been removed.  Back in the mists of time, this
   1991   opcode was needed to produce line numbers in tracebacks and support trace
   1992   functions (for, e.g., :mod:`pdb`). Since Python 1.5, the line numbers in
   1993   tracebacks have been computed using a different mechanism that works with
   1994   "python -O".  For Python 2.3 Michael Hudson implemented a similar scheme to
   1995   determine when to call the trace function, removing the need for ``SET_LINENO``
   1996   entirely.
   1997 
   1998   It would be difficult to detect any resulting difference from Python code, apart
   1999   from a slight speed up when Python is run without :option:`-O`.
   2000 
   2001   C extensions that access the :attr:`f_lineno` field of frame objects should
   2002   instead call ``PyCode_Addr2Line(f->f_code, f->f_lasti)``. This will have the
   2003   added effect of making the code work as desired under "python -O" in earlier
   2004   versions of Python.
   2005 
   2006   A nifty new feature is that trace functions can now assign to the
   2007   :attr:`f_lineno` attribute of frame objects, changing the line that will be
   2008   executed next.  A ``jump`` command has been added to the :mod:`pdb` debugger
   2009   taking advantage of this new feature. (Implemented by Richie Hindle.)
   2010 
   2011 .. ======================================================================
   2012 
   2013 
   2014 Porting to Python 2.3
   2015 =====================
   2016 
   2017 This section lists previously described changes that may require changes to your
   2018 code:
   2019 
   2020 * :keyword:`yield` is now always a keyword; if it's used as a variable name in
   2021   your code, a different name must be chosen.
   2022 
   2023 * For strings *X* and *Y*, ``X in Y`` now works if *X* is more than one
   2024   character long.
   2025 
   2026 * The :func:`int` type constructor will now return a long integer instead of
   2027   raising an :exc:`OverflowError` when a string or floating-point number is too
   2028   large to fit into an integer.
   2029 
   2030 * If you have Unicode strings that contain 8-bit characters, you must declare
   2031   the file's encoding (UTF-8, Latin-1, or whatever) by adding a comment to the top
   2032   of the file.  See section :ref:`section-encodings` for more information.
   2033 
   2034 * Calling Tcl methods through :mod:`_tkinter` no longer  returns only strings.
   2035   Instead, if Tcl returns other objects those objects are converted to their
   2036   Python equivalent, if one exists, or wrapped with a :class:`_tkinter.Tcl_Obj`
   2037   object if no Python equivalent exists.
   2038 
   2039 * Large octal and hex literals such as ``0xffffffff`` now trigger a
   2040   :exc:`FutureWarning`. Currently they're stored as 32-bit numbers and result in a
   2041   negative value, but in Python 2.4 they'll become positive long integers.
   2042 
   2043   There are a few ways to fix this warning.  If you really need a positive number,
   2044   just add an ``L`` to the end of the literal.  If you're trying to get a 32-bit
   2045   integer with low bits set and have previously used an expression such as ``~(1
   2046   << 31)``, it's probably clearest to start with all bits set and clear the
   2047   desired upper bits. For example, to clear just the top bit (bit 31), you could
   2048   write ``0xffffffffL &~(1L<<31)``.
   2049 
   2050 * You can no longer disable assertions by assigning to ``__debug__``.
   2051 
   2052 * The Distutils :func:`setup` function has gained various new keyword arguments
   2053   such as *depends*.  Old versions of the Distutils will abort if passed unknown
   2054   keywords.  A solution is to check for the presence of the new
   2055   :func:`get_distutil_options` function in your :file:`setup.py` and only uses the
   2056   new keywords with a version of the Distutils that supports them::
   2057 
   2058      from distutils import core
   2059 
   2060      kw = {'sources': 'foo.c', ...}
   2061      if hasattr(core, 'get_distutil_options'):
   2062          kw['depends'] = ['foo.h']
   2063      ext = Extension(**kw)
   2064 
   2065 * Using ``None`` as a variable name will now result in a :exc:`SyntaxWarning`
   2066   warning.
   2067 
   2068 * Names of extension types defined by the modules included with Python now
   2069   contain the module and a ``'.'`` in front of the type name.
   2070 
   2071 .. ======================================================================
   2072 
   2073 
   2074 .. _23acks:
   2075 
   2076 Acknowledgements
   2077 ================
   2078 
   2079 The author would like to thank the following people for offering suggestions,
   2080 corrections and assistance with various drafts of this article: Jeff Bauer,
   2081 Simon Brunning, Brett Cannon, Michael Chermside, Andrew Dalke, Scott David
   2082 Daniels, Fred L. Drake, Jr., David Fraser,  Kelly Gerber, Raymond Hettinger,
   2083 Michael Hudson, Chris Lambert, Detlef Lannert, Martin von Lwis, Andrew
   2084 MacIntyre, Lalo Martins, Chad Netzer, Gustavo Niemeyer, Neal Norwitz, Hans
   2085 Nowak, Chris Reedy, Francesco Ricciardi, Vinay Sajip, Neil Schemenauer, Roman
   2086 Suzi, Jason Tishler, Just van Rossum.
   2087