Home | History | Annotate | Download | only in whatsnew
      1 ****************************
      2   What's New In Python 3.1
      3 ****************************
      4 
      5 :Author: Raymond Hettinger
      6 
      7 .. $Id$
      8    Rules for maintenance:
      9 
     10    * Anyone can add text to this document.  Do not spend very much time
     11    on the wording of your changes, because your text will probably
     12    get rewritten to some degree.
     13 
     14    * The maintainer will go through Misc/NEWS periodically and add
     15    changes; it's therefore more important to add your changes to
     16    Misc/NEWS than to this file.
     17 
     18    * This is not a complete list of every single change; completeness
     19    is the purpose of Misc/NEWS.  Some changes I consider too small
     20    or esoteric to include.  If such a change is added to the text,
     21    I'll just remove it.  (This is another reason you shouldn't spend
     22    too much time on writing your addition.)
     23 
     24    * If you want to draw your new text to the attention of the
     25    maintainer, add 'XXX' to the beginning of the paragraph or
     26    section.
     27 
     28    * It's OK to just add a fragmentary note about a change.  For
     29    example: "XXX Describe the transmogrify() function added to the
     30    socket module."  The maintainer will research the change and
     31    write the necessary text.
     32 
     33    * You can comment out your additions if you like, but it's not
     34    necessary (especially when a final release is some months away).
     35 
     36    * Credit the author of a patch or bugfix.   Just the name is
     37    sufficient; the e-mail address isn't necessary.
     38 
     39    * It's helpful to add the bug/patch number as a comment:
     40 
     41    % Patch 12345
     42    XXX Describe the transmogrify() function added to the socket
     43    module.
     44    (Contributed by P.Y. Developer.)
     45 
     46    This saves the maintainer the effort of going through the SVN log
     47    when researching a change.
     48 
     49 This article explains the new features in Python 3.1, compared to 3.0.
     50 
     51 
     52 PEP 372: Ordered Dictionaries
     53 =============================
     54 
     55 Regular Python dictionaries iterate over key/value pairs in arbitrary order.
     56 Over the years, a number of authors have written alternative implementations
     57 that remember the order that the keys were originally inserted.  Based on
     58 the experiences from those implementations, a new
     59 :class:`collections.OrderedDict` class has been introduced.
     60 
     61 The OrderedDict API is substantially the same as regular dictionaries
     62 but will iterate over keys and values in a guaranteed order depending on
     63 when a key was first inserted.  If a new entry overwrites an existing entry,
     64 the original insertion position is left unchanged.  Deleting an entry and
     65 reinserting it will move it to the end.
     66 
     67 The standard library now supports use of ordered dictionaries in several
     68 modules.  The :mod:`configparser` module uses them by default.  This lets
     69 configuration files be read, modified, and then written back in their original
     70 order.  The *_asdict()* method for :func:`collections.namedtuple` now
     71 returns an ordered dictionary with the values appearing in the same order as
     72 the underlying tuple indices.  The :mod:`json` module is being built-out with
     73 an *object_pairs_hook* to allow OrderedDicts to be built by the decoder.
     74 Support was also added for third-party tools like `PyYAML <http://pyyaml.org/>`_.
     75 
     76 .. seealso::
     77 
     78    :pep:`372` - Ordered Dictionaries
     79       PEP written by Armin Ronacher and Raymond Hettinger.  Implementation
     80       written by Raymond Hettinger.
     81 
     82 
     83 PEP 378: Format Specifier for Thousands Separator
     84 =================================================
     85 
     86 The built-in :func:`format` function and the :meth:`str.format` method use
     87 a mini-language that now includes a simple, non-locale aware way to format
     88 a number with a thousands separator.  That provides a way to humanize a
     89 program's output, improving its professional appearance and readability::
     90 
     91     >>> format(1234567, ',d')
     92     '1,234,567'
     93     >>> format(1234567.89, ',.2f')
     94     '1,234,567.89'
     95     >>> format(12345.6 + 8901234.12j, ',f')
     96     '12,345.600000+8,901,234.120000j'
     97     >>> format(Decimal('1234567.89'), ',f')
     98     '1,234,567.89'
     99 
    100 The supported types are :class:`int`, :class:`float`, :class:`complex`
    101 and :class:`decimal.Decimal`.
    102 
    103 Discussions are underway about how to specify alternative separators
    104 like dots, spaces, apostrophes, or underscores.  Locale-aware applications
    105 should use the existing *n* format specifier which already has some support
    106 for thousands separators.
    107 
    108 .. seealso::
    109 
    110    :pep:`378` - Format Specifier for Thousands Separator
    111       PEP written by Raymond Hettinger and implemented by Eric Smith and
    112       Mark Dickinson.
    113 
    114 
    115 Other Language Changes
    116 ======================
    117 
    118 Some smaller changes made to the core Python language are:
    119 
    120 * Directories and zip archives containing a :file:`__main__.py`
    121   file can now be executed directly by passing their name to the
    122   interpreter. The directory/zipfile is automatically inserted as the
    123   first entry in sys.path.  (Suggestion and initial patch by Andy Chu;
    124   revised patch by Phillip J. Eby and Nick Coghlan; :issue:`1739468`.)
    125 
    126 * The :func:`int` type gained a ``bit_length`` method that returns the
    127   number of bits necessary to represent its argument in binary::
    128 
    129       >>> n = 37
    130       >>> bin(37)
    131       '0b100101'
    132       >>> n.bit_length()
    133       6
    134       >>> n = 2**123-1
    135       >>> n.bit_length()
    136       123
    137       >>> (n+1).bit_length()
    138       124
    139 
    140   (Contributed by Fredrik Johansson, Victor Stinner, Raymond Hettinger,
    141   and Mark Dickinson; :issue:`3439`.)
    142 
    143 * The fields in :func:`format` strings can now be automatically
    144   numbered::
    145 
    146     >>> 'Sir {} of {}'.format('Gallahad', 'Camelot')
    147     'Sir Gallahad of Camelot'
    148 
    149   Formerly, the string would have required numbered fields such as:
    150   ``'Sir {0} of {1}'``.
    151 
    152   (Contributed by Eric Smith; :issue:`5237`.)
    153 
    154 * The :func:`string.maketrans` function is deprecated and is replaced by new
    155   static methods, :meth:`bytes.maketrans` and :meth:`bytearray.maketrans`.
    156   This change solves the confusion around which types were supported by the
    157   :mod:`string` module. Now, :class:`str`, :class:`bytes`, and
    158   :class:`bytearray` each have their own **maketrans** and **translate**
    159   methods with intermediate translation tables of the appropriate type.
    160 
    161   (Contributed by Georg Brandl; :issue:`5675`.)
    162 
    163 * The syntax of the :keyword:`with` statement now allows multiple context
    164   managers in a single statement::
    165 
    166     >>> with open('mylog.txt') as infile, open('a.out', 'w') as outfile:
    167     ...     for line in infile:
    168     ...         if '<critical>' in line:
    169     ...             outfile.write(line)
    170 
    171   With the new syntax, the :func:`contextlib.nested` function is no longer
    172   needed and is now deprecated.
    173 
    174   (Contributed by Georg Brandl and Mattias Brndstrm;
    175   `appspot issue 53094 <https://codereview.appspot.com/53094>`_.)
    176 
    177 * ``round(x, n)`` now returns an integer if *x* is an integer.
    178   Previously it returned a float::
    179 
    180     >>> round(1123, -2)
    181     1100
    182 
    183   (Contributed by Mark Dickinson; :issue:`4707`.)
    184 
    185 * Python now uses David Gay's algorithm for finding the shortest floating
    186   point representation that doesn't change its value.  This should help
    187   mitigate some of the confusion surrounding binary floating point
    188   numbers.
    189 
    190   The significance is easily seen with a number like ``1.1`` which does not
    191   have an exact equivalent in binary floating point.  Since there is no exact
    192   equivalent, an expression like ``float('1.1')`` evaluates to the nearest
    193   representable value which is ``0x1.199999999999ap+0`` in hex or
    194   ``1.100000000000000088817841970012523233890533447265625`` in decimal. That
    195   nearest value was and still is used in subsequent floating point
    196   calculations.
    197 
    198   What is new is how the number gets displayed.  Formerly, Python used a
    199   simple approach.  The value of ``repr(1.1)`` was computed as ``format(1.1,
    200   '.17g')`` which evaluated to ``'1.1000000000000001'``. The advantage of
    201   using 17 digits was that it relied on IEEE-754 guarantees to assure that
    202   ``eval(repr(1.1))`` would round-trip exactly to its original value.  The
    203   disadvantage is that many people found the output to be confusing (mistaking
    204   intrinsic limitations of binary floating point representation as being a
    205   problem with Python itself).
    206 
    207   The new algorithm for ``repr(1.1)`` is smarter and returns ``'1.1'``.
    208   Effectively, it searches all equivalent string representations (ones that
    209   get stored with the same underlying float value) and returns the shortest
    210   representation.
    211 
    212   The new algorithm tends to emit cleaner representations when possible, but
    213   it does not change the underlying values.  So, it is still the case that
    214   ``1.1 + 2.2 != 3.3`` even though the representations may suggest otherwise.
    215 
    216   The new algorithm depends on certain features in the underlying floating
    217   point implementation.  If the required features are not found, the old
    218   algorithm will continue to be used.  Also, the text pickle protocols
    219   assure cross-platform portability by using the old algorithm.
    220 
    221   (Contributed by Eric Smith and Mark Dickinson; :issue:`1580`)
    222 
    223 New, Improved, and Deprecated Modules
    224 =====================================
    225 
    226 * Added a :class:`collections.Counter` class to support convenient
    227   counting of unique items in a sequence or iterable::
    228 
    229       >>> Counter(['red', 'blue', 'red', 'green', 'blue', 'blue'])
    230       Counter({'blue': 3, 'red': 2, 'green': 1})
    231 
    232   (Contributed by Raymond Hettinger; :issue:`1696199`.)
    233 
    234 * Added a new module, :mod:`tkinter.ttk` for access to the Tk themed widget set.
    235   The basic idea of ttk is to separate, to the extent possible, the code
    236   implementing a widget's behavior from the code implementing its appearance.
    237 
    238   (Contributed by Guilherme Polo; :issue:`2983`.)
    239 
    240 * The :class:`gzip.GzipFile` and :class:`bz2.BZ2File` classes now support
    241   the context management protocol::
    242 
    243         >>> # Automatically close file after writing
    244         >>> with gzip.GzipFile(filename, "wb") as f:
    245         ...     f.write(b"xxx")
    246 
    247   (Contributed by Antoine Pitrou.)
    248 
    249 * The :mod:`decimal` module now supports methods for creating a
    250   decimal object from a binary :class:`float`.  The conversion is
    251   exact but can sometimes be surprising::
    252 
    253       >>> Decimal.from_float(1.1)
    254       Decimal('1.100000000000000088817841970012523233890533447265625')
    255 
    256   The long decimal result shows the actual binary fraction being
    257   stored for *1.1*.  The fraction has many digits because *1.1* cannot
    258   be exactly represented in binary.
    259 
    260   (Contributed by Raymond Hettinger and Mark Dickinson.)
    261 
    262 * The :mod:`itertools` module grew two new functions.  The
    263   :func:`itertools.combinations_with_replacement` function is one of
    264   four for generating combinatorics including permutations and Cartesian
    265   products.  The :func:`itertools.compress` function mimics its namesake
    266   from APL.  Also, the existing :func:`itertools.count` function now has
    267   an optional *step* argument and can accept any type of counting
    268   sequence including :class:`fractions.Fraction` and
    269   :class:`decimal.Decimal`::
    270 
    271     >>> [p+q for p,q in combinations_with_replacement('LOVE', 2)]
    272     ['LL', 'LO', 'LV', 'LE', 'OO', 'OV', 'OE', 'VV', 'VE', 'EE']
    273 
    274     >>> list(compress(data=range(10), selectors=[0,0,1,1,0,1,0,1,0,0]))
    275     [2, 3, 5, 7]
    276 
    277     >>> c = count(start=Fraction(1,2), step=Fraction(1,6))
    278     >>> [next(c), next(c), next(c), next(c)]
    279     [Fraction(1, 2), Fraction(2, 3), Fraction(5, 6), Fraction(1, 1)]
    280 
    281   (Contributed by Raymond Hettinger.)
    282 
    283 * :func:`collections.namedtuple` now supports a keyword argument
    284   *rename* which lets invalid fieldnames be automatically converted to
    285   positional names in the form _0, _1, etc.  This is useful when
    286   the field names are being created by an external source such as a
    287   CSV header, SQL field list, or user input::
    288 
    289     >>> query = input()
    290     SELECT region, dept, count(*) FROM main GROUPBY region, dept
    291 
    292     >>> cursor.execute(query)
    293     >>> query_fields = [desc[0] for desc in cursor.description]
    294     >>> UserQuery = namedtuple('UserQuery', query_fields, rename=True)
    295     >>> pprint.pprint([UserQuery(*row) for row in cursor])
    296     [UserQuery(region='South', dept='Shipping', _2=185),
    297      UserQuery(region='North', dept='Accounting', _2=37),
    298      UserQuery(region='West', dept='Sales', _2=419)]
    299 
    300   (Contributed by Raymond Hettinger; :issue:`1818`.)
    301 
    302 * The :func:`re.sub`, :func:`re.subn` and :func:`re.split` functions now
    303   accept a flags parameter.
    304 
    305   (Contributed by Gregory Smith.)
    306 
    307 * The :mod:`logging` module now implements a simple :class:`logging.NullHandler`
    308   class for applications that are not using logging but are calling
    309   library code that does.  Setting-up a null handler will suppress
    310   spurious warnings such as "No handlers could be found for logger foo"::
    311 
    312     >>> h = logging.NullHandler()
    313     >>> logging.getLogger("foo").addHandler(h)
    314 
    315   (Contributed by Vinay Sajip; :issue:`4384`).
    316 
    317 * The :mod:`runpy` module which supports the ``-m`` command line switch
    318   now supports the execution of packages by looking for and executing
    319   a ``__main__`` submodule when a package name is supplied.
    320 
    321   (Contributed by Andi Vajda; :issue:`4195`.)
    322 
    323 * The :mod:`pdb` module can now access and display source code loaded via
    324   :mod:`zipimport` (or any other conformant :pep:`302` loader).
    325 
    326   (Contributed by Alexander Belopolsky; :issue:`4201`.)
    327 
    328 *  :class:`functools.partial` objects can now be pickled.
    329 
    330   (Suggested by Antoine Pitrou and Jesse Noller.  Implemented by
    331   Jack Diederich; :issue:`5228`.)
    332 
    333 * Add :mod:`pydoc` help topics for symbols so that ``help('@')``
    334   works as expected in the interactive environment.
    335 
    336   (Contributed by David Laban; :issue:`4739`.)
    337 
    338 * The :mod:`unittest` module now supports skipping individual tests or classes
    339   of tests. And it supports marking a test as an expected failure, a test that
    340   is known to be broken, but shouldn't be counted as a failure on a
    341   TestResult::
    342 
    343     class TestGizmo(unittest.TestCase):
    344 
    345         @unittest.skipUnless(sys.platform.startswith("win"), "requires Windows")
    346         def test_gizmo_on_windows(self):
    347             ...
    348 
    349         @unittest.expectedFailure
    350         def test_gimzo_without_required_library(self):
    351             ...
    352 
    353   Also, tests for exceptions have been builtout to work with context managers
    354   using the :keyword:`with` statement::
    355 
    356       def test_division_by_zero(self):
    357           with self.assertRaises(ZeroDivisionError):
    358               x / 0
    359 
    360   In addition, several new assertion methods were added including
    361   :func:`assertSetEqual`, :func:`assertDictEqual`,
    362   :func:`assertDictContainsSubset`, :func:`assertListEqual`,
    363   :func:`assertTupleEqual`, :func:`assertSequenceEqual`,
    364   :func:`assertRaisesRegexp`, :func:`assertIsNone`,
    365   and :func:`assertIsNotNone`.
    366 
    367   (Contributed by Benjamin Peterson and Antoine Pitrou.)
    368 
    369 * The :mod:`io` module has three new constants for the :meth:`seek`
    370   method :data:`SEEK_SET`, :data:`SEEK_CUR`, and :data:`SEEK_END`.
    371 
    372 * The :attr:`sys.version_info` tuple is now a named tuple::
    373 
    374     >>> sys.version_info
    375     sys.version_info(major=3, minor=1, micro=0, releaselevel='alpha', serial=2)
    376 
    377   (Contributed by Ross Light; :issue:`4285`.)
    378 
    379 * The :mod:`nntplib` and :mod:`imaplib` modules now support IPv6.
    380 
    381   (Contributed by Derek Morr; :issue:`1655` and :issue:`1664`.)
    382 
    383 * The :mod:`pickle` module has been adapted for better interoperability with
    384   Python 2.x when used with protocol 2 or lower.  The reorganization of the
    385   standard library changed the formal reference for many objects.  For
    386   example, ``__builtin__.set`` in Python 2 is called ``builtins.set`` in Python
    387   3. This change confounded efforts to share data between different versions of
    388   Python.  But now when protocol 2 or lower is selected, the pickler will
    389   automatically use the old Python 2 names for both loading and dumping. This
    390   remapping is turned-on by default but can be disabled with the *fix_imports*
    391   option::
    392 
    393     >>> s = {1, 2, 3}
    394     >>> pickle.dumps(s, protocol=0)
    395     b'c__builtin__\nset\np0\n((lp1\nL1L\naL2L\naL3L\natp2\nRp3\n.'
    396     >>> pickle.dumps(s, protocol=0, fix_imports=False)
    397     b'cbuiltins\nset\np0\n((lp1\nL1L\naL2L\naL3L\natp2\nRp3\n.'
    398 
    399   An unfortunate but unavoidable side-effect of this change is that protocol 2
    400   pickles produced by Python 3.1 won't be readable with Python 3.0. The latest
    401   pickle protocol, protocol 3, should be used when migrating data between
    402   Python 3.x implementations, as it doesn't attempt to remain compatible with
    403   Python 2.x.
    404 
    405   (Contributed by Alexandre Vassalotti and Antoine Pitrou, :issue:`6137`.)
    406 
    407 * A new module, :mod:`importlib` was added.  It provides a complete, portable,
    408   pure Python reference implementation of the :keyword:`import` statement and its
    409   counterpart, the :func:`__import__` function.  It represents a substantial
    410   step forward in documenting and defining the actions that take place during
    411   imports.
    412 
    413   (Contributed by Brett Cannon.)
    414 
    415 Optimizations
    416 =============
    417 
    418 Major performance enhancements have been added:
    419 
    420 * The new I/O library (as defined in :pep:`3116`) was mostly written in
    421   Python and quickly proved to be a problematic bottleneck in Python 3.0.
    422   In Python 3.1, the I/O library has been entirely rewritten in C and is
    423   2 to 20 times faster depending on the task at hand. The pure Python
    424   version is still available for experimentation purposes through
    425   the ``_pyio`` module.
    426 
    427   (Contributed by Amaury Forgeot d'Arc and Antoine Pitrou.)
    428 
    429 * Added a heuristic so that tuples and dicts containing only untrackable objects
    430   are not tracked by the garbage collector. This can reduce the size of
    431   collections and therefore the garbage collection overhead on long-running
    432   programs, depending on their particular use of datatypes.
    433 
    434   (Contributed by Antoine Pitrou, :issue:`4688`.)
    435 
    436 * Enabling a configure option named ``--with-computed-gotos``
    437   on compilers that support it (notably: gcc, SunPro, icc), the bytecode
    438   evaluation loop is compiled with a new dispatch mechanism which gives
    439   speedups of up to 20%, depending on the system, the compiler, and
    440   the benchmark.
    441 
    442   (Contributed by Antoine Pitrou along with a number of other participants,
    443   :issue:`4753`).
    444 
    445 * The decoding of UTF-8, UTF-16 and LATIN-1 is now two to four times
    446   faster.
    447 
    448   (Contributed by Antoine Pitrou and Amaury Forgeot d'Arc, :issue:`4868`.)
    449 
    450 * The :mod:`json` module now has a C extension to substantially improve
    451   its performance.  In addition, the API was modified so that json works
    452   only with :class:`str`, not with :class:`bytes`.  That change makes the
    453   module closely match the `JSON specification <http://json.org/>`_
    454   which is defined in terms of Unicode.
    455 
    456   (Contributed by Bob Ippolito and converted to Py3.1 by Antoine Pitrou
    457   and Benjamin Peterson; :issue:`4136`.)
    458 
    459 * Unpickling now interns the attribute names of pickled objects.  This saves
    460   memory and allows pickles to be smaller.
    461 
    462   (Contributed by Jake McGuire and Antoine Pitrou; :issue:`5084`.)
    463 
    464 IDLE
    465 ====
    466 
    467 * IDLE's format menu now provides an option to strip trailing whitespace
    468   from a source file.
    469 
    470   (Contributed by Roger D. Serwy; :issue:`5150`.)
    471 
    472 Build and C API Changes
    473 =======================
    474 
    475 Changes to Python's build process and to the C API include:
    476 
    477 * Integers are now stored internally either in base 2**15 or in base
    478   2**30, the base being determined at build time.  Previously, they
    479   were always stored in base 2**15.  Using base 2**30 gives
    480   significant performance improvements on 64-bit machines, but
    481   benchmark results on 32-bit machines have been mixed.  Therefore,
    482   the default is to use base 2**30 on 64-bit machines and base 2**15
    483   on 32-bit machines; on Unix, there's a new configure option
    484   ``--enable-big-digits`` that can be used to override this default.
    485 
    486   Apart from the performance improvements this change should be invisible to
    487   end users, with one exception: for testing and debugging purposes there's a
    488   new :attr:`sys.int_info` that provides information about the
    489   internal format, giving the number of bits per digit and the size in bytes
    490   of the C type used to store each digit::
    491 
    492      >>> import sys
    493      >>> sys.int_info
    494      sys.int_info(bits_per_digit=30, sizeof_digit=4)
    495 
    496   (Contributed by Mark Dickinson; :issue:`4258`.)
    497 
    498 * The :c:func:`PyLong_AsUnsignedLongLong()` function now handles a negative
    499   *pylong* by raising :exc:`OverflowError` instead of :exc:`TypeError`.
    500 
    501   (Contributed by Mark Dickinson and Lisandro Dalcrin; :issue:`5175`.)
    502 
    503 * Deprecated :c:func:`PyNumber_Int`.  Use :c:func:`PyNumber_Long` instead.
    504 
    505   (Contributed by Mark Dickinson; :issue:`4910`.)
    506 
    507 * Added a new :c:func:`PyOS_string_to_double` function to replace the
    508   deprecated functions :c:func:`PyOS_ascii_strtod` and :c:func:`PyOS_ascii_atof`.
    509 
    510   (Contributed by Mark Dickinson; :issue:`5914`.)
    511 
    512 * Added :c:type:`PyCapsule` as a replacement for the :c:type:`PyCObject` API.
    513   The principal difference is that the new type has a well defined interface
    514   for passing typing safety information and a less complicated signature
    515   for calling a destructor.  The old type had a problematic API and is now
    516   deprecated.
    517 
    518   (Contributed by Larry Hastings; :issue:`5630`.)
    519 
    520 Porting to Python 3.1
    521 =====================
    522 
    523 This section lists previously described changes and other bugfixes
    524 that may require changes to your code:
    525 
    526 * The new floating point string representations can break existing doctests.
    527   For example::
    528 
    529     def e():
    530         '''Compute the base of natural logarithms.
    531 
    532         >>> e()
    533         2.7182818284590451
    534 
    535         '''
    536         return sum(1/math.factorial(x) for x in reversed(range(30)))
    537 
    538     doctest.testmod()
    539 
    540     **********************************************************************
    541     Failed example:
    542         e()
    543     Expected:
    544         2.7182818284590451
    545     Got:
    546         2.718281828459045
    547     **********************************************************************
    548 
    549 * The automatic name remapping in the pickle module for protocol 2 or lower can
    550   make Python 3.1 pickles unreadable in Python 3.0.  One solution is to use
    551   protocol 3.  Another solution is to set the *fix_imports* option to ``False``.
    552   See the discussion above for more details.
    553