1 **************************** 2 What's New in Python 2.4 3 **************************** 4 5 :Author: A.M. Kuchling 6 7 .. |release| replace:: 1.02 8 9 .. $Id: whatsnew24.tex 54632 2007-03-31 11:59:54Z georg.brandl $ 10 .. Don't write extensive text for new sections; I'll do that. 11 .. Feel free to add commented-out reminders of things that need 12 .. to be covered. --amk 13 14 This article explains the new features in Python 2.4.1, released on March 30, 15 2005. 16 17 Python 2.4 is a medium-sized release. It doesn't introduce as many changes as 18 the radical Python 2.2, but introduces more features than the conservative 2.3 19 release. The most significant new language features are function decorators and 20 generator expressions; most other changes are to the standard library. 21 22 According to the CVS change logs, there were 481 patches applied and 502 bugs 23 fixed between Python 2.3 and 2.4. Both figures are likely to be underestimates. 24 25 This article doesn't attempt to provide a complete specification of every single 26 new feature, but instead provides a brief introduction to each feature. For 27 full details, you should refer to the documentation for Python 2.4, such as the 28 Python Library Reference and the Python Reference Manual. Often you will be 29 referred to the PEP for a particular new feature for explanations of the 30 implementation and design rationale. 31 32 .. ====================================================================== 33 34 35 PEP 218: Built-In Set Objects 36 ============================= 37 38 Python 2.3 introduced the :mod:`sets` module. C implementations of set data 39 types have now been added to the Python core as two new built-in types, 40 ``set(iterable)`` and ``frozenset(iterable)``. They provide high speed 41 operations for membership testing, for eliminating duplicates from sequences, 42 and for mathematical operations like unions, intersections, differences, and 43 symmetric differences. :: 44 45 >>> a = set('abracadabra') # form a set from a string 46 >>> 'z' in a # fast membership testing 47 False 48 >>> a # unique letters in a 49 set(['a', 'r', 'b', 'c', 'd']) 50 >>> ''.join(a) # convert back into a string 51 'arbcd' 52 53 >>> b = set('alacazam') # form a second set 54 >>> a - b # letters in a but not in b 55 set(['r', 'd', 'b']) 56 >>> a | b # letters in either a or b 57 set(['a', 'c', 'r', 'd', 'b', 'm', 'z', 'l']) 58 >>> a & b # letters in both a and b 59 set(['a', 'c']) 60 >>> a ^ b # letters in a or b but not both 61 set(['r', 'd', 'b', 'm', 'z', 'l']) 62 63 >>> a.add('z') # add a new element 64 >>> a.update('wxy') # add multiple new elements 65 >>> a 66 set(['a', 'c', 'b', 'd', 'r', 'w', 'y', 'x', 'z']) 67 >>> a.remove('x') # take one element out 68 >>> a 69 set(['a', 'c', 'b', 'd', 'r', 'w', 'y', 'z']) 70 71 The :func:`frozenset` type is an immutable version of :func:`set`. Since it is 72 immutable and hashable, it may be used as a dictionary key or as a member of 73 another set. 74 75 The :mod:`sets` module remains in the standard library, and may be useful if you 76 wish to subclass the :class:`Set` or :class:`ImmutableSet` classes. There are 77 currently no plans to deprecate the module. 78 79 80 .. seealso:: 81 82 :pep:`218` - Adding a Built-In Set Object Type 83 Originally proposed by Greg Wilson and ultimately implemented by Raymond 84 Hettinger. 85 86 .. ====================================================================== 87 88 89 PEP 237: Unifying Long Integers and Integers 90 ============================================ 91 92 The lengthy transition process for this PEP, begun in Python 2.2, takes another 93 step forward in Python 2.4. In 2.3, certain integer operations that would 94 behave differently after int/long unification triggered :exc:`FutureWarning` 95 warnings and returned values limited to 32 or 64 bits (depending on your 96 platform). In 2.4, these expressions no longer produce a warning and instead 97 produce a different result that's usually a long integer. 98 99 The problematic expressions are primarily left shifts and lengthy hexadecimal 100 and octal constants. For example, ``2 << 32`` results in a warning in 2.3, 101 evaluating to 0 on 32-bit platforms. In Python 2.4, this expression now returns 102 the correct answer, 8589934592. 103 104 105 .. seealso:: 106 107 :pep:`237` - Unifying Long Integers and Integers 108 Original PEP written by Moshe Zadka and GvR. The changes for 2.4 were 109 implemented by Kalle Svensson. 110 111 .. ====================================================================== 112 113 114 PEP 289: Generator Expressions 115 ============================== 116 117 The iterator feature introduced in Python 2.2 and the :mod:`itertools` module 118 make it easier to write programs that loop through large data sets without 119 having the entire data set in memory at one time. List comprehensions don't fit 120 into this picture very well because they produce a Python list object containing 121 all of the items. This unavoidably pulls all of the objects into memory, which 122 can be a problem if your data set is very large. When trying to write a 123 functionally-styled program, it would be natural to write something like:: 124 125 links = [link for link in get_all_links() if not link.followed] 126 for link in links: 127 ... 128 129 instead of :: 130 131 for link in get_all_links(): 132 if link.followed: 133 continue 134 ... 135 136 The first form is more concise and perhaps more readable, but if you're dealing 137 with a large number of link objects you'd have to write the second form to avoid 138 having all link objects in memory at the same time. 139 140 Generator expressions work similarly to list comprehensions but don't 141 materialize the entire list; instead they create a generator that will return 142 elements one by one. The above example could be written as:: 143 144 links = (link for link in get_all_links() if not link.followed) 145 for link in links: 146 ... 147 148 Generator expressions always have to be written inside parentheses, as in the 149 above example. The parentheses signalling a function call also count, so if you 150 want to create an iterator that will be immediately passed to a function you 151 could write:: 152 153 print sum(obj.count for obj in list_all_objects()) 154 155 Generator expressions differ from list comprehensions in various small ways. 156 Most notably, the loop variable (*obj* in the above example) is not accessible 157 outside of the generator expression. List comprehensions leave the variable 158 assigned to its last value; future versions of Python will change this, making 159 list comprehensions match generator expressions in this respect. 160 161 162 .. seealso:: 163 164 :pep:`289` - Generator Expressions 165 Proposed by Raymond Hettinger and implemented by Jiwon Seo with early efforts 166 steered by Hye-Shik Chang. 167 168 .. ====================================================================== 169 170 171 PEP 292: Simpler String Substitutions 172 ===================================== 173 174 Some new classes in the standard library provide an alternative mechanism for 175 substituting variables into strings; this style of substitution may be better 176 for applications where untrained users need to edit templates. 177 178 The usual way of substituting variables by name is the ``%`` operator:: 179 180 >>> '%(page)i: %(title)s' % {'page':2, 'title': 'The Best of Times'} 181 '2: The Best of Times' 182 183 When writing the template string, it can be easy to forget the ``i`` or ``s`` 184 after the closing parenthesis. This isn't a big problem if the template is in a 185 Python module, because you run the code, get an "Unsupported format character" 186 :exc:`ValueError`, and fix the problem. However, consider an application such 187 as Mailman where template strings or translations are being edited by users who 188 aren't aware of the Python language. The format string's syntax is complicated 189 to explain to such users, and if they make a mistake, it's difficult to provide 190 helpful feedback to them. 191 192 PEP 292 adds a :class:`Template` class to the :mod:`string` module that uses 193 ``$`` to indicate a substitution:: 194 195 >>> import string 196 >>> t = string.Template('$page: $title') 197 >>> t.substitute({'page':2, 'title': 'The Best of Times'}) 198 '2: The Best of Times' 199 200 If a key is missing from the dictionary, the :meth:`substitute` method will 201 raise a :exc:`KeyError`. There's also a :meth:`safe_substitute` method that 202 ignores missing keys:: 203 204 >>> t = string.Template('$page: $title') 205 >>> t.safe_substitute({'page':3}) 206 '3: $title' 207 208 209 .. seealso:: 210 211 :pep:`292` - Simpler String Substitutions 212 Written and implemented by Barry Warsaw. 213 214 .. ====================================================================== 215 216 217 PEP 318: Decorators for Functions and Methods 218 ============================================= 219 220 Python 2.2 extended Python's object model by adding static methods and class 221 methods, but it didn't extend Python's syntax to provide any new way of defining 222 static or class methods. Instead, you had to write a :keyword:`def` statement 223 in the usual way, and pass the resulting method to a :func:`staticmethod` or 224 :func:`classmethod` function that would wrap up the function as a method of the 225 new type. Your code would look like this:: 226 227 class C: 228 def meth (cls): 229 ... 230 231 meth = classmethod(meth) # Rebind name to wrapped-up class method 232 233 If the method was very long, it would be easy to miss or forget the 234 :func:`classmethod` invocation after the function body. 235 236 The intention was always to add some syntax to make such definitions more 237 readable, but at the time of 2.2's release a good syntax was not obvious. Today 238 a good syntax *still* isn't obvious but users are asking for easier access to 239 the feature; a new syntactic feature has been added to meet this need. 240 241 The new feature is called "function decorators". The name comes from the idea 242 that :func:`classmethod`, :func:`staticmethod`, and friends are storing 243 additional information on a function object; they're *decorating* functions with 244 more details. 245 246 The notation borrows from Java and uses the ``'@'`` character as an indicator. 247 Using the new syntax, the example above would be written:: 248 249 class C: 250 251 @classmethod 252 def meth (cls): 253 ... 254 255 256 The ``@classmethod`` is shorthand for the ``meth=classmethod(meth)`` assignment. 257 More generally, if you have the following:: 258 259 @A 260 @B 261 @C 262 def f (): 263 ... 264 265 It's equivalent to the following pre-decorator code:: 266 267 def f(): ... 268 f = A(B(C(f))) 269 270 Decorators must come on the line before a function definition, one decorator per 271 line, and can't be on the same line as the def statement, meaning that ``@A def 272 f(): ...`` is illegal. You can only decorate function definitions, either at 273 the module level or inside a class; you can't decorate class definitions. 274 275 A decorator is just a function that takes the function to be decorated as an 276 argument and returns either the same function or some new object. The return 277 value of the decorator need not be callable (though it typically is), unless 278 further decorators will be applied to the result. It's easy to write your own 279 decorators. The following simple example just sets an attribute on the function 280 object:: 281 282 >>> def deco(func): 283 ... func.attr = 'decorated' 284 ... return func 285 ... 286 >>> @deco 287 ... def f(): pass 288 ... 289 >>> f 290 <function f at 0x402ef0d4> 291 >>> f.attr 292 'decorated' 293 >>> 294 295 As a slightly more realistic example, the following decorator checks that the 296 supplied argument is an integer:: 297 298 def require_int (func): 299 def wrapper (arg): 300 assert isinstance(arg, int) 301 return func(arg) 302 303 return wrapper 304 305 @require_int 306 def p1 (arg): 307 print arg 308 309 @require_int 310 def p2(arg): 311 print arg*2 312 313 An example in :pep:`318` contains a fancier version of this idea that lets you 314 both specify the required type and check the returned type. 315 316 Decorator functions can take arguments. If arguments are supplied, your 317 decorator function is called with only those arguments and must return a new 318 decorator function; this function must take a single function and return a 319 function, as previously described. In other words, ``@A @B @C(args)`` becomes:: 320 321 def f(): ... 322 _deco = C(args) 323 f = A(B(_deco(f))) 324 325 Getting this right can be slightly brain-bending, but it's not too difficult. 326 327 A small related change makes the :attr:`func_name` attribute of functions 328 writable. This attribute is used to display function names in tracebacks, so 329 decorators should change the name of any new function that's constructed and 330 returned. 331 332 333 .. seealso:: 334 335 :pep:`318` - Decorators for Functions, Methods and Classes 336 Written by Kevin D. Smith, Jim Jewett, and Skip Montanaro. Several people 337 wrote patches implementing function decorators, but the one that was actually 338 checked in was patch #979728, written by Mark Russell. 339 340 https://wiki.python.org/moin/PythonDecoratorLibrary 341 This Wiki page contains several examples of decorators. 342 343 .. ====================================================================== 344 345 346 PEP 322: Reverse Iteration 347 ========================== 348 349 A new built-in function, ``reversed(seq)``, takes a sequence and returns an 350 iterator that loops over the elements of the sequence in reverse order. :: 351 352 >>> for i in reversed(xrange(1,4)): 353 ... print i 354 ... 355 3 356 2 357 1 358 359 Compared to extended slicing, such as ``range(1,4)[::-1]``, :func:`reversed` is 360 easier to read, runs faster, and uses substantially less memory. 361 362 Note that :func:`reversed` only accepts sequences, not arbitrary iterators. If 363 you want to reverse an iterator, first convert it to a list with :func:`list`. 364 :: 365 366 >>> input = open('/etc/passwd', 'r') 367 >>> for line in reversed(list(input)): 368 ... print line 369 ... 370 root:*:0:0:System Administrator:/var/root:/bin/tcsh 371 ... 372 373 374 .. seealso:: 375 376 :pep:`322` - Reverse Iteration 377 Written and implemented by Raymond Hettinger. 378 379 .. ====================================================================== 380 381 382 PEP 324: New subprocess Module 383 ============================== 384 385 The standard library provides a number of ways to execute a subprocess, offering 386 different features and different levels of complexity. 387 ``os.system(command)`` is easy to use, but slow (it runs a shell process 388 which executes the command) and dangerous (you have to be careful about escaping 389 the shell's metacharacters). The :mod:`popen2` module offers classes that can 390 capture standard output and standard error from the subprocess, but the naming 391 is confusing. The :mod:`subprocess` module cleans this up, providing a unified 392 interface that offers all the features you might need. 393 394 Instead of :mod:`popen2`'s collection of classes, :mod:`subprocess` contains a 395 single class called :class:`Popen` whose constructor supports a number of 396 different keyword arguments. :: 397 398 class Popen(args, bufsize=0, executable=None, 399 stdin=None, stdout=None, stderr=None, 400 preexec_fn=None, close_fds=False, shell=False, 401 cwd=None, env=None, universal_newlines=False, 402 startupinfo=None, creationflags=0): 403 404 *args* is commonly a sequence of strings that will be the arguments to the 405 program executed as the subprocess. (If the *shell* argument is true, *args* 406 can be a string which will then be passed on to the shell for interpretation, 407 just as :func:`os.system` does.) 408 409 *stdin*, *stdout*, and *stderr* specify what the subprocess's input, output, and 410 error streams will be. You can provide a file object or a file descriptor, or 411 you can use the constant ``subprocess.PIPE`` to create a pipe between the 412 subprocess and the parent. 413 414 .. index:: 415 single: universal newlines; What's new 416 417 The constructor has a number of handy options: 418 419 * *close_fds* requests that all file descriptors be closed before running the 420 subprocess. 421 422 * *cwd* specifies the working directory in which the subprocess will be executed 423 (defaulting to whatever the parent's working directory is). 424 425 * *env* is a dictionary specifying environment variables. 426 427 * *preexec_fn* is a function that gets called before the child is started. 428 429 * *universal_newlines* opens the child's input and output using Python's 430 :term:`universal newlines` feature. 431 432 Once you've created the :class:`Popen` instance, you can call its :meth:`wait` 433 method to pause until the subprocess has exited, :meth:`poll` to check if it's 434 exited without pausing, or ``communicate(data)`` to send the string *data* 435 to the subprocess's standard input. ``communicate(data)`` then reads any 436 data that the subprocess has sent to its standard output or standard error, 437 returning a tuple ``(stdout_data, stderr_data)``. 438 439 :func:`call` is a shortcut that passes its arguments along to the :class:`Popen` 440 constructor, waits for the command to complete, and returns the status code of 441 the subprocess. It can serve as a safer analog to :func:`os.system`:: 442 443 sts = subprocess.call(['dpkg', '-i', '/tmp/new-package.deb']) 444 if sts == 0: 445 # Success 446 ... 447 else: 448 # dpkg returned an error 449 ... 450 451 The command is invoked without use of the shell. If you really do want to use 452 the shell, you can add ``shell=True`` as a keyword argument and provide a string 453 instead of a sequence:: 454 455 sts = subprocess.call('dpkg -i /tmp/new-package.deb', shell=True) 456 457 The PEP takes various examples of shell and Python code and shows how they'd be 458 translated into Python code that uses :mod:`subprocess`. Reading this section 459 of the PEP is highly recommended. 460 461 462 .. seealso:: 463 464 :pep:`324` - subprocess - New process module 465 Written and implemented by Peter strand, with assistance from Fredrik Lundh and 466 others. 467 468 .. ====================================================================== 469 470 471 PEP 327: Decimal Data Type 472 ========================== 473 474 Python has always supported floating-point (FP) numbers, based on the underlying 475 C :c:type:`double` type, as a data type. However, while most programming 476 languages provide a floating-point type, many people (even programmers) are 477 unaware that floating-point numbers don't represent certain decimal fractions 478 accurately. The new :class:`Decimal` type can represent these fractions 479 accurately, up to a user-specified precision limit. 480 481 482 Why is Decimal needed? 483 ---------------------- 484 485 The limitations arise from the representation used for floating-point numbers. 486 FP numbers are made up of three components: 487 488 * The sign, which is positive or negative. 489 490 * The mantissa, which is a single-digit binary number followed by a fractional 491 part. For example, ``1.01`` in base-2 notation is ``1 + 0/2 + 1/4``, or 1.25 in 492 decimal notation. 493 494 * The exponent, which tells where the decimal point is located in the number 495 represented. 496 497 For example, the number 1.25 has positive sign, a mantissa value of 1.01 (in 498 binary), and an exponent of 0 (the decimal point doesn't need to be shifted). 499 The number 5 has the same sign and mantissa, but the exponent is 2 because the 500 mantissa is multiplied by 4 (2 to the power of the exponent 2); 1.25 \* 4 equals 501 5. 502 503 Modern systems usually provide floating-point support that conforms to a 504 standard called IEEE 754. C's :c:type:`double` type is usually implemented as a 505 64-bit IEEE 754 number, which uses 52 bits of space for the mantissa. This 506 means that numbers can only be specified to 52 bits of precision. If you're 507 trying to represent numbers whose expansion repeats endlessly, the expansion is 508 cut off after 52 bits. Unfortunately, most software needs to produce output in 509 base 10, and common fractions in base 10 are often repeating decimals in binary. 510 For example, 1.1 decimal is binary ``1.0001100110011 ...``; .1 = 1/16 + 1/32 + 511 1/256 plus an infinite number of additional terms. IEEE 754 has to chop off 512 that infinitely repeated decimal after 52 digits, so the representation is 513 slightly inaccurate. 514 515 Sometimes you can see this inaccuracy when the number is printed:: 516 517 >>> 1.1 518 1.1000000000000001 519 520 The inaccuracy isn't always visible when you print the number because the 521 FP-to-decimal-string conversion is provided by the C library, and most C libraries try 522 to produce sensible output. Even if it's not displayed, however, the inaccuracy 523 is still there and subsequent operations can magnify the error. 524 525 For many applications this doesn't matter. If I'm plotting points and 526 displaying them on my monitor, the difference between 1.1 and 1.1000000000000001 527 is too small to be visible. Reports often limit output to a certain number of 528 decimal places, and if you round the number to two or three or even eight 529 decimal places, the error is never apparent. However, for applications where it 530 does matter, it's a lot of work to implement your own custom arithmetic 531 routines. 532 533 Hence, the :class:`Decimal` type was created. 534 535 536 The :class:`Decimal` type 537 ------------------------- 538 539 A new module, :mod:`decimal`, was added to Python's standard library. It 540 contains two classes, :class:`Decimal` and :class:`Context`. :class:`Decimal` 541 instances represent numbers, and :class:`Context` instances are used to wrap up 542 various settings such as the precision and default rounding mode. 543 544 :class:`Decimal` instances are immutable, like regular Python integers and FP 545 numbers; once it's been created, you can't change the value an instance 546 represents. :class:`Decimal` instances can be created from integers or 547 strings:: 548 549 >>> import decimal 550 >>> decimal.Decimal(1972) 551 Decimal("1972") 552 >>> decimal.Decimal("1.1") 553 Decimal("1.1") 554 555 You can also provide tuples containing the sign, the mantissa represented as a 556 tuple of decimal digits, and the exponent:: 557 558 >>> decimal.Decimal((1, (1, 4, 7, 5), -2)) 559 Decimal("-14.75") 560 561 Cautionary note: the sign bit is a Boolean value, so 0 is positive and 1 is 562 negative. 563 564 Converting from floating-point numbers poses a bit of a problem: should the FP 565 number representing 1.1 turn into the decimal number for exactly 1.1, or for 1.1 566 plus whatever inaccuracies are introduced? The decision was to dodge the issue 567 and leave such a conversion out of the API. Instead, you should convert the 568 floating-point number into a string using the desired precision and pass the 569 string to the :class:`Decimal` constructor:: 570 571 >>> f = 1.1 572 >>> decimal.Decimal(str(f)) 573 Decimal("1.1") 574 >>> decimal.Decimal('%.12f' % f) 575 Decimal("1.100000000000") 576 577 Once you have :class:`Decimal` instances, you can perform the usual mathematical 578 operations on them. One limitation: exponentiation requires an integer 579 exponent:: 580 581 >>> a = decimal.Decimal('35.72') 582 >>> b = decimal.Decimal('1.73') 583 >>> a+b 584 Decimal("37.45") 585 >>> a-b 586 Decimal("33.99") 587 >>> a*b 588 Decimal("61.7956") 589 >>> a/b 590 Decimal("20.64739884393063583815028902") 591 >>> a ** 2 592 Decimal("1275.9184") 593 >>> a**b 594 Traceback (most recent call last): 595 ... 596 decimal.InvalidOperation: x ** (non-integer) 597 598 You can combine :class:`Decimal` instances with integers, but not with 599 floating-point numbers:: 600 601 >>> a + 4 602 Decimal("39.72") 603 >>> a + 4.5 604 Traceback (most recent call last): 605 ... 606 TypeError: You can interact Decimal only with int, long or Decimal data types. 607 >>> 608 609 :class:`Decimal` numbers can be used with the :mod:`math` and :mod:`cmath` 610 modules, but note that they'll be immediately converted to floating-point 611 numbers before the operation is performed, resulting in a possible loss of 612 precision and accuracy. You'll also get back a regular floating-point number 613 and not a :class:`Decimal`. :: 614 615 >>> import math, cmath 616 >>> d = decimal.Decimal('123456789012.345') 617 >>> math.sqrt(d) 618 351364.18288201344 619 >>> cmath.sqrt(-d) 620 351364.18288201344j 621 622 :class:`Decimal` instances have a :meth:`sqrt` method that returns a 623 :class:`Decimal`, but if you need other things such as trigonometric functions 624 you'll have to implement them. :: 625 626 >>> d.sqrt() 627 Decimal("351364.1828820134592177245001") 628 629 630 The :class:`Context` type 631 ------------------------- 632 633 Instances of the :class:`Context` class encapsulate several settings for 634 decimal operations: 635 636 * :attr:`prec` is the precision, the number of decimal places. 637 638 * :attr:`rounding` specifies the rounding mode. The :mod:`decimal` module has 639 constants for the various possibilities: :const:`ROUND_DOWN`, 640 :const:`ROUND_CEILING`, :const:`ROUND_HALF_EVEN`, and various others. 641 642 * :attr:`traps` is a dictionary specifying what happens on encountering certain 643 error conditions: either an exception is raised or a value is returned. Some 644 examples of error conditions are division by zero, loss of precision, and 645 overflow. 646 647 There's a thread-local default context available by calling :func:`getcontext`; 648 you can change the properties of this context to alter the default precision, 649 rounding, or trap handling. The following example shows the effect of changing 650 the precision of the default context:: 651 652 >>> decimal.getcontext().prec 653 28 654 >>> decimal.Decimal(1) / decimal.Decimal(7) 655 Decimal("0.1428571428571428571428571429") 656 >>> decimal.getcontext().prec = 9 657 >>> decimal.Decimal(1) / decimal.Decimal(7) 658 Decimal("0.142857143") 659 660 The default action for error conditions is selectable; the module can either 661 return a special value such as infinity or not-a-number, or exceptions can be 662 raised:: 663 664 >>> decimal.Decimal(1) / decimal.Decimal(0) 665 Traceback (most recent call last): 666 ... 667 decimal.DivisionByZero: x / 0 668 >>> decimal.getcontext().traps[decimal.DivisionByZero] = False 669 >>> decimal.Decimal(1) / decimal.Decimal(0) 670 Decimal("Infinity") 671 >>> 672 673 The :class:`Context` instance also has various methods for formatting numbers 674 such as :meth:`to_eng_string` and :meth:`to_sci_string`. 675 676 For more information, see the documentation for the :mod:`decimal` module, which 677 includes a quick-start tutorial and a reference. 678 679 680 .. seealso:: 681 682 :pep:`327` - Decimal Data Type 683 Written by Facundo Batista and implemented by Facundo Batista, Eric Price, 684 Raymond Hettinger, Aahz, and Tim Peters. 685 686 http://www.lahey.com/float.htm 687 The article uses Fortran code to illustrate many of the problems that 688 floating-point inaccuracy can cause. 689 690 http://speleotrove.com/decimal/ 691 A description of a decimal-based representation. This representation is being 692 proposed as a standard, and underlies the new Python decimal type. Much of this 693 material was written by Mike Cowlishaw, designer of the Rexx language. 694 695 .. ====================================================================== 696 697 698 PEP 328: Multi-line Imports 699 =========================== 700 701 One language change is a small syntactic tweak aimed at making it easier to 702 import many names from a module. In a ``from module import names`` statement, 703 *names* is a sequence of names separated by commas. If the sequence is very 704 long, you can either write multiple imports from the same module, or you can use 705 backslashes to escape the line endings like this:: 706 707 from SimpleXMLRPCServer import SimpleXMLRPCServer,\ 708 SimpleXMLRPCRequestHandler,\ 709 CGIXMLRPCRequestHandler,\ 710 resolve_dotted_attribute 711 712 The syntactic change in Python 2.4 simply allows putting the names within 713 parentheses. Python ignores newlines within a parenthesized expression, so the 714 backslashes are no longer needed:: 715 716 from SimpleXMLRPCServer import (SimpleXMLRPCServer, 717 SimpleXMLRPCRequestHandler, 718 CGIXMLRPCRequestHandler, 719 resolve_dotted_attribute) 720 721 The PEP also proposes that all :keyword:`import` statements be absolute imports, 722 with a leading ``.`` character to indicate a relative import. This part of the 723 PEP was not implemented for Python 2.4, but was completed for Python 2.5. 724 725 726 .. seealso:: 727 728 :pep:`328` - Imports: Multi-Line and Absolute/Relative 729 Written by Aahz. Multi-line imports were implemented by Dima Dorfman. 730 731 .. ====================================================================== 732 733 734 PEP 331: Locale-Independent Float/String Conversions 735 ==================================================== 736 737 The :mod:`locale` modules lets Python software select various conversions and 738 display conventions that are localized to a particular country or language. 739 However, the module was careful to not change the numeric locale because various 740 functions in Python's implementation required that the numeric locale remain set 741 to the ``'C'`` locale. Often this was because the code was using the C 742 library's :c:func:`atof` function. 743 744 Not setting the numeric locale caused trouble for extensions that used third-party 745 C libraries, however, because they wouldn't have the correct locale set. 746 The motivating example was GTK+, whose user interface widgets weren't displaying 747 numbers in the current locale. 748 749 The solution described in the PEP is to add three new functions to the Python 750 API that perform ASCII-only conversions, ignoring the locale setting: 751 752 * ``PyOS_ascii_strtod(str, ptr)`` and ``PyOS_ascii_atof(str, ptr)`` 753 both convert a string to a C :c:type:`double`. 754 755 * ``PyOS_ascii_formatd(buffer, buf_len, format, d)`` converts a 756 :c:type:`double` to an ASCII string. 757 758 The code for these functions came from the GLib library 759 (https://developer.gnome.org/glib/stable/), whose developers kindly 760 relicensed the relevant functions and donated them to the Python Software 761 Foundation. The :mod:`locale` module can now change the numeric locale, 762 letting extensions such as GTK+ produce the correct results. 763 764 765 .. seealso:: 766 767 :pep:`331` - Locale-Independent Float/String Conversions 768 Written by Christian R. Reis, and implemented by Gustavo Carneiro. 769 770 .. ====================================================================== 771 772 773 Other Language Changes 774 ====================== 775 776 Here are all of the changes that Python 2.4 makes to the core Python language. 777 778 * Decorators for functions and methods were added (:pep:`318`). 779 780 * Built-in :func:`set` and :func:`frozenset` types were added (:pep:`218`). 781 Other new built-ins include the ``reversed(seq)`` function (:pep:`322`). 782 783 * Generator expressions were added (:pep:`289`). 784 785 * Certain numeric expressions no longer return values restricted to 32 or 64 786 bits (:pep:`237`). 787 788 * You can now put parentheses around the list of names in a ``from module import 789 names`` statement (:pep:`328`). 790 791 * The :meth:`dict.update` method now accepts the same argument forms as the 792 :class:`dict` constructor. This includes any mapping, any iterable of key/value 793 pairs, and keyword arguments. (Contributed by Raymond Hettinger.) 794 795 * The string methods :meth:`ljust`, :meth:`rjust`, and :meth:`center` now take 796 an optional argument for specifying a fill character other than a space. 797 (Contributed by Raymond Hettinger.) 798 799 * Strings also gained an :meth:`rsplit` method that works like the :meth:`split` 800 method but splits from the end of the string. (Contributed by Sean 801 Reifschneider.) :: 802 803 >>> 'www.python.org'.split('.', 1) 804 ['www', 'python.org'] 805 'www.python.org'.rsplit('.', 1) 806 ['www.python', 'org'] 807 808 * Three keyword parameters, *cmp*, *key*, and *reverse*, were added to the 809 :meth:`sort` method of lists. These parameters make some common usages of 810 :meth:`sort` simpler. All of these parameters are optional. 811 812 For the *cmp* parameter, the value should be a comparison function that takes 813 two parameters and returns -1, 0, or +1 depending on how the parameters compare. 814 This function will then be used to sort the list. Previously this was the only 815 parameter that could be provided to :meth:`sort`. 816 817 *key* should be a single-parameter function that takes a list element and 818 returns a comparison key for the element. The list is then sorted using the 819 comparison keys. The following example sorts a list case-insensitively:: 820 821 >>> L = ['A', 'b', 'c', 'D'] 822 >>> L.sort() # Case-sensitive sort 823 >>> L 824 ['A', 'D', 'b', 'c'] 825 >>> # Using 'key' parameter to sort list 826 >>> L.sort(key=lambda x: x.lower()) 827 >>> L 828 ['A', 'b', 'c', 'D'] 829 >>> # Old-fashioned way 830 >>> L.sort(cmp=lambda x,y: cmp(x.lower(), y.lower())) 831 >>> L 832 ['A', 'b', 'c', 'D'] 833 834 The last example, which uses the *cmp* parameter, is the old way to perform a 835 case-insensitive sort. It works but is slower than using a *key* parameter. 836 Using *key* calls :meth:`lower` method once for each element in the list while 837 using *cmp* will call it twice for each comparison, so using *key* saves on 838 invocations of the :meth:`lower` method. 839 840 For simple key functions and comparison functions, it is often possible to avoid 841 a :keyword:`lambda` expression by using an unbound method instead. For example, 842 the above case-insensitive sort is best written as:: 843 844 >>> L.sort(key=str.lower) 845 >>> L 846 ['A', 'b', 'c', 'D'] 847 848 Finally, the *reverse* parameter takes a Boolean value. If the value is true, 849 the list will be sorted into reverse order. Instead of ``L.sort(); 850 L.reverse()``, you can now write ``L.sort(reverse=True)``. 851 852 The results of sorting are now guaranteed to be stable. This means that two 853 entries with equal keys will be returned in the same order as they were input. 854 For example, you can sort a list of people by name, and then sort the list by 855 age, resulting in a list sorted by age where people with the same age are in 856 name-sorted order. 857 858 (All changes to :meth:`sort` contributed by Raymond Hettinger.) 859 860 * There is a new built-in function ``sorted(iterable)`` that works like the 861 in-place :meth:`list.sort` method but can be used in expressions. The 862 differences are: 863 864 * the input may be any iterable; 865 866 * a newly formed copy is sorted, leaving the original intact; and 867 868 * the expression returns the new sorted copy 869 870 :: 871 872 >>> L = [9,7,8,3,2,4,1,6,5] 873 >>> [10+i for i in sorted(L)] # usable in a list comprehension 874 [11, 12, 13, 14, 15, 16, 17, 18, 19] 875 >>> L # original is left unchanged 876 [9,7,8,3,2,4,1,6,5] 877 >>> sorted('Monty Python') # any iterable may be an input 878 [' ', 'M', 'P', 'h', 'n', 'n', 'o', 'o', 't', 't', 'y', 'y'] 879 880 >>> # List the contents of a dict sorted by key values 881 >>> colormap = dict(red=1, blue=2, green=3, black=4, yellow=5) 882 >>> for k, v in sorted(colormap.iteritems()): 883 ... print k, v 884 ... 885 black 4 886 blue 2 887 green 3 888 red 1 889 yellow 5 890 891 (Contributed by Raymond Hettinger.) 892 893 * Integer operations will no longer trigger an :exc:`OverflowWarning`. The 894 :exc:`OverflowWarning` warning will disappear in Python 2.5. 895 896 * The interpreter gained a new switch, :option:`-m`, that takes a name, searches 897 for the corresponding module on ``sys.path``, and runs the module as a script. 898 For example, you can now run the Python profiler with ``python -m profile``. 899 (Contributed by Nick Coghlan.) 900 901 * The ``eval(expr, globals, locals)`` and ``execfile(filename, globals, 902 locals)`` functions and the :keyword:`exec` statement now accept any mapping type 903 for the *locals* parameter. Previously this had to be a regular Python 904 dictionary. (Contributed by Raymond Hettinger.) 905 906 * The :func:`zip` built-in function and :func:`itertools.izip` now return an 907 empty list if called with no arguments. Previously they raised a 908 :exc:`TypeError` exception. This makes them more suitable for use with variable 909 length argument lists:: 910 911 >>> def transpose(array): 912 ... return zip(*array) 913 ... 914 >>> transpose([(1,2,3), (4,5,6)]) 915 [(1, 4), (2, 5), (3, 6)] 916 >>> transpose([]) 917 [] 918 919 (Contributed by Raymond Hettinger.) 920 921 * Encountering a failure while importing a module no longer leaves a partially-initialized 922 module object in ``sys.modules``. The incomplete module object left 923 behind would fool further imports of the same module into succeeding, leading to 924 confusing errors. (Fixed by Tim Peters.) 925 926 * :const:`None` is now a constant; code that binds a new value to the name 927 ``None`` is now a syntax error. (Contributed by Raymond Hettinger.) 928 929 .. ====================================================================== 930 931 932 Optimizations 933 ------------- 934 935 * The inner loops for list and tuple slicing were optimized and now run about 936 one-third faster. The inner loops for dictionaries were also optimized, 937 resulting in performance boosts for :meth:`keys`, :meth:`values`, :meth:`items`, 938 :meth:`iterkeys`, :meth:`itervalues`, and :meth:`iteritems`. (Contributed by 939 Raymond Hettinger.) 940 941 * The machinery for growing and shrinking lists was optimized for speed and for 942 space efficiency. Appending and popping from lists now runs faster due to more 943 efficient code paths and less frequent use of the underlying system 944 :c:func:`realloc`. List comprehensions also benefit. :meth:`list.extend` was 945 also optimized and no longer converts its argument into a temporary list before 946 extending the base list. (Contributed by Raymond Hettinger.) 947 948 * :func:`list`, :func:`tuple`, :func:`map`, :func:`filter`, and :func:`zip` now 949 run several times faster with non-sequence arguments that supply a 950 :meth:`__len__` method. (Contributed by Raymond Hettinger.) 951 952 * The methods :meth:`list.__getitem__`, :meth:`dict.__getitem__`, and 953 :meth:`dict.__contains__` are now implemented as :class:`method_descriptor` 954 objects rather than :class:`wrapper_descriptor` objects. This form of access 955 doubles their performance and makes them more suitable for use as arguments to 956 functionals: ``map(mydict.__getitem__, keylist)``. (Contributed by Raymond 957 Hettinger.) 958 959 * Added a new opcode, ``LIST_APPEND``, that simplifies the generated bytecode 960 for list comprehensions and speeds them up by about a third. (Contributed by 961 Raymond Hettinger.) 962 963 * The peephole bytecode optimizer has been improved to produce shorter, faster 964 bytecode; remarkably, the resulting bytecode is more readable. (Enhanced by 965 Raymond Hettinger.) 966 967 * String concatenations in statements of the form ``s = s + "abc"`` and ``s += 968 "abc"`` are now performed more efficiently in certain circumstances. This 969 optimization won't be present in other Python implementations such as Jython, so 970 you shouldn't rely on it; using the :meth:`join` method of strings is still 971 recommended when you want to efficiently glue a large number of strings 972 together. (Contributed by Armin Rigo.) 973 974 The net result of the 2.4 optimizations is that Python 2.4 runs the pystone 975 benchmark around 5% faster than Python 2.3 and 35% faster than Python 2.2. 976 (pystone is not a particularly good benchmark, but it's the most commonly used 977 measurement of Python's performance. Your own applications may show greater or 978 smaller benefits from Python 2.4.) 979 980 .. pystone is almost useless for comparing different versions of Python; 981 instead, it excels at predicting relative Python performance on different 982 machines. So, this section would be more informative if it used other tools 983 such as pybench and parrotbench. For a more application oriented benchmark, 984 try comparing the timings of test_decimal.py under 2.3 and 2.4. 985 986 .. ====================================================================== 987 988 989 New, Improved, and Deprecated Modules 990 ===================================== 991 992 As usual, Python's standard library received a number of enhancements and bug 993 fixes. Here's a partial list of the most notable changes, sorted alphabetically 994 by module name. Consult the :file:`Misc/NEWS` file in the source tree for a more 995 complete list of changes, or look through the CVS logs for all the details. 996 997 * The :mod:`asyncore` module's :func:`loop` function now has a *count* parameter 998 that lets you perform a limited number of passes through the polling loop. The 999 default is still to loop forever. 1000 1001 * The :mod:`base64` module now has more complete RFC 3548 support for Base64, 1002 Base32, and Base16 encoding and decoding, including optional case folding and 1003 optional alternative alphabets. (Contributed by Barry Warsaw.) 1004 1005 * The :mod:`bisect` module now has an underlying C implementation for improved 1006 performance. (Contributed by Dmitry Vasiliev.) 1007 1008 * The CJKCodecs collections of East Asian codecs, maintained by Hye-Shik Chang, 1009 was integrated into 2.4. The new encodings are: 1010 1011 * Chinese (PRC): gb2312, gbk, gb18030, big5hkscs, hz 1012 1013 * Chinese (ROC): big5, cp950 1014 1015 * Japanese: cp932, euc-jis-2004, euc-jp, euc-jisx0213, iso-2022-jp, 1016 iso-2022-jp-1, iso-2022-jp-2, iso-2022-jp-3, iso-2022-jp-ext, iso-2022-jp-2004, 1017 shift-jis, shift-jisx0213, shift-jis-2004 1018 1019 * Korean: cp949, euc-kr, johab, iso-2022-kr 1020 1021 * Some other new encodings were added: HP Roman8, ISO_8859-11, ISO_8859-16, 1022 PCTP-154, and TIS-620. 1023 1024 * The UTF-8 and UTF-16 codecs now cope better with receiving partial input. 1025 Previously the :class:`StreamReader` class would try to read more data, making 1026 it impossible to resume decoding from the stream. The :meth:`read` method will 1027 now return as much data as it can and future calls will resume decoding where 1028 previous ones left off. (Implemented by Walter Drwald.) 1029 1030 * There is a new :mod:`collections` module for various specialized collection 1031 datatypes. Currently it contains just one type, :class:`deque`, a double-ended 1032 queue that supports efficiently adding and removing elements from either 1033 end:: 1034 1035 >>> from collections import deque 1036 >>> d = deque('ghi') # make a new deque with three items 1037 >>> d.append('j') # add a new entry to the right side 1038 >>> d.appendleft('f') # add a new entry to the left side 1039 >>> d # show the representation of the deque 1040 deque(['f', 'g', 'h', 'i', 'j']) 1041 >>> d.pop() # return and remove the rightmost item 1042 'j' 1043 >>> d.popleft() # return and remove the leftmost item 1044 'f' 1045 >>> list(d) # list the contents of the deque 1046 ['g', 'h', 'i'] 1047 >>> 'h' in d # search the deque 1048 True 1049 1050 Several modules, such as the :mod:`Queue` and :mod:`threading` modules, now take 1051 advantage of :class:`collections.deque` for improved performance. (Contributed 1052 by Raymond Hettinger.) 1053 1054 * The :mod:`ConfigParser` classes have been enhanced slightly. The :meth:`read` 1055 method now returns a list of the files that were successfully parsed, and the 1056 :meth:`set` method raises :exc:`TypeError` if passed a *value* argument that 1057 isn't a string. (Contributed by John Belmonte and David Goodger.) 1058 1059 * The :mod:`curses` module now supports the ncurses extension 1060 :func:`use_default_colors`. On platforms where the terminal supports 1061 transparency, this makes it possible to use a transparent background. 1062 (Contributed by Jrg Lehmann.) 1063 1064 * The :mod:`difflib` module now includes an :class:`HtmlDiff` class that creates 1065 an HTML table showing a side by side comparison of two versions of a text. 1066 (Contributed by Dan Gass.) 1067 1068 * The :mod:`email` package was updated to version 3.0, which dropped various 1069 deprecated APIs and removes support for Python versions earlier than 2.3. The 1070 3.0 version of the package uses a new incremental parser for MIME messages, 1071 available in the :mod:`email.FeedParser` module. The new parser doesn't require 1072 reading the entire message into memory, and doesn't raise exceptions if a 1073 message is malformed; instead it records any problems in the :attr:`defect` 1074 attribute of the message. (Developed by Anthony Baxter, Barry Warsaw, Thomas 1075 Wouters, and others.) 1076 1077 * The :mod:`heapq` module has been converted to C. The resulting tenfold 1078 improvement in speed makes the module suitable for handling high volumes of 1079 data. In addition, the module has two new functions :func:`nlargest` and 1080 :func:`nsmallest` that use heaps to find the N largest or smallest values in a 1081 dataset without the expense of a full sort. (Contributed by Raymond Hettinger.) 1082 1083 * The :mod:`httplib` module now contains constants for HTTP status codes defined 1084 in various HTTP-related RFC documents. Constants have names such as 1085 :const:`OK`, :const:`CREATED`, :const:`CONTINUE`, and 1086 :const:`MOVED_PERMANENTLY`; use pydoc to get a full list. (Contributed by 1087 Andrew Eland.) 1088 1089 * The :mod:`imaplib` module now supports IMAP's THREAD command (contributed by 1090 Yves Dionne) and new :meth:`deleteacl` and :meth:`myrights` methods (contributed 1091 by Arnaud Mazin). 1092 1093 * The :mod:`itertools` module gained a ``groupby(iterable[, *func*])`` 1094 function. *iterable* is something that can be iterated over to return a stream 1095 of elements, and the optional *func* parameter is a function that takes an 1096 element and returns a key value; if omitted, the key is simply the element 1097 itself. :func:`groupby` then groups the elements into subsequences which have 1098 matching values of the key, and returns a series of 2-tuples containing the key 1099 value and an iterator over the subsequence. 1100 1101 Here's an example to make this clearer. The *key* function simply returns 1102 whether a number is even or odd, so the result of :func:`groupby` is to return 1103 consecutive runs of odd or even numbers. :: 1104 1105 >>> import itertools 1106 >>> L = [2, 4, 6, 7, 8, 9, 11, 12, 14] 1107 >>> for key_val, it in itertools.groupby(L, lambda x: x % 2): 1108 ... print key_val, list(it) 1109 ... 1110 0 [2, 4, 6] 1111 1 [7] 1112 0 [8] 1113 1 [9, 11] 1114 0 [12, 14] 1115 >>> 1116 1117 :func:`groupby` is typically used with sorted input. The logic for 1118 :func:`groupby` is similar to the Unix ``uniq`` filter which makes it handy for 1119 eliminating, counting, or identifying duplicate elements:: 1120 1121 >>> word = 'abracadabra' 1122 >>> letters = sorted(word) # Turn string into a sorted list of letters 1123 >>> letters 1124 ['a', 'a', 'a', 'a', 'a', 'b', 'b', 'c', 'd', 'r', 'r'] 1125 >>> for k, g in itertools.groupby(letters): 1126 ... print k, list(g) 1127 ... 1128 a ['a', 'a', 'a', 'a', 'a'] 1129 b ['b', 'b'] 1130 c ['c'] 1131 d ['d'] 1132 r ['r', 'r'] 1133 >>> # List unique letters 1134 >>> [k for k, g in groupby(letters)] 1135 ['a', 'b', 'c', 'd', 'r'] 1136 >>> # Count letter occurrences 1137 >>> [(k, len(list(g))) for k, g in groupby(letters)] 1138 [('a', 5), ('b', 2), ('c', 1), ('d', 1), ('r', 2)] 1139 1140 (Contributed by Hye-Shik Chang.) 1141 1142 * :mod:`itertools` also gained a function named ``tee(iterator, N)`` that 1143 returns *N* independent iterators that replicate *iterator*. If *N* is omitted, 1144 the default is 2. :: 1145 1146 >>> L = [1,2,3] 1147 >>> i1, i2 = itertools.tee(L) 1148 >>> i1,i2 1149 (<itertools.tee object at 0x402c2080>, <itertools.tee object at 0x402c2090>) 1150 >>> list(i1) # Run the first iterator to exhaustion 1151 [1, 2, 3] 1152 >>> list(i2) # Run the second iterator to exhaustion 1153 [1, 2, 3] 1154 1155 Note that :func:`tee` has to keep copies of the values returned by the 1156 iterator; in the worst case, it may need to keep all of them. This should 1157 therefore be used carefully if the leading iterator can run far ahead of the 1158 trailing iterator in a long stream of inputs. If the separation is large, then 1159 you might as well use :func:`list` instead. When the iterators track closely 1160 with one another, :func:`tee` is ideal. Possible applications include 1161 bookmarking, windowing, or lookahead iterators. (Contributed by Raymond 1162 Hettinger.) 1163 1164 * A number of functions were added to the :mod:`locale` module, such as 1165 :func:`bind_textdomain_codeset` to specify a particular encoding and a family of 1166 :func:`l\*gettext` functions that return messages in the chosen encoding. 1167 (Contributed by Gustavo Niemeyer.) 1168 1169 * Some keyword arguments were added to the :mod:`logging` package's 1170 :func:`basicConfig` function to simplify log configuration. The default 1171 behavior is to log messages to standard error, but various keyword arguments can 1172 be specified to log to a particular file, change the logging format, or set the 1173 logging level. For example:: 1174 1175 import logging 1176 logging.basicConfig(filename='/var/log/application.log', 1177 level=0, # Log all messages 1178 format='%(levelname):%(process):%(thread):%(message)') 1179 1180 Other additions to the :mod:`logging` package include a ``log(level, msg)`` 1181 convenience method, as well as a :class:`TimedRotatingFileHandler` class that 1182 rotates its log files at a timed interval. The module already had 1183 :class:`RotatingFileHandler`, which rotated logs once the file exceeded a 1184 certain size. Both classes derive from a new :class:`BaseRotatingHandler` class 1185 that can be used to implement other rotating handlers. 1186 1187 (Changes implemented by Vinay Sajip.) 1188 1189 * The :mod:`marshal` module now shares interned strings on unpacking a data 1190 structure. This may shrink the size of certain pickle strings, but the primary 1191 effect is to make :file:`.pyc` files significantly smaller. (Contributed by 1192 Martin von Lwis.) 1193 1194 * The :mod:`nntplib` module's :class:`NNTP` class gained :meth:`description` and 1195 :meth:`descriptions` methods to retrieve newsgroup descriptions for a single 1196 group or for a range of groups. (Contributed by Jrgen A. Erhard.) 1197 1198 * Two new functions were added to the :mod:`operator` module, 1199 ``attrgetter(attr)`` and ``itemgetter(index)``. Both functions return 1200 callables that take a single argument and return the corresponding attribute or 1201 item; these callables make excellent data extractors when used with :func:`map` 1202 or :func:`sorted`. For example:: 1203 1204 >>> L = [('c', 2), ('d', 1), ('a', 4), ('b', 3)] 1205 >>> map(operator.itemgetter(0), L) 1206 ['c', 'd', 'a', 'b'] 1207 >>> map(operator.itemgetter(1), L) 1208 [2, 1, 4, 3] 1209 >>> sorted(L, key=operator.itemgetter(1)) # Sort list by second tuple item 1210 [('d', 1), ('c', 2), ('b', 3), ('a', 4)] 1211 1212 (Contributed by Raymond Hettinger.) 1213 1214 * The :mod:`optparse` module was updated in various ways. The module now passes 1215 its messages through :func:`gettext.gettext`, making it possible to 1216 internationalize Optik's help and error messages. Help messages for options can 1217 now include the string ``'%default'``, which will be replaced by the option's 1218 default value. (Contributed by Greg Ward.) 1219 1220 * The long-term plan is to deprecate the :mod:`rfc822` module in some future 1221 Python release in favor of the :mod:`email` package. To this end, the 1222 :func:`email.Utils.formatdate` function has been changed to make it usable as a 1223 replacement for :func:`rfc822.formatdate`. You may want to write new e-mail 1224 processing code with this in mind. (Change implemented by Anthony Baxter.) 1225 1226 * A new ``urandom(n)`` function was added to the :mod:`os` module, returning 1227 a string containing *n* bytes of random data. This function provides access to 1228 platform-specific sources of randomness such as :file:`/dev/urandom` on Linux or 1229 the Windows CryptoAPI. (Contributed by Trevor Perrin.) 1230 1231 * Another new function: ``os.path.lexists(path)`` returns true if the file 1232 specified by *path* exists, whether or not it's a symbolic link. This differs 1233 from the existing ``os.path.exists(path)`` function, which returns false if 1234 *path* is a symlink that points to a destination that doesn't exist. 1235 (Contributed by Beni Cherniavsky.) 1236 1237 * A new :func:`getsid` function was added to the :mod:`posix` module that 1238 underlies the :mod:`os` module. (Contributed by J. Raynor.) 1239 1240 * The :mod:`poplib` module now supports POP over SSL. (Contributed by Hector 1241 Urtubia.) 1242 1243 * The :mod:`profile` module can now profile C extension functions. (Contributed 1244 by Nick Bastin.) 1245 1246 * The :mod:`random` module has a new method called ``getrandbits(N)`` that 1247 returns a long integer *N* bits in length. The existing :meth:`randrange` 1248 method now uses :meth:`getrandbits` where appropriate, making generation of 1249 arbitrarily large random numbers more efficient. (Contributed by Raymond 1250 Hettinger.) 1251 1252 * The regular expression language accepted by the :mod:`re` module was extended 1253 with simple conditional expressions, written as ``(?(group)A|B)``. *group* is 1254 either a numeric group ID or a group name defined with ``(?P<group>...)`` 1255 earlier in the expression. If the specified group matched, the regular 1256 expression pattern *A* will be tested against the string; if the group didn't 1257 match, the pattern *B* will be used instead. (Contributed by Gustavo Niemeyer.) 1258 1259 * The :mod:`re` module is also no longer recursive, thanks to a massive amount 1260 of work by Gustavo Niemeyer. In a recursive regular expression engine, certain 1261 patterns result in a large amount of C stack space being consumed, and it was 1262 possible to overflow the stack. For example, if you matched a 30000-byte string 1263 of ``a`` characters against the expression ``(a|b)+``, one stack frame was 1264 consumed per character. Python 2.3 tried to check for stack overflow and raise 1265 a :exc:`RuntimeError` exception, but certain patterns could sidestep the 1266 checking and if you were unlucky Python could segfault. Python 2.4's regular 1267 expression engine can match this pattern without problems. 1268 1269 * The :mod:`signal` module now performs tighter error-checking on the parameters 1270 to the :func:`signal.signal` function. For example, you can't set a handler on 1271 the :const:`SIGKILL` signal; previous versions of Python would quietly accept 1272 this, but 2.4 will raise a :exc:`RuntimeError` exception. 1273 1274 * Two new functions were added to the :mod:`socket` module. :func:`socketpair` 1275 returns a pair of connected sockets and ``getservbyport(port)`` looks up the 1276 service name for a given port number. (Contributed by Dave Cole and Barry 1277 Warsaw.) 1278 1279 * The :func:`sys.exitfunc` function has been deprecated. Code should be using 1280 the existing :mod:`atexit` module, which correctly handles calling multiple exit 1281 functions. Eventually :func:`sys.exitfunc` will become a purely internal 1282 interface, accessed only by :mod:`atexit`. 1283 1284 * The :mod:`tarfile` module now generates GNU-format tar files by default. 1285 (Contributed by Lars Gustaebel.) 1286 1287 * The :mod:`threading` module now has an elegantly simple way to support 1288 thread-local data. The module contains a :class:`local` class whose attribute 1289 values are local to different threads. :: 1290 1291 import threading 1292 1293 data = threading.local() 1294 data.number = 42 1295 data.url = ('www.python.org', 80) 1296 1297 Other threads can assign and retrieve their own values for the :attr:`number` 1298 and :attr:`url` attributes. You can subclass :class:`local` to initialize 1299 attributes or to add methods. (Contributed by Jim Fulton.) 1300 1301 * The :mod:`timeit` module now automatically disables periodic garbage 1302 collection during the timing loop. This change makes consecutive timings more 1303 comparable. (Contributed by Raymond Hettinger.) 1304 1305 * The :mod:`weakref` module now supports a wider variety of objects including 1306 Python functions, class instances, sets, frozensets, deques, arrays, files, 1307 sockets, and regular expression pattern objects. (Contributed by Raymond 1308 Hettinger.) 1309 1310 * The :mod:`xmlrpclib` module now supports a multi-call extension for 1311 transmitting multiple XML-RPC calls in a single HTTP operation. (Contributed by 1312 Brian Quinlan.) 1313 1314 * The :mod:`mpz`, :mod:`rotor`, and :mod:`xreadlines` modules have been 1315 removed. 1316 1317 .. ====================================================================== 1318 .. whole new modules get described in subsections here 1319 .. ===================== 1320 1321 1322 cookielib 1323 --------- 1324 1325 The :mod:`cookielib` library supports client-side handling for HTTP cookies, 1326 mirroring the :mod:`Cookie` module's server-side cookie support. Cookies are 1327 stored in cookie jars; the library transparently stores cookies offered by the 1328 web server in the cookie jar, and fetches the cookie from the jar when 1329 connecting to the server. As in web browsers, policy objects control whether 1330 cookies are accepted or not. 1331 1332 In order to store cookies across sessions, two implementations of cookie jars 1333 are provided: one that stores cookies in the Netscape format so applications can 1334 use the Mozilla or Lynx cookie files, and one that stores cookies in the same 1335 format as the Perl libwww library. 1336 1337 :mod:`urllib2` has been changed to interact with :mod:`cookielib`: 1338 :class:`HTTPCookieProcessor` manages a cookie jar that is used when accessing 1339 URLs. 1340 1341 This module was contributed by John J. Lee. 1342 1343 .. ================== 1344 1345 1346 doctest 1347 ------- 1348 1349 The :mod:`doctest` module underwent considerable refactoring thanks to Edward 1350 Loper and Tim Peters. Testing can still be as simple as running 1351 :func:`doctest.testmod`, but the refactorings allow customizing the module's 1352 operation in various ways 1353 1354 The new :class:`DocTestFinder` class extracts the tests from a given object's 1355 docstrings:: 1356 1357 def f (x, y): 1358 """>>> f(2,2) 1359 4 1360 >>> f(3,2) 1361 6 1362 """ 1363 return x*y 1364 1365 finder = doctest.DocTestFinder() 1366 1367 # Get list of DocTest instances 1368 tests = finder.find(f) 1369 1370 The new :class:`DocTestRunner` class then runs individual tests and can produce 1371 a summary of the results:: 1372 1373 runner = doctest.DocTestRunner() 1374 for t in tests: 1375 tried, failed = runner.run(t) 1376 1377 runner.summarize(verbose=1) 1378 1379 The above example produces the following output:: 1380 1381 1 items passed all tests: 1382 2 tests in f 1383 2 tests in 1 items. 1384 2 passed and 0 failed. 1385 Test passed. 1386 1387 :class:`DocTestRunner` uses an instance of the :class:`OutputChecker` class to 1388 compare the expected output with the actual output. This class takes a number 1389 of different flags that customize its behaviour; ambitious users can also write 1390 a completely new subclass of :class:`OutputChecker`. 1391 1392 The default output checker provides a number of handy features. For example, 1393 with the :const:`doctest.ELLIPSIS` option flag, an ellipsis (``...``) in the 1394 expected output matches any substring, making it easier to accommodate outputs 1395 that vary in minor ways:: 1396 1397 def o (n): 1398 """>>> o(1) 1399 <__main__.C instance at 0x...> 1400 >>> 1401 """ 1402 1403 Another special string, ``<BLANKLINE>``, matches a blank line:: 1404 1405 def p (n): 1406 """>>> p(1) 1407 <BLANKLINE> 1408 >>> 1409 """ 1410 1411 Another new capability is producing a diff-style display of the output by 1412 specifying the :const:`doctest.REPORT_UDIFF` (unified diffs), 1413 :const:`doctest.REPORT_CDIFF` (context diffs), or :const:`doctest.REPORT_NDIFF` 1414 (delta-style) option flags. For example:: 1415 1416 def g (n): 1417 """>>> g(4) 1418 here 1419 is 1420 a 1421 lengthy 1422 >>>""" 1423 L = 'here is a rather lengthy list of words'.split() 1424 for word in L[:n]: 1425 print word 1426 1427 Running the above function's tests with :const:`doctest.REPORT_UDIFF` specified, 1428 you get the following output: 1429 1430 .. code-block:: none 1431 1432 ********************************************************************** 1433 File "t.py", line 15, in g 1434 Failed example: 1435 g(4) 1436 Differences (unified diff with -expected +actual): 1437 @@ -2,3 +2,3 @@ 1438 is 1439 a 1440 -lengthy 1441 +rather 1442 ********************************************************************** 1443 1444 .. ====================================================================== 1445 1446 1447 Build and C API Changes 1448 ======================= 1449 1450 Some of the changes to Python's build process and to the C API are: 1451 1452 * Three new convenience macros were added for common return values from 1453 extension functions: :c:macro:`Py_RETURN_NONE`, :c:macro:`Py_RETURN_TRUE`, and 1454 :c:macro:`Py_RETURN_FALSE`. (Contributed by Brett Cannon.) 1455 1456 * Another new macro, :c:macro:`Py_CLEAR(obj)`, decreases the reference count of 1457 *obj* and sets *obj* to the null pointer. (Contributed by Jim Fulton.) 1458 1459 * A new function, ``PyTuple_Pack(N, obj1, obj2, ..., objN)``, constructs 1460 tuples from a variable length argument list of Python objects. (Contributed by 1461 Raymond Hettinger.) 1462 1463 * A new function, ``PyDict_Contains(d, k)``, implements fast dictionary 1464 lookups without masking exceptions raised during the look-up process. 1465 (Contributed by Raymond Hettinger.) 1466 1467 * The :c:macro:`Py_IS_NAN(X)` macro returns 1 if its float or double argument 1468 *X* is a NaN. (Contributed by Tim Peters.) 1469 1470 * C code can avoid unnecessary locking by using the new 1471 :c:func:`PyEval_ThreadsInitialized` function to tell if any thread operations 1472 have been performed. If this function returns false, no lock operations are 1473 needed. (Contributed by Nick Coghlan.) 1474 1475 * A new function, :c:func:`PyArg_VaParseTupleAndKeywords`, is the same as 1476 :c:func:`PyArg_ParseTupleAndKeywords` but takes a :c:type:`va_list` instead of a 1477 number of arguments. (Contributed by Greg Chapman.) 1478 1479 * A new method flag, :const:`METH_COEXISTS`, allows a function defined in slots 1480 to co-exist with a :c:type:`PyCFunction` having the same name. This can halve 1481 the access time for a method such as :meth:`set.__contains__`. (Contributed by 1482 Raymond Hettinger.) 1483 1484 * Python can now be built with additional profiling for the interpreter itself, 1485 intended as an aid to people developing the Python core. Providing 1486 :option:`!--enable-profiling` to the :program:`configure` script will let you 1487 profile the interpreter with :program:`gprof`, and providing the 1488 :option:`!--with-tsc` switch enables profiling using the Pentium's 1489 Time-Stamp-Counter register. Note that the :option:`!--with-tsc` switch is slightly 1490 misnamed, because the profiling feature also works on the PowerPC platform, 1491 though that processor architecture doesn't call that register "the TSC 1492 register". (Contributed by Jeremy Hylton.) 1493 1494 * The :c:type:`tracebackobject` type has been renamed to 1495 :c:type:`PyTracebackObject`. 1496 1497 .. ====================================================================== 1498 1499 1500 Port-Specific Changes 1501 --------------------- 1502 1503 * The Windows port now builds under MSVC++ 7.1 as well as version 6. 1504 (Contributed by Martin von Lwis.) 1505 1506 .. ====================================================================== 1507 1508 1509 Porting to Python 2.4 1510 ===================== 1511 1512 This section lists previously described changes that may require changes to your 1513 code: 1514 1515 * Left shifts and hexadecimal/octal constants that are too large no longer 1516 trigger a :exc:`FutureWarning` and return a value limited to 32 or 64 bits; 1517 instead they return a long integer. 1518 1519 * Integer operations will no longer trigger an :exc:`OverflowWarning`. The 1520 :exc:`OverflowWarning` warning will disappear in Python 2.5. 1521 1522 * The :func:`zip` built-in function and :func:`itertools.izip` now return an 1523 empty list instead of raising a :exc:`TypeError` exception if called with no 1524 arguments. 1525 1526 * You can no longer compare the :class:`date` and :class:`~datetime.datetime` instances 1527 provided by the :mod:`datetime` module. Two instances of different classes 1528 will now always be unequal, and relative comparisons (``<``, ``>``) will raise 1529 a :exc:`TypeError`. 1530 1531 * :func:`dircache.listdir` now passes exceptions to the caller instead of 1532 returning empty lists. 1533 1534 * :func:`LexicalHandler.startDTD` used to receive the public and system IDs in 1535 the wrong order. This has been corrected; applications relying on the wrong 1536 order need to be fixed. 1537 1538 * :func:`fcntl.ioctl` now warns if the *mutate* argument is omitted and 1539 relevant. 1540 1541 * The :mod:`tarfile` module now generates GNU-format tar files by default. 1542 1543 * Encountering a failure while importing a module no longer leaves a 1544 partially-initialized module object in ``sys.modules``. 1545 1546 * :const:`None` is now a constant; code that binds a new value to the name 1547 ``None`` is now a syntax error. 1548 1549 * The :func:`signals.signal` function now raises a :exc:`RuntimeError` exception 1550 for certain illegal values; previously these errors would pass silently. For 1551 example, you can no longer set a handler on the :const:`SIGKILL` signal. 1552 1553 .. ====================================================================== 1554 1555 1556 .. _24acks: 1557 1558 Acknowledgements 1559 ================ 1560 1561 The author would like to thank the following people for offering suggestions, 1562 corrections and assistance with various drafts of this article: Koray Can, 1563 Hye-Shik Chang, Michael Dyck, Raymond Hettinger, Brian Hurt, Hamish Lawson, 1564 Fredrik Lundh, Sean Reifschneider, Sadruddin Rejeb. 1565 1566