1 :tocdepth: 2 2 3 =============== 4 Programming FAQ 5 =============== 6 7 .. only:: html 8 9 .. contents:: 10 11 General Questions 12 ================= 13 14 Is there a source code level debugger with breakpoints, single-stepping, etc.? 15 ------------------------------------------------------------------------------ 16 17 Yes. 18 19 The pdb module is a simple but adequate console-mode debugger for Python. It is 20 part of the standard Python library, and is :mod:`documented in the Library 21 Reference Manual <pdb>`. You can also write your own debugger by using the code 22 for pdb as an example. 23 24 The IDLE interactive development environment, which is part of the standard 25 Python distribution (normally available as Tools/scripts/idle), includes a 26 graphical debugger. 27 28 PythonWin is a Python IDE that includes a GUI debugger based on pdb. The 29 Pythonwin debugger colors breakpoints and has quite a few cool features such as 30 debugging non-Pythonwin programs. Pythonwin is available as part of the `Python 31 for Windows Extensions <https://sourceforge.net/projects/pywin32/>`__ project and 32 as a part of the ActivePython distribution (see 33 https://www.activestate.com/activepython\ ). 34 35 `Boa Constructor <http://boa-constructor.sourceforge.net/>`_ is an IDE and GUI 36 builder that uses wxWidgets. It offers visual frame creation and manipulation, 37 an object inspector, many views on the source like object browsers, inheritance 38 hierarchies, doc string generated html documentation, an advanced debugger, 39 integrated help, and Zope support. 40 41 `Eric <http://eric-ide.python-projects.org/>`_ is an IDE built on PyQt 42 and the Scintilla editing component. 43 44 Pydb is a version of the standard Python debugger pdb, modified for use with DDD 45 (Data Display Debugger), a popular graphical debugger front end. Pydb can be 46 found at http://bashdb.sourceforge.net/pydb/ and DDD can be found at 47 https://www.gnu.org/software/ddd. 48 49 There are a number of commercial Python IDEs that include graphical debuggers. 50 They include: 51 52 * Wing IDE (https://wingware.com/) 53 * Komodo IDE (https://komodoide.com/) 54 * PyCharm (https://www.jetbrains.com/pycharm/) 55 56 57 Is there a tool to help find bugs or perform static analysis? 58 ------------------------------------------------------------- 59 60 Yes. 61 62 PyChecker is a static analysis tool that finds bugs in Python source code and 63 warns about code complexity and style. You can get PyChecker from 64 http://pychecker.sourceforge.net/. 65 66 `Pylint <https://www.pylint.org/>`_ is another tool that checks 67 if a module satisfies a coding standard, and also makes it possible to write 68 plug-ins to add a custom feature. In addition to the bug checking that 69 PyChecker performs, Pylint offers some additional features such as checking line 70 length, whether variable names are well-formed according to your coding 71 standard, whether declared interfaces are fully implemented, and more. 72 https://docs.pylint.org/ provides a full list of Pylint's features. 73 74 75 How can I create a stand-alone binary from a Python script? 76 ----------------------------------------------------------- 77 78 You don't need the ability to compile Python to C code if all you want is a 79 stand-alone program that users can download and run without having to install 80 the Python distribution first. There are a number of tools that determine the 81 set of modules required by a program and bind these modules together with a 82 Python binary to produce a single executable. 83 84 One is to use the freeze tool, which is included in the Python source tree as 85 ``Tools/freeze``. It converts Python byte code to C arrays; a C compiler you can 86 embed all your modules into a new program, which is then linked with the 87 standard Python modules. 88 89 It works by scanning your source recursively for import statements (in both 90 forms) and looking for the modules in the standard Python path as well as in the 91 source directory (for built-in modules). It then turns the bytecode for modules 92 written in Python into C code (array initializers that can be turned into code 93 objects using the marshal module) and creates a custom-made config file that 94 only contains those built-in modules which are actually used in the program. It 95 then compiles the generated C code and links it with the rest of the Python 96 interpreter to form a self-contained binary which acts exactly like your script. 97 98 Obviously, freeze requires a C compiler. There are several other utilities 99 which don't. One is Thomas Heller's py2exe (Windows only) at 100 101 http://www.py2exe.org/ 102 103 Another tool is Anthony Tuininga's `cx_Freeze <http://cx-freeze.sourceforge.net/>`_. 104 105 106 Are there coding standards or a style guide for Python programs? 107 ---------------------------------------------------------------- 108 109 Yes. The coding style required for standard library modules is documented as 110 :pep:`8`. 111 112 113 My program is too slow. How do I speed it up? 114 --------------------------------------------- 115 116 That's a tough one, in general. There are many tricks to speed up Python code; 117 consider rewriting parts in C as a last resort. 118 119 In some cases it's possible to automatically translate Python to C or x86 120 assembly language, meaning that you don't have to modify your code to gain 121 increased speed. 122 123 .. XXX seems to have overlap with other questions! 124 125 `Pyrex <http://www.cosc.canterbury.ac.nz/~greg/python/Pyrex/>`_ can compile a 126 slightly modified version of Python code into a C extension, and can be used on 127 many different platforms. 128 129 `Psyco <http://psyco.sourceforge.net>`_ is a just-in-time compiler that 130 translates Python code into x86 assembly language. If you can use it, Psyco can 131 provide dramatic speedups for critical functions. 132 133 The rest of this answer will discuss various tricks for squeezing a bit more 134 speed out of Python code. *Never* apply any optimization tricks unless you know 135 you need them, after profiling has indicated that a particular function is the 136 heavily executed hot spot in the code. Optimizations almost always make the 137 code less clear, and you shouldn't pay the costs of reduced clarity (increased 138 development time, greater likelihood of bugs) unless the resulting performance 139 benefit is worth it. 140 141 There is a page on the wiki devoted to `performance tips 142 <https://wiki.python.org/moin/PythonSpeed/PerformanceTips>`_. 143 144 Guido van Rossum has written up an anecdote related to optimization at 145 https://www.python.org/doc/essays/list2str. 146 147 One thing to notice is that function and (especially) method calls are rather 148 expensive; if you have designed a purely OO interface with lots of tiny 149 functions that don't do much more than get or set an instance variable or call 150 another method, you might consider using a more direct way such as directly 151 accessing instance variables. Also see the standard module :mod:`profile` which 152 makes it possible to find out where your program is spending most of its time 153 (if you have some patience -- the profiling itself can slow your program down by 154 an order of magnitude). 155 156 Remember that many standard optimization heuristics you may know from other 157 programming experience may well apply to Python. For example it may be faster 158 to send output to output devices using larger writes rather than smaller ones in 159 order to reduce the overhead of kernel system calls. Thus CGI scripts that 160 write all output in "one shot" may be faster than those that write lots of small 161 pieces of output. 162 163 Also, be sure to use Python's core features where appropriate. For example, 164 slicing allows programs to chop up lists and other sequence objects in a single 165 tick of the interpreter's mainloop using highly optimized C implementations. 166 Thus to get the same effect as:: 167 168 L2 = [] 169 for i in range(3): 170 L2.append(L1[i]) 171 172 it is much shorter and far faster to use :: 173 174 L2 = list(L1[:3]) # "list" is redundant if L1 is a list. 175 176 Note that the functionally-oriented built-in functions such as :func:`map`, 177 :func:`zip`, and friends can be a convenient accelerator for loops that 178 perform a single task. For example to pair the elements of two lists 179 together:: 180 181 >>> zip([1, 2, 3], [4, 5, 6]) 182 [(1, 4), (2, 5), (3, 6)] 183 184 or to compute a number of sines:: 185 186 >>> map(math.sin, (1, 2, 3, 4)) 187 [0.841470984808, 0.909297426826, 0.14112000806, -0.756802495308] 188 189 The operation completes very quickly in such cases. 190 191 Other examples include the ``join()`` and ``split()`` :ref:`methods 192 of string objects <string-methods>`. 193 For example if s1..s7 are large (10K+) strings then 194 ``"".join([s1,s2,s3,s4,s5,s6,s7])`` may be far faster than the more obvious 195 ``s1+s2+s3+s4+s5+s6+s7``, since the "summation" will compute many 196 subexpressions, whereas ``join()`` does all the copying in one pass. For 197 manipulating strings, use the ``replace()`` and the ``format()`` :ref:`methods 198 on string objects <string-methods>`. Use regular expressions only when you're 199 not dealing with constant string patterns. You may still use :ref:`the old % 200 operations <string-formatting>` ``string % tuple`` and ``string % dictionary``. 201 202 Be sure to use the :meth:`list.sort` built-in method to do sorting, and see the 203 `sorting mini-HOWTO <https://wiki.python.org/moin/HowTo/Sorting>`_ for examples 204 of moderately advanced usage. :meth:`list.sort` beats other techniques for 205 sorting in all but the most extreme circumstances. 206 207 Another common trick is to "push loops into functions or methods." For example 208 suppose you have a program that runs slowly and you use the profiler to 209 determine that a Python function ``ff()`` is being called lots of times. If you 210 notice that ``ff()``:: 211 212 def ff(x): 213 ... # do something with x computing result... 214 return result 215 216 tends to be called in loops like:: 217 218 list = map(ff, oldlist) 219 220 or:: 221 222 for x in sequence: 223 value = ff(x) 224 ... # do something with value... 225 226 then you can often eliminate function call overhead by rewriting ``ff()`` to:: 227 228 def ffseq(seq): 229 resultseq = [] 230 for x in seq: 231 ... # do something with x computing result... 232 resultseq.append(result) 233 return resultseq 234 235 and rewrite the two examples to ``list = ffseq(oldlist)`` and to:: 236 237 for value in ffseq(sequence): 238 ... # do something with value... 239 240 Single calls to ``ff(x)`` translate to ``ffseq([x])[0]`` with little penalty. 241 Of course this technique is not always appropriate and there are other variants 242 which you can figure out. 243 244 You can gain some performance by explicitly storing the results of a function or 245 method lookup into a local variable. A loop like:: 246 247 for key in token: 248 dict[key] = dict.get(key, 0) + 1 249 250 resolves ``dict.get`` every iteration. If the method isn't going to change, a 251 slightly faster implementation is:: 252 253 dict_get = dict.get # look up the method once 254 for key in token: 255 dict[key] = dict_get(key, 0) + 1 256 257 Default arguments can be used to determine values once, at compile time instead 258 of at run time. This can only be done for functions or objects which will not 259 be changed during program execution, such as replacing :: 260 261 def degree_sin(deg): 262 return math.sin(deg * math.pi / 180.0) 263 264 with :: 265 266 def degree_sin(deg, factor=math.pi/180.0, sin=math.sin): 267 return sin(deg * factor) 268 269 Because this trick uses default arguments for terms which should not be changed, 270 it should only be used when you are not concerned with presenting a possibly 271 confusing API to your users. 272 273 274 Core Language 275 ============= 276 277 Why am I getting an UnboundLocalError when the variable has a value? 278 -------------------------------------------------------------------- 279 280 It can be a surprise to get the UnboundLocalError in previously working 281 code when it is modified by adding an assignment statement somewhere in 282 the body of a function. 283 284 This code: 285 286 >>> x = 10 287 >>> def bar(): 288 ... print x 289 >>> bar() 290 10 291 292 works, but this code: 293 294 >>> x = 10 295 >>> def foo(): 296 ... print x 297 ... x += 1 298 299 results in an UnboundLocalError: 300 301 >>> foo() 302 Traceback (most recent call last): 303 ... 304 UnboundLocalError: local variable 'x' referenced before assignment 305 306 This is because when you make an assignment to a variable in a scope, that 307 variable becomes local to that scope and shadows any similarly named variable 308 in the outer scope. Since the last statement in foo assigns a new value to 309 ``x``, the compiler recognizes it as a local variable. Consequently when the 310 earlier ``print x`` attempts to print the uninitialized local variable and 311 an error results. 312 313 In the example above you can access the outer scope variable by declaring it 314 global: 315 316 >>> x = 10 317 >>> def foobar(): 318 ... global x 319 ... print x 320 ... x += 1 321 >>> foobar() 322 10 323 324 This explicit declaration is required in order to remind you that (unlike the 325 superficially analogous situation with class and instance variables) you are 326 actually modifying the value of the variable in the outer scope: 327 328 >>> print x 329 11 330 331 332 What are the rules for local and global variables in Python? 333 ------------------------------------------------------------ 334 335 In Python, variables that are only referenced inside a function are implicitly 336 global. If a variable is assigned a value anywhere within the function's body, 337 it's assumed to be a local unless explicitly declared as global. 338 339 Though a bit surprising at first, a moment's consideration explains this. On 340 one hand, requiring :keyword:`global` for assigned variables provides a bar 341 against unintended side-effects. On the other hand, if ``global`` was required 342 for all global references, you'd be using ``global`` all the time. You'd have 343 to declare as global every reference to a built-in function or to a component of 344 an imported module. This clutter would defeat the usefulness of the ``global`` 345 declaration for identifying side-effects. 346 347 348 Why do lambdas defined in a loop with different values all return the same result? 349 ---------------------------------------------------------------------------------- 350 351 Assume you use a for loop to define a few different lambdas (or even plain 352 functions), e.g.:: 353 354 >>> squares = [] 355 >>> for x in range(5): 356 ... squares.append(lambda: x**2) 357 358 This gives you a list that contains 5 lambdas that calculate ``x**2``. You 359 might expect that, when called, they would return, respectively, ``0``, ``1``, 360 ``4``, ``9``, and ``16``. However, when you actually try you will see that 361 they all return ``16``:: 362 363 >>> squares[2]() 364 16 365 >>> squares[4]() 366 16 367 368 This happens because ``x`` is not local to the lambdas, but is defined in 369 the outer scope, and it is accessed when the lambda is called --- not when it 370 is defined. At the end of the loop, the value of ``x`` is ``4``, so all the 371 functions now return ``4**2``, i.e. ``16``. You can also verify this by 372 changing the value of ``x`` and see how the results of the lambdas change:: 373 374 >>> x = 8 375 >>> squares[2]() 376 64 377 378 In order to avoid this, you need to save the values in variables local to the 379 lambdas, so that they don't rely on the value of the global ``x``:: 380 381 >>> squares = [] 382 >>> for x in range(5): 383 ... squares.append(lambda n=x: n**2) 384 385 Here, ``n=x`` creates a new variable ``n`` local to the lambda and computed 386 when the lambda is defined so that it has the same value that ``x`` had at 387 that point in the loop. This means that the value of ``n`` will be ``0`` 388 in the first lambda, ``1`` in the second, ``2`` in the third, and so on. 389 Therefore each lambda will now return the correct result:: 390 391 >>> squares[2]() 392 4 393 >>> squares[4]() 394 16 395 396 Note that this behaviour is not peculiar to lambdas, but applies to regular 397 functions too. 398 399 400 How do I share global variables across modules? 401 ------------------------------------------------ 402 403 The canonical way to share information across modules within a single program is 404 to create a special module (often called config or cfg). Just import the config 405 module in all modules of your application; the module then becomes available as 406 a global name. Because there is only one instance of each module, any changes 407 made to the module object get reflected everywhere. For example: 408 409 config.py:: 410 411 x = 0 # Default value of the 'x' configuration setting 412 413 mod.py:: 414 415 import config 416 config.x = 1 417 418 main.py:: 419 420 import config 421 import mod 422 print config.x 423 424 Note that using a module is also the basis for implementing the Singleton design 425 pattern, for the same reason. 426 427 428 What are the "best practices" for using import in a module? 429 ----------------------------------------------------------- 430 431 In general, don't use ``from modulename import *``. Doing so clutters the 432 importer's namespace, and makes it much harder for linters to detect undefined 433 names. 434 435 Import modules at the top of a file. Doing so makes it clear what other modules 436 your code requires and avoids questions of whether the module name is in scope. 437 Using one import per line makes it easy to add and delete module imports, but 438 using multiple imports per line uses less screen space. 439 440 It's good practice if you import modules in the following order: 441 442 1. standard library modules -- e.g. ``sys``, ``os``, ``getopt``, ``re`` 443 2. third-party library modules (anything installed in Python's site-packages 444 directory) -- e.g. mx.DateTime, ZODB, PIL.Image, etc. 445 3. locally-developed modules 446 447 Only use explicit relative package imports. If you're writing code that's in 448 the ``package.sub.m1`` module and want to import ``package.sub.m2``, do not just 449 write ``import m2``, even though it's legal. Write ``from package.sub import 450 m2`` or ``from . import m2`` instead. 451 452 It is sometimes necessary to move imports to a function or class to avoid 453 problems with circular imports. Gordon McMillan says: 454 455 Circular imports are fine where both modules use the "import <module>" form 456 of import. They fail when the 2nd module wants to grab a name out of the 457 first ("from module import name") and the import is at the top level. That's 458 because names in the 1st are not yet available, because the first module is 459 busy importing the 2nd. 460 461 In this case, if the second module is only used in one function, then the import 462 can easily be moved into that function. By the time the import is called, the 463 first module will have finished initializing, and the second module can do its 464 import. 465 466 It may also be necessary to move imports out of the top level of code if some of 467 the modules are platform-specific. In that case, it may not even be possible to 468 import all of the modules at the top of the file. In this case, importing the 469 correct modules in the corresponding platform-specific code is a good option. 470 471 Only move imports into a local scope, such as inside a function definition, if 472 it's necessary to solve a problem such as avoiding a circular import or are 473 trying to reduce the initialization time of a module. This technique is 474 especially helpful if many of the imports are unnecessary depending on how the 475 program executes. You may also want to move imports into a function if the 476 modules are only ever used in that function. Note that loading a module the 477 first time may be expensive because of the one time initialization of the 478 module, but loading a module multiple times is virtually free, costing only a 479 couple of dictionary lookups. Even if the module name has gone out of scope, 480 the module is probably available in :data:`sys.modules`. 481 482 483 Why are default values shared between objects? 484 ---------------------------------------------- 485 486 This type of bug commonly bites neophyte programmers. Consider this function:: 487 488 def foo(mydict={}): # Danger: shared reference to one dict for all calls 489 ... compute something ... 490 mydict[key] = value 491 return mydict 492 493 The first time you call this function, ``mydict`` contains a single item. The 494 second time, ``mydict`` contains two items because when ``foo()`` begins 495 executing, ``mydict`` starts out with an item already in it. 496 497 It is often expected that a function call creates new objects for default 498 values. This is not what happens. Default values are created exactly once, when 499 the function is defined. If that object is changed, like the dictionary in this 500 example, subsequent calls to the function will refer to this changed object. 501 502 By definition, immutable objects such as numbers, strings, tuples, and ``None``, 503 are safe from change. Changes to mutable objects such as dictionaries, lists, 504 and class instances can lead to confusion. 505 506 Because of this feature, it is good programming practice to not use mutable 507 objects as default values. Instead, use ``None`` as the default value and 508 inside the function, check if the parameter is ``None`` and create a new 509 list/dictionary/whatever if it is. For example, don't write:: 510 511 def foo(mydict={}): 512 ... 513 514 but:: 515 516 def foo(mydict=None): 517 if mydict is None: 518 mydict = {} # create a new dict for local namespace 519 520 This feature can be useful. When you have a function that's time-consuming to 521 compute, a common technique is to cache the parameters and the resulting value 522 of each call to the function, and return the cached value if the same value is 523 requested again. This is called "memoizing", and can be implemented like this:: 524 525 # Callers will never provide a third parameter for this function. 526 def expensive(arg1, arg2, _cache={}): 527 if (arg1, arg2) in _cache: 528 return _cache[(arg1, arg2)] 529 530 # Calculate the value 531 result = ... expensive computation ... 532 _cache[(arg1, arg2)] = result # Store result in the cache 533 return result 534 535 You could use a global variable containing a dictionary instead of the default 536 value; it's a matter of taste. 537 538 539 How can I pass optional or keyword parameters from one function to another? 540 --------------------------------------------------------------------------- 541 542 Collect the arguments using the ``*`` and ``**`` specifiers in the function's 543 parameter list; this gives you the positional arguments as a tuple and the 544 keyword arguments as a dictionary. You can then pass these arguments when 545 calling another function by using ``*`` and ``**``:: 546 547 def f(x, *args, **kwargs): 548 ... 549 kwargs['width'] = '14.3c' 550 ... 551 g(x, *args, **kwargs) 552 553 In the unlikely case that you care about Python versions older than 2.0, use 554 :func:`apply`:: 555 556 def f(x, *args, **kwargs): 557 ... 558 kwargs['width'] = '14.3c' 559 ... 560 apply(g, (x,)+args, kwargs) 561 562 563 .. index:: 564 single: argument; difference from parameter 565 single: parameter; difference from argument 566 567 .. _faq-argument-vs-parameter: 568 569 What is the difference between arguments and parameters? 570 -------------------------------------------------------- 571 572 :term:`Parameters <parameter>` are defined by the names that appear in a 573 function definition, whereas :term:`arguments <argument>` are the values 574 actually passed to a function when calling it. Parameters define what types of 575 arguments a function can accept. For example, given the function definition:: 576 577 def func(foo, bar=None, **kwargs): 578 pass 579 580 *foo*, *bar* and *kwargs* are parameters of ``func``. However, when calling 581 ``func``, for example:: 582 583 func(42, bar=314, extra=somevar) 584 585 the values ``42``, ``314``, and ``somevar`` are arguments. 586 587 588 Why did changing list 'y' also change list 'x'? 589 ------------------------------------------------ 590 591 If you wrote code like:: 592 593 >>> x = [] 594 >>> y = x 595 >>> y.append(10) 596 >>> y 597 [10] 598 >>> x 599 [10] 600 601 you might be wondering why appending an element to ``y`` changed ``x`` too. 602 603 There are two factors that produce this result: 604 605 1) Variables are simply names that refer to objects. Doing ``y = x`` doesn't 606 create a copy of the list -- it creates a new variable ``y`` that refers to 607 the same object ``x`` refers to. This means that there is only one object 608 (the list), and both ``x`` and ``y`` refer to it. 609 2) Lists are :term:`mutable`, which means that you can change their content. 610 611 After the call to :meth:`~list.append`, the content of the mutable object has 612 changed from ``[]`` to ``[10]``. Since both the variables refer to the same 613 object, using either name accesses the modified value ``[10]``. 614 615 If we instead assign an immutable object to ``x``:: 616 617 >>> x = 5 # ints are immutable 618 >>> y = x 619 >>> x = x + 1 # 5 can't be mutated, we are creating a new object here 620 >>> x 621 6 622 >>> y 623 5 624 625 we can see that in this case ``x`` and ``y`` are not equal anymore. This is 626 because integers are :term:`immutable`, and when we do ``x = x + 1`` we are not 627 mutating the int ``5`` by incrementing its value; instead, we are creating a 628 new object (the int ``6``) and assigning it to ``x`` (that is, changing which 629 object ``x`` refers to). After this assignment we have two objects (the ints 630 ``6`` and ``5``) and two variables that refer to them (``x`` now refers to 631 ``6`` but ``y`` still refers to ``5``). 632 633 Some operations (for example ``y.append(10)`` and ``y.sort()``) mutate the 634 object, whereas superficially similar operations (for example ``y = y + [10]`` 635 and ``sorted(y)``) create a new object. In general in Python (and in all cases 636 in the standard library) a method that mutates an object will return ``None`` 637 to help avoid getting the two types of operations confused. So if you 638 mistakenly write ``y.sort()`` thinking it will give you a sorted copy of ``y``, 639 you'll instead end up with ``None``, which will likely cause your program to 640 generate an easily diagnosed error. 641 642 However, there is one class of operations where the same operation sometimes 643 has different behaviors with different types: the augmented assignment 644 operators. For example, ``+=`` mutates lists but not tuples or ints (``a_list 645 += [1, 2, 3]`` is equivalent to ``a_list.extend([1, 2, 3])`` and mutates 646 ``a_list``, whereas ``some_tuple += (1, 2, 3)`` and ``some_int += 1`` create 647 new objects). 648 649 In other words: 650 651 * If we have a mutable object (:class:`list`, :class:`dict`, :class:`set`, 652 etc.), we can use some specific operations to mutate it and all the variables 653 that refer to it will see the change. 654 * If we have an immutable object (:class:`str`, :class:`int`, :class:`tuple`, 655 etc.), all the variables that refer to it will always see the same value, 656 but operations that transform that value into a new value always return a new 657 object. 658 659 If you want to know if two variables refer to the same object or not, you can 660 use the :keyword:`is` operator, or the built-in function :func:`id`. 661 662 663 How do I write a function with output parameters (call by reference)? 664 --------------------------------------------------------------------- 665 666 Remember that arguments are passed by assignment in Python. Since assignment 667 just creates references to objects, there's no alias between an argument name in 668 the caller and callee, and so no call-by-reference per se. You can achieve the 669 desired effect in a number of ways. 670 671 1) By returning a tuple of the results:: 672 673 def func2(a, b): 674 a = 'new-value' # a and b are local names 675 b = b + 1 # assigned to new objects 676 return a, b # return new values 677 678 x, y = 'old-value', 99 679 x, y = func2(x, y) 680 print x, y # output: new-value 100 681 682 This is almost always the clearest solution. 683 684 2) By using global variables. This isn't thread-safe, and is not recommended. 685 686 3) By passing a mutable (changeable in-place) object:: 687 688 def func1(a): 689 a[0] = 'new-value' # 'a' references a mutable list 690 a[1] = a[1] + 1 # changes a shared object 691 692 args = ['old-value', 99] 693 func1(args) 694 print args[0], args[1] # output: new-value 100 695 696 4) By passing in a dictionary that gets mutated:: 697 698 def func3(args): 699 args['a'] = 'new-value' # args is a mutable dictionary 700 args['b'] = args['b'] + 1 # change it in-place 701 702 args = {'a': 'old-value', 'b': 99} 703 func3(args) 704 print args['a'], args['b'] 705 706 5) Or bundle up values in a class instance:: 707 708 class callByRef: 709 def __init__(self, **args): 710 for (key, value) in args.items(): 711 setattr(self, key, value) 712 713 def func4(args): 714 args.a = 'new-value' # args is a mutable callByRef 715 args.b = args.b + 1 # change object in-place 716 717 args = callByRef(a='old-value', b=99) 718 func4(args) 719 print args.a, args.b 720 721 722 There's almost never a good reason to get this complicated. 723 724 Your best choice is to return a tuple containing the multiple results. 725 726 727 How do you make a higher order function in Python? 728 -------------------------------------------------- 729 730 You have two choices: you can use nested scopes or you can use callable objects. 731 For example, suppose you wanted to define ``linear(a,b)`` which returns a 732 function ``f(x)`` that computes the value ``a*x+b``. Using nested scopes:: 733 734 def linear(a, b): 735 def result(x): 736 return a * x + b 737 return result 738 739 Or using a callable object:: 740 741 class linear: 742 743 def __init__(self, a, b): 744 self.a, self.b = a, b 745 746 def __call__(self, x): 747 return self.a * x + self.b 748 749 In both cases, :: 750 751 taxes = linear(0.3, 2) 752 753 gives a callable object where ``taxes(10e6) == 0.3 * 10e6 + 2``. 754 755 The callable object approach has the disadvantage that it is a bit slower and 756 results in slightly longer code. However, note that a collection of callables 757 can share their signature via inheritance:: 758 759 class exponential(linear): 760 # __init__ inherited 761 def __call__(self, x): 762 return self.a * (x ** self.b) 763 764 Object can encapsulate state for several methods:: 765 766 class counter: 767 768 value = 0 769 770 def set(self, x): 771 self.value = x 772 773 def up(self): 774 self.value = self.value + 1 775 776 def down(self): 777 self.value = self.value - 1 778 779 count = counter() 780 inc, dec, reset = count.up, count.down, count.set 781 782 Here ``inc()``, ``dec()`` and ``reset()`` act like functions which share the 783 same counting variable. 784 785 786 How do I copy an object in Python? 787 ---------------------------------- 788 789 In general, try :func:`copy.copy` or :func:`copy.deepcopy` for the general case. 790 Not all objects can be copied, but most can. 791 792 Some objects can be copied more easily. Dictionaries have a :meth:`~dict.copy` 793 method:: 794 795 newdict = olddict.copy() 796 797 Sequences can be copied by slicing:: 798 799 new_l = l[:] 800 801 802 How can I find the methods or attributes of an object? 803 ------------------------------------------------------ 804 805 For an instance x of a user-defined class, ``dir(x)`` returns an alphabetized 806 list of the names containing the instance attributes and methods and attributes 807 defined by its class. 808 809 810 How can my code discover the name of an object? 811 ----------------------------------------------- 812 813 Generally speaking, it can't, because objects don't really have names. 814 Essentially, assignment always binds a name to a value; The same is true of 815 ``def`` and ``class`` statements, but in that case the value is a 816 callable. Consider the following code:: 817 818 >>> class A: 819 ... pass 820 ... 821 >>> B = A 822 >>> a = B() 823 >>> b = a 824 >>> print b 825 <__main__.A instance at 0x16D07CC> 826 >>> print a 827 <__main__.A instance at 0x16D07CC> 828 829 Arguably the class has a name: even though it is bound to two names and invoked 830 through the name B the created instance is still reported as an instance of 831 class A. However, it is impossible to say whether the instance's name is a or 832 b, since both names are bound to the same value. 833 834 Generally speaking it should not be necessary for your code to "know the names" 835 of particular values. Unless you are deliberately writing introspective 836 programs, this is usually an indication that a change of approach might be 837 beneficial. 838 839 In comp.lang.python, Fredrik Lundh once gave an excellent analogy in answer to 840 this question: 841 842 The same way as you get the name of that cat you found on your porch: the cat 843 (object) itself cannot tell you its name, and it doesn't really care -- so 844 the only way to find out what it's called is to ask all your neighbours 845 (namespaces) if it's their cat (object)... 846 847 ....and don't be surprised if you'll find that it's known by many names, or 848 no name at all! 849 850 851 What's up with the comma operator's precedence? 852 ----------------------------------------------- 853 854 Comma is not an operator in Python. Consider this session:: 855 856 >>> "a" in "b", "a" 857 (False, 'a') 858 859 Since the comma is not an operator, but a separator between expressions the 860 above is evaluated as if you had entered:: 861 862 ("a" in "b"), "a" 863 864 not:: 865 866 "a" in ("b", "a") 867 868 The same is true of the various assignment operators (``=``, ``+=`` etc). They 869 are not truly operators but syntactic delimiters in assignment statements. 870 871 872 Is there an equivalent of C's "?:" ternary operator? 873 ---------------------------------------------------- 874 875 Yes, this feature was added in Python 2.5. The syntax would be as follows:: 876 877 [on_true] if [expression] else [on_false] 878 879 x, y = 50, 25 880 881 small = x if x < y else y 882 883 For versions previous to 2.5 the answer would be 'No'. 884 885 886 Is it possible to write obfuscated one-liners in Python? 887 -------------------------------------------------------- 888 889 Yes. Usually this is done by nesting :keyword:`lambda` within 890 :keyword:`lambda`. See the following three examples, due to Ulf Bartelt:: 891 892 # Primes < 1000 893 print filter(None,map(lambda y:y*reduce(lambda x,y:x*y!=0, 894 map(lambda x,y=y:y%x,range(2,int(pow(y,0.5)+1))),1),range(2,1000))) 895 896 # First 10 Fibonacci numbers 897 print map(lambda x,f=lambda x,f:(f(x-1,f)+f(x-2,f)) if x>1 else 1: f(x,f), 898 range(10)) 899 900 # Mandelbrot set 901 print (lambda Ru,Ro,Iu,Io,IM,Sx,Sy:reduce(lambda x,y:x+y,map(lambda y, 902 Iu=Iu,Io=Io,Ru=Ru,Ro=Ro,Sy=Sy,L=lambda yc,Iu=Iu,Io=Io,Ru=Ru,Ro=Ro,i=IM, 903 Sx=Sx,Sy=Sy:reduce(lambda x,y:x+y,map(lambda x,xc=Ru,yc=yc,Ru=Ru,Ro=Ro, 904 i=i,Sx=Sx,F=lambda xc,yc,x,y,k,f=lambda xc,yc,x,y,k,f:(k<=0)or (x*x+y*y 905 >=4.0) or 1+f(xc,yc,x*x-y*y+xc,2.0*x*y+yc,k-1,f):f(xc,yc,x,y,k,f):chr( 906 64+F(Ru+x*(Ro-Ru)/Sx,yc,0,0,i)),range(Sx))):L(Iu+y*(Io-Iu)/Sy),range(Sy 907 ))))(-2.1, 0.7, -1.2, 1.2, 30, 80, 24) 908 # \___ ___/ \___ ___/ | | |__ lines on screen 909 # V V | |______ columns on screen 910 # | | |__________ maximum of "iterations" 911 # | |_________________ range on y axis 912 # |____________________________ range on x axis 913 914 Don't try this at home, kids! 915 916 917 Numbers and strings 918 =================== 919 920 How do I specify hexadecimal and octal integers? 921 ------------------------------------------------ 922 923 To specify an octal digit, precede the octal value with a zero, and then a lower 924 or uppercase "o". For example, to set the variable "a" to the octal value "10" 925 (8 in decimal), type:: 926 927 >>> a = 0o10 928 >>> a 929 8 930 931 Hexadecimal is just as easy. Simply precede the hexadecimal number with a zero, 932 and then a lower or uppercase "x". Hexadecimal digits can be specified in lower 933 or uppercase. For example, in the Python interpreter:: 934 935 >>> a = 0xa5 936 >>> a 937 165 938 >>> b = 0XB2 939 >>> b 940 178 941 942 943 Why does -22 // 10 return -3? 944 ----------------------------- 945 946 It's primarily driven by the desire that ``i % j`` have the same sign as ``j``. 947 If you want that, and also want:: 948 949 i == (i // j) * j + (i % j) 950 951 then integer division has to return the floor. C also requires that identity to 952 hold, and then compilers that truncate ``i // j`` need to make ``i % j`` have 953 the same sign as ``i``. 954 955 There are few real use cases for ``i % j`` when ``j`` is negative. When ``j`` 956 is positive, there are many, and in virtually all of them it's more useful for 957 ``i % j`` to be ``>= 0``. If the clock says 10 now, what did it say 200 hours 958 ago? ``-190 % 12 == 2`` is useful; ``-190 % 12 == -10`` is a bug waiting to 959 bite. 960 961 .. note:: 962 963 On Python 2, ``a / b`` returns the same as ``a // b`` if 964 ``__future__.division`` is not in effect. This is also known as "classic" 965 division. 966 967 968 How do I convert a string to a number? 969 -------------------------------------- 970 971 For integers, use the built-in :func:`int` type constructor, e.g. ``int('144') 972 == 144``. Similarly, :func:`float` converts to floating-point, 973 e.g. ``float('144') == 144.0``. 974 975 By default, these interpret the number as decimal, so that ``int('0144') == 976 144`` and ``int('0x144')`` raises :exc:`ValueError`. ``int(string, base)`` takes 977 the base to convert from as a second optional argument, so ``int('0x144', 16) == 978 324``. If the base is specified as 0, the number is interpreted using Python's 979 rules: a leading '0' indicates octal, and '0x' indicates a hex number. 980 981 Do not use the built-in function :func:`eval` if all you need is to convert 982 strings to numbers. :func:`eval` will be significantly slower and it presents a 983 security risk: someone could pass you a Python expression that might have 984 unwanted side effects. For example, someone could pass 985 ``__import__('os').system("rm -rf $HOME")`` which would erase your home 986 directory. 987 988 :func:`eval` also has the effect of interpreting numbers as Python expressions, 989 so that e.g. ``eval('09')`` gives a syntax error because Python regards numbers 990 starting with '0' as octal (base 8). 991 992 993 How do I convert a number to a string? 994 -------------------------------------- 995 996 To convert, e.g., the number 144 to the string '144', use the built-in type 997 constructor :func:`str`. If you want a hexadecimal or octal representation, use 998 the built-in functions :func:`hex` or :func:`oct`. For fancy formatting, see 999 the :ref:`formatstrings` section, e.g. ``"{:04d}".format(144)`` yields 1000 ``'0144'`` and ``"{:.3f}".format(1/3)`` yields ``'0.333'``. You may also use 1001 :ref:`the % operator <string-formatting>` on strings. See the library reference 1002 manual for details. 1003 1004 1005 How do I modify a string in place? 1006 ---------------------------------- 1007 1008 You can't, because strings are immutable. If you need an object with this 1009 ability, try converting the string to a list or use the array module:: 1010 1011 >>> import io 1012 >>> s = "Hello, world" 1013 >>> a = list(s) 1014 >>> print a 1015 ['H', 'e', 'l', 'l', 'o', ',', ' ', 'w', 'o', 'r', 'l', 'd'] 1016 >>> a[7:] = list("there!") 1017 >>> ''.join(a) 1018 'Hello, there!' 1019 1020 >>> import array 1021 >>> a = array.array('c', s) 1022 >>> print a 1023 array('c', 'Hello, world') 1024 >>> a[0] = 'y'; print a 1025 array('c', 'yello, world') 1026 >>> a.tostring() 1027 'yello, world' 1028 1029 1030 How do I use strings to call functions/methods? 1031 ----------------------------------------------- 1032 1033 There are various techniques. 1034 1035 * The best is to use a dictionary that maps strings to functions. The primary 1036 advantage of this technique is that the strings do not need to match the names 1037 of the functions. This is also the primary technique used to emulate a case 1038 construct:: 1039 1040 def a(): 1041 pass 1042 1043 def b(): 1044 pass 1045 1046 dispatch = {'go': a, 'stop': b} # Note lack of parens for funcs 1047 1048 dispatch[get_input()]() # Note trailing parens to call function 1049 1050 * Use the built-in function :func:`getattr`:: 1051 1052 import foo 1053 getattr(foo, 'bar')() 1054 1055 Note that :func:`getattr` works on any object, including classes, class 1056 instances, modules, and so on. 1057 1058 This is used in several places in the standard library, like this:: 1059 1060 class Foo: 1061 def do_foo(self): 1062 ... 1063 1064 def do_bar(self): 1065 ... 1066 1067 f = getattr(foo_instance, 'do_' + opname) 1068 f() 1069 1070 1071 * Use :func:`locals` or :func:`eval` to resolve the function name:: 1072 1073 def myFunc(): 1074 print "hello" 1075 1076 fname = "myFunc" 1077 1078 f = locals()[fname] 1079 f() 1080 1081 f = eval(fname) 1082 f() 1083 1084 Note: Using :func:`eval` is slow and dangerous. If you don't have absolute 1085 control over the contents of the string, someone could pass a string that 1086 resulted in an arbitrary function being executed. 1087 1088 Is there an equivalent to Perl's chomp() for removing trailing newlines from strings? 1089 ------------------------------------------------------------------------------------- 1090 1091 Starting with Python 2.2, you can use ``S.rstrip("\r\n")`` to remove all 1092 occurrences of any line terminator from the end of the string ``S`` without 1093 removing other trailing whitespace. If the string ``S`` represents more than 1094 one line, with several empty lines at the end, the line terminators for all the 1095 blank lines will be removed:: 1096 1097 >>> lines = ("line 1 \r\n" 1098 ... "\r\n" 1099 ... "\r\n") 1100 >>> lines.rstrip("\n\r") 1101 'line 1 ' 1102 1103 Since this is typically only desired when reading text one line at a time, using 1104 ``S.rstrip()`` this way works well. 1105 1106 For older versions of Python, there are two partial substitutes: 1107 1108 - If you want to remove all trailing whitespace, use the ``rstrip()`` method of 1109 string objects. This removes all trailing whitespace, not just a single 1110 newline. 1111 1112 - Otherwise, if there is only one line in the string ``S``, use 1113 ``S.splitlines()[0]``. 1114 1115 1116 Is there a scanf() or sscanf() equivalent? 1117 ------------------------------------------ 1118 1119 Not as such. 1120 1121 For simple input parsing, the easiest approach is usually to split the line into 1122 whitespace-delimited words using the :meth:`~str.split` method of string objects 1123 and then convert decimal strings to numeric values using :func:`int` or 1124 :func:`float`. ``split()`` supports an optional "sep" parameter which is useful 1125 if the line uses something other than whitespace as a separator. 1126 1127 For more complicated input parsing, regular expressions are more powerful 1128 than C's :c:func:`sscanf` and better suited for the task. 1129 1130 1131 What does 'UnicodeError: ASCII [decoding,encoding] error: ordinal not in range(128)' mean? 1132 ------------------------------------------------------------------------------------------ 1133 1134 This error indicates that your Python installation can handle only 7-bit ASCII 1135 strings. There are a couple ways to fix or work around the problem. 1136 1137 If your programs must handle data in arbitrary character set encodings, the 1138 environment the application runs in will generally identify the encoding of the 1139 data it is handing you. You need to convert the input to Unicode data using 1140 that encoding. For example, a program that handles email or web input will 1141 typically find character set encoding information in Content-Type headers. This 1142 can then be used to properly convert input data to Unicode. Assuming the string 1143 referred to by ``value`` is encoded as UTF-8:: 1144 1145 value = unicode(value, "utf-8") 1146 1147 will return a Unicode object. If the data is not correctly encoded as UTF-8, 1148 the above call will raise a :exc:`UnicodeError` exception. 1149 1150 If you only want strings converted to Unicode which have non-ASCII data, you can 1151 try converting them first assuming an ASCII encoding, and then generate Unicode 1152 objects if that fails:: 1153 1154 try: 1155 x = unicode(value, "ascii") 1156 except UnicodeError: 1157 value = unicode(value, "utf-8") 1158 else: 1159 # value was valid ASCII data 1160 pass 1161 1162 It's possible to set a default encoding in a file called ``sitecustomize.py`` 1163 that's part of the Python library. However, this isn't recommended because 1164 changing the Python-wide default encoding may cause third-party extension 1165 modules to fail. 1166 1167 Note that on Windows, there is an encoding known as "mbcs", which uses an 1168 encoding specific to your current locale. In many cases, and particularly when 1169 working with COM, this may be an appropriate default encoding to use. 1170 1171 1172 Sequences (Tuples/Lists) 1173 ======================== 1174 1175 How do I convert between tuples and lists? 1176 ------------------------------------------ 1177 1178 The type constructor ``tuple(seq)`` converts any sequence (actually, any 1179 iterable) into a tuple with the same items in the same order. 1180 1181 For example, ``tuple([1, 2, 3])`` yields ``(1, 2, 3)`` and ``tuple('abc')`` 1182 yields ``('a', 'b', 'c')``. If the argument is a tuple, it does not make a copy 1183 but returns the same object, so it is cheap to call :func:`tuple` when you 1184 aren't sure that an object is already a tuple. 1185 1186 The type constructor ``list(seq)`` converts any sequence or iterable into a list 1187 with the same items in the same order. For example, ``list((1, 2, 3))`` yields 1188 ``[1, 2, 3]`` and ``list('abc')`` yields ``['a', 'b', 'c']``. If the argument 1189 is a list, it makes a copy just like ``seq[:]`` would. 1190 1191 1192 What's a negative index? 1193 ------------------------ 1194 1195 Python sequences are indexed with positive numbers and negative numbers. For 1196 positive numbers 0 is the first index 1 is the second index and so forth. For 1197 negative indices -1 is the last index and -2 is the penultimate (next to last) 1198 index and so forth. Think of ``seq[-n]`` as the same as ``seq[len(seq)-n]``. 1199 1200 Using negative indices can be very convenient. For example ``S[:-1]`` is all of 1201 the string except for its last character, which is useful for removing the 1202 trailing newline from a string. 1203 1204 1205 How do I iterate over a sequence in reverse order? 1206 -------------------------------------------------- 1207 1208 Use the :func:`reversed` built-in function, which is new in Python 2.4:: 1209 1210 for x in reversed(sequence): 1211 ... # do something with x ... 1212 1213 This won't touch your original sequence, but build a new copy with reversed 1214 order to iterate over. 1215 1216 With Python 2.3, you can use an extended slice syntax:: 1217 1218 for x in sequence[::-1]: 1219 ... # do something with x ... 1220 1221 1222 How do you remove duplicates from a list? 1223 ----------------------------------------- 1224 1225 See the Python Cookbook for a long discussion of many ways to do this: 1226 1227 https://code.activestate.com/recipes/52560/ 1228 1229 If you don't mind reordering the list, sort it and then scan from the end of the 1230 list, deleting duplicates as you go:: 1231 1232 if mylist: 1233 mylist.sort() 1234 last = mylist[-1] 1235 for i in range(len(mylist)-2, -1, -1): 1236 if last == mylist[i]: 1237 del mylist[i] 1238 else: 1239 last = mylist[i] 1240 1241 If all elements of the list may be used as dictionary keys (i.e. they are all 1242 hashable) this is often faster :: 1243 1244 d = {} 1245 for x in mylist: 1246 d[x] = 1 1247 mylist = list(d.keys()) 1248 1249 In Python 2.5 and later, the following is possible instead:: 1250 1251 mylist = list(set(mylist)) 1252 1253 This converts the list into a set, thereby removing duplicates, and then back 1254 into a list. 1255 1256 1257 How do you make an array in Python? 1258 ----------------------------------- 1259 1260 Use a list:: 1261 1262 ["this", 1, "is", "an", "array"] 1263 1264 Lists are equivalent to C or Pascal arrays in their time complexity; the primary 1265 difference is that a Python list can contain objects of many different types. 1266 1267 The ``array`` module also provides methods for creating arrays of fixed types 1268 with compact representations, but they are slower to index than lists. Also 1269 note that the Numeric extensions and others define array-like structures with 1270 various characteristics as well. 1271 1272 To get Lisp-style linked lists, you can emulate cons cells using tuples:: 1273 1274 lisp_list = ("like", ("this", ("example", None) ) ) 1275 1276 If mutability is desired, you could use lists instead of tuples. Here the 1277 analogue of lisp car is ``lisp_list[0]`` and the analogue of cdr is 1278 ``lisp_list[1]``. Only do this if you're sure you really need to, because it's 1279 usually a lot slower than using Python lists. 1280 1281 1282 .. _faq-multidimensional-list: 1283 1284 How do I create a multidimensional list? 1285 ---------------------------------------- 1286 1287 You probably tried to make a multidimensional array like this:: 1288 1289 >>> A = [[None] * 2] * 3 1290 1291 This looks correct if you print it:: 1292 1293 >>> A 1294 [[None, None], [None, None], [None, None]] 1295 1296 But when you assign a value, it shows up in multiple places: 1297 1298 >>> A[0][0] = 5 1299 >>> A 1300 [[5, None], [5, None], [5, None]] 1301 1302 The reason is that replicating a list with ``*`` doesn't create copies, it only 1303 creates references to the existing objects. The ``*3`` creates a list 1304 containing 3 references to the same list of length two. Changes to one row will 1305 show in all rows, which is almost certainly not what you want. 1306 1307 The suggested approach is to create a list of the desired length first and then 1308 fill in each element with a newly created list:: 1309 1310 A = [None] * 3 1311 for i in range(3): 1312 A[i] = [None] * 2 1313 1314 This generates a list containing 3 different lists of length two. You can also 1315 use a list comprehension:: 1316 1317 w, h = 2, 3 1318 A = [[None] * w for i in range(h)] 1319 1320 Or, you can use an extension that provides a matrix datatype; `NumPy 1321 <http://www.numpy.org/>`_ is the best known. 1322 1323 1324 How do I apply a method to a sequence of objects? 1325 ------------------------------------------------- 1326 1327 Use a list comprehension:: 1328 1329 result = [obj.method() for obj in mylist] 1330 1331 More generically, you can try the following function:: 1332 1333 def method_map(objects, method, arguments): 1334 """method_map([a,b], "meth", (1,2)) gives [a.meth(1,2), b.meth(1,2)]""" 1335 nobjects = len(objects) 1336 methods = map(getattr, objects, [method]*nobjects) 1337 return map(apply, methods, [arguments]*nobjects) 1338 1339 1340 Why does a_tuple[i] += ['item'] raise an exception when the addition works? 1341 --------------------------------------------------------------------------- 1342 1343 This is because of a combination of the fact that augmented assignment 1344 operators are *assignment* operators, and the difference between mutable and 1345 immutable objects in Python. 1346 1347 This discussion applies in general when augmented assignment operators are 1348 applied to elements of a tuple that point to mutable objects, but we'll use 1349 a ``list`` and ``+=`` as our exemplar. 1350 1351 If you wrote:: 1352 1353 >>> a_tuple = (1, 2) 1354 >>> a_tuple[0] += 1 1355 Traceback (most recent call last): 1356 ... 1357 TypeError: 'tuple' object does not support item assignment 1358 1359 The reason for the exception should be immediately clear: ``1`` is added to the 1360 object ``a_tuple[0]`` points to (``1``), producing the result object, ``2``, 1361 but when we attempt to assign the result of the computation, ``2``, to element 1362 ``0`` of the tuple, we get an error because we can't change what an element of 1363 a tuple points to. 1364 1365 Under the covers, what this augmented assignment statement is doing is 1366 approximately this:: 1367 1368 >>> result = a_tuple[0] + 1 1369 >>> a_tuple[0] = result 1370 Traceback (most recent call last): 1371 ... 1372 TypeError: 'tuple' object does not support item assignment 1373 1374 It is the assignment part of the operation that produces the error, since a 1375 tuple is immutable. 1376 1377 When you write something like:: 1378 1379 >>> a_tuple = (['foo'], 'bar') 1380 >>> a_tuple[0] += ['item'] 1381 Traceback (most recent call last): 1382 ... 1383 TypeError: 'tuple' object does not support item assignment 1384 1385 The exception is a bit more surprising, and even more surprising is the fact 1386 that even though there was an error, the append worked:: 1387 1388 >>> a_tuple[0] 1389 ['foo', 'item'] 1390 1391 To see why this happens, you need to know that (a) if an object implements an 1392 ``__iadd__`` magic method, it gets called when the ``+=`` augmented assignment 1393 is executed, and its return value is what gets used in the assignment statement; 1394 and (b) for lists, ``__iadd__`` is equivalent to calling ``extend`` on the list 1395 and returning the list. That's why we say that for lists, ``+=`` is a 1396 "shorthand" for ``list.extend``:: 1397 1398 >>> a_list = [] 1399 >>> a_list += [1] 1400 >>> a_list 1401 [1] 1402 1403 This is equivalent to:: 1404 1405 >>> result = a_list.__iadd__([1]) 1406 >>> a_list = result 1407 1408 The object pointed to by a_list has been mutated, and the pointer to the 1409 mutated object is assigned back to ``a_list``. The end result of the 1410 assignment is a no-op, since it is a pointer to the same object that ``a_list`` 1411 was previously pointing to, but the assignment still happens. 1412 1413 Thus, in our tuple example what is happening is equivalent to:: 1414 1415 >>> result = a_tuple[0].__iadd__(['item']) 1416 >>> a_tuple[0] = result 1417 Traceback (most recent call last): 1418 ... 1419 TypeError: 'tuple' object does not support item assignment 1420 1421 The ``__iadd__`` succeeds, and thus the list is extended, but even though 1422 ``result`` points to the same object that ``a_tuple[0]`` already points to, 1423 that final assignment still results in an error, because tuples are immutable. 1424 1425 1426 Dictionaries 1427 ============ 1428 1429 How can I get a dictionary to display its keys in a consistent order? 1430 --------------------------------------------------------------------- 1431 1432 You can't. Dictionaries store their keys in an unpredictable order, so the 1433 display order of a dictionary's elements will be similarly unpredictable. 1434 1435 This can be frustrating if you want to save a printable version to a file, make 1436 some changes and then compare it with some other printed dictionary. In this 1437 case, use the ``pprint`` module to pretty-print the dictionary; the items will 1438 be presented in order sorted by the key. 1439 1440 A more complicated solution is to subclass ``dict`` to create a 1441 ``SortedDict`` class that prints itself in a predictable order. Here's one 1442 simpleminded implementation of such a class:: 1443 1444 class SortedDict(dict): 1445 def __repr__(self): 1446 keys = sorted(self.keys()) 1447 result = ("{!r}: {!r}".format(k, self[k]) for k in keys) 1448 return "{{{}}}".format(", ".join(result)) 1449 1450 __str__ = __repr__ 1451 1452 This will work for many common situations you might encounter, though it's far 1453 from a perfect solution. The largest flaw is that if some values in the 1454 dictionary are also dictionaries, their values won't be presented in any 1455 particular order. 1456 1457 1458 I want to do a complicated sort: can you do a Schwartzian Transform in Python? 1459 ------------------------------------------------------------------------------ 1460 1461 The technique, attributed to Randal Schwartz of the Perl community, sorts the 1462 elements of a list by a metric which maps each element to its "sort value". In 1463 Python, use the ``key`` argument for the :func:`sort()` function:: 1464 1465 Isorted = L[:] 1466 Isorted.sort(key=lambda s: int(s[10:15])) 1467 1468 1469 How can I sort one list by values from another list? 1470 ---------------------------------------------------- 1471 1472 Merge them into a single list of tuples, sort the resulting list, and then pick 1473 out the element you want. :: 1474 1475 >>> list1 = ["what", "I'm", "sorting", "by"] 1476 >>> list2 = ["something", "else", "to", "sort"] 1477 >>> pairs = zip(list1, list2) 1478 >>> pairs 1479 [('what', 'something'), ("I'm", 'else'), ('sorting', 'to'), ('by', 'sort')] 1480 >>> pairs.sort() 1481 >>> result = [ x[1] for x in pairs ] 1482 >>> result 1483 ['else', 'sort', 'to', 'something'] 1484 1485 An alternative for the last step is:: 1486 1487 >>> result = [] 1488 >>> for p in pairs: result.append(p[1]) 1489 1490 If you find this more legible, you might prefer to use this instead of the final 1491 list comprehension. However, it is almost twice as slow for long lists. Why? 1492 First, the ``append()`` operation has to reallocate memory, and while it uses 1493 some tricks to avoid doing that each time, it still has to do it occasionally, 1494 and that costs quite a bit. Second, the expression "result.append" requires an 1495 extra attribute lookup, and third, there's a speed reduction from having to make 1496 all those function calls. 1497 1498 1499 Objects 1500 ======= 1501 1502 What is a class? 1503 ---------------- 1504 1505 A class is the particular object type created by executing a class statement. 1506 Class objects are used as templates to create instance objects, which embody 1507 both the data (attributes) and code (methods) specific to a datatype. 1508 1509 A class can be based on one or more other classes, called its base class(es). It 1510 then inherits the attributes and methods of its base classes. This allows an 1511 object model to be successively refined by inheritance. You might have a 1512 generic ``Mailbox`` class that provides basic accessor methods for a mailbox, 1513 and subclasses such as ``MboxMailbox``, ``MaildirMailbox``, ``OutlookMailbox`` 1514 that handle various specific mailbox formats. 1515 1516 1517 What is a method? 1518 ----------------- 1519 1520 A method is a function on some object ``x`` that you normally call as 1521 ``x.name(arguments...)``. Methods are defined as functions inside the class 1522 definition:: 1523 1524 class C: 1525 def meth(self, arg): 1526 return arg * 2 + self.attribute 1527 1528 1529 What is self? 1530 ------------- 1531 1532 Self is merely a conventional name for the first argument of a method. A method 1533 defined as ``meth(self, a, b, c)`` should be called as ``x.meth(a, b, c)`` for 1534 some instance ``x`` of the class in which the definition occurs; the called 1535 method will think it is called as ``meth(x, a, b, c)``. 1536 1537 See also :ref:`why-self`. 1538 1539 1540 How do I check if an object is an instance of a given class or of a subclass of it? 1541 ----------------------------------------------------------------------------------- 1542 1543 Use the built-in function ``isinstance(obj, cls)``. You can check if an object 1544 is an instance of any of a number of classes by providing a tuple instead of a 1545 single class, e.g. ``isinstance(obj, (class1, class2, ...))``, and can also 1546 check whether an object is one of Python's built-in types, e.g. 1547 ``isinstance(obj, str)`` or ``isinstance(obj, (int, long, float, complex))``. 1548 1549 Note that most programs do not use :func:`isinstance` on user-defined classes 1550 very often. If you are developing the classes yourself, a more proper 1551 object-oriented style is to define methods on the classes that encapsulate a 1552 particular behaviour, instead of checking the object's class and doing a 1553 different thing based on what class it is. For example, if you have a function 1554 that does something:: 1555 1556 def search(obj): 1557 if isinstance(obj, Mailbox): 1558 ... # code to search a mailbox 1559 elif isinstance(obj, Document): 1560 ... # code to search a document 1561 elif ... 1562 1563 A better approach is to define a ``search()`` method on all the classes and just 1564 call it:: 1565 1566 class Mailbox: 1567 def search(self): 1568 ... # code to search a mailbox 1569 1570 class Document: 1571 def search(self): 1572 ... # code to search a document 1573 1574 obj.search() 1575 1576 1577 What is delegation? 1578 ------------------- 1579 1580 Delegation is an object oriented technique (also called a design pattern). 1581 Let's say you have an object ``x`` and want to change the behaviour of just one 1582 of its methods. You can create a new class that provides a new implementation 1583 of the method you're interested in changing and delegates all other methods to 1584 the corresponding method of ``x``. 1585 1586 Python programmers can easily implement delegation. For example, the following 1587 class implements a class that behaves like a file but converts all written data 1588 to uppercase:: 1589 1590 class UpperOut: 1591 1592 def __init__(self, outfile): 1593 self._outfile = outfile 1594 1595 def write(self, s): 1596 self._outfile.write(s.upper()) 1597 1598 def __getattr__(self, name): 1599 return getattr(self._outfile, name) 1600 1601 Here the ``UpperOut`` class redefines the ``write()`` method to convert the 1602 argument string to uppercase before calling the underlying 1603 ``self.__outfile.write()`` method. All other methods are delegated to the 1604 underlying ``self.__outfile`` object. The delegation is accomplished via the 1605 ``__getattr__`` method; consult :ref:`the language reference <attribute-access>` 1606 for more information about controlling attribute access. 1607 1608 Note that for more general cases delegation can get trickier. When attributes 1609 must be set as well as retrieved, the class must define a :meth:`__setattr__` 1610 method too, and it must do so carefully. The basic implementation of 1611 :meth:`__setattr__` is roughly equivalent to the following:: 1612 1613 class X: 1614 ... 1615 def __setattr__(self, name, value): 1616 self.__dict__[name] = value 1617 ... 1618 1619 Most :meth:`__setattr__` implementations must modify ``self.__dict__`` to store 1620 local state for self without causing an infinite recursion. 1621 1622 1623 How do I call a method defined in a base class from a derived class that overrides it? 1624 -------------------------------------------------------------------------------------- 1625 1626 If you're using new-style classes, use the built-in :func:`super` function:: 1627 1628 class Derived(Base): 1629 def meth(self): 1630 super(Derived, self).meth() 1631 1632 If you're using classic classes: For a class definition such as ``class 1633 Derived(Base): ...`` you can call method ``meth()`` defined in ``Base`` (or one 1634 of ``Base``'s base classes) as ``Base.meth(self, arguments...)``. Here, 1635 ``Base.meth`` is an unbound method, so you need to provide the ``self`` 1636 argument. 1637 1638 1639 How can I organize my code to make it easier to change the base class? 1640 ---------------------------------------------------------------------- 1641 1642 You could define an alias for the base class, assign the real base class to it 1643 before your class definition, and use the alias throughout your class. Then all 1644 you have to change is the value assigned to the alias. Incidentally, this trick 1645 is also handy if you want to decide dynamically (e.g. depending on availability 1646 of resources) which base class to use. Example:: 1647 1648 BaseAlias = <real base class> 1649 1650 class Derived(BaseAlias): 1651 def meth(self): 1652 BaseAlias.meth(self) 1653 ... 1654 1655 1656 How do I create static class data and static class methods? 1657 ----------------------------------------------------------- 1658 1659 Both static data and static methods (in the sense of C++ or Java) are supported 1660 in Python. 1661 1662 For static data, simply define a class attribute. To assign a new value to the 1663 attribute, you have to explicitly use the class name in the assignment:: 1664 1665 class C: 1666 count = 0 # number of times C.__init__ called 1667 1668 def __init__(self): 1669 C.count = C.count + 1 1670 1671 def getcount(self): 1672 return C.count # or return self.count 1673 1674 ``c.count`` also refers to ``C.count`` for any ``c`` such that ``isinstance(c, 1675 C)`` holds, unless overridden by ``c`` itself or by some class on the base-class 1676 search path from ``c.__class__`` back to ``C``. 1677 1678 Caution: within a method of C, an assignment like ``self.count = 42`` creates a 1679 new and unrelated instance named "count" in ``self``'s own dict. Rebinding of a 1680 class-static data name must always specify the class whether inside a method or 1681 not:: 1682 1683 C.count = 314 1684 1685 Static methods are possible since Python 2.2:: 1686 1687 class C: 1688 def static(arg1, arg2, arg3): 1689 # No 'self' parameter! 1690 ... 1691 static = staticmethod(static) 1692 1693 With Python 2.4's decorators, this can also be written as :: 1694 1695 class C: 1696 @staticmethod 1697 def static(arg1, arg2, arg3): 1698 # No 'self' parameter! 1699 ... 1700 1701 However, a far more straightforward way to get the effect of a static method is 1702 via a simple module-level function:: 1703 1704 def getcount(): 1705 return C.count 1706 1707 If your code is structured so as to define one class (or tightly related class 1708 hierarchy) per module, this supplies the desired encapsulation. 1709 1710 1711 How can I overload constructors (or methods) in Python? 1712 ------------------------------------------------------- 1713 1714 This answer actually applies to all methods, but the question usually comes up 1715 first in the context of constructors. 1716 1717 In C++ you'd write 1718 1719 .. code-block:: c 1720 1721 class C { 1722 C() { cout << "No arguments\n"; } 1723 C(int i) { cout << "Argument is " << i << "\n"; } 1724 } 1725 1726 In Python you have to write a single constructor that catches all cases using 1727 default arguments. For example:: 1728 1729 class C: 1730 def __init__(self, i=None): 1731 if i is None: 1732 print "No arguments" 1733 else: 1734 print "Argument is", i 1735 1736 This is not entirely equivalent, but close enough in practice. 1737 1738 You could also try a variable-length argument list, e.g. :: 1739 1740 def __init__(self, *args): 1741 ... 1742 1743 The same approach works for all method definitions. 1744 1745 1746 I try to use __spam and I get an error about _SomeClassName__spam. 1747 ------------------------------------------------------------------ 1748 1749 Variable names with double leading underscores are "mangled" to provide a simple 1750 but effective way to define class private variables. Any identifier of the form 1751 ``__spam`` (at least two leading underscores, at most one trailing underscore) 1752 is textually replaced with ``_classname__spam``, where ``classname`` is the 1753 current class name with any leading underscores stripped. 1754 1755 This doesn't guarantee privacy: an outside user can still deliberately access 1756 the "_classname__spam" attribute, and private values are visible in the object's 1757 ``__dict__``. Many Python programmers never bother to use private variable 1758 names at all. 1759 1760 1761 My class defines __del__ but it is not called when I delete the object. 1762 ----------------------------------------------------------------------- 1763 1764 There are several possible reasons for this. 1765 1766 The del statement does not necessarily call :meth:`__del__` -- it simply 1767 decrements the object's reference count, and if this reaches zero 1768 :meth:`__del__` is called. 1769 1770 If your data structures contain circular links (e.g. a tree where each child has 1771 a parent reference and each parent has a list of children) the reference counts 1772 will never go back to zero. Once in a while Python runs an algorithm to detect 1773 such cycles, but the garbage collector might run some time after the last 1774 reference to your data structure vanishes, so your :meth:`__del__` method may be 1775 called at an inconvenient and random time. This is inconvenient if you're trying 1776 to reproduce a problem. Worse, the order in which object's :meth:`__del__` 1777 methods are executed is arbitrary. You can run :func:`gc.collect` to force a 1778 collection, but there *are* pathological cases where objects will never be 1779 collected. 1780 1781 Despite the cycle collector, it's still a good idea to define an explicit 1782 ``close()`` method on objects to be called whenever you're done with them. The 1783 ``close()`` method can then remove attributes that refer to subobjecs. Don't 1784 call :meth:`__del__` directly -- :meth:`__del__` should call ``close()`` and 1785 ``close()`` should make sure that it can be called more than once for the same 1786 object. 1787 1788 Another way to avoid cyclical references is to use the :mod:`weakref` module, 1789 which allows you to point to objects without incrementing their reference count. 1790 Tree data structures, for instance, should use weak references for their parent 1791 and sibling references (if they need them!). 1792 1793 If the object has ever been a local variable in a function that caught an 1794 expression in an except clause, chances are that a reference to the object still 1795 exists in that function's stack frame as contained in the stack trace. 1796 Normally, calling :func:`sys.exc_clear` will take care of this by clearing the 1797 last recorded exception. 1798 1799 Finally, if your :meth:`__del__` method raises an exception, a warning message 1800 is printed to :data:`sys.stderr`. 1801 1802 1803 How do I get a list of all instances of a given class? 1804 ------------------------------------------------------ 1805 1806 Python does not keep track of all instances of a class (or of a built-in type). 1807 You can program the class's constructor to keep track of all instances by 1808 keeping a list of weak references to each instance. 1809 1810 1811 Why does the result of ``id()`` appear to be not unique? 1812 -------------------------------------------------------- 1813 1814 The :func:`id` builtin returns an integer that is guaranteed to be unique during 1815 the lifetime of the object. Since in CPython, this is the object's memory 1816 address, it happens frequently that after an object is deleted from memory, the 1817 next freshly created object is allocated at the same position in memory. This 1818 is illustrated by this example: 1819 1820 >>> id(1000) 1821 13901272 1822 >>> id(2000) 1823 13901272 1824 1825 The two ids belong to different integer objects that are created before, and 1826 deleted immediately after execution of the ``id()`` call. To be sure that 1827 objects whose id you want to examine are still alive, create another reference 1828 to the object: 1829 1830 >>> a = 1000; b = 2000 1831 >>> id(a) 1832 13901272 1833 >>> id(b) 1834 13891296 1835 1836 1837 Modules 1838 ======= 1839 1840 How do I create a .pyc file? 1841 ---------------------------- 1842 1843 When a module is imported for the first time (or when the source is more recent 1844 than the current compiled file) a ``.pyc`` file containing the compiled code 1845 should be created in the same directory as the ``.py`` file. 1846 1847 One reason that a ``.pyc`` file may not be created is permissions problems with 1848 the directory. This can happen, for example, if you develop as one user but run 1849 as another, such as if you are testing with a web server. Creation of a .pyc 1850 file is automatic if you're importing a module and Python has the ability 1851 (permissions, free space, etc...) to write the compiled module back to the 1852 directory. 1853 1854 Running Python on a top level script is not considered an import and no 1855 ``.pyc`` will be created. For example, if you have a top-level module 1856 ``foo.py`` that imports another module ``xyz.py``, when you run ``foo``, 1857 ``xyz.pyc`` will be created since ``xyz`` is imported, but no ``foo.pyc`` file 1858 will be created since ``foo.py`` isn't being imported. 1859 1860 If you need to create ``foo.pyc`` -- that is, to create a ``.pyc`` file for a module 1861 that is not imported -- you can, using the :mod:`py_compile` and 1862 :mod:`compileall` modules. 1863 1864 The :mod:`py_compile` module can manually compile any module. One way is to use 1865 the ``compile()`` function in that module interactively:: 1866 1867 >>> import py_compile 1868 >>> py_compile.compile('foo.py') # doctest: +SKIP 1869 1870 This will write the ``.pyc`` to the same location as ``foo.py`` (or you can 1871 override that with the optional parameter ``cfile``). 1872 1873 You can also automatically compile all files in a directory or directories using 1874 the :mod:`compileall` module. You can do it from the shell prompt by running 1875 ``compileall.py`` and providing the path of a directory containing Python files 1876 to compile:: 1877 1878 python -m compileall . 1879 1880 1881 How do I find the current module name? 1882 -------------------------------------- 1883 1884 A module can find out its own module name by looking at the predefined global 1885 variable ``__name__``. If this has the value ``'__main__'``, the program is 1886 running as a script. Many modules that are usually used by importing them also 1887 provide a command-line interface or a self-test, and only execute this code 1888 after checking ``__name__``:: 1889 1890 def main(): 1891 print 'Running test...' 1892 ... 1893 1894 if __name__ == '__main__': 1895 main() 1896 1897 1898 How can I have modules that mutually import each other? 1899 ------------------------------------------------------- 1900 1901 Suppose you have the following modules: 1902 1903 foo.py:: 1904 1905 from bar import bar_var 1906 foo_var = 1 1907 1908 bar.py:: 1909 1910 from foo import foo_var 1911 bar_var = 2 1912 1913 The problem is that the interpreter will perform the following steps: 1914 1915 * main imports foo 1916 * Empty globals for foo are created 1917 * foo is compiled and starts executing 1918 * foo imports bar 1919 * Empty globals for bar are created 1920 * bar is compiled and starts executing 1921 * bar imports foo (which is a no-op since there already is a module named foo) 1922 * bar.foo_var = foo.foo_var 1923 1924 The last step fails, because Python isn't done with interpreting ``foo`` yet and 1925 the global symbol dictionary for ``foo`` is still empty. 1926 1927 The same thing happens when you use ``import foo``, and then try to access 1928 ``foo.foo_var`` in global code. 1929 1930 There are (at least) three possible workarounds for this problem. 1931 1932 Guido van Rossum recommends avoiding all uses of ``from <module> import ...``, 1933 and placing all code inside functions. Initializations of global variables and 1934 class variables should use constants or built-in functions only. This means 1935 everything from an imported module is referenced as ``<module>.<name>``. 1936 1937 Jim Roskind suggests performing steps in the following order in each module: 1938 1939 * exports (globals, functions, and classes that don't need imported base 1940 classes) 1941 * ``import`` statements 1942 * active code (including globals that are initialized from imported values). 1943 1944 van Rossum doesn't like this approach much because the imports appear in a 1945 strange place, but it does work. 1946 1947 Matthias Urlichs recommends restructuring your code so that the recursive import 1948 is not necessary in the first place. 1949 1950 These solutions are not mutually exclusive. 1951 1952 1953 __import__('x.y.z') returns <module 'x'>; how do I get z? 1954 --------------------------------------------------------- 1955 1956 Consider using the convenience function :func:`~importlib.import_module` from 1957 :mod:`importlib` instead:: 1958 1959 z = importlib.import_module('x.y.z') 1960 1961 1962 When I edit an imported module and reimport it, the changes don't show up. Why does this happen? 1963 ------------------------------------------------------------------------------------------------- 1964 1965 For reasons of efficiency as well as consistency, Python only reads the module 1966 file on the first time a module is imported. If it didn't, in a program 1967 consisting of many modules where each one imports the same basic module, the 1968 basic module would be parsed and re-parsed many times. To force rereading of a 1969 changed module, do this:: 1970 1971 import modname 1972 reload(modname) 1973 1974 Warning: this technique is not 100% fool-proof. In particular, modules 1975 containing statements like :: 1976 1977 from modname import some_objects 1978 1979 will continue to work with the old version of the imported objects. If the 1980 module contains class definitions, existing class instances will *not* be 1981 updated to use the new class definition. This can result in the following 1982 paradoxical behaviour: 1983 1984 >>> import cls 1985 >>> c = cls.C() # Create an instance of C 1986 >>> reload(cls) 1987 <module 'cls' from 'cls.pyc'> 1988 >>> isinstance(c, cls.C) # isinstance is false?!? 1989 False 1990 1991 The nature of the problem is made clear if you print out the class objects: 1992 1993 >>> c.__class__ 1994 <class cls.C at 0x7352a0> 1995 >>> cls.C 1996 <class cls.C at 0x4198d0> 1997 1998