1 .. _tut-informal: 2 3 ********************************** 4 An Informal Introduction to Python 5 ********************************** 6 7 In the following examples, input and output are distinguished by the presence or 8 absence of prompts (:term:`>>>` and :term:`...`): to repeat the example, you must type 9 everything after the prompt, when the prompt appears; lines that do not begin 10 with a prompt are output from the interpreter. Note that a secondary prompt on a 11 line by itself in an example means you must type a blank line; this is used to 12 end a multi-line command. 13 14 Many of the examples in this manual, even those entered at the interactive 15 prompt, include comments. Comments in Python start with the hash character, 16 ``#``, and extend to the end of the physical line. A comment may appear at the 17 start of a line or following whitespace or code, but not within a string 18 literal. A hash character within a string literal is just a hash character. 19 Since comments are to clarify code and are not interpreted by Python, they may 20 be omitted when typing in examples. 21 22 Some examples:: 23 24 # this is the first comment 25 spam = 1 # and this is the second comment 26 # ... and now a third! 27 text = "# This is not a comment because it's inside quotes." 28 29 30 .. _tut-calculator: 31 32 Using Python as a Calculator 33 ============================ 34 35 Let's try some simple Python commands. Start the interpreter and wait for the 36 primary prompt, ``>>>``. (It shouldn't take long.) 37 38 39 .. _tut-numbers: 40 41 Numbers 42 ------- 43 44 The interpreter acts as a simple calculator: you can type an expression at it 45 and it will write the value. Expression syntax is straightforward: the 46 operators ``+``, ``-``, ``*`` and ``/`` work just like in most other languages 47 (for example, Pascal or C); parentheses (``()``) can be used for grouping. 48 For example:: 49 50 >>> 2 + 2 51 4 52 >>> 50 - 5*6 53 20 54 >>> (50 - 5.0*6) / 4 55 5.0 56 >>> 8 / 5.0 57 1.6 58 59 The integer numbers (e.g. ``2``, ``4``, ``20``) have type :class:`int`, 60 the ones with a fractional part (e.g. ``5.0``, ``1.6``) have type 61 :class:`float`. We will see more about numeric types later in the tutorial. 62 63 The return type of a division (``/``) operation depends on its operands. If 64 both operands are of type :class:`int`, :term:`floor division` is performed 65 and an :class:`int` is returned. If either operand is a :class:`float`, 66 classic division is performed and a :class:`float` is returned. The ``//`` 67 operator is also provided for doing floor division no matter what the 68 operands are. The remainder can be calculated with the ``%`` operator:: 69 70 >>> 17 / 3 # int / int -> int 71 5 72 >>> 17 / 3.0 # int / float -> float 73 5.666666666666667 74 >>> 17 // 3.0 # explicit floor division discards the fractional part 75 5.0 76 >>> 17 % 3 # the % operator returns the remainder of the division 77 2 78 >>> 5 * 3 + 2 # result * divisor + remainder 79 17 80 81 With Python, it is possible to use the ``**`` operator to calculate powers [#]_:: 82 83 >>> 5 ** 2 # 5 squared 84 25 85 >>> 2 ** 7 # 2 to the power of 7 86 128 87 88 The equal sign (``=``) is used to assign a value to a variable. Afterwards, no 89 result is displayed before the next interactive prompt:: 90 91 >>> width = 20 92 >>> height = 5 * 9 93 >>> width * height 94 900 95 96 If a variable is not "defined" (assigned a value), trying to use it will 97 give you an error:: 98 99 >>> n # try to access an undefined variable 100 Traceback (most recent call last): 101 File "<stdin>", line 1, in <module> 102 NameError: name 'n' is not defined 103 104 There is full support for floating point; operators with mixed type operands 105 convert the integer operand to floating point:: 106 107 >>> 3 * 3.75 / 1.5 108 7.5 109 >>> 7.0 / 2 110 3.5 111 112 In interactive mode, the last printed expression is assigned to the variable 113 ``_``. This means that when you are using Python as a desk calculator, it is 114 somewhat easier to continue calculations, for example:: 115 116 >>> tax = 12.5 / 100 117 >>> price = 100.50 118 >>> price * tax 119 12.5625 120 >>> price + _ 121 113.0625 122 >>> round(_, 2) 123 113.06 124 125 This variable should be treated as read-only by the user. Don't explicitly 126 assign a value to it --- you would create an independent local variable with the 127 same name masking the built-in variable with its magic behavior. 128 129 In addition to :class:`int` and :class:`float`, Python supports other types of 130 numbers, such as :class:`~decimal.Decimal` and :class:`~fractions.Fraction`. 131 Python also has built-in support for :ref:`complex numbers <typesnumeric>`, 132 and uses the ``j`` or ``J`` suffix to indicate the imaginary part 133 (e.g. ``3+5j``). 134 135 136 .. _tut-strings: 137 138 Strings 139 ------- 140 141 Besides numbers, Python can also manipulate strings, which can be expressed 142 in several ways. They can be enclosed in single quotes (``'...'``) or 143 double quotes (``"..."``) with the same result [#]_. ``\`` can be used 144 to escape quotes:: 145 146 >>> 'spam eggs' # single quotes 147 'spam eggs' 148 >>> 'doesn\'t' # use \' to escape the single quote... 149 "doesn't" 150 >>> "doesn't" # ...or use double quotes instead 151 "doesn't" 152 >>> '"Yes," he said.' 153 '"Yes," he said.' 154 >>> "\"Yes,\" he said." 155 '"Yes," he said.' 156 >>> '"Isn\'t," she said.' 157 '"Isn\'t," she said.' 158 159 In the interactive interpreter, the output string is enclosed in quotes and 160 special characters are escaped with backslashes. While this might sometimes 161 look different from the input (the enclosing quotes could change), the two 162 strings are equivalent. The string is enclosed in double quotes if 163 the string contains a single quote and no double quotes, otherwise it is 164 enclosed in single quotes. The :keyword:`print` statement produces a more 165 readable output, by omitting the enclosing quotes and by printing escaped 166 and special characters:: 167 168 >>> '"Isn\'t," she said.' 169 '"Isn\'t," she said.' 170 >>> print '"Isn\'t," she said.' 171 "Isn't," she said. 172 >>> s = 'First line.\nSecond line.' # \n means newline 173 >>> s # without print, \n is included in the output 174 'First line.\nSecond line.' 175 >>> print s # with print, \n produces a new line 176 First line. 177 Second line. 178 179 If you don't want characters prefaced by ``\`` to be interpreted as 180 special characters, you can use *raw strings* by adding an ``r`` before 181 the first quote:: 182 183 >>> print 'C:\some\name' # here \n means newline! 184 C:\some 185 ame 186 >>> print r'C:\some\name' # note the r before the quote 187 C:\some\name 188 189 String literals can span multiple lines. One way is using triple-quotes: 190 ``"""..."""`` or ``'''...'''``. End of lines are automatically 191 included in the string, but it's possible to prevent this by adding a ``\`` at 192 the end of the line. The following example:: 193 194 print """\ 195 Usage: thingy [OPTIONS] 196 -h Display this usage message 197 -H hostname Hostname to connect to 198 """ 199 200 produces the following output (note that the initial newline is not included): 201 202 .. code-block:: text 203 204 Usage: thingy [OPTIONS] 205 -h Display this usage message 206 -H hostname Hostname to connect to 207 208 Strings can be concatenated (glued together) with the ``+`` operator, and 209 repeated with ``*``:: 210 211 >>> # 3 times 'un', followed by 'ium' 212 >>> 3 * 'un' + 'ium' 213 'unununium' 214 215 Two or more *string literals* (i.e. the ones enclosed between quotes) next 216 to each other are automatically concatenated. :: 217 218 >>> 'Py' 'thon' 219 'Python' 220 221 This only works with two literals though, not with variables or expressions:: 222 223 >>> prefix = 'Py' 224 >>> prefix 'thon' # can't concatenate a variable and a string literal 225 ... 226 SyntaxError: invalid syntax 227 >>> ('un' * 3) 'ium' 228 ... 229 SyntaxError: invalid syntax 230 231 If you want to concatenate variables or a variable and a literal, use ``+``:: 232 233 >>> prefix + 'thon' 234 'Python' 235 236 This feature is particularly useful when you want to break long strings:: 237 238 >>> text = ('Put several strings within parentheses ' 239 ... 'to have them joined together.') 240 >>> text 241 'Put several strings within parentheses to have them joined together.' 242 243 Strings can be *indexed* (subscripted), with the first character having index 0. 244 There is no separate character type; a character is simply a string of size 245 one:: 246 247 >>> word = 'Python' 248 >>> word[0] # character in position 0 249 'P' 250 >>> word[5] # character in position 5 251 'n' 252 253 Indices may also be negative numbers, to start counting from the right:: 254 255 >>> word[-1] # last character 256 'n' 257 >>> word[-2] # second-last character 258 'o' 259 >>> word[-6] 260 'P' 261 262 Note that since -0 is the same as 0, negative indices start from -1. 263 264 In addition to indexing, *slicing* is also supported. While indexing is used 265 to obtain individual characters, *slicing* allows you to obtain a substring:: 266 267 >>> word[0:2] # characters from position 0 (included) to 2 (excluded) 268 'Py' 269 >>> word[2:5] # characters from position 2 (included) to 5 (excluded) 270 'tho' 271 272 Note how the start is always included, and the end always excluded. This 273 makes sure that ``s[:i] + s[i:]`` is always equal to ``s``:: 274 275 >>> word[:2] + word[2:] 276 'Python' 277 >>> word[:4] + word[4:] 278 'Python' 279 280 Slice indices have useful defaults; an omitted first index defaults to zero, an 281 omitted second index defaults to the size of the string being sliced. :: 282 283 >>> word[:2] # character from the beginning to position 2 (excluded) 284 'Py' 285 >>> word[4:] # characters from position 4 (included) to the end 286 'on' 287 >>> word[-2:] # characters from the second-last (included) to the end 288 'on' 289 290 One way to remember how slices work is to think of the indices as pointing 291 *between* characters, with the left edge of the first character numbered 0. 292 Then the right edge of the last character of a string of *n* characters has 293 index *n*, for example:: 294 295 +---+---+---+---+---+---+ 296 | P | y | t | h | o | n | 297 +---+---+---+---+---+---+ 298 0 1 2 3 4 5 6 299 -6 -5 -4 -3 -2 -1 300 301 The first row of numbers gives the position of the indices 0...6 in the string; 302 the second row gives the corresponding negative indices. The slice from *i* to 303 *j* consists of all characters between the edges labeled *i* and *j*, 304 respectively. 305 306 For non-negative indices, the length of a slice is the difference of the 307 indices, if both are within bounds. For example, the length of ``word[1:3]`` is 308 2. 309 310 Attempting to use an index that is too large will result in an error:: 311 312 >>> word[42] # the word only has 6 characters 313 Traceback (most recent call last): 314 File "<stdin>", line 1, in <module> 315 IndexError: string index out of range 316 317 However, out of range slice indexes are handled gracefully when used for 318 slicing:: 319 320 >>> word[4:42] 321 'on' 322 >>> word[42:] 323 '' 324 325 Python strings cannot be changed --- they are :term:`immutable`. 326 Therefore, assigning to an indexed position in the string results in an error:: 327 328 >>> word[0] = 'J' 329 ... 330 TypeError: 'str' object does not support item assignment 331 >>> word[2:] = 'py' 332 ... 333 TypeError: 'str' object does not support item assignment 334 335 If you need a different string, you should create a new one:: 336 337 >>> 'J' + word[1:] 338 'Jython' 339 >>> word[:2] + 'py' 340 'Pypy' 341 342 The built-in function :func:`len` returns the length of a string:: 343 344 >>> s = 'supercalifragilisticexpialidocious' 345 >>> len(s) 346 34 347 348 349 .. seealso:: 350 351 :ref:`typesseq` 352 Strings, and the Unicode strings described in the next section, are 353 examples of *sequence types*, and support the common operations supported 354 by such types. 355 356 :ref:`string-methods` 357 Both strings and Unicode strings support a large number of methods for 358 basic transformations and searching. 359 360 :ref:`formatstrings` 361 Information about string formatting with :meth:`str.format`. 362 363 :ref:`string-formatting` 364 The old formatting operations invoked when strings and Unicode strings are 365 the left operand of the ``%`` operator are described in more detail here. 366 367 368 .. _tut-unicodestrings: 369 370 Unicode Strings 371 --------------- 372 373 .. sectionauthor:: Marc-Andre Lemburg <mal (a] lemburg.com> 374 375 376 Starting with Python 2.0 a new data type for storing text data is available to 377 the programmer: the Unicode object. It can be used to store and manipulate 378 Unicode data (see http://www.unicode.org/) and integrates well with the existing 379 string objects, providing auto-conversions where necessary. 380 381 Unicode has the advantage of providing one ordinal for every character in every 382 script used in modern and ancient texts. Previously, there were only 256 383 possible ordinals for script characters. Texts were typically bound to a code 384 page which mapped the ordinals to script characters. This lead to very much 385 confusion especially with respect to internationalization (usually written as 386 ``i18n`` --- ``'i'`` + 18 characters + ``'n'``) of software. Unicode solves 387 these problems by defining one code page for all scripts. 388 389 Creating Unicode strings in Python is just as simple as creating normal 390 strings:: 391 392 >>> u'Hello World !' 393 u'Hello World !' 394 395 The small ``'u'`` in front of the quote indicates that a Unicode string is 396 supposed to be created. If you want to include special characters in the string, 397 you can do so by using the Python *Unicode-Escape* encoding. The following 398 example shows how:: 399 400 >>> u'Hello\u0020World !' 401 u'Hello World !' 402 403 The escape sequence ``\u0020`` indicates to insert the Unicode character with 404 the ordinal value 0x0020 (the space character) at the given position. 405 406 Other characters are interpreted by using their respective ordinal values 407 directly as Unicode ordinals. If you have literal strings in the standard 408 Latin-1 encoding that is used in many Western countries, you will find it 409 convenient that the lower 256 characters of Unicode are the same as the 256 410 characters of Latin-1. 411 412 For experts, there is also a raw mode just like the one for normal strings. You 413 have to prefix the opening quote with 'ur' to have Python use the 414 *Raw-Unicode-Escape* encoding. It will only apply the above ``\uXXXX`` 415 conversion if there is an uneven number of backslashes in front of the small 416 'u'. :: 417 418 >>> ur'Hello\u0020World !' 419 u'Hello World !' 420 >>> ur'Hello\\u0020World !' 421 u'Hello\\\\u0020World !' 422 423 The raw mode is most useful when you have to enter lots of backslashes, as can 424 be necessary in regular expressions. 425 426 Apart from these standard encodings, Python provides a whole set of other ways 427 of creating Unicode strings on the basis of a known encoding. 428 429 .. index:: builtin: unicode 430 431 The built-in function :func:`unicode` provides access to all registered Unicode 432 codecs (COders and DECoders). Some of the more well known encodings which these 433 codecs can convert are *Latin-1*, *ASCII*, *UTF-8*, and *UTF-16*. The latter two 434 are variable-length encodings that store each Unicode character in one or more 435 bytes. The default encoding is normally set to ASCII, which passes through 436 characters in the range 0 to 127 and rejects any other characters with an error. 437 When a Unicode string is printed, written to a file, or converted with 438 :func:`str`, conversion takes place using this default encoding. :: 439 440 >>> u"abc" 441 u'abc' 442 >>> str(u"abc") 443 'abc' 444 >>> u"" 445 u'\xe4\xf6\xfc' 446 >>> str(u"") 447 Traceback (most recent call last): 448 File "<stdin>", line 1, in ? 449 UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-2: ordinal not in range(128) 450 451 To convert a Unicode string into an 8-bit string using a specific encoding, 452 Unicode objects provide an :func:`encode` method that takes one argument, the 453 name of the encoding. Lowercase names for encodings are preferred. :: 454 455 >>> u"".encode('utf-8') 456 '\xc3\xa4\xc3\xb6\xc3\xbc' 457 458 If you have data in a specific encoding and want to produce a corresponding 459 Unicode string from it, you can use the :func:`unicode` function with the 460 encoding name as the second argument. :: 461 462 >>> unicode('\xc3\xa4\xc3\xb6\xc3\xbc', 'utf-8') 463 u'\xe4\xf6\xfc' 464 465 466 .. _tut-lists: 467 468 Lists 469 ----- 470 471 Python knows a number of *compound* data types, used to group together other 472 values. The most versatile is the *list*, which can be written as a list of 473 comma-separated values (items) between square brackets. Lists might contain 474 items of different types, but usually the items all have the same type. :: 475 476 >>> squares = [1, 4, 9, 16, 25] 477 >>> squares 478 [1, 4, 9, 16, 25] 479 480 Like strings (and all other built-in :term:`sequence` type), lists can be 481 indexed and sliced:: 482 483 >>> squares[0] # indexing returns the item 484 1 485 >>> squares[-1] 486 25 487 >>> squares[-3:] # slicing returns a new list 488 [9, 16, 25] 489 490 All slice operations return a new list containing the requested elements. This 491 means that the following slice returns a new (shallow) copy of the list:: 492 493 >>> squares[:] 494 [1, 4, 9, 16, 25] 495 496 Lists also supports operations like concatenation:: 497 498 >>> squares + [36, 49, 64, 81, 100] 499 [1, 4, 9, 16, 25, 36, 49, 64, 81, 100] 500 501 Unlike strings, which are :term:`immutable`, lists are a :term:`mutable` 502 type, i.e. it is possible to change their content:: 503 504 >>> cubes = [1, 8, 27, 65, 125] # something's wrong here 505 >>> 4 ** 3 # the cube of 4 is 64, not 65! 506 64 507 >>> cubes[3] = 64 # replace the wrong value 508 >>> cubes 509 [1, 8, 27, 64, 125] 510 511 You can also add new items at the end of the list, by using 512 the :meth:`~list.append` *method* (we will see more about methods later):: 513 514 >>> cubes.append(216) # add the cube of 6 515 >>> cubes.append(7 ** 3) # and the cube of 7 516 >>> cubes 517 [1, 8, 27, 64, 125, 216, 343] 518 519 Assignment to slices is also possible, and this can even change the size of the 520 list or clear it entirely:: 521 522 >>> letters = ['a', 'b', 'c', 'd', 'e', 'f', 'g'] 523 >>> letters 524 ['a', 'b', 'c', 'd', 'e', 'f', 'g'] 525 >>> # replace some values 526 >>> letters[2:5] = ['C', 'D', 'E'] 527 >>> letters 528 ['a', 'b', 'C', 'D', 'E', 'f', 'g'] 529 >>> # now remove them 530 >>> letters[2:5] = [] 531 >>> letters 532 ['a', 'b', 'f', 'g'] 533 >>> # clear the list by replacing all the elements with an empty list 534 >>> letters[:] = [] 535 >>> letters 536 [] 537 538 The built-in function :func:`len` also applies to lists:: 539 540 >>> letters = ['a', 'b', 'c', 'd'] 541 >>> len(letters) 542 4 543 544 It is possible to nest lists (create lists containing other lists), for 545 example:: 546 547 >>> a = ['a', 'b', 'c'] 548 >>> n = [1, 2, 3] 549 >>> x = [a, n] 550 >>> x 551 [['a', 'b', 'c'], [1, 2, 3]] 552 >>> x[0] 553 ['a', 'b', 'c'] 554 >>> x[0][1] 555 'b' 556 557 .. _tut-firststeps: 558 559 First Steps Towards Programming 560 =============================== 561 562 Of course, we can use Python for more complicated tasks than adding two and two 563 together. For instance, we can write an initial sub-sequence of the *Fibonacci* 564 series as follows:: 565 566 >>> # Fibonacci series: 567 ... # the sum of two elements defines the next 568 ... a, b = 0, 1 569 >>> while b < 10: 570 ... print b 571 ... a, b = b, a+b 572 ... 573 1 574 1 575 2 576 3 577 5 578 8 579 580 This example introduces several new features. 581 582 * The first line contains a *multiple assignment*: the variables ``a`` and ``b`` 583 simultaneously get the new values 0 and 1. On the last line this is used again, 584 demonstrating that the expressions on the right-hand side are all evaluated 585 first before any of the assignments take place. The right-hand side expressions 586 are evaluated from the left to the right. 587 588 * The :keyword:`while` loop executes as long as the condition (here: ``b < 10``) 589 remains true. In Python, like in C, any non-zero integer value is true; zero is 590 false. The condition may also be a string or list value, in fact any sequence; 591 anything with a non-zero length is true, empty sequences are false. The test 592 used in the example is a simple comparison. The standard comparison operators 593 are written the same as in C: ``<`` (less than), ``>`` (greater than), ``==`` 594 (equal to), ``<=`` (less than or equal to), ``>=`` (greater than or equal to) 595 and ``!=`` (not equal to). 596 597 * The *body* of the loop is *indented*: indentation is Python's way of grouping 598 statements. At the interactive prompt, you have to type a tab or space(s) for 599 each indented line. In practice you will prepare more complicated input 600 for Python with a text editor; all decent text editors have an auto-indent 601 facility. When a compound statement is entered interactively, it must be 602 followed by a blank line to indicate completion (since the parser cannot 603 guess when you have typed the last line). Note that each line within a basic 604 block must be indented by the same amount. 605 606 * The :keyword:`print` statement writes the value of the expression(s) it is 607 given. It differs from just writing the expression you want to write (as we did 608 earlier in the calculator examples) in the way it handles multiple expressions 609 and strings. Strings are printed without quotes, and a space is inserted 610 between items, so you can format things nicely, like this:: 611 612 >>> i = 256*256 613 >>> print 'The value of i is', i 614 The value of i is 65536 615 616 A trailing comma avoids the newline after the output:: 617 618 >>> a, b = 0, 1 619 >>> while b < 1000: 620 ... print b, 621 ... a, b = b, a+b 622 ... 623 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 624 625 Note that the interpreter inserts a newline before it prints the next prompt if 626 the last line was not completed. 627 628 .. rubric:: Footnotes 629 630 .. [#] Since ``**`` has higher precedence than ``-``, ``-3**2`` will be 631 interpreted as ``-(3**2)`` and thus result in ``-9``. To avoid this 632 and get ``9``, you can use ``(-3)**2``. 633 634 .. [#] Unlike other languages, special characters such as ``\n`` have the 635 same meaning with both single (``'...'``) and double (``"..."``) quotes. 636 The only difference between the two is that within single quotes you don't 637 need to escape ``"`` (but you have to escape ``\'``) and vice versa. 638