Home | History | Annotate | Download | only in tutorial
      1 .. _tut-informal:
      2 
      3 **********************************
      4 An Informal Introduction to Python
      5 **********************************
      6 
      7 In the following examples, input and output are distinguished by the presence or
      8 absence of prompts (:term:`>>>` and :term:`...`): to repeat the example, you must type
      9 everything after the prompt, when the prompt appears; lines that do not begin
     10 with a prompt are output from the interpreter. Note that a secondary prompt on a
     11 line by itself in an example means you must type a blank line; this is used to
     12 end a multi-line command.
     13 
     14 Many of the examples in this manual, even those entered at the interactive
     15 prompt, include comments.  Comments in Python start with the hash character,
     16 ``#``, and extend to the end of the physical line.  A comment may appear at the
     17 start of a line or following whitespace or code, but not within a string
     18 literal.  A hash character within a string literal is just a hash character.
     19 Since comments are to clarify code and are not interpreted by Python, they may
     20 be omitted when typing in examples.
     21 
     22 Some examples::
     23 
     24    # this is the first comment
     25    spam = 1  # and this is the second comment
     26              # ... and now a third!
     27    text = "# This is not a comment because it's inside quotes."
     28 
     29 
     30 .. _tut-calculator:
     31 
     32 Using Python as a Calculator
     33 ============================
     34 
     35 Let's try some simple Python commands.  Start the interpreter and wait for the
     36 primary prompt, ``>>>``.  (It shouldn't take long.)
     37 
     38 
     39 .. _tut-numbers:
     40 
     41 Numbers
     42 -------
     43 
     44 The interpreter acts as a simple calculator: you can type an expression at it
     45 and it will write the value.  Expression syntax is straightforward: the
     46 operators ``+``, ``-``, ``*`` and ``/`` work just like in most other languages
     47 (for example, Pascal or C); parentheses (``()``) can be used for grouping.
     48 For example::
     49 
     50    >>> 2 + 2
     51    4
     52    >>> 50 - 5*6
     53    20
     54    >>> (50 - 5.0*6) / 4
     55    5.0
     56    >>> 8 / 5.0
     57    1.6
     58 
     59 The integer numbers (e.g. ``2``, ``4``, ``20``) have type :class:`int`,
     60 the ones with a fractional part (e.g. ``5.0``, ``1.6``) have type
     61 :class:`float`.  We will see more about numeric types later in the tutorial.
     62 
     63 The return type of a division (``/``) operation depends on its operands.  If
     64 both operands are of type :class:`int`, :term:`floor division` is performed
     65 and an :class:`int` is returned.  If either operand is a :class:`float`,
     66 classic division is performed and a :class:`float` is returned.  The ``//``
     67 operator is also provided for doing floor division no matter what the
     68 operands are.  The remainder can be calculated with the ``%`` operator::
     69 
     70    >>> 17 / 3  # int / int -> int
     71    5
     72    >>> 17 / 3.0  # int / float -> float
     73    5.666666666666667
     74    >>> 17 // 3.0  # explicit floor division discards the fractional part
     75    5.0
     76    >>> 17 % 3  # the % operator returns the remainder of the division
     77    2
     78    >>> 5 * 3 + 2  # result * divisor + remainder
     79    17
     80 
     81 With Python, it is possible to use the ``**`` operator to calculate powers [#]_::
     82 
     83    >>> 5 ** 2  # 5 squared
     84    25
     85    >>> 2 ** 7  # 2 to the power of 7
     86    128
     87 
     88 The equal sign (``=``) is used to assign a value to a variable. Afterwards, no
     89 result is displayed before the next interactive prompt::
     90 
     91    >>> width = 20
     92    >>> height = 5 * 9
     93    >>> width * height
     94    900
     95 
     96 If a variable is not "defined" (assigned a value), trying to use it will
     97 give you an error::
     98 
     99    >>> n  # try to access an undefined variable
    100    Traceback (most recent call last):
    101      File "<stdin>", line 1, in <module>
    102    NameError: name 'n' is not defined
    103 
    104 There is full support for floating point; operators with mixed type operands
    105 convert the integer operand to floating point::
    106 
    107    >>> 3 * 3.75 / 1.5
    108    7.5
    109    >>> 7.0 / 2
    110    3.5
    111 
    112 In interactive mode, the last printed expression is assigned to the variable
    113 ``_``.  This means that when you are using Python as a desk calculator, it is
    114 somewhat easier to continue calculations, for example::
    115 
    116    >>> tax = 12.5 / 100
    117    >>> price = 100.50
    118    >>> price * tax
    119    12.5625
    120    >>> price + _
    121    113.0625
    122    >>> round(_, 2)
    123    113.06
    124 
    125 This variable should be treated as read-only by the user.  Don't explicitly
    126 assign a value to it --- you would create an independent local variable with the
    127 same name masking the built-in variable with its magic behavior.
    128 
    129 In addition to :class:`int` and :class:`float`, Python supports other types of
    130 numbers, such as :class:`~decimal.Decimal` and :class:`~fractions.Fraction`.
    131 Python also has built-in support for :ref:`complex numbers <typesnumeric>`,
    132 and uses the ``j`` or ``J`` suffix to indicate the imaginary part
    133 (e.g. ``3+5j``).
    134 
    135 
    136 .. _tut-strings:
    137 
    138 Strings
    139 -------
    140 
    141 Besides numbers, Python can also manipulate strings, which can be expressed
    142 in several ways.  They can be enclosed in single quotes (``'...'``) or
    143 double quotes (``"..."``) with the same result [#]_.  ``\`` can be used
    144 to escape quotes::
    145 
    146    >>> 'spam eggs'  # single quotes
    147    'spam eggs'
    148    >>> 'doesn\'t'  # use \' to escape the single quote...
    149    "doesn't"
    150    >>> "doesn't"  # ...or use double quotes instead
    151    "doesn't"
    152    >>> '"Yes," he said.'
    153    '"Yes," he said.'
    154    >>> "\"Yes,\" he said."
    155    '"Yes," he said.'
    156    >>> '"Isn\'t," she said.'
    157    '"Isn\'t," she said.'
    158 
    159 In the interactive interpreter, the output string is enclosed in quotes and
    160 special characters are escaped with backslashes.  While this might sometimes
    161 look different from the input (the enclosing quotes could change), the two
    162 strings are equivalent.  The string is enclosed in double quotes if
    163 the string contains a single quote and no double quotes, otherwise it is
    164 enclosed in single quotes.  The :keyword:`print` statement produces a more
    165 readable output, by omitting the enclosing quotes and by printing escaped
    166 and special characters::
    167 
    168    >>> '"Isn\'t," she said.'
    169    '"Isn\'t," she said.'
    170    >>> print '"Isn\'t," she said.'
    171    "Isn't," she said.
    172    >>> s = 'First line.\nSecond line.'  # \n means newline
    173    >>> s  # without print, \n is included in the output
    174    'First line.\nSecond line.'
    175    >>> print s  # with print, \n produces a new line
    176    First line.
    177    Second line.
    178 
    179 If you don't want characters prefaced by ``\`` to be interpreted as
    180 special characters, you can use *raw strings* by adding an ``r`` before
    181 the first quote::
    182 
    183    >>> print 'C:\some\name'  # here \n means newline!
    184    C:\some
    185    ame
    186    >>> print r'C:\some\name'  # note the r before the quote
    187    C:\some\name
    188 
    189 String literals can span multiple lines.  One way is using triple-quotes:
    190 ``"""..."""`` or ``'''...'''``.  End of lines are automatically
    191 included in the string, but it's possible to prevent this by adding a ``\`` at
    192 the end of the line.  The following example::
    193 
    194    print """\
    195    Usage: thingy [OPTIONS]
    196         -h                        Display this usage message
    197         -H hostname               Hostname to connect to
    198    """
    199 
    200 produces the following output (note that the initial newline is not included):
    201 
    202 .. code-block:: text
    203 
    204    Usage: thingy [OPTIONS]
    205         -h                        Display this usage message
    206         -H hostname               Hostname to connect to
    207 
    208 Strings can be concatenated (glued together) with the ``+`` operator, and
    209 repeated with ``*``::
    210 
    211    >>> # 3 times 'un', followed by 'ium'
    212    >>> 3 * 'un' + 'ium'
    213    'unununium'
    214 
    215 Two or more *string literals* (i.e. the ones enclosed between quotes) next
    216 to each other are automatically concatenated. ::
    217 
    218    >>> 'Py' 'thon'
    219    'Python'
    220 
    221 This only works with two literals though, not with variables or expressions::
    222 
    223    >>> prefix = 'Py'
    224    >>> prefix 'thon'  # can't concatenate a variable and a string literal
    225      ...
    226    SyntaxError: invalid syntax
    227    >>> ('un' * 3) 'ium'
    228      ...
    229    SyntaxError: invalid syntax
    230 
    231 If you want to concatenate variables or a variable and a literal, use ``+``::
    232 
    233    >>> prefix + 'thon'
    234    'Python'
    235 
    236 This feature is particularly useful when you want to break long strings::
    237 
    238    >>> text = ('Put several strings within parentheses '
    239    ...         'to have them joined together.')
    240    >>> text
    241    'Put several strings within parentheses to have them joined together.'
    242 
    243 Strings can be *indexed* (subscripted), with the first character having index 0.
    244 There is no separate character type; a character is simply a string of size
    245 one::
    246 
    247    >>> word = 'Python'
    248    >>> word[0]  # character in position 0
    249    'P'
    250    >>> word[5]  # character in position 5
    251    'n'
    252 
    253 Indices may also be negative numbers, to start counting from the right::
    254 
    255    >>> word[-1]  # last character
    256    'n'
    257    >>> word[-2]  # second-last character
    258    'o'
    259    >>> word[-6]
    260    'P'
    261 
    262 Note that since -0 is the same as 0, negative indices start from -1.
    263 
    264 In addition to indexing, *slicing* is also supported.  While indexing is used
    265 to obtain individual characters, *slicing* allows you to obtain a substring::
    266 
    267    >>> word[0:2]  # characters from position 0 (included) to 2 (excluded)
    268    'Py'
    269    >>> word[2:5]  # characters from position 2 (included) to 5 (excluded)
    270    'tho'
    271 
    272 Note how the start is always included, and the end always excluded.  This
    273 makes sure that ``s[:i] + s[i:]`` is always equal to ``s``::
    274 
    275    >>> word[:2] + word[2:]
    276    'Python'
    277    >>> word[:4] + word[4:]
    278    'Python'
    279 
    280 Slice indices have useful defaults; an omitted first index defaults to zero, an
    281 omitted second index defaults to the size of the string being sliced. ::
    282 
    283    >>> word[:2]   # character from the beginning to position 2 (excluded)
    284    'Py'
    285    >>> word[4:]   # characters from position 4 (included) to the end
    286    'on'
    287    >>> word[-2:]  # characters from the second-last (included) to the end
    288    'on'
    289 
    290 One way to remember how slices work is to think of the indices as pointing
    291 *between* characters, with the left edge of the first character numbered 0.
    292 Then the right edge of the last character of a string of *n* characters has
    293 index *n*, for example::
    294 
    295     +---+---+---+---+---+---+
    296     | P | y | t | h | o | n |
    297     +---+---+---+---+---+---+
    298     0   1   2   3   4   5   6
    299    -6  -5  -4  -3  -2  -1
    300 
    301 The first row of numbers gives the position of the indices 0...6 in the string;
    302 the second row gives the corresponding negative indices. The slice from *i* to
    303 *j* consists of all characters between the edges labeled *i* and *j*,
    304 respectively.
    305 
    306 For non-negative indices, the length of a slice is the difference of the
    307 indices, if both are within bounds.  For example, the length of ``word[1:3]`` is
    308 2.
    309 
    310 Attempting to use an index that is too large will result in an error::
    311 
    312    >>> word[42]  # the word only has 6 characters
    313    Traceback (most recent call last):
    314      File "<stdin>", line 1, in <module>
    315    IndexError: string index out of range
    316 
    317 However, out of range slice indexes are handled gracefully when used for
    318 slicing::
    319 
    320    >>> word[4:42]
    321    'on'
    322    >>> word[42:]
    323    ''
    324 
    325 Python strings cannot be changed --- they are :term:`immutable`.
    326 Therefore, assigning to an indexed position in the string results in an error::
    327 
    328    >>> word[0] = 'J'
    329      ...
    330    TypeError: 'str' object does not support item assignment
    331    >>> word[2:] = 'py'
    332      ...
    333    TypeError: 'str' object does not support item assignment
    334 
    335 If you need a different string, you should create a new one::
    336 
    337    >>> 'J' + word[1:]
    338    'Jython'
    339    >>> word[:2] + 'py'
    340    'Pypy'
    341 
    342 The built-in function :func:`len` returns the length of a string::
    343 
    344    >>> s = 'supercalifragilisticexpialidocious'
    345    >>> len(s)
    346    34
    347 
    348 
    349 .. seealso::
    350 
    351    :ref:`typesseq`
    352       Strings, and the Unicode strings described in the next section, are
    353       examples of *sequence types*, and support the common operations supported
    354       by such types.
    355 
    356    :ref:`string-methods`
    357       Both strings and Unicode strings support a large number of methods for
    358       basic transformations and searching.
    359 
    360    :ref:`formatstrings`
    361       Information about string formatting with :meth:`str.format`.
    362 
    363    :ref:`string-formatting`
    364       The old formatting operations invoked when strings and Unicode strings are
    365       the left operand of the ``%`` operator are described in more detail here.
    366 
    367 
    368 .. _tut-unicodestrings:
    369 
    370 Unicode Strings
    371 ---------------
    372 
    373 .. sectionauthor:: Marc-Andre Lemburg <mal (a] lemburg.com>
    374 
    375 
    376 Starting with Python 2.0 a new data type for storing text data is available to
    377 the programmer: the Unicode object. It can be used to store and manipulate
    378 Unicode data (see http://www.unicode.org/) and integrates well with the existing
    379 string objects, providing auto-conversions where necessary.
    380 
    381 Unicode has the advantage of providing one ordinal for every character in every
    382 script used in modern and ancient texts. Previously, there were only 256
    383 possible ordinals for script characters. Texts were typically bound to a code
    384 page which mapped the ordinals to script characters. This lead to very much
    385 confusion especially with respect to internationalization (usually written as
    386 ``i18n`` --- ``'i'`` + 18 characters + ``'n'``) of software.  Unicode solves
    387 these problems by defining one code page for all scripts.
    388 
    389 Creating Unicode strings in Python is just as simple as creating normal
    390 strings::
    391 
    392    >>> u'Hello World !'
    393    u'Hello World !'
    394 
    395 The small ``'u'`` in front of the quote indicates that a Unicode string is
    396 supposed to be created. If you want to include special characters in the string,
    397 you can do so by using the Python *Unicode-Escape* encoding. The following
    398 example shows how::
    399 
    400    >>> u'Hello\u0020World !'
    401    u'Hello World !'
    402 
    403 The escape sequence ``\u0020`` indicates to insert the Unicode character with
    404 the ordinal value 0x0020 (the space character) at the given position.
    405 
    406 Other characters are interpreted by using their respective ordinal values
    407 directly as Unicode ordinals.  If you have literal strings in the standard
    408 Latin-1 encoding that is used in many Western countries, you will find it
    409 convenient that the lower 256 characters of Unicode are the same as the 256
    410 characters of Latin-1.
    411 
    412 For experts, there is also a raw mode just like the one for normal strings. You
    413 have to prefix the opening quote with 'ur' to have Python use the
    414 *Raw-Unicode-Escape* encoding. It will only apply the above ``\uXXXX``
    415 conversion if there is an uneven number of backslashes in front of the small
    416 'u'. ::
    417 
    418    >>> ur'Hello\u0020World !'
    419    u'Hello World !'
    420    >>> ur'Hello\\u0020World !'
    421    u'Hello\\\\u0020World !'
    422 
    423 The raw mode is most useful when you have to enter lots of backslashes, as can
    424 be necessary in regular expressions.
    425 
    426 Apart from these standard encodings, Python provides a whole set of other ways
    427 of creating Unicode strings on the basis of a known encoding.
    428 
    429 .. index:: builtin: unicode
    430 
    431 The built-in function :func:`unicode` provides access to all registered Unicode
    432 codecs (COders and DECoders). Some of the more well known encodings which these
    433 codecs can convert are *Latin-1*, *ASCII*, *UTF-8*, and *UTF-16*. The latter two
    434 are variable-length encodings that store each Unicode character in one or more
    435 bytes. The default encoding is normally set to ASCII, which passes through
    436 characters in the range 0 to 127 and rejects any other characters with an error.
    437 When a Unicode string is printed, written to a file, or converted with
    438 :func:`str`, conversion takes place using this default encoding. ::
    439 
    440    >>> u"abc"
    441    u'abc'
    442    >>> str(u"abc")
    443    'abc'
    444    >>> u""
    445    u'\xe4\xf6\xfc'
    446    >>> str(u"")
    447    Traceback (most recent call last):
    448      File "<stdin>", line 1, in ?
    449    UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-2: ordinal not in range(128)
    450 
    451 To convert a Unicode string into an 8-bit string using a specific encoding,
    452 Unicode objects provide an :func:`encode` method that takes one argument, the
    453 name of the encoding.  Lowercase names for encodings are preferred. ::
    454 
    455    >>> u"".encode('utf-8')
    456    '\xc3\xa4\xc3\xb6\xc3\xbc'
    457 
    458 If you have data in a specific encoding and want to produce a corresponding
    459 Unicode string from it, you can use the :func:`unicode` function with the
    460 encoding name as the second argument. ::
    461 
    462    >>> unicode('\xc3\xa4\xc3\xb6\xc3\xbc', 'utf-8')
    463    u'\xe4\xf6\xfc'
    464 
    465 
    466 .. _tut-lists:
    467 
    468 Lists
    469 -----
    470 
    471 Python knows a number of *compound* data types, used to group together other
    472 values.  The most versatile is the *list*, which can be written as a list of
    473 comma-separated values (items) between square brackets.  Lists might contain
    474 items of different types, but usually the items all have the same type. ::
    475 
    476    >>> squares = [1, 4, 9, 16, 25]
    477    >>> squares
    478    [1, 4, 9, 16, 25]
    479 
    480 Like strings (and all other built-in :term:`sequence` type), lists can be
    481 indexed and sliced::
    482 
    483    >>> squares[0]  # indexing returns the item
    484    1
    485    >>> squares[-1]
    486    25
    487    >>> squares[-3:]  # slicing returns a new list
    488    [9, 16, 25]
    489 
    490 All slice operations return a new list containing the requested elements.  This
    491 means that the following slice returns a new (shallow) copy of the list::
    492 
    493    >>> squares[:]
    494    [1, 4, 9, 16, 25]
    495 
    496 Lists also supports operations like concatenation::
    497 
    498    >>> squares + [36, 49, 64, 81, 100]
    499    [1, 4, 9, 16, 25, 36, 49, 64, 81, 100]
    500 
    501 Unlike strings, which are :term:`immutable`, lists are a :term:`mutable`
    502 type, i.e. it is possible to change their content::
    503 
    504     >>> cubes = [1, 8, 27, 65, 125]  # something's wrong here
    505     >>> 4 ** 3  # the cube of 4 is 64, not 65!
    506     64
    507     >>> cubes[3] = 64  # replace the wrong value
    508     >>> cubes
    509     [1, 8, 27, 64, 125]
    510 
    511 You can also add new items at the end of the list, by using
    512 the :meth:`~list.append` *method* (we will see more about methods later)::
    513 
    514    >>> cubes.append(216)  # add the cube of 6
    515    >>> cubes.append(7 ** 3)  # and the cube of 7
    516    >>> cubes
    517    [1, 8, 27, 64, 125, 216, 343]
    518 
    519 Assignment to slices is also possible, and this can even change the size of the
    520 list or clear it entirely::
    521 
    522    >>> letters = ['a', 'b', 'c', 'd', 'e', 'f', 'g']
    523    >>> letters
    524    ['a', 'b', 'c', 'd', 'e', 'f', 'g']
    525    >>> # replace some values
    526    >>> letters[2:5] = ['C', 'D', 'E']
    527    >>> letters
    528    ['a', 'b', 'C', 'D', 'E', 'f', 'g']
    529    >>> # now remove them
    530    >>> letters[2:5] = []
    531    >>> letters
    532    ['a', 'b', 'f', 'g']
    533    >>> # clear the list by replacing all the elements with an empty list
    534    >>> letters[:] = []
    535    >>> letters
    536    []
    537 
    538 The built-in function :func:`len` also applies to lists::
    539 
    540    >>> letters = ['a', 'b', 'c', 'd']
    541    >>> len(letters)
    542    4
    543 
    544 It is possible to nest lists (create lists containing other lists), for
    545 example::
    546 
    547    >>> a = ['a', 'b', 'c']
    548    >>> n = [1, 2, 3]
    549    >>> x = [a, n]
    550    >>> x
    551    [['a', 'b', 'c'], [1, 2, 3]]
    552    >>> x[0]
    553    ['a', 'b', 'c']
    554    >>> x[0][1]
    555    'b'
    556 
    557 .. _tut-firststeps:
    558 
    559 First Steps Towards Programming
    560 ===============================
    561 
    562 Of course, we can use Python for more complicated tasks than adding two and two
    563 together.  For instance, we can write an initial sub-sequence of the *Fibonacci*
    564 series as follows::
    565 
    566    >>> # Fibonacci series:
    567    ... # the sum of two elements defines the next
    568    ... a, b = 0, 1
    569    >>> while b < 10:
    570    ...     print b
    571    ...     a, b = b, a+b
    572    ...
    573    1
    574    1
    575    2
    576    3
    577    5
    578    8
    579 
    580 This example introduces several new features.
    581 
    582 * The first line contains a *multiple assignment*: the variables ``a`` and ``b``
    583   simultaneously get the new values 0 and 1.  On the last line this is used again,
    584   demonstrating that the expressions on the right-hand side are all evaluated
    585   first before any of the assignments take place.  The right-hand side expressions
    586   are evaluated  from the left to the right.
    587 
    588 * The :keyword:`while` loop executes as long as the condition (here: ``b < 10``)
    589   remains true.  In Python, like in C, any non-zero integer value is true; zero is
    590   false.  The condition may also be a string or list value, in fact any sequence;
    591   anything with a non-zero length is true, empty sequences are false.  The test
    592   used in the example is a simple comparison.  The standard comparison operators
    593   are written the same as in C: ``<`` (less than), ``>`` (greater than), ``==``
    594   (equal to), ``<=`` (less than or equal to), ``>=`` (greater than or equal to)
    595   and ``!=`` (not equal to).
    596 
    597 * The *body* of the loop is *indented*: indentation is Python's way of grouping
    598   statements.  At the interactive prompt, you have to type a tab or space(s) for
    599   each indented line.  In practice you will prepare more complicated input
    600   for Python with a text editor; all decent text editors have an auto-indent
    601   facility.  When a compound statement is entered interactively, it must be
    602   followed by a blank line to indicate completion (since the parser cannot
    603   guess when you have typed the last line).  Note that each line within a basic
    604   block must be indented by the same amount.
    605 
    606 * The :keyword:`print` statement writes the value of the expression(s) it is
    607   given.  It differs from just writing the expression you want to write (as we did
    608   earlier in the calculator examples) in the way it handles multiple expressions
    609   and strings.  Strings are printed without quotes, and a space is inserted
    610   between items, so you can format things nicely, like this::
    611 
    612      >>> i = 256*256
    613      >>> print 'The value of i is', i
    614      The value of i is 65536
    615 
    616   A trailing comma avoids the newline after the output::
    617 
    618      >>> a, b = 0, 1
    619      >>> while b < 1000:
    620      ...     print b,
    621      ...     a, b = b, a+b
    622      ...
    623      1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987
    624 
    625   Note that the interpreter inserts a newline before it prints the next prompt if
    626   the last line was not completed.
    627 
    628 .. rubric:: Footnotes
    629 
    630 .. [#] Since ``**`` has higher precedence than ``-``, ``-3**2`` will be
    631    interpreted as ``-(3**2)`` and thus result in ``-9``.  To avoid this
    632    and get ``9``, you can use ``(-3)**2``.
    633 
    634 .. [#] Unlike other languages, special characters such as ``\n`` have the
    635    same meaning with both single (``'...'``) and double (``"..."``) quotes.
    636    The only difference between the two is that within single quotes you don't
    637    need to escape ``"`` (but you have to escape ``\'``) and vice versa.
    638