Home | History | Annotate | Download | only in tutorial
      1 .. _tut-modules:
      2 
      3 *******
      4 Modules
      5 *******
      6 
      7 If you quit from the Python interpreter and enter it again, the definitions you
      8 have made (functions and variables) are lost. Therefore, if you want to write a
      9 somewhat longer program, you are better off using a text editor to prepare the
     10 input for the interpreter and running it with that file as input instead.  This
     11 is known as creating a *script*.  As your program gets longer, you may want to
     12 split it into several files for easier maintenance.  You may also want to use a
     13 handy function that you've written in several programs without copying its
     14 definition into each program.
     15 
     16 To support this, Python has a way to put definitions in a file and use them in a
     17 script or in an interactive instance of the interpreter. Such a file is called a
     18 *module*; definitions from a module can be *imported* into other modules or into
     19 the *main* module (the collection of variables that you have access to in a
     20 script executed at the top level and in calculator mode).
     21 
     22 A module is a file containing Python definitions and statements.  The file name
     23 is the module name with the suffix :file:`.py` appended.  Within a module, the
     24 module's name (as a string) is available as the value of the global variable
     25 ``__name__``.  For instance, use your favorite text editor to create a file
     26 called :file:`fibo.py` in the current directory with the following contents::
     27 
     28    # Fibonacci numbers module
     29 
     30    def fib(n):    # write Fibonacci series up to n
     31        a, b = 0, 1
     32        while a < n:
     33            print(a, end=' ')
     34            a, b = b, a+b
     35        print()
     36 
     37    def fib2(n):   # return Fibonacci series up to n
     38        result = []
     39        a, b = 0, 1
     40        while a < n:
     41            result.append(a)
     42            a, b = b, a+b
     43        return result
     44 
     45 Now enter the Python interpreter and import this module with the following
     46 command::
     47 
     48    >>> import fibo
     49 
     50 This does not enter the names of the functions defined in ``fibo``  directly in
     51 the current symbol table; it only enters the module name ``fibo`` there. Using
     52 the module name you can access the functions::
     53 
     54    >>> fibo.fib(1000)
     55    0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987
     56    >>> fibo.fib2(100)
     57    [0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89]
     58    >>> fibo.__name__
     59    'fibo'
     60 
     61 If you intend to use a function often you can assign it to a local name::
     62 
     63    >>> fib = fibo.fib
     64    >>> fib(500)
     65    0 1 1 2 3 5 8 13 21 34 55 89 144 233 377
     66 
     67 
     68 .. _tut-moremodules:
     69 
     70 More on Modules
     71 ===============
     72 
     73 A module can contain executable statements as well as function definitions.
     74 These statements are intended to initialize the module. They are executed only
     75 the *first* time the module name is encountered in an import statement. [#]_
     76 (They are also run if the file is executed as a script.)
     77 
     78 Each module has its own private symbol table, which is used as the global symbol
     79 table by all functions defined in the module. Thus, the author of a module can
     80 use global variables in the module without worrying about accidental clashes
     81 with a user's global variables. On the other hand, if you know what you are
     82 doing you can touch a module's global variables with the same notation used to
     83 refer to its functions, ``modname.itemname``.
     84 
     85 Modules can import other modules.  It is customary but not required to place all
     86 :keyword:`import` statements at the beginning of a module (or script, for that
     87 matter).  The imported module names are placed in the importing module's global
     88 symbol table.
     89 
     90 There is a variant of the :keyword:`import` statement that imports names from a
     91 module directly into the importing module's symbol table.  For example::
     92 
     93    >>> from fibo import fib, fib2
     94    >>> fib(500)
     95    0 1 1 2 3 5 8 13 21 34 55 89 144 233 377
     96 
     97 This does not introduce the module name from which the imports are taken in the
     98 local symbol table (so in the example, ``fibo`` is not defined).
     99 
    100 There is even a variant to import all names that a module defines::
    101 
    102    >>> from fibo import *
    103    >>> fib(500)
    104    0 1 1 2 3 5 8 13 21 34 55 89 144 233 377
    105 
    106 This imports all names except those beginning with an underscore (``_``).
    107 In most cases Python programmers do not use this facility since it introduces
    108 an unknown set of names into the interpreter, possibly hiding some things
    109 you have already defined.
    110 
    111 Note that in general the practice of importing ``*`` from a module or package is
    112 frowned upon, since it often causes poorly readable code. However, it is okay to
    113 use it to save typing in interactive sessions.
    114 
    115 If the module name is followed by :keyword:`!as`, then the name
    116 following :keyword:`!as` is bound directly to the imported module.
    117 
    118 ::
    119 
    120    >>> import fibo as fib
    121    >>> fib.fib(500)
    122    0 1 1 2 3 5 8 13 21 34 55 89 144 233 377
    123 
    124 This is effectively importing the module in the same way that ``import fibo``
    125 will do, with the only difference of it being available as ``fib``.
    126 
    127 It can also be used when utilising :keyword:`from` with similar effects::
    128 
    129    >>> from fibo import fib as fibonacci
    130    >>> fibonacci(500)
    131    0 1 1 2 3 5 8 13 21 34 55 89 144 233 377
    132 
    133 
    134 .. note::
    135 
    136    For efficiency reasons, each module is only imported once per interpreter
    137    session.  Therefore, if you change your modules, you must restart the
    138    interpreter -- or, if it's just one module you want to test interactively,
    139    use :func:`importlib.reload`, e.g. ``import importlib;
    140    importlib.reload(modulename)``.
    141 
    142 
    143 .. _tut-modulesasscripts:
    144 
    145 Executing modules as scripts
    146 ----------------------------
    147 
    148 When you run a Python module with ::
    149 
    150    python fibo.py <arguments>
    151 
    152 the code in the module will be executed, just as if you imported it, but with
    153 the ``__name__`` set to ``"__main__"``.  That means that by adding this code at
    154 the end of your module::
    155 
    156    if __name__ == "__main__":
    157        import sys
    158        fib(int(sys.argv[1]))
    159 
    160 you can make the file usable as a script as well as an importable module,
    161 because the code that parses the command line only runs if the module is
    162 executed as the "main" file:
    163 
    164 .. code-block:: shell-session
    165 
    166    $ python fibo.py 50
    167    0 1 1 2 3 5 8 13 21 34
    168 
    169 If the module is imported, the code is not run::
    170 
    171    >>> import fibo
    172    >>>
    173 
    174 This is often used either to provide a convenient user interface to a module, or
    175 for testing purposes (running the module as a script executes a test suite).
    176 
    177 
    178 .. _tut-searchpath:
    179 
    180 The Module Search Path
    181 ----------------------
    182 
    183 .. index:: triple: module; search; path
    184 
    185 When a module named :mod:`spam` is imported, the interpreter first searches for
    186 a built-in module with that name. If not found, it then searches for a file
    187 named :file:`spam.py` in a list of directories given by the variable
    188 :data:`sys.path`.  :data:`sys.path` is initialized from these locations:
    189 
    190 * The directory containing the input script (or the current directory when no
    191   file is specified).
    192 * :envvar:`PYTHONPATH` (a list of directory names, with the same syntax as the
    193   shell variable :envvar:`PATH`).
    194 * The installation-dependent default.
    195 
    196 .. note::
    197    On file systems which support symlinks, the directory containing the input
    198    script is calculated after the symlink is followed. In other words the
    199    directory containing the symlink is **not** added to the module search path.
    200 
    201 After initialization, Python programs can modify :data:`sys.path`.  The
    202 directory containing the script being run is placed at the beginning of the
    203 search path, ahead of the standard library path. This means that scripts in that
    204 directory will be loaded instead of modules of the same name in the library
    205 directory. This is an error unless the replacement is intended.  See section
    206 :ref:`tut-standardmodules` for more information.
    207 
    208 .. %
    209     Do we need stuff on zip files etc. ? DUBOIS
    210 
    211 "Compiled" Python files
    212 -----------------------
    213 
    214 To speed up loading modules, Python caches the compiled version of each module
    215 in the ``__pycache__`` directory under the name :file:`module.{version}.pyc`,
    216 where the version encodes the format of the compiled file; it generally contains
    217 the Python version number.  For example, in CPython release 3.3 the compiled
    218 version of spam.py would be cached as ``__pycache__/spam.cpython-33.pyc``.  This
    219 naming convention allows compiled modules from different releases and different
    220 versions of Python to coexist.
    221 
    222 Python checks the modification date of the source against the compiled version
    223 to see if it's out of date and needs to be recompiled.  This is a completely
    224 automatic process.  Also, the compiled modules are platform-independent, so the
    225 same library can be shared among systems with different architectures.
    226 
    227 Python does not check the cache in two circumstances.  First, it always
    228 recompiles and does not store the result for the module that's loaded directly
    229 from the command line.  Second, it does not check the cache if there is no
    230 source module.  To support a non-source (compiled only) distribution, the
    231 compiled module must be in the source directory, and there must not be a source
    232 module.
    233 
    234 Some tips for experts:
    235 
    236 * You can use the :option:`-O` or :option:`-OO` switches on the Python command
    237   to reduce the size of a compiled module.  The ``-O`` switch removes assert
    238   statements, the ``-OO`` switch removes both assert statements and __doc__
    239   strings.  Since some programs may rely on having these available, you should
    240   only use this option if you know what you're doing.  "Optimized" modules have
    241   an ``opt-`` tag and are usually smaller.  Future releases may
    242   change the effects of optimization.
    243 
    244 * A program doesn't run any faster when it is read from a ``.pyc``
    245   file than when it is read from a ``.py`` file; the only thing that's faster
    246   about ``.pyc`` files is the speed with which they are loaded.
    247 
    248 * The module :mod:`compileall` can create .pyc files for all modules in a
    249   directory.
    250 
    251 * There is more detail on this process, including a flow chart of the
    252   decisions, in :pep:`3147`.
    253 
    254 
    255 .. _tut-standardmodules:
    256 
    257 Standard Modules
    258 ================
    259 
    260 .. index:: module: sys
    261 
    262 Python comes with a library of standard modules, described in a separate
    263 document, the Python Library Reference ("Library Reference" hereafter).  Some
    264 modules are built into the interpreter; these provide access to operations that
    265 are not part of the core of the language but are nevertheless built in, either
    266 for efficiency or to provide access to operating system primitives such as
    267 system calls.  The set of such modules is a configuration option which also
    268 depends on the underlying platform.  For example, the :mod:`winreg` module is only
    269 provided on Windows systems. One particular module deserves some attention:
    270 :mod:`sys`, which is built into every Python interpreter.  The variables
    271 ``sys.ps1`` and ``sys.ps2`` define the strings used as primary and secondary
    272 prompts::
    273 
    274    >>> import sys
    275    >>> sys.ps1
    276    '>>> '
    277    >>> sys.ps2
    278    '... '
    279    >>> sys.ps1 = 'C> '
    280    C> print('Yuck!')
    281    Yuck!
    282    C>
    283 
    284 
    285 These two variables are only defined if the interpreter is in interactive mode.
    286 
    287 The variable ``sys.path`` is a list of strings that determines the interpreter's
    288 search path for modules. It is initialized to a default path taken from the
    289 environment variable :envvar:`PYTHONPATH`, or from a built-in default if
    290 :envvar:`PYTHONPATH` is not set.  You can modify it using standard list
    291 operations::
    292 
    293    >>> import sys
    294    >>> sys.path.append('/ufs/guido/lib/python')
    295 
    296 
    297 .. _tut-dir:
    298 
    299 The :func:`dir` Function
    300 ========================
    301 
    302 The built-in function :func:`dir` is used to find out which names a module
    303 defines.  It returns a sorted list of strings::
    304 
    305    >>> import fibo, sys
    306    >>> dir(fibo)
    307    ['__name__', 'fib', 'fib2']
    308    >>> dir(sys)  # doctest: +NORMALIZE_WHITESPACE
    309    ['__displayhook__', '__doc__', '__excepthook__', '__loader__', '__name__',
    310     '__package__', '__stderr__', '__stdin__', '__stdout__',
    311     '_clear_type_cache', '_current_frames', '_debugmallocstats', '_getframe',
    312     '_home', '_mercurial', '_xoptions', 'abiflags', 'api_version', 'argv',
    313     'base_exec_prefix', 'base_prefix', 'builtin_module_names', 'byteorder',
    314     'call_tracing', 'callstats', 'copyright', 'displayhook',
    315     'dont_write_bytecode', 'exc_info', 'excepthook', 'exec_prefix',
    316     'executable', 'exit', 'flags', 'float_info', 'float_repr_style',
    317     'getcheckinterval', 'getdefaultencoding', 'getdlopenflags',
    318     'getfilesystemencoding', 'getobjects', 'getprofile', 'getrecursionlimit',
    319     'getrefcount', 'getsizeof', 'getswitchinterval', 'gettotalrefcount',
    320     'gettrace', 'hash_info', 'hexversion', 'implementation', 'int_info',
    321     'intern', 'maxsize', 'maxunicode', 'meta_path', 'modules', 'path',
    322     'path_hooks', 'path_importer_cache', 'platform', 'prefix', 'ps1',
    323     'setcheckinterval', 'setdlopenflags', 'setprofile', 'setrecursionlimit',
    324     'setswitchinterval', 'settrace', 'stderr', 'stdin', 'stdout',
    325     'thread_info', 'version', 'version_info', 'warnoptions']
    326 
    327 Without arguments, :func:`dir` lists the names you have defined currently::
    328 
    329    >>> a = [1, 2, 3, 4, 5]
    330    >>> import fibo
    331    >>> fib = fibo.fib
    332    >>> dir()
    333    ['__builtins__', '__name__', 'a', 'fib', 'fibo', 'sys']
    334 
    335 Note that it lists all types of names: variables, modules, functions, etc.
    336 
    337 .. index:: module: builtins
    338 
    339 :func:`dir` does not list the names of built-in functions and variables.  If you
    340 want a list of those, they are defined in the standard module
    341 :mod:`builtins`::
    342 
    343    >>> import builtins
    344    >>> dir(builtins)  # doctest: +NORMALIZE_WHITESPACE
    345    ['ArithmeticError', 'AssertionError', 'AttributeError', 'BaseException',
    346     'BlockingIOError', 'BrokenPipeError', 'BufferError', 'BytesWarning',
    347     'ChildProcessError', 'ConnectionAbortedError', 'ConnectionError',
    348     'ConnectionRefusedError', 'ConnectionResetError', 'DeprecationWarning',
    349     'EOFError', 'Ellipsis', 'EnvironmentError', 'Exception', 'False',
    350     'FileExistsError', 'FileNotFoundError', 'FloatingPointError',
    351     'FutureWarning', 'GeneratorExit', 'IOError', 'ImportError',
    352     'ImportWarning', 'IndentationError', 'IndexError', 'InterruptedError',
    353     'IsADirectoryError', 'KeyError', 'KeyboardInterrupt', 'LookupError',
    354     'MemoryError', 'NameError', 'None', 'NotADirectoryError', 'NotImplemented',
    355     'NotImplementedError', 'OSError', 'OverflowError',
    356     'PendingDeprecationWarning', 'PermissionError', 'ProcessLookupError',
    357     'ReferenceError', 'ResourceWarning', 'RuntimeError', 'RuntimeWarning',
    358     'StopIteration', 'SyntaxError', 'SyntaxWarning', 'SystemError',
    359     'SystemExit', 'TabError', 'TimeoutError', 'True', 'TypeError',
    360     'UnboundLocalError', 'UnicodeDecodeError', 'UnicodeEncodeError',
    361     'UnicodeError', 'UnicodeTranslateError', 'UnicodeWarning', 'UserWarning',
    362     'ValueError', 'Warning', 'ZeroDivisionError', '_', '__build_class__',
    363     '__debug__', '__doc__', '__import__', '__name__', '__package__', 'abs',
    364     'all', 'any', 'ascii', 'bin', 'bool', 'bytearray', 'bytes', 'callable',
    365     'chr', 'classmethod', 'compile', 'complex', 'copyright', 'credits',
    366     'delattr', 'dict', 'dir', 'divmod', 'enumerate', 'eval', 'exec', 'exit',
    367     'filter', 'float', 'format', 'frozenset', 'getattr', 'globals', 'hasattr',
    368     'hash', 'help', 'hex', 'id', 'input', 'int', 'isinstance', 'issubclass',
    369     'iter', 'len', 'license', 'list', 'locals', 'map', 'max', 'memoryview',
    370     'min', 'next', 'object', 'oct', 'open', 'ord', 'pow', 'print', 'property',
    371     'quit', 'range', 'repr', 'reversed', 'round', 'set', 'setattr', 'slice',
    372     'sorted', 'staticmethod', 'str', 'sum', 'super', 'tuple', 'type', 'vars',
    373     'zip']
    374 
    375 .. _tut-packages:
    376 
    377 Packages
    378 ========
    379 
    380 Packages are a way of structuring Python's module namespace by using "dotted
    381 module names".  For example, the module name :mod:`A.B` designates a submodule
    382 named ``B`` in a package named ``A``.  Just like the use of modules saves the
    383 authors of different modules from having to worry about each other's global
    384 variable names, the use of dotted module names saves the authors of multi-module
    385 packages like NumPy or Pillow from having to worry about
    386 each other's module names.
    387 
    388 Suppose you want to design a collection of modules (a "package") for the uniform
    389 handling of sound files and sound data.  There are many different sound file
    390 formats (usually recognized by their extension, for example: :file:`.wav`,
    391 :file:`.aiff`, :file:`.au`), so you may need to create and maintain a growing
    392 collection of modules for the conversion between the various file formats.
    393 There are also many different operations you might want to perform on sound data
    394 (such as mixing, adding echo, applying an equalizer function, creating an
    395 artificial stereo effect), so in addition you will be writing a never-ending
    396 stream of modules to perform these operations.  Here's a possible structure for
    397 your package (expressed in terms of a hierarchical filesystem):
    398 
    399 .. code-block:: text
    400 
    401    sound/                          Top-level package
    402          __init__.py               Initialize the sound package
    403          formats/                  Subpackage for file format conversions
    404                  __init__.py
    405                  wavread.py
    406                  wavwrite.py
    407                  aiffread.py
    408                  aiffwrite.py
    409                  auread.py
    410                  auwrite.py
    411                  ...
    412          effects/                  Subpackage for sound effects
    413                  __init__.py
    414                  echo.py
    415                  surround.py
    416                  reverse.py
    417                  ...
    418          filters/                  Subpackage for filters
    419                  __init__.py
    420                  equalizer.py
    421                  vocoder.py
    422                  karaoke.py
    423                  ...
    424 
    425 When importing the package, Python searches through the directories on
    426 ``sys.path`` looking for the package subdirectory.
    427 
    428 The :file:`__init__.py` files are required to make Python treat the directories
    429 as containing packages; this is done to prevent directories with a common name,
    430 such as ``string``, from unintentionally hiding valid modules that occur later
    431 on the module search path. In the simplest case, :file:`__init__.py` can just be
    432 an empty file, but it can also execute initialization code for the package or
    433 set the ``__all__`` variable, described later.
    434 
    435 Users of the package can import individual modules from the package, for
    436 example::
    437 
    438    import sound.effects.echo
    439 
    440 This loads the submodule :mod:`sound.effects.echo`.  It must be referenced with
    441 its full name. ::
    442 
    443    sound.effects.echo.echofilter(input, output, delay=0.7, atten=4)
    444 
    445 An alternative way of importing the submodule is::
    446 
    447    from sound.effects import echo
    448 
    449 This also loads the submodule :mod:`echo`, and makes it available without its
    450 package prefix, so it can be used as follows::
    451 
    452    echo.echofilter(input, output, delay=0.7, atten=4)
    453 
    454 Yet another variation is to import the desired function or variable directly::
    455 
    456    from sound.effects.echo import echofilter
    457 
    458 Again, this loads the submodule :mod:`echo`, but this makes its function
    459 :func:`echofilter` directly available::
    460 
    461    echofilter(input, output, delay=0.7, atten=4)
    462 
    463 Note that when using ``from package import item``, the item can be either a
    464 submodule (or subpackage) of the package, or some  other name defined in the
    465 package, like a function, class or variable.  The ``import`` statement first
    466 tests whether the item is defined in the package; if not, it assumes it is a
    467 module and attempts to load it.  If it fails to find it, an :exc:`ImportError`
    468 exception is raised.
    469 
    470 Contrarily, when using syntax like ``import item.subitem.subsubitem``, each item
    471 except for the last must be a package; the last item can be a module or a
    472 package but can't be a class or function or variable defined in the previous
    473 item.
    474 
    475 
    476 .. _tut-pkg-import-star:
    477 
    478 Importing \* From a Package
    479 ---------------------------
    480 
    481 .. index:: single: __all__
    482 
    483 Now what happens when the user writes ``from sound.effects import *``?  Ideally,
    484 one would hope that this somehow goes out to the filesystem, finds which
    485 submodules are present in the package, and imports them all.  This could take a
    486 long time and importing sub-modules might have unwanted side-effects that should
    487 only happen when the sub-module is explicitly imported.
    488 
    489 The only solution is for the package author to provide an explicit index of the
    490 package.  The :keyword:`import` statement uses the following convention: if a package's
    491 :file:`__init__.py` code defines a list named ``__all__``, it is taken to be the
    492 list of module names that should be imported when ``from package import *`` is
    493 encountered.  It is up to the package author to keep this list up-to-date when a
    494 new version of the package is released.  Package authors may also decide not to
    495 support it, if they don't see a use for importing \* from their package.  For
    496 example, the file :file:`sound/effects/__init__.py` could contain the following
    497 code::
    498 
    499    __all__ = ["echo", "surround", "reverse"]
    500 
    501 This would mean that ``from sound.effects import *`` would import the three
    502 named submodules of the :mod:`sound` package.
    503 
    504 If ``__all__`` is not defined, the statement ``from sound.effects import *``
    505 does *not* import all submodules from the package :mod:`sound.effects` into the
    506 current namespace; it only ensures that the package :mod:`sound.effects` has
    507 been imported (possibly running any initialization code in :file:`__init__.py`)
    508 and then imports whatever names are defined in the package.  This includes any
    509 names defined (and submodules explicitly loaded) by :file:`__init__.py`.  It
    510 also includes any submodules of the package that were explicitly loaded by
    511 previous :keyword:`import` statements.  Consider this code::
    512 
    513    import sound.effects.echo
    514    import sound.effects.surround
    515    from sound.effects import *
    516 
    517 In this example, the :mod:`echo` and :mod:`surround` modules are imported in the
    518 current namespace because they are defined in the :mod:`sound.effects` package
    519 when the ``from...import`` statement is executed.  (This also works when
    520 ``__all__`` is defined.)
    521 
    522 Although certain modules are designed to export only names that follow certain
    523 patterns when you use ``import *``, it is still considered bad practice in
    524 production code.
    525 
    526 Remember, there is nothing wrong with using ``from Package import
    527 specific_submodule``!  In fact, this is the recommended notation unless the
    528 importing module needs to use submodules with the same name from different
    529 packages.
    530 
    531 
    532 Intra-package References
    533 ------------------------
    534 
    535 When packages are structured into subpackages (as with the :mod:`sound` package
    536 in the example), you can use absolute imports to refer to submodules of siblings
    537 packages.  For example, if the module :mod:`sound.filters.vocoder` needs to use
    538 the :mod:`echo` module in the :mod:`sound.effects` package, it can use ``from
    539 sound.effects import echo``.
    540 
    541 You can also write relative imports, with the ``from module import name`` form
    542 of import statement.  These imports use leading dots to indicate the current and
    543 parent packages involved in the relative import.  From the :mod:`surround`
    544 module for example, you might use::
    545 
    546    from . import echo
    547    from .. import formats
    548    from ..filters import equalizer
    549 
    550 Note that relative imports are based on the name of the current module.  Since
    551 the name of the main module is always ``"__main__"``, modules intended for use
    552 as the main module of a Python application must always use absolute imports.
    553 
    554 
    555 Packages in Multiple Directories
    556 --------------------------------
    557 
    558 Packages support one more special attribute, :attr:`__path__`.  This is
    559 initialized to be a list containing the name of the directory holding the
    560 package's :file:`__init__.py` before the code in that file is executed.  This
    561 variable can be modified; doing so affects future searches for modules and
    562 subpackages contained in the package.
    563 
    564 While this feature is not often needed, it can be used to extend the set of
    565 modules found in a package.
    566 
    567 
    568 .. rubric:: Footnotes
    569 
    570 .. [#] In fact function definitions are also 'statements' that are 'executed'; the
    571    execution of a module-level function definition enters the function name in
    572    the module's global symbol table.
    573