Home | History | Annotate | Download | only in doc
      1 :mod:`modulegraph.modulegraph` --- Find modules used by a script
      2 ================================================================
      3 
      4 .. module:: modulegraph.modulegraph
      5    :synopsis: Find modules used by a script
      6 
      7 This module defines :class:`ModuleGraph`, which is used to find
      8 the dependencies of scripts using bytecode analysis.
      9 
     10 A number of APIs in this module refer to filesystem path. Those paths can refer to
     11 files inside zipfiles (for example when there are zipped egg files on :data:`sys.path`).
     12 Filenames referring to entries in a zipfile are not marked any way, if ``"somepath.zip"``
     13 refers to a zipfile, that is ``"somepath.zip/embedded/file"`` will be used to refer to
     14 ``embedded/file`` inside the zipfile.
     15 
     16 The actual graph
     17 ----------------
     18 
     19 .. class:: ModuleGraph([path[, excludes[, replace_paths[, implies[, graph[, debug]]]]]])
     20 
     21    Create a new ModuleGraph object. Use the :meth:`run_script` method to add scripts,
     22    and their dependencies to the graph.
     23 
     24    :param path: Python search path to use, defaults to :data:`sys.path`
     25    :param excludes: Iterable with module names that should not be included as a dependency
     26    :param replace_paths: List of pathname rewrites ``(old, new)``. When this argument is
     27      supplied the ``co_filename`` attributes of code objects get rewritten before scanning
     28      them for dependencies.
     29    :param implies: Implied module dependencies, a mapping from a module name to the list
     30      of modules it depends on. Use this to tell modulegraph about dependencies that cannot
     31      be found by code inspection (such as imports from C code or using the :func:`__import__`
     32      function).
     33    :param graph: A precreated :class:`Graph <altgraph.Graph.Graph>` object to use, the
     34      default is to create a new one.
     35    :param debug: The :class:`ObjectGraph <altgraph.ObjectGraph.ObjectGraph>` debug level.
     36 
     37 
     38 .. method:: run_script(pathname[, caller])
     39 
     40    Create, and return,  a node by path (not module name). The *pathname* should
     41    refer to a Python source file and will be scanned for dependencies.
     42 
     43    The optional argument *caller* is the the node that calls this script,
     44    and is used to add a reference in the graph.
     45 
     46 .. method:: import_hook(name[[, caller[, fromlist[, level, [, attr]]]])
     47 
     48    Import a module and analyse its dependencies
     49 
     50    :arg name:     The module name
     51    :arg caller:   The node that caused the import to happen
     52    :arg fromlist: The list of names to import, this is an empty list for
     53       ``import name`` and a list of names for ``from name import a, b, c``.
     54    :arg level:    The import level. The value should be ``-1`` for classical Python 2
     55      imports, ``0`` for absolute imports and a positive number for relative imports (
     56      where the value is the number of leading dots in the imported name).
     57    :arg attr:     Attributes for the graph edge.
     58 
     59 
     60 .. method:: implyNodeReference(node, other, edgeData=None)
     61 
     62    Explictly mark that *node* depends on *other*. Other is either
     63    a :class:`node <Node>` or the name of a module that will be
     64    searched for as if it were an absolute import.
     65 
     66 
     67 .. method:: createReference(fromnode, tonode[, edge_data])
     68 
     69    Create a reference from *fromnode* to *tonode*, with optional edge data.
     70 
     71    The default for *edge_data* is ``"direct"``.
     72 
     73 .. method:: getReferences(fromnode)
     74 
     75    Yield all nodes that *fromnode* refers to. That is, all modules imported
     76    by *fromnode*.
     77 
     78    Node :data:`None` is the root of the graph, and refers to all notes that were
     79    explicitly imported by :meth:`run_script` or :meth:`import_hook`, unless you use
     80    an explicit parent with those methods.
     81 
     82    .. versionadded:: 0.11
     83 
     84 .. method:: getReferers(tonode, collapse_missing_modules=True)
     85 
     86    Yield all nodes that refer to *tonode*. That is, all modules that import
     87    *tonode*.
     88 
     89    If *collapse_missing_modules* is false this includes refererences from
     90    :class:`MissingModule` nodes, otherwise :class:`MissingModule` nodes
     91    are replaced by the "real" nodes that reference this missing node.
     92 
     93    .. versionadded:: 0.12
     94 
     95 .. method:: foldReferences(pkgnode)
     96 
     97    Hide all submodule nodes for package *pkgnode* and add ingoing and outgoing
     98    edges to *pkgnode* based on the edges from the submodule nodes.
     99 
    100    This can be used to simplify a module graph: after folding 'email' all
    101    references to modules in the 'email' package are references to the package.
    102 
    103    .. versionadded: 0.11
    104 
    105 .. method:: findNode(name)
    106 
    107    Find a node by identifier.  If a node by that identifier exists, it will be returned.
    108 
    109    If a lazy node exists by that identifier with no dependencies (excluded), it will be
    110    instantiated and returned.
    111 
    112    If a lazy node exists by that identifier with dependencies, it and its
    113    dependencies will be instantiated and scanned for additional depende
    114 
    115 
    116 
    117 .. method:: create_xref([out])
    118 
    119    Write an HTML file to the *out* stream (defaulting to :data:`sys.stdout`).
    120 
    121    The HTML file contains a textual description of the dependency graph.
    122 
    123 
    124 
    125 .. method:: graphreport([fileobj[, flatpackages]])
    126 
    127    .. todo:: To be documented
    128 
    129 
    130 
    131 .. method:: report()
    132 
    133    Print a report to stdout, listing the found modules with their
    134    paths, as well as modules that are missing, or seem to be missing.
    135 
    136 
    137 Mostly internal methods
    138 .......................
    139 
    140 The methods in this section should be considered as methods for subclassing at best,
    141 please let us know if you need these methods in your code as they are on track to be
    142 made private methods before the 1.0 release.
    143 
    144 .. warning:: The methods in this section will be refactored in a future release,
    145    the current architecture makes it unnecessarily hard to write proper tests.
    146 
    147 .. method:: determine_parent(caller)
    148 
    149    Returns the node of the package root voor *caller*. If *caller* is a package
    150    this is the node itself, if the node is a module in a package this is the
    151    node of for the package and otherwise the *caller* is not a package and
    152    the result is :data:`None`.
    153 
    154 .. method:: find_head_package(parent, name[, level])
    155 
    156    .. todo:: To be documented
    157 
    158 
    159 .. method:: load_tail(mod, tail)
    160 
    161    This method is called to load the rest of a dotted name after loading the root
    162    of a package. This will import all intermediate modules as well (using
    163    :meth:`import_module`), and returns the module :class:`node <Node>` for the
    164    requested node.
    165 
    166    .. note:: When *tail* is empty this will just return *mod*.
    167 
    168    :arg mod:   A start module (instance of :class:`Node`)
    169    :arg tail:  The rest of a dotted name, can be empty
    170    :raise ImportError: When the requested (or one of its parents) module cannot be found
    171    :returns: the requested module
    172 
    173 
    174 
    175 .. method:: ensure_fromlist(m, fromlist)
    176 
    177    Yield all submodules that would be imported when importing *fromlist*
    178    from *m* (using ``from m import fromlist...``).
    179 
    180    *m* must be a package and not a regular module.
    181 
    182 .. method:: find_all_submodules(m)
    183 
    184    Yield the filenames for submodules of in the same package as *m*.
    185 
    186 
    187 
    188 .. method:: import_module(partname, fqname, parent)
    189 
    190    Perform import of the module with basename *partname* (``path``) and
    191    full name *fqname* (``os.path``). Import is performed by *parent*.
    192 
    193    This will create a reference from the parent node to the
    194    module node and will load the module node when it is not already
    195    loaded.
    196 
    197 
    198 
    199 .. method:: load_module(fqname, fp, pathname, (suffix, mode, type))
    200 
    201    Load the module named *fqname* from the given *pathame*. The
    202    argument *fp* is either :data:`None`, or a stream where the
    203    code for the Python module can be loaded (either byte-code or
    204    the source code). The *(suffix, mode, type)* tuple are the
    205    suffix of the source file, the open mode for the file and the
    206    type of module.
    207 
    208    Creates a node of the right class and processes the dependencies
    209    of the :class:`node <Node>` by scanning the byte-code for the node.
    210 
    211    Returns the resulting :class:`node <Node>`.
    212 
    213 
    214 
    215 .. method:: scan_code(code, m)
    216 
    217    Scan the *code* object for module *m* and update the dependencies of
    218    *m* using the import statemets found in the code.
    219 
    220    This will automaticly scan the code for nested functions, generator
    221    expressions and list comprehensions as well.
    222 
    223 
    224 
    225 .. method:: load_package(fqname, pathname)
    226 
    227    Load a package directory.
    228 
    229 
    230 
    231 .. method:: find_module(name, path[, parent])
    232 
    233    Locates a module named *name* that is not yet part of the
    234    graph. This method will raise :exc:`ImportError` when
    235    the module cannot be found or when it is already part
    236    of the graph. The *name* can not be a dotted name.
    237 
    238    The *path* is the search path used, or :data:`None` to
    239    use the default path.
    240 
    241    When the *parent* is specified *name* refers to a
    242    subpackage of *parent*, and *path* should be the
    243    search path of the parent.
    244 
    245    Returns the result of the global function
    246    :func:`find_module <modulegraph.modulegraph.find_module>`.
    247 
    248 
    249 .. method:: itergraphreport([name[, flatpackages]])
    250 
    251    .. todo:: To be documented
    252 
    253 
    254 
    255 .. method:: replace_paths_in_code(co)
    256 
    257    Replace the filenames in code object *co* using the *replace_paths* value that
    258    was passed to the contructor. Returns the rewritten code object.
    259 
    260 
    261 
    262 .. method:: calc_setuptools_nspackages()
    263 
    264    Returns a mapping from package name to a list of paths where that package
    265    can be found in ``--single-version-externally-managed`` form.
    266 
    267    This method is used to be able to find those packages: these use
    268    a magic ``.pth`` file to ensure that the package is added to :data:`sys.path`,
    269    as they do not contain an ``___init__.py`` file.
    270 
    271    Packages in this form are used by system packages and the "pip"
    272    installer.
    273 
    274 
    275 Graph nodes
    276 -----------
    277 
    278 The :class:`ModuleGraph` contains nodes that represent the various types of modules.
    279 
    280 .. class:: Alias(value)
    281 
    282    This is a subclass of string that is used to mark module aliases.
    283 
    284 
    285 
    286 .. class:: Node(identifier)
    287 
    288    Base class for nodes, which provides the common functionality.
    289 
    290    Nodes can by used as mappings for storing arbitrary data in the node.
    291 
    292    Nodes are compared by comparing their *identifier*.
    293 
    294 .. data:: debug
    295 
    296    Debug level (integer)
    297 
    298 .. data:: graphident
    299 
    300    The node identifier, this is the value of the *identifier* argument
    301    to the constructor.
    302 
    303 .. data:: identifier
    304 
    305    The node identifier, this is the value of the *identifier* argument
    306    to the constructor.
    307 
    308 .. data:: filename
    309 
    310    The filename associated with this node.
    311 
    312 .. data:: packagepath
    313 
    314    The value of ``__path__`` for this node.
    315 
    316 .. data:: code
    317 
    318    The :class:`code object <types.CodeObject>` associated with this node
    319 
    320 .. data:: globalnames
    321 
    322    The set of global names that are assigned to in this module. This
    323    includes those names imported through startimports of Python modules.
    324 
    325 .. data:: startimports
    326 
    327    The set of startimports this module did that could not be resolved,
    328    ie. a startimport from a non-Python module.
    329 
    330 
    331 .. method:: __contains__(name)
    332 
    333    Return if there is a value associated with *name*.
    334 
    335    This method is usually accessed as ``name in aNode``.
    336 
    337 .. method:: __setitem__(name, value)
    338 
    339    Set the value of *name* to *value*.
    340 
    341    This method is usually accessed as ``aNode[name] = value``.
    342 
    343 .. method:: __getitem__(name)
    344 
    345    Returns the value of *name*, raises :exc:`KeyError` when
    346    it cannot be found.
    347 
    348    This method is usually accessed as ``value = aNode[name]``.
    349 
    350 .. method:: get(name[, default])
    351 
    352    Returns the value of *name*, or the default value when it
    353    cannot be found. The *default* is :data:`None` when not specified.
    354 
    355 .. method:: infoTuple()
    356 
    357    Returns a tuple with information used in the :func:`repr`
    358    output for the node. Subclasses can add additional informations
    359    to the result.
    360 
    361 
    362 .. class:: AliasNode (name, node)
    363 
    364    A node that represents an alias from a name to another node.
    365 
    366    The value of attribute *graphident* for this node will be the
    367    value of *name*, the other :class:`Node` attributed are
    368    references to those attributed in *node*.
    369 
    370 .. class:: BadModule(identifier)
    371 
    372    Base class for nodes that should be ignored for some reason
    373 
    374 .. class:: ExcludedModule(identifier)
    375 
    376    A module that is explicitly excluded.
    377 
    378 .. class:: MissingModule(identifier)
    379 
    380    A module that is imported but cannot be located.
    381 
    382 
    383 
    384 .. class:: Script(filename)
    385 
    386    A python script.
    387 
    388    .. data:: filename
    389 
    390       The filename for the script
    391 
    392 .. class:: BaseModule(name[, filename[, path]])
    393 
    394     The base class for actual modules. The *name* is
    395     the possibly dotted module name, *filename* is the
    396     filesystem path to the module and *path* is the
    397     value of ``__path__`` for the module.
    398 
    399 .. data:: graphident
    400 
    401    The name of the module
    402 
    403 .. data:: filename
    404 
    405    The filesystem path to the module.
    406 
    407 .. data:: path
    408 
    409    The value of ``__path__`` for this module.
    410 
    411 .. class:: BuiltinModule(name)
    412 
    413    A built-in module (on in :data:`sys.builtin_module_names`).
    414 
    415 .. class:: SourceModule(name)
    416 
    417    A module for which the python source code is available.
    418 
    419 .. class:: InvalidSourceModule(name)
    420 
    421    A module for which the python source code is available, but where
    422    that source code cannot be compiled (due to syntax errors).
    423 
    424    This is a subclass of :class:`SourceModule`.
    425 
    426    .. versionadded:: 0.12
    427 
    428 .. class:: CompiledModule(name)
    429 
    430    A module for which only byte-code is available.
    431 
    432 .. class:: Package(name)
    433 
    434    Represents a python package
    435 
    436 .. class:: NamespacePackage(name)
    437 
    438    Represents a python namespace package.
    439 
    440    This is a subclass of :class:`Package`.
    441 
    442 .. class:: Extension(name)
    443 
    444    A native extension
    445 
    446 
    447 .. warning:: A number of other node types are defined in the module. Those modules aren't
    448    used by modulegraph and will be removed in a future version.
    449 
    450 
    451 Edge data
    452 ---------
    453 
    454 The edges in a module graph by default contain information about the edge, represented
    455 by an instance of :class:`DependencyInfo`.
    456 
    457 .. class:: DependencyInfo(conditional, function, tryexcept, fromlist)
    458 
    459    This class is a :func:`namedtuple <collections.namedtuple>` for representing
    460    the information on a dependency between two modules.
    461 
    462    All attributes can be used to deduce if a dependency is essential or not, and
    463    are particularly useful when reporting on missing modules (dependencies on
    464    :class:`MissingModule`).
    465 
    466    .. data:: fromlist
    467 
    468       A boolean that is true iff the target of the edge is named in the "import"
    469       list of a "from" import ("from package import module").
    470 
    471       When the target module is imported multiple times this attribute is false
    472       unless all imports are in "import" list of a "from" import.
    473 
    474    .. data:: function
    475 
    476       A boolean that is true iff the import is done inside a function definition,
    477       and is false for imports in module scope (or class scope for classes that
    478       aren't definined in a function).
    479 
    480    .. data:: tryexcept
    481 
    482       A boolean that is true iff the import that is done in the "try" or "except"
    483       block of a try statement (but not in the "else" block).
    484 
    485    .. data:: conditional
    486 
    487       A boolean that is true iff the import is done in either block of an "if"
    488       statement.
    489 
    490    When the target of the edge is imported multiple times the :data:`function`,
    491    :data:`tryexcept` and :data:`conditional` attributes of all imports are
    492    merged: when there is an import where all these attributes are false the
    493    attributes are false, otherwise each attribute is set to true if it is
    494    true for at least one of the imports.
    495 
    496    For example, when a module is imported both in a try-except statement and
    497    furthermore is imported in a function (in two separate statements),
    498    both :data:`tryexcept` and :data:`function` will be true.  But if there
    499    is a third unconditional toplevel import for that module as well all
    500    three attributes are false.
    501 
    502    .. warning::
    503 
    504       All attributes but :data:`fromlist` will be false when the source of
    505       a dependency is scanned from a byte-compiled module instead of a python
    506       source file. The :data:`fromlist` attribute will stil be set correctly.
    507 
    508 Utility functions
    509 -----------------
    510 
    511 .. function:: find_module(name[, path])
    512 
    513    A version of :func:`imp.find_module` that works with zipped packages (and other
    514    :pep:`302` importers).
    515 
    516 .. function:: moduleInfoForPath(path)
    517 
    518    Return the module name, readmode and type for the file at *path*, or
    519    None if it doesn't seem to be a valid module (based on its name).
    520 
    521 .. function:: addPackagePath(packagename, path)
    522 
    523    Add *path* to the value of ``__path__`` for the package named *packagename*.
    524 
    525 .. function:: replacePackage(oldname, newname)
    526 
    527    Rename *oldname* to *newname* when it is found by the module finder. This
    528    is used as a workaround for the hack that the ``_xmlplus`` package uses
    529    to inject itself in the ``xml`` namespace.
    530 
    531 
    532