1 :mod:`modulegraph.modulegraph` --- Find modules used by a script 2 ================================================================ 3 4 .. module:: modulegraph.modulegraph 5 :synopsis: Find modules used by a script 6 7 This module defines :class:`ModuleGraph`, which is used to find 8 the dependencies of scripts using bytecode analysis. 9 10 A number of APIs in this module refer to filesystem path. Those paths can refer to 11 files inside zipfiles (for example when there are zipped egg files on :data:`sys.path`). 12 Filenames referring to entries in a zipfile are not marked any way, if ``"somepath.zip"`` 13 refers to a zipfile, that is ``"somepath.zip/embedded/file"`` will be used to refer to 14 ``embedded/file`` inside the zipfile. 15 16 The actual graph 17 ---------------- 18 19 .. class:: ModuleGraph([path[, excludes[, replace_paths[, implies[, graph[, debug]]]]]]) 20 21 Create a new ModuleGraph object. Use the :meth:`run_script` method to add scripts, 22 and their dependencies to the graph. 23 24 :param path: Python search path to use, defaults to :data:`sys.path` 25 :param excludes: Iterable with module names that should not be included as a dependency 26 :param replace_paths: List of pathname rewrites ``(old, new)``. When this argument is 27 supplied the ``co_filename`` attributes of code objects get rewritten before scanning 28 them for dependencies. 29 :param implies: Implied module dependencies, a mapping from a module name to the list 30 of modules it depends on. Use this to tell modulegraph about dependencies that cannot 31 be found by code inspection (such as imports from C code or using the :func:`__import__` 32 function). 33 :param graph: A precreated :class:`Graph <altgraph.Graph.Graph>` object to use, the 34 default is to create a new one. 35 :param debug: The :class:`ObjectGraph <altgraph.ObjectGraph.ObjectGraph>` debug level. 36 37 38 .. method:: run_script(pathname[, caller]) 39 40 Create, and return, a node by path (not module name). The *pathname* should 41 refer to a Python source file and will be scanned for dependencies. 42 43 The optional argument *caller* is the the node that calls this script, 44 and is used to add a reference in the graph. 45 46 .. method:: import_hook(name[[, caller[, fromlist[, level, [, attr]]]]) 47 48 Import a module and analyse its dependencies 49 50 :arg name: The module name 51 :arg caller: The node that caused the import to happen 52 :arg fromlist: The list of names to import, this is an empty list for 53 ``import name`` and a list of names for ``from name import a, b, c``. 54 :arg level: The import level. The value should be ``-1`` for classical Python 2 55 imports, ``0`` for absolute imports and a positive number for relative imports ( 56 where the value is the number of leading dots in the imported name). 57 :arg attr: Attributes for the graph edge. 58 59 60 .. method:: implyNodeReference(node, other, edgeData=None) 61 62 Explictly mark that *node* depends on *other*. Other is either 63 a :class:`node <Node>` or the name of a module that will be 64 searched for as if it were an absolute import. 65 66 67 .. method:: createReference(fromnode, tonode[, edge_data]) 68 69 Create a reference from *fromnode* to *tonode*, with optional edge data. 70 71 The default for *edge_data* is ``"direct"``. 72 73 .. method:: getReferences(fromnode) 74 75 Yield all nodes that *fromnode* refers to. That is, all modules imported 76 by *fromnode*. 77 78 Node :data:`None` is the root of the graph, and refers to all notes that were 79 explicitly imported by :meth:`run_script` or :meth:`import_hook`, unless you use 80 an explicit parent with those methods. 81 82 .. versionadded:: 0.11 83 84 .. method:: getReferers(tonode, collapse_missing_modules=True) 85 86 Yield all nodes that refer to *tonode*. That is, all modules that import 87 *tonode*. 88 89 If *collapse_missing_modules* is false this includes refererences from 90 :class:`MissingModule` nodes, otherwise :class:`MissingModule` nodes 91 are replaced by the "real" nodes that reference this missing node. 92 93 .. versionadded:: 0.12 94 95 .. method:: foldReferences(pkgnode) 96 97 Hide all submodule nodes for package *pkgnode* and add ingoing and outgoing 98 edges to *pkgnode* based on the edges from the submodule nodes. 99 100 This can be used to simplify a module graph: after folding 'email' all 101 references to modules in the 'email' package are references to the package. 102 103 .. versionadded: 0.11 104 105 .. method:: findNode(name) 106 107 Find a node by identifier. If a node by that identifier exists, it will be returned. 108 109 If a lazy node exists by that identifier with no dependencies (excluded), it will be 110 instantiated and returned. 111 112 If a lazy node exists by that identifier with dependencies, it and its 113 dependencies will be instantiated and scanned for additional depende 114 115 116 117 .. method:: create_xref([out]) 118 119 Write an HTML file to the *out* stream (defaulting to :data:`sys.stdout`). 120 121 The HTML file contains a textual description of the dependency graph. 122 123 124 125 .. method:: graphreport([fileobj[, flatpackages]]) 126 127 .. todo:: To be documented 128 129 130 131 .. method:: report() 132 133 Print a report to stdout, listing the found modules with their 134 paths, as well as modules that are missing, or seem to be missing. 135 136 137 Mostly internal methods 138 ....................... 139 140 The methods in this section should be considered as methods for subclassing at best, 141 please let us know if you need these methods in your code as they are on track to be 142 made private methods before the 1.0 release. 143 144 .. warning:: The methods in this section will be refactored in a future release, 145 the current architecture makes it unnecessarily hard to write proper tests. 146 147 .. method:: determine_parent(caller) 148 149 Returns the node of the package root voor *caller*. If *caller* is a package 150 this is the node itself, if the node is a module in a package this is the 151 node of for the package and otherwise the *caller* is not a package and 152 the result is :data:`None`. 153 154 .. method:: find_head_package(parent, name[, level]) 155 156 .. todo:: To be documented 157 158 159 .. method:: load_tail(mod, tail) 160 161 This method is called to load the rest of a dotted name after loading the root 162 of a package. This will import all intermediate modules as well (using 163 :meth:`import_module`), and returns the module :class:`node <Node>` for the 164 requested node. 165 166 .. note:: When *tail* is empty this will just return *mod*. 167 168 :arg mod: A start module (instance of :class:`Node`) 169 :arg tail: The rest of a dotted name, can be empty 170 :raise ImportError: When the requested (or one of its parents) module cannot be found 171 :returns: the requested module 172 173 174 175 .. method:: ensure_fromlist(m, fromlist) 176 177 Yield all submodules that would be imported when importing *fromlist* 178 from *m* (using ``from m import fromlist...``). 179 180 *m* must be a package and not a regular module. 181 182 .. method:: find_all_submodules(m) 183 184 Yield the filenames for submodules of in the same package as *m*. 185 186 187 188 .. method:: import_module(partname, fqname, parent) 189 190 Perform import of the module with basename *partname* (``path``) and 191 full name *fqname* (``os.path``). Import is performed by *parent*. 192 193 This will create a reference from the parent node to the 194 module node and will load the module node when it is not already 195 loaded. 196 197 198 199 .. method:: load_module(fqname, fp, pathname, (suffix, mode, type)) 200 201 Load the module named *fqname* from the given *pathame*. The 202 argument *fp* is either :data:`None`, or a stream where the 203 code for the Python module can be loaded (either byte-code or 204 the source code). The *(suffix, mode, type)* tuple are the 205 suffix of the source file, the open mode for the file and the 206 type of module. 207 208 Creates a node of the right class and processes the dependencies 209 of the :class:`node <Node>` by scanning the byte-code for the node. 210 211 Returns the resulting :class:`node <Node>`. 212 213 214 215 .. method:: scan_code(code, m) 216 217 Scan the *code* object for module *m* and update the dependencies of 218 *m* using the import statemets found in the code. 219 220 This will automaticly scan the code for nested functions, generator 221 expressions and list comprehensions as well. 222 223 224 225 .. method:: load_package(fqname, pathname) 226 227 Load a package directory. 228 229 230 231 .. method:: find_module(name, path[, parent]) 232 233 Locates a module named *name* that is not yet part of the 234 graph. This method will raise :exc:`ImportError` when 235 the module cannot be found or when it is already part 236 of the graph. The *name* can not be a dotted name. 237 238 The *path* is the search path used, or :data:`None` to 239 use the default path. 240 241 When the *parent* is specified *name* refers to a 242 subpackage of *parent*, and *path* should be the 243 search path of the parent. 244 245 Returns the result of the global function 246 :func:`find_module <modulegraph.modulegraph.find_module>`. 247 248 249 .. method:: itergraphreport([name[, flatpackages]]) 250 251 .. todo:: To be documented 252 253 254 255 .. method:: replace_paths_in_code(co) 256 257 Replace the filenames in code object *co* using the *replace_paths* value that 258 was passed to the contructor. Returns the rewritten code object. 259 260 261 262 .. method:: calc_setuptools_nspackages() 263 264 Returns a mapping from package name to a list of paths where that package 265 can be found in ``--single-version-externally-managed`` form. 266 267 This method is used to be able to find those packages: these use 268 a magic ``.pth`` file to ensure that the package is added to :data:`sys.path`, 269 as they do not contain an ``___init__.py`` file. 270 271 Packages in this form are used by system packages and the "pip" 272 installer. 273 274 275 Graph nodes 276 ----------- 277 278 The :class:`ModuleGraph` contains nodes that represent the various types of modules. 279 280 .. class:: Alias(value) 281 282 This is a subclass of string that is used to mark module aliases. 283 284 285 286 .. class:: Node(identifier) 287 288 Base class for nodes, which provides the common functionality. 289 290 Nodes can by used as mappings for storing arbitrary data in the node. 291 292 Nodes are compared by comparing their *identifier*. 293 294 .. data:: debug 295 296 Debug level (integer) 297 298 .. data:: graphident 299 300 The node identifier, this is the value of the *identifier* argument 301 to the constructor. 302 303 .. data:: identifier 304 305 The node identifier, this is the value of the *identifier* argument 306 to the constructor. 307 308 .. data:: filename 309 310 The filename associated with this node. 311 312 .. data:: packagepath 313 314 The value of ``__path__`` for this node. 315 316 .. data:: code 317 318 The :class:`code object <types.CodeObject>` associated with this node 319 320 .. data:: globalnames 321 322 The set of global names that are assigned to in this module. This 323 includes those names imported through startimports of Python modules. 324 325 .. data:: startimports 326 327 The set of startimports this module did that could not be resolved, 328 ie. a startimport from a non-Python module. 329 330 331 .. method:: __contains__(name) 332 333 Return if there is a value associated with *name*. 334 335 This method is usually accessed as ``name in aNode``. 336 337 .. method:: __setitem__(name, value) 338 339 Set the value of *name* to *value*. 340 341 This method is usually accessed as ``aNode[name] = value``. 342 343 .. method:: __getitem__(name) 344 345 Returns the value of *name*, raises :exc:`KeyError` when 346 it cannot be found. 347 348 This method is usually accessed as ``value = aNode[name]``. 349 350 .. method:: get(name[, default]) 351 352 Returns the value of *name*, or the default value when it 353 cannot be found. The *default* is :data:`None` when not specified. 354 355 .. method:: infoTuple() 356 357 Returns a tuple with information used in the :func:`repr` 358 output for the node. Subclasses can add additional informations 359 to the result. 360 361 362 .. class:: AliasNode (name, node) 363 364 A node that represents an alias from a name to another node. 365 366 The value of attribute *graphident* for this node will be the 367 value of *name*, the other :class:`Node` attributed are 368 references to those attributed in *node*. 369 370 .. class:: BadModule(identifier) 371 372 Base class for nodes that should be ignored for some reason 373 374 .. class:: ExcludedModule(identifier) 375 376 A module that is explicitly excluded. 377 378 .. class:: MissingModule(identifier) 379 380 A module that is imported but cannot be located. 381 382 383 384 .. class:: Script(filename) 385 386 A python script. 387 388 .. data:: filename 389 390 The filename for the script 391 392 .. class:: BaseModule(name[, filename[, path]]) 393 394 The base class for actual modules. The *name* is 395 the possibly dotted module name, *filename* is the 396 filesystem path to the module and *path* is the 397 value of ``__path__`` for the module. 398 399 .. data:: graphident 400 401 The name of the module 402 403 .. data:: filename 404 405 The filesystem path to the module. 406 407 .. data:: path 408 409 The value of ``__path__`` for this module. 410 411 .. class:: BuiltinModule(name) 412 413 A built-in module (on in :data:`sys.builtin_module_names`). 414 415 .. class:: SourceModule(name) 416 417 A module for which the python source code is available. 418 419 .. class:: InvalidSourceModule(name) 420 421 A module for which the python source code is available, but where 422 that source code cannot be compiled (due to syntax errors). 423 424 This is a subclass of :class:`SourceModule`. 425 426 .. versionadded:: 0.12 427 428 .. class:: CompiledModule(name) 429 430 A module for which only byte-code is available. 431 432 .. class:: Package(name) 433 434 Represents a python package 435 436 .. class:: NamespacePackage(name) 437 438 Represents a python namespace package. 439 440 This is a subclass of :class:`Package`. 441 442 .. class:: Extension(name) 443 444 A native extension 445 446 447 .. warning:: A number of other node types are defined in the module. Those modules aren't 448 used by modulegraph and will be removed in a future version. 449 450 451 Edge data 452 --------- 453 454 The edges in a module graph by default contain information about the edge, represented 455 by an instance of :class:`DependencyInfo`. 456 457 .. class:: DependencyInfo(conditional, function, tryexcept, fromlist) 458 459 This class is a :func:`namedtuple <collections.namedtuple>` for representing 460 the information on a dependency between two modules. 461 462 All attributes can be used to deduce if a dependency is essential or not, and 463 are particularly useful when reporting on missing modules (dependencies on 464 :class:`MissingModule`). 465 466 .. data:: fromlist 467 468 A boolean that is true iff the target of the edge is named in the "import" 469 list of a "from" import ("from package import module"). 470 471 When the target module is imported multiple times this attribute is false 472 unless all imports are in "import" list of a "from" import. 473 474 .. data:: function 475 476 A boolean that is true iff the import is done inside a function definition, 477 and is false for imports in module scope (or class scope for classes that 478 aren't definined in a function). 479 480 .. data:: tryexcept 481 482 A boolean that is true iff the import that is done in the "try" or "except" 483 block of a try statement (but not in the "else" block). 484 485 .. data:: conditional 486 487 A boolean that is true iff the import is done in either block of an "if" 488 statement. 489 490 When the target of the edge is imported multiple times the :data:`function`, 491 :data:`tryexcept` and :data:`conditional` attributes of all imports are 492 merged: when there is an import where all these attributes are false the 493 attributes are false, otherwise each attribute is set to true if it is 494 true for at least one of the imports. 495 496 For example, when a module is imported both in a try-except statement and 497 furthermore is imported in a function (in two separate statements), 498 both :data:`tryexcept` and :data:`function` will be true. But if there 499 is a third unconditional toplevel import for that module as well all 500 three attributes are false. 501 502 .. warning:: 503 504 All attributes but :data:`fromlist` will be false when the source of 505 a dependency is scanned from a byte-compiled module instead of a python 506 source file. The :data:`fromlist` attribute will stil be set correctly. 507 508 Utility functions 509 ----------------- 510 511 .. function:: find_module(name[, path]) 512 513 A version of :func:`imp.find_module` that works with zipped packages (and other 514 :pep:`302` importers). 515 516 .. function:: moduleInfoForPath(path) 517 518 Return the module name, readmode and type for the file at *path*, or 519 None if it doesn't seem to be a valid module (based on its name). 520 521 .. function:: addPackagePath(packagename, path) 522 523 Add *path* to the value of ``__path__`` for the package named *packagename*. 524 525 .. function:: replacePackage(oldname, newname) 526 527 Rename *oldname* to *newname* when it is found by the module finder. This 528 is used as a workaround for the hack that the ``_xmlplus`` package uses 529 to inject itself in the ``xml`` namespace. 530 531 532