Commits

Nick Coghlan committed 38577d5 Draft

Address the rest of my review comments

  • Participants
  • Parent commits 5b655e6
  • Branches importdocs

Comments (0)

Files changed (2)

 
    meta path finder
       A finder returned by a search of :data:`sys.meta_path`.  Meta path
-      finders are related to, but different from :term:`sys path finders <sys
-      path finder>`.
+      finders are related to, but different from :term:`path entry finders
+      <path entry finder>`.
 
    metaclass
       The class of a class.  Class definitions create a class name, a class
    namespace package
       A :pep:`420` :term:`package` which serves only as a container for
       subpackages.  Namespace packages may have no physical representation,
-      and specifically are not like a :term:`regular package` because they
-      have no ``__init__.py`` file.
+      and specifically are not like a :term:`initialized package` because
+      they have no ``__init__.py`` file and thus no code is executed when
+      they are first imported.
 
    nested scope
       The ability to refer to a variable in an enclosing definition.  For
 
    package
       A Python module which can contain submodules or recursively,
-      subpackages.  Technically, a package is a Python module with an
+      subpackages.  Technically, a package is any Python module with a
       ``__path__`` attribute.
 
    path importer
       A built-in :term:`finder` / :term:`loader` that knows how to find and
-      load modules from the file system.
+      load modules from path entries.
 
    portion
       A set of files in a single directory (possibly stored in a zip file)
       :func:`~sys.getrefcount` function that programmers can call to return the
       reference count for a particular object.
 
-   regular package
-      A traditional :term:`package`, such as a directory containing an
-      ``__init__.py`` file.
+   initialized package
+      A traditional :term:`package` that may execute Python code when first
+      imported, such as a directory containing an ``__init__.py`` file.
 
    __slots__
       A declaration inside a class that saves memory by pre-declaring space for
       :meth:`~collections.somenamedtuple._asdict`. Examples of struct sequences
       include :data:`sys.float_info` and the return value of :func:`os.stat`.
 
-   sys path finder
-      A finder returned by a search of :data:`sys.path` by the :term:`path
-      importer`.  Sys path finders are related to, but different from
+   path entry finder
+      A finder returned by a search of :data:`sys.path` or a package
+      ``__path__`` attribute by the :term:`path importer`.  Path entry
+      finders are related to, but different from
       :term:`meta path finders <meta path finder>`.
 
    triple-quoted string

Doc/reference/import_system.rst

 Python code in one :term:`module` gains access to the code in another module
 by the process of :term:`importing` it.  Most commonly, the :keyword:`import`
 statement is used to invoke the import system, but it can also be invoked
-by calling the built-in :func:`__import__` function.
+by calling the built-in :func:`__import__` function. An easier to use
+dynamic interface to the import system is also provided as
+:func:`importlib.import_module`.
 
 The :keyword:`import` statement combines two operations; it searches for the
 named module, then it binds the results of that search to a name in the local
 there is no special side-effects (e.g. name binding) associated with
 :func:`__import__`.
 
+When calling :func:`__import__` as part of an import statement, the
+import system first checks the module global namespace for a function by
+that name. If it is not found, then the standard builtin :func:`__import__`
+is called. Other mechanisms for invoking the import system (such as
+:func:`importlib.import_module`) do not perform this check and will always
+use the standard import system.
+
 When a module is first imported, Python searches for the module and if found,
 it creates a module object, initializing it.  If the named module cannot be
 found, an :exc:`ImportError` is raised.  Python implements various strategies
 to search for the named module when the import system is invoked.  These
 strategies can be modified and extended by using various hooks described in
-the sections below.  The entire import system itself can be overridden by
-replacing built-in :func:`__import__`.
+the sections below.
 
 
 Packages
 concept of :term:`packages <package>`.  It's important to keep in mind that
 all packages are modules, but not all modules are packages.  Or put another
 way, packages are just a special kind of module.  Although usually
-unnecessary, introspection of various module object attributes can determine
-whether a module is a package or not.
+unnecessary, it is possible to distinguish packages from other modules by
+introspection: packages are modules with a ``__path__`` attribute (the rules
+for valid path attributes are described :ref`below <package-path-rules>`.
 
 Packages can contain other packages and modules, while modules generally do
 not contain other modules or packages.  You can think of packages as the
 subpackage called :mod:`email.mime.text`.
 
 
-Regular packages
-----------------
+Initialized packages
+--------------------
 
 .. index::
-    pair: package; regular
+    pair: package; initialized
 
-Python defines two types of packages, :term:`regular packages <regular
-package>` and :term:`namespace packages <namespace package>`.  Regular
-packages are traditional packages as they existed in Python 3.2 and earlier.
-A regular package is typically implemented as a directory containing an
-``__init__.py`` file.  When a regular package is imported, this
-``__init__.py`` file is implicitly imported, and the objects it defines are
-bound to names in the package's namespace.  The ``__init__.py`` file can
+Python defines two types of packages, :term:`initialized packages
+<initialized package>` and :term:`namespace packages <namespace package>`.
+Initialized packages are traditional packages as they existed in Python 3.2
+and earlier. An initialized package is typically implemented as a directory
+containing an ``__init__.py`` file.  When an initialized package is imported,
+this ``__init__.py`` file is implicitly executed, and the objects it defines
+are bound to names in the package's namespace.  The ``__init__.py`` file can
 contain the same Python code that any other module can contain, and Python
 will add some additional attributes to the module when it is imported.
 
+Initialized packages may also act as a kind of namespace package, by setting
+their ``__path__`` attribute appropriate during initialization.
+:func:`pkgutil.extend_path` is one mechanism for achieving that effect.
 
 Namespace packages
 ------------------
 objects on the file system; they may be virtual modules that have no concrete
 representation.
 
+Namespace packages do not use an ordinary list for their ``__path__``
+attribute. They instead use a custom iterable type which will automatically
+perform a new search for package portions on the next import attempt within
+that package if the path of their parent package (or :data:`sys.path` for a
+top level package) changes.
+
+Package Layout Example
+----------------------
+
 For example, the following file system layout defines a top level ``parent``
 package with three subpackages::
 
         three/
             __init__.py
 
-Importing ``parent.one`` will implicitly import ``parent/__init__.py`` and
+Importing ``parent.one`` will implicitly execute ``parent/__init__.py`` and
 ``parent/one/__init__.py``.  Subsequent imports of ``parent.two`` or
 ``parent.three`` will import ``parent/two/__init__.py`` and
 ``parent/three/__init__.py`` respectively.
 first tries to import ``foo``, then ``foo.bar``, and finally ``foo.bar.baz``.
 If any of the intermediate imports fail, an :exc:`ImportError` is raised.
 
-
 The module cache
 ----------------
 
 
 There are actually two types of finders, and two different but related APIs
 for finders, depending on whether it is a :term:`meta path finder` or a
-:term:`sys path finder`.  Meta path processing occurs at the beginning of
-import processing, while sys path processing happens later, by the :term:`path
-importer`.
+:term:`path entry finder`.  Meta path processing occurs at the beginning of
+import processing, while path processing is invoked later, by the :term:`path
+importer` (which, by default, is installed as the last entry in the meta
+path).
 
 The following sections describe the protocol for finders and loaders in more
 detail, including how you can create and register new ones to extend the
 hooks* and *path hooks*.
 
 Meta hooks are called at the start of import processing, before any other
-import processing has occurred.  This allows meta hooks to override
-:data:`sys.path` processing, frozen modules, or even built-in modules.  Meta
-hooks are registered by adding new finder objects to :data:`sys.meta_path`, as
-described below.
+import processing (other than checking the module cache) has occurred.
+This allows meta hooks to override :data:`sys.path` processing,
+frozen modules, or even built-in modules.  Meta hooks are registered
+by adding new finder objects to :data:`sys.meta_path`, as described below.
 
 Path hooks are called as part of :data:`sys.path` (or ``package.__path__``)
 processing, at the point where their associated path item is encountered.
-Path hooks are registered by adding new callables to :data:`sys.path_hooks` as
-described below.
+Path hooks are registered by adding new callables to :data:`sys.path_hooks`
+as described below.
 
 
 The meta path
 searches :data:`sys.meta_path`, which contains a list of meta path finder
 objects.  These finders are queried in order to see if they know how to handle
 the named module.  Meta path finders must implement a method called
-:meth:`find_module()` which takes two arguments, a name and a path.  The meta
-path finder can use any strategy it wants to determine whether it can handle
-the named module or not.
+:meth:`find_module()` which takes two arguments, a name and an import search
+path. The meta path finder can use any strategy it wants to determine whether
+it can handle the named module or not.
 
 If the meta path finder knows how to handle the named module, it returns a
 loader object.  If it cannot handle the named module, it returns ``None``.  If
 
 The :meth:`find_module()` method of meta path finders is called with two
 arguments.  The first is the fully qualified name of the module being
-imported, for example ``foo.bar.baz``.  The second argument is the relative
-path for the module search.  For top-level modules, the second argument is
-``None``, but for submodules or subpackages, the second argument is the value
-of the parent package's ``__path__`` attribute, which must exist or an
+imported, for example ``foo.bar.baz``.  The second argument is the path
+entries to use for the module search.  For top-level modules, the second
+argument is ``None``, but for submodules or subpackages, the second
+argument is the value of the parent package's ``__path__`` attribute. If
+the appropriate ``__path__`` attribute cannot be accessed, an
 :exc:`ImportError` is raised.
 
+The meta path may be traversed multiple times for a single import request.
+For example, assuming none of the modules involved has already been cached,
+importing ``foo.bar.baz`` will first perform a top level import, calling
+``.find_module("foo", None)`` on each meta path finder. After ``foo`` has
+been imported, ``foo.bar`` will be imported by traversing the meta path a
+second time, calling ``.find_module("foo.bar", foo.__path__)``. Once
+``foo.bar`` has been imported, the final traversal will call
+``.find_module("foo.bar.baz", foo.bar.__path__)``
+
+Some meta path finders only support top level imports. These importers will
+always return ``None`` when anything other than ``None`` is passed as the
+second argument.
+
 Python's default :data:`sys.meta_path` has three meta path finders, one that
 knows how to import built-in modules, one that knows how to import frozen
-modules, and one that knows how to import modules from the file system
-(i.e. the :term:`path importer`).
+modules, and one that knows how to import modules based on path entries
+(i.e. the :term:`path importer`). As the meta path is processed in order,
+new finders inserted at the beginning of the list will be accessed before
+the default machinery, while those appended to the end of the list will only
+be accessed if a module is not found by the standard lookup process.
 
 
-Meta path loaders
------------------
+Module loaders
+--------------
 
 Once a loader is found via a meta path finder, the loader's
 :meth:`load_module()` method is called, with a single argument, the fully
 qualified name of the module being imported.  This method has several
-responsibilities, and should return the module object it has loaded [#fn1]_.
+responsibilities, and should return the module object it has loaded.
 If it cannot load the module, it should raise an :exc:`ImportError`, although
 any other exception raised during :meth:`load_module()` will be propagated.
 
-In many cases, the meta path finder and loader can be the same object,
+While loaders are expected to return the module they created, the import
+system will actually retrieve the object to ultimately be returned from the
+import operation from the module cache. The reason for this is to ensure
+that the same object is returned for both the initial and subsequent imports
+when the module code overwrites the module's own entry in the module cache.
+
+.. Re previous paragraph: Isn't process global state fun?
+
+In many cases, a meta path finder may also be a module loader,
 e.g. :meth:`finder.find_module()` would just return ``self``.
 
 Loaders must satisfy the following requirements:
    beforehand prevents unbounded recursion in the worst case and multiple
    loading in the best.
 
-   If the load fails, the loader needs to remove any modules it may have
-   inserted into ``sys.modules``.  If the module was already in
+   If the load fails, the loader needs to remove any modules it inserted
+   directly into ``sys.modules`` (other modules that were successfully
+   imported as a side effect may remain cached).  If a module was already in
    ``sys.modules`` then the loader should leave it alone.
 
  * The loader may set the ``__file__`` attribute of the module.  If set, this
 
 Here are the exact rules used:
 
- * If the module has an ``__loader__`` and that loader has a
+ * If the module has a ``__loader__`` and that loader has a
    :meth:`module_repr()` method, call it with a single argument, which is the
    module object.  The value returned is used as the module's repr.
 
             return "<module '{}' (namespace)>".format(module.__name__)
 
 
+.. _package-path-rules:
+
 module.__path__
 ---------------
 
 However, ``__path__`` is typically much more constrained than
 :data:`sys.path`.
 
-``__path__`` must be a list, but it may be empty.  The same rules used for
-:data:`sys.path` also apply to a package's ``__path__``, and
-:data:`sys.path_hooks` (described below) are consulted when traversing a
-package's ``__path__``.
+``__path__`` must be an iterable sequence of strings, but it may be empty.
+The same rules used for :data:`sys.path` also apply to a package's
+``__path__``, and :data:`sys.path_hooks` (described below) are
+consulted when traversing a package's ``__path__``.
 
 A package's ``__init__.py`` file may set or alter the package's ``__path__``
 attribute, and this was typically the way namespace packages were implemented
     single: path importer
 
 As mentioned previously, Python comes with several default meta path finders.
-One of these, called the :term:`path importer`, knows how to provide
-traditional file system imports.  It implements all the semantics for finding
+One of these, called the :term:`path importer`, is designed to handle the
+path entries found in ``sys.path`` and package ``__path__`` attributes.
+
+The path importer itself doesn't know how to import anything. Instead, it
+traverses the individual path entries, associating each of them with a
+path entry finder that knows how to handle that particular kind of path.
+
+The default set of path entry finders implement all the semantics for finding
 modules on the file system, handling special file types such as Python source
 code (``.py`` files), Python byte code (``.pyc`` and ``.pyo`` files) and
-shared libraries (e.g. ``.so`` files).
+shared libraries (e.g. ``.so`` files). When supported by the standard
+library, path entry finders also handle loading all of these file types
+(other than shared libraries) from zipfiles.
 
 In addition to being able to find such modules, there is built-in support for
 loading these modules.  To accomplish these two related tasks, additional
 
 A word of warning: this section and the previous both use the term *finder*,
 distinguishing between them by using the terms :term:`meta path finder` and
-:term:`sys path finder`.  Meta path finders and sys path finders are very
+:term:`path entry finder`.  Meta path finders and path entry finders are very
 similar, support similar protocols, and function in similar ways during the
 import process, but it's important to keep in mind that they are subtly
 different.  In particular, meta path finders operate at the beginning of the
 import process, as keyed off the :data:`sys.meta_path` traversal.
 
-On the other hand, sys path finders are in a sense an implementation detail of
-the path importer, and in fact, if the path importer were to be removed from
-:data:`sys.meta_path`, none of the sys path finder semantics would be invoked.
+On the other hand, path entry finders are in a sense an implementation detail of
+the path importer. If the path importer is removed from :data:`sys.meta_path`,
+none of the path entry finder semantics are invoked.
 
 
-sys path finders
-----------------
+path entry finders
+------------------
 
 .. index::
     single: sys.path
     single: PYTHONPATH
 
 The path importer is responsible for finding and loading Python modules and
-packages from the file system.  As a meta path finder, it implements the
+packages from path entries. As a meta path finder, it implements the
 :meth:`find_module()` protocol previously described, however it exposes
 additional hooks that can be used to customize how modules are found and
 loaded from the file system.
 
-Three variables are used during file system import, :data:`sys.path`,
-:data:`sys.path_hooks` and :data:`sys.path_importer_cache`.  These provide
-additional ways that the import system can be customized, in this case
-specifically during file system path import.
+Three additional variables are used for path imports, :data:`sys.path`,
+:data:`sys.path_hooks` and :data:`sys.path_importer_cache`.
+These provide additional ways that the import system can be customized, in
+this case specifically during file system path import.
 
 :data:`sys.path` contains a list of strings providing search locations for
-modules and packages.  It is initialized from the :data:`PYTHONPATH`
+top level modules and packages.  It is initialized from the :data:`PYTHONPATH`
 environment variable and various other installation- and
 implementation-specific defaults.  Entries in :data:`sys.path` can name
 directories on the file system, zip files, and potentially other "locations"
 that should be searched for modules.
 
-The path importer is a meta path finder, so the import system begins file
-system search by calling the path importer's :meth:`find_module()` method as
+The path importer is a meta path finder, so the import system begins path
+entry search by calling the path importer's :meth:`find_module()` method as
 described previously.  When the ``path`` argument to :meth:`find_module()` is
-given, it will be a list of string paths to traverse.  If not,
-:data:`sys.path` is used.
+given, it will be a list of string paths to traverse for an import within a
+package.  If not given (or ``None``), :data:`sys.path` is used to perform
+a top level import.
 
 The path importer iterates over every entry in the search path, and for each
-of these, searches for an appropriate sys path finder for the path entry.
+of these, searches for an appropriate path entry finder for the path entry.
 Because this can be an expensive operation (e.g. there are `stat()` call
 overheads for this search), the path importer maintains a cache mapping path
-entries to sys path finders.  This cache is maintained in
+entries to path entry finders.  This cache is maintained in
 :data:`sys.path_importer_cache`.  In this way, the expensive search for a
-particular path location's sys path finder need only be done once.  User code
+particular path location's path entry finder need only be done once.  User code
 is free to remove cache entries from :data:`sys.path_importer_cache` forcing
-the path importer to perform the path search again.
+the path importer to perform the path search again. The helper function,
+:func:`importlib.invalidate_caches` clears this cache, as well as other
+internal caches with the various finder and loader implementations.
+
+If the path entry is present in the cache, but refers to ``None``, it
+indicates that there is no finder is able to handle that path entry. To
+maintain backwards compatibility :class:`imp.NullImporter` also indicates
+that there is no finder able to handle that path entry (in versions of
+Python prior to 3.3, an entry of ``None`` indicated "fall back to the
+implicit default import system". This is no longer needed, as the
+entire import system now operates using the standard import protocols).
 
 If the path entry is not present in the cache, the path importer iterates over
 every callable in :data:`sys.path_hooks`.  Each entry in this list is called
 with a single argument, the path entry being searched.  This callable may
-either return a sys path finder that can handle the path entry, or it may
+either return a path entry finder that can handle the path entry, or it may
 raise :exc:`ImportError`.  An :exc:`ImportError` is used by the path importer
-to signal that the hook cannot find a sys path finder for that path entry.
+to signal that the hook cannot find a path entry finder for that path entry.
 The exception is ignored and :data:`sys.path_hooks` iteration continues.
 
-If :data:`sys.path_hooks` iteration ends with no sys path finder being
-returned then the path importer's :meth:`find_module()` method will return
-``None`` and an :exc:`ImportError` will be raised.
+If :data:`sys.path_hooks` iteration ends with no path entry finder being
+returned then ``None`` will be stored in the path importer cache for that
+path entry, the path importer's :meth:`find_module()` method will return
+``None`` and meta path processing will continue.
 
-If a sys path finder *is* returned by one of the callables on
-:data:`sys.path_hooks`, then the following protocol is used to ask the sys
-path finder for a module loader.  If a loader results from this step, it is
-used to load the module as previously described (i.e. its
+If a path entry finder *is* returned by one of the callables on
+:data:`sys.path_hooks`, then the following protocol is used to ask the
+path entry finder for a module loader.  If a loader results from this step,
+it is used to load the module as previously described (i.e. its
 :meth:`load_module()` method is called).
 
 
-sys path finder protocol
-------------------------
+path entry finder protocol
+--------------------------
 
-sys path finders support the same, traditional :meth:`find_module()` method
-that meta path finders support, however sys path finder :meth:`find_module()`
-methods are never called with a ``path`` argument.
-
-The :meth:`find_module()` method on sys path finders is deprecated though, and
-instead sys path finders should implement the :meth:`find_loader()` method.
-If it exists on the sys path finder, :meth:`find_loader()` will always be
-called instead of :meth:`find_module()`.
+In order to support imports of modules and initialized packages and also to
+contribute portions to namespace packages, path entry finders must implement
+the :meth:`find_loader()` method.
 
 :meth:`find_loader()` takes one argument, the fully qualified name of the
 module being imported.  :meth:`find_loader()` returns a 2-tuple where the
 first item is the loader and the second item is a namespace :term:`portion`.
 When the first item (i.e. the loader) is ``None``, this means that while the
-sys path finder does not have a loader for the named module, it knows that the
+path entry finder does not have a loader for the named module, it knows that the
 path entry contributes to a namespace portion for the named module.  This will
 almost always be the case where Python is asked to import a namespace package
-that has no physical presence on the file system.  When a sys path finder
+that has no physical presence on the file system.  When a path entry finder
 returns ``None`` for the loader, the second item of the 2-tuple return value
 must be a sequence, although it can be empty.
 
 If :meth:`find_loader()` returns a non-``None`` loader value, the portion is
 ignored and the loader is returned from the path importer, terminating the
-:data:`sys.path` search.
+search through the path entries.
+
+For backwards compatibility, path entry finders also support the same,
+traditional :meth:`find_module()` method that meta path finders support,
+however path entry finder :meth:`find_module()` methods are never called
+with a ``path`` argument (they are expected to record the appropriate
+path information from the initial call to the path hook).
+
+The :meth:`find_module()` method on path entry finders is deprecated, however,
+as it does not allow the path entry finder to contribute portions to
+namespace packages. Instead path entry finders should implement the
+:meth:`find_loader()` method as described above. If it exists on the path
+entry finder, :meth:`find_loader()` will always be called in preference to
+:meth:`find_module()`.
+
+
+Replacing the standard import system
+====================================
+
+The most reliable mechanism for replacing the entire import system is to
+delete the default contents of :data:`sys.meta_path`, replacing them
+entirely with a custom meta path hook.
+
+If it is acceptable to only alter the behaviour of import statements
+without affecting other APIs that access the import system, then replacing
+the builtin :func:`__import__` function may be sufficient. This may also
+be employed at the module level to only alter the behaviour of import
+statements within that module.
+
+To selectively prevent import of some modules from a hook early on the
+meta path (rather than disabling the standard import system entirely),
+it is sufficient to raise :exc:`ImportError` directly from
+:meth:`find_module` instead of returning ``None`` to indicate that the meta
+path search should continue.
 
 
 Open issues
 ===========
 
-XXX What to say about `imp.NullImporter` when it's found in
-:data:`sys.path_importer_cache`?
-
 XXX It would be really nice to have a diagram.
 
-.. [#fn1] The importlib implementation appears not to use the return value
-   directly. Instead, it gets the module object by looking the module name up
-   in ``sys.modules``.)
-
 
 References
 ==========
 without ``__init__.py`` files in Python 3.3.  :pep:`420` also introduced the
 :meth:`find_loader` protocol as an alternative to :meth:`find_module`.
 
+:pep:`328` describes the elimination of implicit relative imports and
+introduction of explicit relative imports.
+
 :pep:`366` describes the addition of the ``__package__`` attribute for
 explicit relative imports in main modules.