Source

pep376 / docs / APIS.txt

Full commit
================
Developers guide
================

This document provides a detailed documentation on the classes implemented
for PEP 376, for developers that want to implement their own packaging 
system.

.. contents::

Classes
=======

The API is organized in five classes that work with directories and Zip files
(so it works with files included in Zip files, see PEP 273 for more details
[#pep273]_.

- ``Distribution``: manages an `.egg-info` directory.
- ``ZippedDistribution``: manages an `.egg-info` directory contained in a zip
  file.
- ``DistributionDir``: manages a directory that contains some `.egg-info`
  directories.
- ``ZippedDistributionDir``: manages a zipped directory that contains
  some `.egg.info` directory.
- ``DistributionDirMap``: manages ``DistributionDir`` instances.

Distribution class
------------------

A new class called ``Distribution`` is created with a the path of the
`.egg-info` directory provided to the contructor. It reads the metadata
contained in `PKG-INFO` when it is instanciated.

``Distribution(path)`` -> instance

  Creates a ``Distribution`` instance for the given ``path``.

``Distribution`` provides the following attributes:

- ``name``: The name of the distribution.

- ``metadata``: A ``DistributionMetadata`` instance loaded with the
  distribution's PKG-INFO file.

And following methods:

- ``get_installed_files(local=False)`` -> iterator of (path, md5, size)

  Iterates over the `RECORD` entries and return a tuple ``(path, md5, size)``
  for each line. If ``local`` is ``True``, the path is transformed into a
  local absolute path. Otherwise the raw value from `RECORD` is returned.

  A local absolute path is an absolute path in which occurrences of '/'
  have been replaced by the system separator given by ``os.sep``.

- ``uses(path)`` -> Boolean

  Returns ``True`` if ``path`` is listed in `RECORD`. ``path``
  can be a local absolute path or a relative '/'-separated path.

- ``get_egginfo_file(path, binary=False)`` -> file object

   Returns a file located under the `.egg-info` directory.

   Returns a ``file`` instance for the file pointed by ``path``.

   ``path`` has to be a '/'-separated path relative to the `.egg-info`
   directory or an absolute path.

   If ``path`` is an absolute path and doesn't start with the `.egg-info`
   directory path, a ``DistutilsError`` is raised.

   If ``binary`` is ``True``, opens the file in read-only binary mode (`rb`),
   otherwise opens it in read-only mode (`r`).

- ``get_egginfo_files(local=False)`` -> iterator of paths

  Iterates over the `RECORD` entries and return paths for each line if the path
  is pointing a file located in the `.egg-info` directory or one of its
  subdirectory.

  If ``local`` is ``True``, each path is transformed into a
  local absolute path. Otherwise the raw value from `RECORD` is returned.

ZippedDistribution class
------------------------

A ``ZippedDistribution`` class is provided. It overrides the ``Distribution``
class so its methods work with an `.egg.info` directory located in a zip file.

``ZippedDistribution(zipfile, path)`` -> instance

  Creates a ``ZippedDistribution`` instance for the given relative ``path``
  located in the ``zipfile`` file.

Other public methods and attributes are similar to  ``Distribution``.

DistributionDir class
---------------------

A new class called ``DistributionDir`` is created with a path
corresponding to a directory. For each `.egg-info` directory founded in
`path`, the class creates a corresponding ``Distribution``.

The class is a ``set`` of ``Distribution`` instances. ``DistributionDir``
provides a ``path`` attribute corresponding to the path is was created with.

``DistributionDir(path)`` -> instance

  Creates a ``DistributionDir`` instance for the given ``path``.

It also provides one extra method besides the ones from ``set``:

- ``get_file_users(path)`` -> Iterator of ``Distribution``.

  Returns all ``Distribution`` which uses ``path``, by calling
  ``Distribution.uses(path)`` on all ``Distribution`` instances.


ZippedDistributionDir class
---------------------------

A ``ZippedDistributionDir`` is provided. It overrides the
``DistributionDir`` class so its methods work with a Zip file.

``ZippedDistributionDir(path)`` -> instance

  Creates a ``ZippedDistributionDir`` instance for the given ``path``.

Other public methods and attributes are similar to  ``DistributionDir``.


DistributionDirMap class
------------------------

A new class called ``DistributionDirMap`` is created. It's a collection of
``DistributionDir`` and ``ZippedDistributionDir`` instances.

``DistributionDirMap(paths=None, use_cache=True)`` -> instance

  If ``paths`` is not not, it's a sequence of paths the constructor loads
  in the instance.

  The constructor also takes an optional ``use_cache`` argument.
  When it's ``True``, ``DistributionDirMap`` will use a global
  cache to reduce the numbers of I/O accesses and speed up the lookups.

  The cache is a global mapping containing ``DistributionDir`` and
  ``ZippedDistributionDir`` instances. When a
  ``DistributionDirMap`` object is created, it will use the cache to
  add an entry for each path it visits, or reuse existing entries. The
  cache usage can be disabled at any time with the ``use_cache`` attribute.

  The cache can also be emptied with the global ``purge_cache`` function.

The class is a ``dict`` where the values are ``DistributionDir``
and ``ZippedDistributionDir`` instances and the keys are their path
attributes.

``DistributionDirMap`` also provides the following methods besides the ones
from ``dict``:

- ``load(*paths)``

  Creates and adds ``DistributionDir`` (or
  ``ZippedDistributionDir``) instances corresponding to ``paths``.

- ``reload()``

  Reloads existing entries.

- ``get_distributions()`` -> Iterator of ``Distribution`` (or
  ``ZippedDistribution``) instances.

  Iterates over all ``Distribution`` and ``ZippedDistribution`` contained
  in ``DistributionDir`` and ``ZippedDistributionDir`` instances.

- ``get_distribution(dist_name)`` -> ``Distribution`` (or
  ``ZippedDistribution``) or None.

  Returns a ``Distribution`` (or ``ZippedDistribution``) instance for the
  given distribution name. If not found, returns None.

- ``get_file_users(path)`` -> Iterator of ``Distribution`` (or
  ``ZippedDistribution``) instances.

  Iterates over all distributions to find out which distributions use the file.
  Returns ``Distribution`` (or ``ZippedDistribution``) instances.

Query Functions
===============

The new functions added in ``pkgutil`` to query installed distributions are:

- ``get_distributions()`` -> iterator of ``Distribution`` (or
  ``ZippedDistribution``) instance.

  Provides an iterator that looks for ``.egg-info`` directories in ``sys.path``
  and returns ``Distribution`` (or ``ZippedDistribution``) instances for
  each one of them.

- ``get_distribution(name)`` -> ``Distribution``  (or ``ZippedDistribution``)
  or None.

  Scans all elements in ``sys.path`` and looks for all directories ending with
  ``.egg-info``. Returns a ``Distribution``  (or ``ZippedDistribution``)
  corresponding to the ``.egg-info`` directory that contains a PKG-INFO that
  matches `name` for the `name` metadata.

  Notice that there should be at most one result. The first result founded
  will be returned. If the directory is not found, returns None.

- ``get_file_users(path)`` -> iterator of ``Distribution``  (or
  ``ZippedDistribution``) instances.

  Iterates over all distributions to find out which distributions uses ``path``.
  ``path`` can be a local absolute path or a relative '/'-separated path.

All these functions use the same global instance of ``DistributionDirMap``
to use the cache. Notice that the cache is never emptied explicitely.

Example
-------

Let's use some of the new APIs with our `docutils` example::

    >>> from pkgutil import get_distribution, get_file_users
    >>> dist = get_distribution('docutils')
    >>> dist.name
    'docutils'
    >>> dist.metadata.version
    '0.5'

    >>> for path, hash, size in dist.get_installed_files()::
    ...     print '%s %s %d' % (path, hash, size)
    ...
    docutils/__init__.py b690274f621402dda63bf11ba5373bf2 9544
    docutils/core.py 9c4b84aff68aa55f2e9bf70481b94333 66188
    roman.py a4b84aff68aa55f2e9bf70481b943D3 234
    /usr/local/bin/rst2html.py a4b84aff68aa55f2e9bf70481b943D3 234
    docutils-0.5-py2.6.egg-info/PKG-INFO 6fe57de576d749536082d8e205b77748 195
    docutils-0.5-py2.6.egg-info/RECORD None None

    >>> dist.uses('docutils/core.py')
    True

    >>> dist.uses('/usr/local/bin/rst2html.py')
    True

    >>> dist.get_egginfo_file('PKG-INFO')
    <open file at ...>

Uninstall function
==================

Distutils already provides a very basic way to install a distribution, which
is running the `install` command over the `setup.py` script of the
distribution.

Distutils will provide a very basic ``uninstall`` function, that will be added
in ``distutils.util`` and will take the name of the distribution to uninstall
as its argument. ``uninstall`` will use the APIs desribed earlier and remove all
unique files, as long as their hash didn't change. Then it will remove
empty directories left behind.

``uninstall`` will return a list of uninstalled files::

    >>> from distutils.util import uninstall
    >>> uninstall('docutils')
    ['/opt/local/lib/python2.6/site-packages/docutils/core.py',
     ...
     '/opt/local/lib/python2.6/site-packages/docutils/__init__.py']

If the distribution is not found, a ``DistutilsUninstallError`` will be raised.

Filtering
---------

To make it a reference API for third-party projects that wish to control
how `uninstall` works, a second callable argument can be used. It will be
called for each file that is removed. If the callable returns `True`, the
file will be removed. If it returns False, it will be left alone.

Examples::

    >>> def _remove_and_log(path):
    ...     logging.info('Removing %s' % path)
    ...     return True
    ...
    >>> uninstall('docutils', _remove_and_log)

    >>> def _dry_run(path):
    ...     logging.info('Removing %s (dry run)' % path)
    ...     return False
    ...
    >>> uninstall('docutils', _dry_run)

Of course, a third-party tool can use ``pkgutil`` APIs to implement
its own uninstall feature.

Installer marker
----------------

As explained earlier in this PEP, the `install` command adds an `INSTALLER`
file in the `.egg-info` directory with the name of the installer.

To avoid removing distributions that where installed by another packaging system,
the ``uninstall`` function takes an extra argument ``installer`` which default
to ``distutils``.

When called, ``uninstall`` will control that the ``INSTALLER`` file matches
this argument. If not, it will raise a ``DistutilsUninstallError``::

    >>> uninstall('docutils')
    Traceback (most recent call last):
    ...
    DistutilsUninstallError: docutils was installed by 'cool-pkg-manager'

    >>> uninstall('docutils', installer='cool-pkg-manager')

This allows a third-party application to use the ``uninstall`` function
and make sure it's the only program that can remove a distribution it has
previously installed. This is useful when a third-party program that relies
on Distutils APIs does extra steps on the system at installation time,
it has to undo at uninstallation time.