Commits

Anonymous committed fbf7c9b

latest revision

Comments (0)

Files changed (1)

 PEP: 376
 Title: Changing the .egg-info structure
-Version: $Revision: 72780 $
-Last-Modified: $Date: 2009-05-19 14:43:34 +0200 (Mar, 19 mai 2009) $
+Version: $Revision: 72911 $
+Last-Modified: $Date: 2009-05-25 12:22:46 +0200 (Lun, 25 mai 2009) $
 Author: Tarek Ziadé <tarek@ziade.org>
 Status: Draft
 Type: Standards Track
 
 - A new format for the .egg-info structure.
 - Some APIs to read the meta-data of a project
+- Replace PEP 262
+- An uninstall feature
 
 Definitions
 ===========
 
 A **project** is a Python application composed of one or several files, which can
-be Python modules, extensions or data. It is distributed using a `setup.py` script 
-with Distutils and/or Setuptools. The `setup.py` script indicates where each 
+be Python modules, extensions or data. It is distributed using a `setup.py` script
+with Distutils and/or Setuptools. The `setup.py` script indicates where each
 elements should be installed.
 
 Once installed, the elements are located in various places in the system, like:
 
-- in Python's site-packages (Python modules, Python modules organized into packages, 
+- in Python's site-packages (Python modules, Python modules organized into packages,
   Extensions, etc.)
 - in Python's `include` directory.
 - in Python's `bin` or `Script` directory.
 How projects are installed
 --------------------------
 
-Right now, when a project is installed in Python, every elements its contains 
-is installed in various directories. 
+Right now, when a project is installed in Python, every elements its contains
+is installed in various directories.
 
 The pure Python code for instance is  installed in the `purelib` directory,
 which is located in the Python  installation in `lib\python2.6\site-packages`
-for example under unix-like systems or Mac OS X, and in `Lib/site-packages` 
+for example under unix-like systems or Mac OS X, and in `Lib/site-packages`
 under Windows. This is done with the Distutils `install` command, which calls
 various subcommands.
 
-The `install_egg_info` subcommand is called during this process, in order to 
+The `install_egg_info` subcommand is called during this process, in order to
 create an `.egg-info` file in the `purelib` directory.
 
 For example, if the `zlib` project (which contains one package) is installed,
 Where `zlib` is a Python package, and `zlib-2.5.2-py2.4.egg-info` is
 a file containing the project metadata as described in PEP 314 [#pep314]_.
 
-This file corresponds to the file called `PKG-INFO`, built by 
+This file corresponds to the file called `PKG-INFO`, built by
 the `sdist` command.
 
-The problem is that many people use `easy_install` (setuptools [#setuptools]_) 
-or `pip` [#pip]_ to install their packages, and these third-party tools do not 
+The problem is that many people use `easy_install` (setuptools [#setuptools]_)
+or `pip` [#pip]_ to install their packages, and these third-party tools do not
 install packages in the same way that Distutils does:
 
-- `easy_install` creates an `EGG-INFO` directory inside an `.egg` directory, 
-  and adds a `PKG-INFO` file inside this directory. The `.egg` directory 
+- `easy_install` creates an `EGG-INFO` directory inside an `.egg` directory,
+  and adds a `PKG-INFO` file inside this directory. The `.egg` directory
   contains in that case all the elements of the project that are supposed to
-  be installed in `site-packages`, and is placed in the `site-packages` 
+  be installed in `site-packages`, and is placed in the `site-packages`
   directory.
 
 - `pip` creates an `.egg-info` directory inside the `site-packages` directory
 And the process differs, depending on the tools you have used to install the
 project, and if the project's `setup.py` uses Distutils or Setuptools.
 
-Under some circumstances, you might not be able to know for sure that you 
+Under some circumstances, you might not be able to know for sure that you
 have removed everything, or that you didn't break another project by
-removing a file that was shared among the two projects.
+removing a file that was shared among several projects.
 
-But there's common behavior: when you install a project, files are copied 
-in your system. And there's a way to keep track of theses files, so to remove 
+But there's common behavior: when you install a project, files are copied
+in your system. And there's a way to keep track of theses files, so to remove
 them.
 
 What this PEP proposes
 
 To address those issues, this PEP proposes a few changes:
 
-- a new `.egg-info` structure using a directory;
-- a list of elements this directory holds;
-- new functions in `pkgutil` to be able to query the information
-  of installed projects.
+- a new `.egg-info` structure using a directory, based on the `EggFormats`
+  standard from `setuptools` [#eggformats]_.
+- new APIs in `pkgutil` to be able to query the information of installed
+  projects. 
+- a de-facto replacement for PEP 262
+- an uninstall function in Distutils.
+
 
 .egg-info becomes a directory
 =============================
 
 The first change would be to make `.egg-info` a directory and let it
-hold the `PKG-INFO` file built by the `write_pkg_file` method of 
+hold the `PKG-INFO` file built by the `write_pkg_file` method of
 the `Distribution` class in Distutils.
 
+Notice that this change is based on the standard proposed by `EggFormats`.
+You may refer to its documentation for more information.
+
 This change will not impact Python itself, because `egg-info` files are not
-used anywhere yet in the standard library besides Distutils. 
+used anywhere yet in the standard library besides Distutils.
 
-Although it will impact the `setuptools` and `pip` projects, but given 
-the fact that they already work with a directory that contains a `PKG-INFO` 
+Although it will impact the `setuptools` and `pip` projects, but given
+the fact that they already work with a directory that contains a `PKG-INFO`
 file, the change will have no deep consequences.
 
 For example, if the `zlib` package is installed, the elements that
     - zlib-2.5.2.egg-info/
         PKG-INFO
 
-The Python version will also be removed from the `.egg-info` directory
-name.
+The syntax of the egg-info directory name is as follows::
 
-Adding a RECORD in the .egg-info directory
-==========================================
+    name + '-' + version + '.egg-info'
+
+The egg-info directory name is created using a new function called
+``egg_info_dirname(name, version)`` added to ``pkgutil``. ``name`` is
+converted to a standard distribution name any runs of non-alphanumeric
+characters are replaced with a single '-'. ``version`` is converted 
+to a standard version string. Spaces become dots, and all other 
+non-alphanumeric characters become dashes, with runs of multiple dashes 
+condensed to a single dash. Both attributes are then converted into their 
+filename-escaped form. Any '-' characters are currently replaced with '_'.
+
+Examples::
+
+    >>> egg_info_dirname('zlib', '2.5.2')
+    'zlib-2.5.2.egg-info'
+
+    >>> egg_info_dirname('python-ldap', '2.5')
+    'python_ldap-2.5.egg-info'
+
+    >>> egg_info_dirname('python-ldap', '2.5 a---5')
+    'python_ldap-2.5.a_5.egg-info'
+
+Adding a RECORD file in the .egg-info directory
+===============================================
 
 A `RECORD` file will be added inside the `.egg-info` directory at installation
-time. The `RECORD` file will hold the list of installed files. These correspond 
-to the files listed by the `record` option of the `install` command, and will 
-always be generated. This will allow uninstallation, as explained later in this 
+time. The `RECORD` file will hold the list of installed files. These correspond
+to the files listed by the `record` option of the `install` command, and will
+always be generated. This will allow uninstallation, as explained later in this
 PEP. This RECORD file is inspired from PEP 262 FILES [#pep262]_.
 
 The RECORD format
 -----------------
 
-The `RECORD` file is composed of records, one line per installed file.
-Each record is composed of three elements separated by a <tab> character:
+The `RECORD` file is a CSV-like file, composed of records, one line per
+installed file. Each record is composed of three elements.
 
 - the file's full **path**
 
  - if the installed file is located in the directory where the .egg-info
-   directory of the package is located, it will be a '/'-separated relative 
-   path, no matter what is the target system. This makes this information 
+   directory of the package is located, it will be a '/'-separated relative
+   path, no matter what is the target system. This makes this information
    cross-compatible and allows simple installation to be relocatable.
 
- - if the installed file is located elsewhere in the system, a 
+ - if the installed file is located elsewhere in the system, a
    '/'-separated absolute path is used.
 
 - the **MD5** hash of the file, encoded in hex. Notice that `pyc` and `pyo`
 
 - the file's size in bytes
 
+The ``csv`` module with its default options will be used to generate this file,
+so the field separator will be ",". Any "," characters found within a field
+will be escaped automatically by ``csv``.
+
 Example
 -------
 
 
 And the RECORD file will contain::
 
-    zlib/include/zconf.h    b690274f621402dda63bf11ba5373bf2    9544
-    zlib/include/zlib.h 9c4b84aff68aa55f2e9bf70481b94333    66188
-    zlib/lib/libz.a e6d43fb94292411909404b07d0692d46    91128
-    zlib/share/man/man3/zlib.3  785dc03452f0508ff0678fba2457e0ba    4486
-    zlib-2.5.2.egg-info/PKG-INFO    6fe57de576d749536082d8e205b77748    195
+    zlib/include/zconf.h,b690274f621402dda63bf11ba5373bf2,9544
+    zlib/include/zlib.h,9c4b84aff68aa55f2e9bf70481b94333,66188
+    zlib/lib/libz.a,e6d43fb94292411909404b07d0692d46,91128
+    zlib/share/man/man3/zlib.3,785dc03452f0508ff0678fba2457e0ba,4486
+    zlib-2.5.2.egg-info/PKG-INFO,6fe57de576d749536082d8e205b77748,195
     zlib-2.5.2.egg-info/RECORD
 
 Notice that:
 
 - the `RECORD` file can't contain a hash of itself and is just mentioned here
-- `zlib` and `zlib-2.5.2.egg-info` are located in `site-packages` so the file 
+- `zlib` and `zlib-2.5.2.egg-info` are located in `site-packages` so the file
   paths are relative to it.
 
-New functions in pkgutil
-========================
+New APIs in pkgutil
+===================
 
-To use the `.egg-info` directory content, we need to add in the standard 
+To use the `.egg-info` directory content, we need to add in the standard
 library a set of APIs. The best place to put these APIs seems to be `pkgutil`.
 
-The new functions added in the package are :
+EggInfo class
+-------------
 
-- get_projects() -> iterator
+A new class called ``EggInfo`` is created, which provides the following
+attributes:
 
-  Provides an iterator that will return (name, path) tuples, where `name`
-  is the name of a registered project and `path` the path to its `egg-info`
-  directory.
+- ``name``: The name of the project
 
-- get_egg_info(project_name) -> path or None
+- ``metadata``: A ``DistributionMetadata`` instance loaded with the project's
+  PKG-INFO file
 
-  Scans all elements in `sys.path` and looks for all directories ending with
-  `.egg-info`. Returns the directory path that contains a PKG-INFO that matches
-  `project_name` for the `name` metadata. Notice that there should be at most 
-  one result. The first result founded will be returned.
+The following methods are provided:
 
-  If the directory is not found, returns None.
+- ``get_installed_files(local=False)`` -> iterator of (path, md5, size)
 
-  XXX The implementation of `get_egg_info` will focus on minimizing the I/O 
-  accesses.
+  Iterates over the `RECORD` entries and return a tuple ``(path, md5, size)`` 
+  for each line. If ``local`` is ``True``, the path is transformed into a 
+  local absolute path. Otherwise the raw value from `RECORD` is returned.
 
-- get_metadata(project_name) -> DistributionMetadata or None
+- ``uses(path)`` -> Boolean
 
-  Uses `get_egg_info` to get the `PKG-INFO` file, and returns a 
-  `DistributionMetadata` instance that contains the metadata.
+  Returns ``True`` if ``path`` is listed in `RECORD`. ``path``
+  can be a local absolute path or a relative '/'-separated path.
 
-- get_files(project_name, local=False) -> iterator of (path, hash, size, 
-                                                       other_projects)
+- ``owns(path)`` -> Boolean
 
-  Uses `get_egg_info` to get the `RECORD` file, and returns an iterator.
+  Returns ``True`` if ``path`` is owned by the project.
+  Owned means that the path is used only by this project and is not used
+  by any other project. ``path`` can be a local absolute path or a relative
+  '/'-separated path.
 
-  Each returned element is a tuple `(path, hash, size, other_projects)` where
-  ``path``, ``hash``, ``size`` are the values found in the RECORD file.
+- ``get_file(path, binary=False)`` -> file object
 
-  `path` is the raw value founded in the RECORD file. If `local` is 
-  set to True, `path` will be translated to its real absolute path, using
-  the local path separator.
+  Returns a ``file`` instance for the file pointed by ``path``. ``path`` can be 
+  a local absolute path or a relative '/'-separated path. If ``binary`` is 
+  ``True``, opens the file in binary mode.
 
-  `other_projects` is a tuple containing the name of the projects that are 
-  also referring to this file in their own RECORD file (same path).
+.egg-info functions
+-------------------
 
-  If `other_projects` is empty, it means that the file is only referred by the
-  current project. In other words, it can be removed if the project is removed.
+The new functions added in the ``pkgutil`` are :
 
-- get_egg_info_file(project_name, path, binary=False) -> file object or None
+- ``get_egg_infos()`` -> iterator
 
-  Uses `get_egg_info` and gets any element inside the directory,
-  pointed by its relative path. `get_egg_info_file` will perform
-  an `os.path.join` on `get_egg_info(project_name)` and `path` to build the 
-  whole path. 
+  Provides an iterator that looks for ``.egg-info`` directories in ``sys.path``
+  and returns ``EggInfo`` instances for each one of them.
 
-  `path` can be a '/'-separated path or can use the local separator. 
-  `get_egg_info_file` will automatically convert it using the platform path 
-  separator, to look for the file.
+- ``get_egg_info(project_name)`` -> path or None
 
-  If `binary` is set True, the file will be opened using the binary mode.
+  Scans all elements in ``sys.path`` and looks for all directories ending with
+  ``.egg-info``. Returns an ``EggInfo`` corresponding to the ``.egg-info``
+  directory that contains a PKG-INFO that matches `project_name` for the `name`
+  metadata.
 
-Let's use it with our `zlib` example::
+  Notice that there should be at most one result. The first result founded
+  will be returned. If the directory is not found, returns None.
 
-    >>> from pkgutil import (get_egg_info, get_metadata, get_egg_info_file, 
-    ...                      get_files)
-    >>> get_egg_info('zlib')
+- ``get_file_users(path)`` -> iterator of ``EggInfo`` instances.
+
+  Iterates over all projects to find out which project uses ``path``.
+  ``path`` can be a local absolute path or a relative '/'-separated path.
+
+Cache functions
+---------------
+
+The functions from the previous section work with a global memory cache to
+reduce the numbers of I/O accesses and speed up the lookups.
+
+The cache can be managed with these functions:
+
+- ``purge_cache``: removes all entries from cache.
+- ``cache_enabled``: returns ``True`` if the cache is enabled.
+- ``enable_cache``: enables the cache.
+- ``disable_cache``: disables the cache.
+
+Example
+-------
+
+Let's use some of the new APIs with our `zlib` example::
+
+    >>> from pkgutil import get_egg_info, get_file_users
+    >>> egg_info = get_egg_info('zlib')
+    >>> egg_info.name
+    'zlib'
+    >>> egg_info.metadata.version
+    '2.5.2'
+
     '/opt/local/lib/python2.6/site-packages/zlib-2.5.2.egg-info'
     >>> metadata = get_metadata('zlib')
     >>> metadata.version
     '2.5.2'
-    >>> get_egg_info_file('zlib', 'PKG-INFO').read()
-    some
-    ...
-    files
-    >>> for path, hash, size, other_projects in get_files('zlib'):
-    ...     print '%s %s %d %s' % (path, hash, size, ','.join(other_projects))
+
+    >>> for path, hash, size in egg_info.get_installed_files()::
+    ...     print '%s %s %d %s' % (path, hash, size)
     ...
     zlib/include/zconf.h b690274f621402dda63bf11ba5373bf2 9544
     zlib/include/zlib.h 9c4b84aff68aa55f2e9bf70481b94333 66188
-    zlib/lib/libz.a e6d43fb94292411909404b07d0692d46 91128 
-    zlib/share/man/man3/zlib.3 785dc03452f0508ff0678fba2457e0ba 4486 
-    zlib-2.5.2.egg-info/PKG-INFO 6fe57de576d749536082d8e205b77748 195 
-    zlib-2.5.2.egg-info/RECORD None None 
+    zlib/lib/libz.a e6d43fb94292411909404b07d0692d46 91128
+    zlib/share/man/man3/zlib.3 785dc03452f0508ff0678fba2457e0ba 4486
+    zlib-2.5.2.egg-info/PKG-INFO 6fe57de576d749536082d8e205b77748 195
+    zlib-2.5.2.egg-info/RECORD None None
 
+    >>> egg_info.uses('zlib/include/zlib.h')
+    True
+    >>> egg_info.owns('zlib/include/zlib.h')
+    True
+
+    >>> egg_info.get_file('zlib/include/zlib.h')
+    <open file at ...>
+
+PEP 262 replacement
+===================
+
+In the past an attempt was made to create a installation database (see PEP 262
+[#pep262]_).
+
+Extract from PEP 262 Requirements:
+
+    " We need a way to figure out what distributions, and what versions of
+    those distributions, are installed on a system..."
+
+
+Since the APIs proposed in the current PEP provide everything needed to meet
+this requirement, PEP 376 will replace PEP 262 and will become the official
+`installation database` standard.
+
+The new version of PEP 345 (XXX work in progress) will extend the Metadata
+standard and will fullfill the requirements described in PEP 262, like the
+`REQUIRES` section.
 
 Adding an Uninstall function
 ============================
 
-Distutils provides a very basic way to install a project, which is running
+Distutils already provides a very basic way to install a project, which is running
 the `install` command over the `setup.py` script of the distribution.
 
-Distutils will provide a very basic ``uninstall`` function, that will be added 
-in ``distutils.util`` and will take the name of the project to uninstall as 
-its argument. ``uninstall`` will use ``pkgutil.get_files`` and remove all 
+Distutils will provide a very basic ``uninstall`` function, that will be added
+in ``distutils.util`` and will take the name of the project to uninstall as
+its argument. ``uninstall`` will use the APIs desribed earlier and remove all
 unique files, as long as their hash didn't change. Then it will remove
-directories where it removed the last elements.
+empty directories left behind.
 
 ``uninstall`` will return a list of uninstalled files::
 
 
 If the project is not found, a ``DistutilsUninstallError`` will be raised.
 
-To make it a reference API for third-party projects that wish to control 
-how `uninstall` works, a second callable argument can be used. It will be 
-called for each file that is removed. If the callable returns `True`, the 
+To make it a reference API for third-party projects that wish to control
+how `uninstall` works, a second callable argument can be used. It will be
+called for each file that is removed. If the callable returns `True`, the
 file will be removed. If it returns False, it will be left alone.
 
 Examples::
     ...
     >>> uninstall('zlib', _dry_run)
 
-Of course, a third-party tool can use ``pkgutil.get_files``, to implement 
+Of course, a third-party tool can use ``pkgutil`` APIs to implement
 its own uninstall feature.
 
 Backward compatibility and roadmap
 .. [#pip]
    http://pypi.python.org/pypi/pip
 
+.. [#eggformats]
+   http://peak.telecommunity.com/DevCenter/EggFormats
+
 Aknowledgments
 ==============