1. Carl Meyer
  2. sample-distutils2-project


sample-distutils2-project / new-config-file.rst

Proposed Changes to Configuration Files

Author: Éric Araujo <merwok@netwok.org>
Credits:Carl Meyer, folks at PyCon 2010, people in #distutils
Revision: 1.0


This document is not part of the documentation of Distutils2. It is a design/discussion document that serves to explain directions, collect feedback and votes, and will ultimately be rewritten as proper documentation (without all the explanations about choices) and moved into the relevant doc files.

One goal of Distutils2 is to put all the information required to build and install a distribution into a static configuration file (:file:`setup.cfg`) instead of Python code (:file:`setup.py`). This information (i.e. the arguments to distutils.core.setup) is split into metadata, files and customization hooks.

In the olden days, Distutils configuration files were used only to give options to commands. They were also designed to be extensible: Third-party tools relying on Distutils or providing new commands could tell their users to add a section in the distribution’s :file:`setup.cfg` file or in their user config file to set options. There is a simple API to get these options merged from all configuration files (which will be even simpler in Distutils2).

Moving the distribution configuration from a script to a static file makes it easier for tools to get the information without having to run code (the :file:`setup.py` script). It will also allow a variety of tools written in any language to work from the same information.

The new sections are different from usual sections that give options to commands because they make no sense in system or user configuration files. While specifying install or sdist options in a system or user configuration file is useful, options like author name or scripts to include in a distribution have to be in the project’s config file only.

In another discussion, it may be good to think about configuration files precedence rules; e.g. if a user specifies an installation directory in their own config file, why is the distribution’s file able to override that choice?


The first kind of :func:`setup` arguments that should be supported in :file:`setup.cfg` are the ones that give metadata. Fields are defined in PEP 345 and progress is tracked on Python #8252. These fields are uncontroversial:

name = RestingParrot
version = 0.6.4
author = Carl Meyer
author-email= carl@oddbird.net
summary = A sample project demonstrating distutils2 packaging
classifiers =
  Development Status :: 4 - Beta
  Environment :: Console (Text Based)
  Environment :: X11 Applications :: GTK; python_version < '3'
  License :: OSI Approved :: MIT License
  Programming Language :: Python
  Programming Language :: Python :: 2
  Programming Language :: Python :: 3
requires-dist =
  MichaelPalin (> 1.1)
  pywin32; sys.platform == 'win32'
  pysqlite2; python_version < '2.5'
  inotify (0.0.1); sys.platform == 'linux2'
requires-external = libxml2
provides-dist = distutils2-sample-project (0.2)
project-url =
  Main repository, http://bitbucket.org/carljm/sample-distutils2-project
  Fork in progress, http://bitbucket.org/Merwok/sample-distutils2-project

Multi-value fields use newline-separated values, since the values themselves may contain spaces. The first (or only) value may be on the same line as the key or on the following one.

The config file parser automatically supports case variation and underscores in place of hyphen in field names. Our own documentation should be consistent and use only lower case and hyphens, for simplicity and non-ugliness.

The PEP 345 environment markers used here will be passed to the DistributionMetadata instance without processing, as intended: The class knows which fields are allowed to use a marker and how to interpret them.

Some fields don’t have a specified format or can be improved. Proposals follow.



The encoding of the config file is UTF-8. This encoding enables using Unicode characters in string fields, and is also a superset of ASCII.

Avoid Metadata-Version


This field does not have to be in the file, since the DistributionMetadata class detects the right version from the fields that are present.

Use CSV for Keywords and Requires-Python


PEP 345 only says that the Keywords field is “a list”; the example uses space-delimited values, but distutils and distutils2 print out comma-separated values, which allows having keywords with spaces in them (e.g. “version control”). The Requires-Python field is described as comma-separated values.

Since keywords and supported versions are typically much shorter than classifiers or dependencies, I propose that :file:`setup.cfg` use a comma-separated list of values, with leading and trailing spaces removed for user convenience (i.e. keywords = version control, packaging, testing, unit testing will give the list ['version control', 'packaging', 'testing', 'unit testing']). More examples:

requires-python = 2.6
requires-python = >=2.4, <=3.0

Alternatively, if it is deemed confusing to have two ways of giving multi-value fields, the field can be newline-separated like other fields already defined. Consistency would win other convenience.

Merge author and author-email


Merge name and email in a single field for author (and maintainer):

author = Carl Meyer <carl@oddbird.net>
maintainer = Éric Araujo <merwok@netwok.org>

It is a common format, easy to parse (we do not support any valid RFC 2822 email field, just specifically name <email>). PEP 345 :file:`METADATA` files separate author name and email, but for user-written :file:`setup.cfg` this format is nicer.

Get description from a file


Use the contents of a file as value for description. Long descriptions typically contain blank lines, which are stripped by our config file parser in Python < 3.2, so including the long description directly in the config file is a non-starter. People often already have the description in a :file:`README` file or equivalent, which they can edit and check with reST-enabled tools. Thus this proposal replaces a very common idiom in setup scripts, prevents duplication and desynchronisation, and avoids the need for me to touch regular expressions to tweak the parser in unholy ways. The value is a path relative to the directory containing the :file:`setup.cfg` (.. disallowed). Examples:

description = README
description = lib/python/unicorn/README.rst

The value can be a list of files, to be concatenated in order and used as description. Now that Distutils2 and PyPI allow uploading documentation and adding arbitrary links in a project page, the need for overly long description values is reduced, but some users would still want to concatenate e.g. :file:`README` and :file:`NEWS`. Thus, this field should allow a newline-separated list of files (it allows paths with spaces and complies with already-defined way of giving multi-value fields).

Files listed in that field are automatically added to distributions.

Fix misnamed fields

The things listed in requires-dist and friends have a name and a version (optional) but no particular distribution format, therefore they’re not distributions but releases (or arguably projects). Editing accepted PEPs is hard, but it’s better to do it now before our tools and terminology are used in the wild. This item is listed only for completeness, not to call a vote; Alexis is the owner of this request, and the field names in :file:`setup.cfg` will follow the latest version of PEP 345.


Most other :func:`setup` arguments have to do with files. Arguments list Python modules, extension modules (written in C or C++), packages, scripts, files related to a package, other files. As specified in Python #8253, a new section is introduced, files; nothing else is already defined.

Listing modules and packages


For the new format, it is proposed that the three kinds of modules (Python modules, extension modules and packages) be merged into a single option. Given this file structure:

$ tree
├── haven.py
├── pirate.c
├── setup.cfg
└── ship
    ├── cabins
    │   ├── captain.py
    │   └── __init__.py
    ├── hull.c
    └── __init__.py

This configuration is enough to include the two modules, the package, its subpackage and all submodules in the distribution and have them processed by the relevant commands (build_*, sdist, etc.):

modules = haven

(See below for the rationale to separate with newlines instead of arbitrary whitespace. See appendix for implementation details explaining how this merged list will be easily parsed.)

It is not possible to have a module and a package with the same name. In addition to being documented, this restriction could also be a runtime warning.


Calling packages “modules” may be confusing to some people, e.g. beginners, even if it’s technically correct. Other proposed names include “source” (too vague) and “importables” (unused in the documentation and ugly).


Alternatively, modules and packages could be listed separately. Since it appears that people tend to use either one or the other in their projects, there would be no cognitive overload in defining two fields instead of one:

modules = haven
packages = ship
exclude-packages = ship.hull

This may also prevent future problems when namespace packages are supported by Python, or maybe not; I have to read PEP 382 closely to get a better undestanding of file layout and possible detection (esp. recursion) issues.



Package listing is recursive; real-world use of :func:`setuptools.find_packages` shows that this is a useful feature. There is a way to control it:

modules = ship
exclude-modules = ship.hull

Additionally, a boolean option could control all-or-nothing recursion:

modules = ship
recursive-modules = 0

It is not clear that this would solve problems, e.g. for the mx project which is developped in one tree but packaged as separate PyPI projects; maybe it’s best not to define this option right now, try to convert complex projects and then revise this proposal.

Replacing package_dir


Instead of replacing the package_dir argument with a field of the same name, it is proposed to merge it with the packages (or modules) values:

packages =
exclude-packages =

This translates this file structure:

$ tree .
├── ship
│   └── __init__.py
├── src
│   └── parrot
│       ├── __init__.py
│       └── tests
│           └── __init__.py
└── src2
    └── thing
        └── __init__.py

Note that these semantics are different from setup(packages=['thing'], package_dir={'thing': 'src2'}: In the new proposal, src2:thing does not mean that the :file:`src2` directory is to be renamed :file:`thing` in the build directory (reference), but that the directory :file:`src2` contains another directory named :file:`thing` (with its :file:`__init__.py` file and other submodules). The changed semantics are more intuitive.

Using : as a separator forbids using it in directory name, which would not be very sane anyway. Using it instead of a space or a slash also allows putting packages in a deep subdirectory:

packages = client/bindings/python:parrotlib

This source directory syntax is also available for modules (in case the proposal to merge them with packages is rejected):

modules = ham/lib/python:ham

If a study of setup scripts in projects distributed on the Cheeseshop reveals that an overwhelming majority uses only one package_dir or none, this alternate, simpler proposal would be enough:

packages = parrot
exclude-packages = parrot.tests
package-dir = src

Replacing conditionals


We can define the packages or modules field to be newline-separated and accept PEP 345 environment markers to support source distributions that contain e.g. code for both 2.x and 3.x, like httplib2 does:

packages =
  python2:httplib2; python_version < '3'
  python3:httplib2; python_version > '2'

(Form using the alternate package_dir proposal:

packages = httplib2
package-dir = python2; python_version < '3'
              python3; python_version > '2'

Using a multi-line value for this field is kind of ugly, though.)

Since there is no else, each condition has to be written twice (once in normal form, once in reverse, which can be tricky and/or tedious), and the values available as EXPR in environment markers (Python version, OS name,  etc.) do not provide all that is required. One example from Mercurial that is not trivial to reverse (or maybe I’m just bad at boolean logic):

if sys.platform == 'win32' and sys.version_info < (2, 5, 0, 'final'):

Example that can’t be translated:

if sys.platform == 'linux2' and os.uname()[2] > '2.6':
  # The inotify extension is only usable with Linux 2.6 kernels.

For such cases, the solution seems to use a pre-build hook to edit the lists of modules and packages. For trivial cases, environment markers provide a solution that does not require writing any code, so they’re still useful in the files section.

Extension modules


A new section family is proposed to describe extension modules, to replace instantiation of :class:`Extension` objects with the right options in :file:`setup.py`. Each extension module has to be listed in the files field and described in its own section:

modules = ship.pirate

[extension: ship.pirate]
sources = ship/pirate.c
headers = Python.h pirate.h
include-dirs = include
optional = 1

The section name is the string extension: followed by optional whitespace and the full name of the module, field names are directly taken from :class:`Extension` arguments, values are simple adaptations (string arguments are single values, string lists are multi-value fields (whitespace-separated or newline-separated, to be decided), booleans are :mod:`ConfigParser` booleans).


If deemed useful, simple variables could be added to these sections; see definition.


An alternate proposal that requires only one section but may prove more difficult to write and to parse is derived from the older Setup format, deprecated in Distutils and removed in Distutils2 (see :file:`{python3.2}/Lib/distutils/tests/Setup.sample`):

pirate = pirate.c
ship.hull = ship/hull.c

The format is module = source files [arguments to the compiler]. More involved example from SDL:

_camera = src/_camera.c src/camera_v4l2.c src/camera_v4l.c $SDL $DEBUG
_numericsurfarray = src/_numericsurfarray.c $SDL $DEBUG
font = src/font.c $SDL $FONT $DEBUG
scrap = src/scrap.c $SDL $SCRAP $DEBUG

This example introduces variables, which can be any string. A simple proposal for the assignment syntax:

$GFX = src/SDL_gfx/SDL_gfxPrimitives.c
$SDL = -I/usr/include/SDL -D_REENTRANT -lSDL
$FONT = -lSDL_ttf
$SCRAP = -lX11

In other words, every key starting with a dollar sign is a variable that will be usable in the regular fields of the same section.


If useful, expanding already defined variables in other variable definitions could be allowed.


If useful, environment markers could be allowed in fields of this section.

Listing scripts


The Python list naturally translates to a multi-value field:

scripts =

Paths are relative, with .. component forbidden. Using multi-line instead of whitespace or comma-separated allows directory names with spaces and is consistent with other multi-value fields.


PEP 345 markers can be useful to filter the list of scripts according to the build environment (sdist would still ship all scripts):

scripts =
  unit2.py; sys.platform == 'win32'
  unit2; sys.platform != 'win32'
  unit2-gui; sys.platform == 'linux2' and python_version < '3'

Full control over the scripts is possible thanks to pre-build hooks but allowing environment markers covers common needs without requiring user code.

On the other hand, disallowing environment markers may be the best thing to do in the short term. There are a number of feature requests (regarding platform-dependent handling in particular) in the Python bug tracker that need discussion and testing; since hooks provide an easy way to experiment, the features could be easily implemented outside of Distutils2 and eventually incorporated into the core if met with positive feedback.

A new field implementing a feature similar to :mod:`pkg_resources`console_scripts may also render scripts as we know them obsolete.

People have also wanted a way to install into $exec_prefix/sbin instead of $exec_prefix/bin. There is no proposal about that now, although it would not be hard to define, since it is believed that the future resources tagging will supersede the scripts section and address this feature request in a generic way.

Therefore, rejecting environment markers in scripts may be the best choice right now.

Additional files


Until the PEP on resources is implemented, :file:`setup.cfg` will grow new fields that merely translate the :file:`setup.py` form without any added semantics or features. Example:

package_data =
  cheese = data/templates/*

Semantics for the paths are described in the documentation for package_data. Environment markers are not supported. The same rules apply for other data files:

data_files =
  bitmaps = bm/b1.gif, bm/b2.gif
  config = cfg/data.cfg
  /etc/init.d = init-script

Comma-separated values allow paths with spaces and avoid the need to parse multi-line values in a multi-line value.

Replacing MANIFEST.in


The design document for resources states the goal to remove redundant listing of files in favor of :file:`setup.cfg`. Presently, files listed as modules, packages, scripts, source files for extension modules, package data, extra data files or description file will automatically be included in the manifest object used by the distribution. Until resources tagging is implemented, we could either still support the :file:`MANIFEST.in` file or move its contents to :file:`setup.cfg`, retaining strict syntactic and semantic compatibility:

sdist_extra =
  recursive-include examples *.txt *.py
  prune examples/sample?/build

Psychic Mode


This proposal takes advantage from convention over configuration, following the lead of DistUtilsExtra.auto. The idea is to automatically detect specific file patterns. Examples taken from its documentation (somewhat edited):

This mode would be opt-in, e.g.

autodetect = 1

Other ideas not considered include less used or platform-specific file formats, which could be autodetected after the resources proposal is implemented. Automatic Requires-Dist and Provides-Dist from import statements seem brittle and are not considered. Automatic :file:`POTFILES.in` depends on consensus that i18n-related tasks are in the scope of Distutils, which is not reached right now.


Alternative: Require that people run mkpkg to trigger detection and have the configuration file created or updated. Then they can check that the update is right thanks to $vcs diff setup.cfg or editor setup.cfg. See appendix.


A PEP needs to be written. See the design document.

Customization hooks


(This is not related to pre/post-command hooks, which will probably be set in the relevant command sections or in a new one.)

The third kind of :func:`setup` arguments are customization hooks.


Specify a class to use instead of :class:`distutils2.dist.Distribution`


Mapping of command names to classes, to replace existing commands or provide new ones, e.g.

setup(..., cmdclass={'build_py': build_py_2to3, 'lint': LintCommand})

Usually sys.argv[0], used to generate error messages with the correct script name in case it it not :file:`setup.py`.


List of arguments to use instead of sys.argv[1:]

The last two arguments are not needed in :file:`setup.cfg`, whereas the first two have are useful and can use this simple syntax:

distclass = shop.cheese.HamDistribution
cmdclass =
  build_py = distutils2.build_py.build_py_2to3; python_version >= '3'
  test = lib:_buildhelper.TestCommand

As you can see, environment markers and source directory specifiers are allowed. The fields are located in the global section, alongside command-packages.


Additional proposal: Give the field a clearer name. command-classes or commands (and change it in the Python code too). If there is a good reason to keep it short, it should at least be a plural form, i.e. cmdclasses.

Appendix: Making things simple for users

Distutils2 will ship with a little program called mkpkg (which will soon get a better name) that generates a :file:`setup.cfg` file thanks to questions asked to the user (what is the project name, its version, etc.). As much as possible, the program will propose answers (e.g. using the :func:`find_package` function to get the list of packages, mocking sys.modules['distutils'] to run :file:`setup.py` scripts in a sandbox and get information from it, etc.) so that the user just has to press :kbd:`Enter` to validate, or write the correct value and validate.

The program will also help people do the right thing, e.g. use a version number compliant with PEP 386, fill the license field only when there is no suitable Trove classifier for the chosen license, in other words give useful hints for people that don’t read PEPs or documentation.

Some values could be specified in the user configuration file:

author = John Smith <john@example.org>
project-url-template =
  Code repository, http://example.org/hacking/projects/{name}
  Documentation, http://packages.python.org/{name}

For people wanting to upgrade progressively, Distutils2 includes a lib2to3-based converter to rewrite imports (:mod:`distutils2` provides a :func:`setup` function with a signature compatible with the one from :mod:`distutils` and a :func:`find_packages` function similar to the one from :mod:`setuptools`), to allow projects to keep using a setup script while they transition their practices and installation documentation.

Appendix: Implementation details

Multi-value fields

The config file parser strips leading and trailing whitespace for free, we just have to handle the case of the first line being empty (in the line spam =\nham, the config file format considers there is an empty line after the equals). Handling that case is as simple as value.strip().splitlines()).

Support code

Helper functions to split the source directory specifier, resolve a dotted name and split an environment marker will be provided in :mod:`util` and :mod:`config` for use by third-party tools. :mod:`config` will also provide higler-level functions to get a :class:`DistributionMetadata` instance from a :file:`setup.cfg` file, a list of Python modules, a list of Python packages filtered according to the environment, access a config section defined by a Distutils2 extension, and so on.

If the proposal to merge the lists of modules and packages is accepted, Distutils2 code will have to sort this list into the three lists used by Distribution, following these simple rules:

  1. If the name has an extension section (or if it is listed in the extensions section, depending on the proposal that gets accepted), an instance of :class:`Extension` is created and added to distribution.ext_modules;
  2. If the name corresponds to an existing directory which contains an __init__.py file, it is added to distribution.packages;
  3. The name is added to distribution.py_modules.

Appendix: Mapping from arguments to fields

Argument in :file:`setup.py` Field in :file:`setup.cfg`
description summary
long_description description (changed meaning)
author author
author_email merged with author
maintainer maintainer
maintainer_email merged with maintainer
url home-page
N/A project-url
every other metadata field unchanged
packages packages or modules
py_modules modules
ext_modules modules (+ extension(s) section)
ext_package unsupported
distclass distclass
cmdclass cmdclasses
script_name N/A
script_args N/A
options fields in sections named after commands

Appendix: Rejected ideas

Get metadata from hooks

Some users would like to specify callback functions instead of writing some values, to avoid repetition. Let’s take version as example. It is very common to have it stored as a tag in version control, and there are a number of helper functions to get this information. We could have this kind of field:

get-version = _buildhelper.get_hg_version

This proposal has to be rejected, since it strongly conflicts with the point of having static metadata. If the metadata is fully defined by a file format, then any tool in any language can follow the specification to implement a parser and do useful things with the values, without depending on Distutils2, setup scripts or Python at all. People who really cannot write the version number in :file:`setup.cfg` for some reason can still use a :file:`setup.py` and benefit from fixes and features in Distutils2, but they won’t be static metadata-compliant.

Furthermore, the version example is not a strong argument. When doing a release, updating the version number in :file:`setup.cfg` is but a minor and quick step. Documentation needs to be checked, translations built, the version number has to be updated in :file:`README`, :file:`NEWS` or :file:`CHANGES`, source code, so using a hook to set version in :file:`setup.cfg` would remove only one tidbit of work.

Other fields may be duplicated in documentation files and :file:`setup.cfg`, i.e. author, summary and project URIs, but this duplication has a very small cost. Dependencies, classifiers and keywords are only in :file:`setup.cfg`, so wouldn’t benefit from hooks at all.

In conclusion, other solutions can be explored. Since the version number is easily retrieved from :file:`setup.cfg`, a trivial shell function can be written to create the VCS tag from the static metadata. People who love automation typically write a small script or makefile to do all operations related to a release (adjust version numbers in relevant files, run lint tools, run i18n tools, etc.), then check if the result look good (not always trusting automated tools is sane), commit, tag, push, send announcements, register the release in catalogs and so on.