# Proposed Changes to Configuration Files

Author: Éric Araujo Carl Meyer, folks at PyCon 2010, people in #distutils PSF |version|

Warning

This document is not part of the documentation of Distutils2. It is a design/discussion document that serves to explain directions, collect feedback and votes, and will ultimately be rewritten as proper documentation (without all the explanations about choices) and moved into the relevant doc files.

Don’t want to read? Here is an example using all the new fields. If there’s something you don’t like or understand, use the table of contents to jump to the rationale and alternate proposals

[metadata]
name = RestingParrot
version = 0.6.4
author = Carl Meyer
author-email = carl@oddbird.net
maintainer = Éric Araujo
maintainer-email = merwok@netwok.org
summary = A sample project demonstrating distutils2 packaging
keywords = distutils2, packaging, sample project

classifiers =
Development Status :: 4 - Beta
Environment :: Console (Text Based)
Environment :: X11 Applications :: GTK; python_version < '3'
Programming Language :: Python
Programming Language :: Python :: 2
Programming Language :: Python :: 3

requires-python = >=2.4, <3.2
requires-dist =
PetShoppe
MichaelPalin (> 1.1)
pywin32; sys.platform == 'win32'
pysqlite2; python_version < '2.5'
inotify (0.0.1); sys.platform == 'linux2'
requires-external = libxml2
provides-dist = distutils2-sample-project (0.2)
unittest2-sample-project

project-url =
Main repository, http://bitbucket.org/carljm/sample-distutils2-project
Fork in progress, http://bitbucket.org/Merwok/sample-distutils2-project

[global]
# Auto-detect modules, packages, scripts, files; probably better to make
# the user run mkpkg and check the result
autodetect = 1
# Customization hooks
distclass = shop.cheese.HamDistribution
commands =
build_py = distutils2.build_py.build_py_2to3; python_version >= '3'
test = lib:_buildhelper.TestCommand

[files]
# Modules are Python modules, extensions modules or Python packages
modules = haven
exclude-modules = haven.ship.hull
modules-dir = src

modules = httplib2
modules-dir = python2; python_version < '3'
python3; python_version > '2'

scripts =
detect-witch
scripts/find-coconuts
bin/taunt

package_data =
cheese = data/templates/*

data_files =
bitmaps = bm/b1.gif, bm/b2.gif
config = cfg/data.cfg
/etc/init.d = init-script

# Replaces MANIFEST.in
sdist_extra =
include THANKS HACKING
recursive-include examples *.txt *.py
prune examples/sample?/build

# Extension module
[extension: haven.ship.pirate]
sources = haven/ship/pirate.c
include-dirs = include
optional = 1

$DEBUG =$GFX = src/SDL_gfx/SDL_gfxPrimitives.c
$SDL = -I/usr/include/SDL -D_REENTRANT -lSDL$FONT = -lSDL_ttf
$SCRAP = -lX11 [resources] # This needs a PEP.  ## Introduction One goal of Distutils2 is to put all the information required to build and install a distribution into a static configuration file (:file:setup.cfg) instead of Python code (:file:setup.py). This information (i.e. the arguments to distutils.core.setup) is split into metadata, files and customization hooks. In the olden days, Distutils configuration files were used only to give options to commands. They were also designed to be extensible: Third-party tools relying on Distutils or providing new commands could tell their users to add a section in the distribution’s :file:setup.cfg file or in their user config file to set options. There is a simple API to get these options merged from all configuration files (which will be even simpler in Distutils2). Moving the distribution configuration from a script to a static file makes it easier for tools to get the information without having to run code (the :file:setup.py script). It will also allow a variety of tools written in any language to work from the same information. The encoding of the config file is UTF-8. The new sections are different from usual sections that give options to commands because they make no sense in system or user configuration files. While specifying install or sdist options in a system or user configuration file is useful, options like author name or scripts to include in a distribution have to be in the project’s config file only. In another discussion, it may be good to think about configuration files precedence rules; e.g. if a user specifies an installation directory in their own config file, why is the distribution’s file able to override that choice? ## Metadata The first kind of :func:setup arguments that should be supported in :file:setup.cfg are the ones that give metadata. Fields are defined in PEP 345 and progress is tracked on Python #8252. These fields are uncontroversial: [metadata] name = RestingParrot version = 0.6.4 author = Carl Meyer author-email= carl@oddbird.net summary = A sample project demonstrating distutils2 packaging classifiers = Development Status :: 4 - Beta Environment :: Console (Text Based) Environment :: X11 Applications :: GTK; python_version < '3' License :: OSI Approved :: MIT License Programming Language :: Python Programming Language :: Python :: 2 Programming Language :: Python :: 3 requires-dist = PetShoppe MichaelPalin (> 1.1) pywin32; sys.platform == 'win32' pysqlite2; python_version < '2.5' inotify (0.0.1); sys.platform == 'linux2' requires-external = libxml2 provides-dist = distutils2-sample-project (0.2) unittest2-sample-project project-url = Main repository, http://bitbucket.org/carljm/sample-distutils2-project Fork in progress, http://bitbucket.org/Merwok/sample-distutils2-project  Multi-value fields use newline-separated values, since the values themselves may contain spaces. The first (or only) value may be on the same line as the key or on the following one. The config file parser automatically supports case variation and underscores in place of hyphen in field names. Our own documentation should be consistent and use only lower case and hyphens, for simplicity and non-ugliness. The PEP 345 environment markers used here will be passed to the DistributionMetadata instance without processing, as intended: The class knows which fields are allowed to use a marker and how to interpret them. Some fields don’t have a specified format or can be improved. Proposals follow. ### Avoid Metadata-Version codename: no-metadata-version This field does not have to be in the file, since the DistributionMetadata class detects the right version from the fields that are present. ### Use CSV for Keywords and Requires-Python codename: keywords-csv PEP 345 only says that the Keywords field is “a list”; the example uses space-delimited values, but distutils and distutils2 print out comma-separated values, which allows having keywords with spaces in them (e.g. “version control”). The Requires-Python field is described as comma-separated values. Since keywords and supported versions are typically much shorter than classifiers or dependencies, I propose that :file:setup.cfg use a comma-separated list of values, with leading and trailing spaces removed for user convenience (i.e. keywords = version control, packaging, testing, unit testing will give the list ['version control', 'packaging', 'testing', 'unit testing']). More examples: requires-python = 2.6  requires-python = >=2.4, <=3.0  codename: keywords-no-csv Alternatively, if it is deemed confusing to have two ways of giving multi-value fields, the field can be newline-separated like other fields already defined. Consistency would win other convenience. ### Get description from a file codename: desc-from-file Use the contents of a file as value for description. Long descriptions typically contain blank lines, which are stripped by our config file parser in Python < 3.2, so including the long description directly in the config file is a non-starter. People often already have the description in a :file:README file or equivalent, which they can edit and check with reST-enabled tools. Thus this proposal replaces a very common idiom in setup scripts, prevents duplication and desynchronisation, and avoids the need for me to touch regular expressions to tweak the parser in unholy ways. The value is a path relative to the directory containing the :file:setup.cfg (.. disallowed). Examples: description-file = README  description-file = lib/python/unicorn/README.rst  The value can be a list of files, to be concatenated in order and used as description. Now that Distutils2 and PyPI allow uploading documentation and adding arbitrary links in a project page, the need for overly long descriptions is reduced, but some users would still want to concatenate e.g. :file:README and :file:NEWS. Thus, this field should allow a newline-separated list of files (it allows paths with spaces and complies with the already defined way of giving multi-value fields). The name is not description since it would have conflicting semantics with PEP 345, but not description-files so as not to encourage multi-file values. Files listed in that field are automatically added to distributions. ### Fix misnamed fields The things listed in requires-dist and friends have a name and a version (optional) but no particular distribution format, therefore they’re not distributions but releases (or arguably projects). Editing accepted PEPs is hard, but it’s better to do it now before our tools and terminology are used in the wild. This item is listed only for completeness, not to call a vote; Alexis is the owner of this request, and the field names in :file:setup.cfg will follow the latest version of PEP 345. ## Files Most other :func:setup arguments have to do with files. Arguments list Python modules, extension modules (written in C or C++), packages, scripts, files related to a package, other files. As specified in Python #8253, a new section is introduced, files; nothing else is already defined. ### Listing modules and packages codename: merge-mod-pkg For the new format, it is proposed that the three kinds of modules (Python modules, extension modules and packages) be merged into a single option. Given this file structure: $ tree
.
├── haven.py
├── pirate.c
├── setup.cfg
└── ship
├── cabins
│   ├── captain.py
│   └── __init__.py
├── hull.c
└── __init__.py


This configuration is enough to include the two modules, the package, its subpackage and all submodules in the distribution and have them processed by the relevant commands (build_*, sdist, etc.):

[files]
modules = haven
pirate
ship


(See below for the rationale to separate with newlines instead of arbitrary whitespace. See appendix for implementation details explaining how this merged list will be easily parsed.)

It is not possible to have a module and a package with the same name. In addition to being documented, this restriction could also be a runtime warning.

#### Naming

Calling packages “modules” may be confusing to some people, e.g. beginners, even if it’s technically correct. Other proposed names include “source” (too vague) and “importables” (unused in the documentation and ugly).

codename: no-merge-mod-pkg

Alternatively, modules and packages could be listed separately. Since it appears that people tend to use either one or the other in their projects, there would be no cognitive overload in defining two fields instead of one:

modules = haven
pirate

packages = ship
exclude-packages = ship.hull


This may also prevent future problems when namespace packages are supported by Python, or maybe not; I have to read PEP 382 closely to get a better undestanding of file layout and possible detection (esp. recursion) issues.

#### Recursion

codename: pkg-recursion

Package listing is recursive; real-world use of :func:setuptools.find_packages shows that this is a useful feature. There is a way to control it:

modules = ship
exclude-modules = ship.hull

codename: [pkg-recursion-boolean]

Additionally, a boolean option could control all-or-nothing recursion:

modules = ship
recursive-modules = 0


It is not clear that this would solve problems, e.g. for the mx project which is developped in one tree but packaged as separate PyPI projects; maybe it’s best not to define this option right now, try to convert complex projects and then revise this proposal.

#### Replacing package_dir

codename: pkg-dir-prefix

Instead of replacing the package_dir argument with a field of the same name, it is proposed to merge it with the packages (or modules) values:

packages =
ship
src:parrot
src2:thing
exclude-packages =
src:parrot.tests


This translates this file structure:

$tree . ├── ship │ └── __init__.py ├── src │ └── parrot │ ├── __init__.py │ └── tests │ └── __init__.py └── src2 └── thing └── __init__.py  Note that these semantics are different from setup(packages=['thing'], package_dir={'thing': 'src2'}: In the new proposal, src2:thing does not mean that the :file:src2 directory is to be renamed :file:thing in the build directory (reference), but that the directory :file:src2 contains another directory named :file:thing (with its :file:__init__.py file and other submodules). The changed semantics are more intuitive. Using : as a separator forbids using it in directory name, which would not be very sane anyway. Using it instead of a space or a slash also allows putting packages in a deep subdirectory: packages = client/bindings/python:parrotlib  This source directory syntax is also available for modules (in case the proposal to merge them with packages is rejected): modules = ham/lib/python:ham cheese/lib/python:cheese  codename: pkg-dir-no-prefix If a study of setup scripts in projects distributed on the Cheeseshop reveals that an overwhelming majority uses only one package_dir or none, this alternate, simpler proposal would be enough: packages = parrot exclude-packages = parrot.tests package-dir = src  #### Replacing conditionals codename: env-markers-for-files We can define the packages or modules field to be newline-separated and accept PEP 345 environment markers to support source distributions that contain e.g. code for both 2.x and 3.x, like httplib2 does: packages = python2:httplib2; python_version < '3' python3:httplib2; python_version > '2'  (Form using the alternate package_dir proposal: packages = httplib2 package-dir = python2; python_version < '3' python3; python_version > '2'  Using a multi-line value for this field is kind of ugly, though.) Since there is no else, each condition has to be written twice (once in normal form, once in reverse, which can be tricky and/or tedious), and the values available as EXPR in environment markers (Python version, OS name, etc.) do not provide all that is required. One example from Mercurial that is not trivial to reverse (or maybe I’m just bad at boolean logic): if sys.platform == 'win32' and sys.version_info < (2, 5, 0, 'final'): pymodules.append('mercurial.pure.osutil')  Example that can’t be translated: if sys.platform == 'linux2' and os.uname()[2] > '2.6': # The inotify extension is only usable with Linux 2.6 kernels. ...  For such cases, the solution seems to use a pre-build hook to edit the lists of modules and packages. For trivial cases, environment markers provide a solution that does not require writing any code, so they’re still useful in the files section. distutils-sig a year ago talked about adding the full uname tuple to the available objects. There was also a proposal that didn’t put conditionals in the fields but used sections, which may be more readable. ### Extension modules codename: extensions-section A new section family is proposed to describe extension modules, to replace instantiation of :class:Extension objects with the right options in :file:setup.py. Each extension module has to be listed in the files field and described in its own section: [files] modules = ship.pirate [extension: ship.pirate] sources = ship/pirate.c headers = Python.h pirate.h include-dirs = include optional = 1  The section name is the string extension: followed by optional whitespace and the full name of the module, field names are directly taken from :class:Extension arguments, values are simple adaptations (string arguments are single values, string lists are multi-value fields (whitespace-separated or newline-separated, to be decided), booleans are :mod:ConfigParser booleans). codename: [vars-in-extmod] If deemed useful, simple variables could be added to these sections; see definition. codename: extensions-section-flat An alternate proposal that requires only one section but may prove more difficult to write and to parse is derived from the older Setup format, deprecated in Distutils and removed in Distutils2 (see :file:{python3.2}/Lib/distutils/tests/Setup.sample): [extensions] pirate = pirate.c ship.hull = ship/hull.c  The format is module = source files [arguments to the compiler]. More involved example from SDL: [extensions] _camera = src/_camera.c src/camera_v4l2.c src/camera_v4l.c$SDL $DEBUG _numericsurfarray = src/_numericsurfarray.c$SDL $DEBUG font = src/font.c$SDL $FONT$DEBUG
scrap = src/scrap.c $SDL$SCRAP $DEBUG  This example introduces variables, which can be any string. A simple proposal for the assignment syntax: $DEBUG =
$GFX = src/SDL_gfx/SDL_gfxPrimitives.c$SDL = -I/usr/include/SDL -D_REENTRANT -lSDL
$FONT = -lSDL_ttf$SCRAP = -lX11


In other words, every key starting with a dollar sign is a variable that will be usable in the regular fields of the same section.

codename: [vars-in-vars-for-extmod]

If useful, expanding already defined variables in other variable definitions could be allowed.

codename: [env-markers-for-extmod]

If useful, environment markers could be allowed in fields of this section.

### Listing scripts

codename: scripts

The Python list naturally translates to a multi-value field:

[files]
scripts =
detect-witch
scripts/find-coconuts
bin/taunt


Paths are relative, with .. component forbidden. Using multi-line instead of whitespace or comma-separated allows directory names with spaces and is consistent with other multi-value fields.

codename: [env-markers-for-scripts]

PEP 345 markers can be useful to filter the list of scripts according to the build environment (sdist would still ship all scripts):

scripts =
unit2.py; sys.platform == 'win32'
unit2; sys.platform != 'win32'
unit2-gui; sys.platform == 'linux2' and python_version < '3'


Full control over the scripts is possible thanks to pre-build hooks but allowing environment markers covers common needs without requiring user code.

On the other hand, disallowing environment markers may be the best thing to do in the short term. There are a number of feature requests (regarding platform-dependent handling in particular) in the Python bug tracker that need discussion and testing; since hooks provide an easy way to experiment, the features could be easily implemented outside of Distutils2 and eventually incorporated into the core if met with positive feedback.

A new field implementing a feature similar to :mod:pkg_resourcesconsole_scripts may also render scripts as we know them obsolete.

People have also wanted a way to install into $exec_prefix/sbin instead of$exec_prefix/bin. There is no proposal about that now, although it would not be hard to define, since it is believed that the future resources tagging will supersede the scripts section and address this feature request in a generic way.

Therefore, rejecting environment markers in scripts may be the best choice right now.

codename: data-fields

Until the PEP on resources is implemented, :file:setup.cfg will grow new fields that merely translate the :file:setup.py form without any added semantics or features. Example:

[files]
package_data =
cheese = data/templates/*


Semantics for the paths are described in the documentation for package_data. Environment markers are not supported. The same rules apply for other data files:

[files]
data_files =
bitmaps = bm/b1.gif, bm/b2.gif
config = cfg/data.cfg
/etc/init.d = init-script


Comma-separated values allow paths with spaces and avoid the need to parse multi-line values in a multi-line value.

### Replacing MANIFEST.in

codename: remove-manifest.in

The design document for resources states the goal to remove redundant listing of files in favor of :file:setup.cfg. Presently, files listed as modules, packages, scripts, source files for extension modules, package data, extra data files or description file will automatically be included in the manifest object used by the distribution. Until resources tagging is implemented, we could either still support the :file:MANIFEST.in file or move its contents to :file:setup.cfg, retaining strict syntactic and semantic compatibility:

[files]
sdist_extra =
include THANKS HACKING
recursive-include examples *.txt *.py
prune examples/sample?/build


### Psychic Mode

codename: auto-detection-option

This proposal takes advantage from convention over configuration, following the lead of DistUtilsExtra.auto. The idea is to automatically detect specific file patterns. Examples taken from its documentation (somewhat edited):

This mode would be opt-in, e.g.

[global]
autodetect = 1


Other ideas not considered include less used or platform-specific file formats, which could be autodetected after the resources proposal is implemented. Automatic Requires-Dist and Provides-Dist from import statements seem brittle and are not considered. Automatic :file:POTFILES.in depends on consensus that i18n-related tasks are in the scope of Distutils, which is not reached right now.

codename: auto-detection-script

Alternative: Use the mkpkg script to detect things and create or update the configuration file (after user confirmation for each field). See appendix.

### Resources

A PEP needs to be written. See the design document.

## Customization hooks

codename: distclass-cmdclass

(This is not related to pre/post-command hooks, which will probably be set in the relevant command sections or in a new one.)

The third kind of :func:setup arguments are customization hooks.

distclass: Specify a class to use instead of :class:distutils2.dist.Distribution Mapping of command names to classes, to replace existing commands or provide new ones, e.g. setup(..., cmdclass={'build_py': build_py_2to3, 'lint': LintCommand})  Usually sys.argv[0], used to generate error messages with the correct script name in case it it not :file:setup.py. List of arguments to use instead of sys.argv[1:]

The last two arguments are not needed in :file:setup.cfg, whereas the first two have are useful and can use this simple syntax:

[global]
distclass = shop.cheese.HamDistribution
cmdclass =
build_py = distutils2.build_py.build_py_2to3; python_version >= '3'
test = lib:_buildhelper.TestCommand


As you can see, environment markers and source directory specifiers are allowed. The fields are located in the global section, alongside command-packages.

codename: [rename-cmdclass]

Additional proposal: Give the field a clearer name. command-classes or commands (and change it in the Python code too). If there is a good reason to keep it short, it should at least be a plural form, i.e. cmdclasses.

## Appendix: Making things simple for users

Distutils2 will ship with a little program called mkpkg (which will soon get a better name) that generates a :file:setup.cfg file thanks to questions asked to the user (what is the project name, its version, etc.). As much as possible, the program will propose answers (e.g. using the :func:find_package function to get the list of packages, mocking sys.modules['distutils'] to run :file:setup.py scripts in a sandbox and get information from it, etc.) so that the user just has to press :kbd:Enter to validate, or write the correct value and validate.

The program will also help people do the right thing, e.g. use a version number compliant with PEP 386, fill the license field only when there is no suitable Trove classifier for the chosen license, in other words give useful hints for people that don’t read PEPs or documentation.

A new section in the user configuration file can provide defaults (even templates, with {variables} replaced by values taken from the metadata of the project):

[mkpkg-defaults]
author = John Smith <john@example.org>
project-url-template =
Code repository, http://example.org/hacking/projects/{name}
Documentation, http://packages.python.org/{name}


For people wanting to upgrade progressively, Distutils2 includes a lib2to3-based converter to rewrite imports (:mod:distutils2 provides a :func:setup function with a signature compatible with the one from :mod:distutils and a :func:find_packages function similar to the one from :mod:setuptools), to allow projects to keep using a setup script while they transition their practices and installation documentation.

## Appendix: Implementation details

### Multi-value fields

The config file parser strips leading and trailing whitespace for free, we just have to handle the case of the first line being empty (in the line spam =\nham, the config file format considers there is an empty line after the equals). Handling that case is as simple as value.strip().splitlines()).

### Support code

Helper functions to split the source directory specifier, resolve a dotted name and split an environment marker will be provided in :mod:util and :mod:config for use by third-party tools. :mod:config will also provide higler-level functions to get a :class:DistributionMetadata instance from a :file:setup.cfg file, a list of Python modules, a list of Python packages filtered according to the environment, access a config section defined by a Distutils2 extension, and so on.

If the proposal to merge the lists of modules and packages is accepted, Distutils2 code will have to sort this list into the three lists used by Distribution, following these simple rules:

1. If the name has an extension section (or if it is listed in the extensions section, depending on the proposal that gets accepted), an instance of :class:Extension is created and added to distribution.ext_modules;
2. If the name corresponds to an existing directory which contains an __init__.py file, it is added to distribution.packages;
3. The name is added to distribution.py_modules.

## Appendix: Mapping from arguments to fields

Argument in :file:setup.py Field in :file:setup.cfg
description summary
long_description description-file (changed meaning)
url home-page
N/A project-url
packages packages or modules
py_modules modules
ext_modules modules (+ extension(s) section(s))
ext_package unsupported
distclass distclass
cmdclass cmdclasses
script_name N/A
script_args N/A
options fields in sections named after commands

## Appendix: Rejected ideas

### Merge author and author-email

It was proposed to merge name and email in a single field for author (and maintainer):

author = Carl Meyer <carl@oddbird.net>
maintainer = Éric Araujo <merwok@netwok.org>


It is a common format, easy to parse (it’s not any valid RFC 2822 email field, just specifically name <email>). The rationale was that while machine-generated and consumed PEP 345 :file:METADATA files were defined with separate author name and email fields, the user-written :file:setup.cfg could have some additional niceties. The Distutils maintainer voted this down for the sake of clarity: Field names in :file:setup.cfg are directly translated from PEP 345 without any complication. There is no real advantage in changing the format here.

Some users would like to specify callback functions instead of writing some values, to avoid repetition. Let’s take version as example. It is very common to have it stored as a tag in version control, and there are a number of helper functions to get this information. We could have this kind of field:

[metadata]
get-version = _buildhelper.get_hg_version


This proposal has to be rejected, since it strongly conflicts with the point of having static metadata. If the metadata is fully defined by a file format, then any tool in any language can follow the specification to implement a parser and do useful things with the values, without depending on Distutils2, setup scripts or Python at all. People who really cannot write the version number in :file:setup.cfg for some reason can still use a :file:setup.py and benefit from fixes and features in Distutils2, but they won’t be static metadata-compliant.

Furthermore, the version example is not a strong argument. When doing a release, updating the version number in :file:setup.cfg is but a minor and quick step. Documentation needs to be checked, translations built, the version number has to be updated in :file:README, :file:NEWS or :file:CHANGES, source code, so using a hook to set version in :file:setup.cfg would remove only one tidbit of work.

Other fields may be duplicated in documentation files and :file:setup.cfg, i.e. author, summary and project URIs, but this duplication has a very small cost. Dependencies, classifiers and keywords are only in :file:setup.cfg, so wouldn’t benefit from hooks at all.

In conclusion, other solutions can be explored. Since the version number is easily retrieved from :file:setup.cfg, a trivial shell function can be written to create the VCS tag from the static metadata. People who love automation typically write a small script or makefile to do all operations related to a release (adjust version numbers in relevant files, run lint tools, run i18n tools, etc.), then check if the result look good (not always trusting automated tools is sane), commit, tag, push, send announcements, register the release in catalogs and so on. People can also maintain another file, named e.g. :file:setup.cfg.in, which contain special fields that get translated by a tool to create :file:setup.cfg.in; Distutils2 however will not define those fields nor support them.