Source

python-peps / pep-0386.txt

The default branch has multiple heads

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
PEP: 386
Title: Changing the version comparison module in Distutils
Version: $Revision$
Last-Modified: $Date$
Author: Tarek Ziadé <tarek@ziade.org>
Status: Accepted
Type: Standards Track
Content-Type: text/x-rst
Created: 4-June-2009


Abstract
========

This PEP proposes a new version comparison schema system in Distutils.


Motivation
==========

In Python there are no real restrictions yet on how a project should manage its
versions, and how they should be incremented.

Distutils provides a `version` distribution meta-data field but it is freeform and
current users, such as PyPI usually consider the latest version pushed as the
`latest` one, regardless of the expected semantics.

Distutils will soon extend its capabilities to allow distributions to express a
dependency on other distributions through the ``Requires-Dist`` metadata field
(see PEP 345) and it will optionally allow use of that field to
restrict the dependency to a set of compatible versions. Notice that this field
is replacing ``Requires`` that was expressing dependencies on modules and packages.

The ``Requires-Dist`` field will allow a distribution to define a dependency on
another package and optionally restrict this dependency to a set of
compatible versions, so one may write::

    Requires-Dist: zope.interface (>3.5.0)

This means that the distribution requires ``zope.interface`` with a version
greater than ``3.5.0``.

This also means that Python projects will need to follow the same convention
as the tool that will be used to install them, so they are able to compare
versions.

That is why this PEP proposes, for the sake of interoperability, a standard
schema to express version information and its comparison semantics.

Furthermore, this will make OS packagers' work easier when repackaging standards
compliant distributions, because as of now it can be difficult to decide how two
distribution versions compare.


Requisites and current status
=============================

It is not in the scope of this PEP to provide a universal versioning schema
intended to support all or even most of existing versioning schemas. There
will always be competing grammars, either mandated by distro or project
policies or by historical reasons that we cannot expect to change.

The proposed schema should be able to express the usual versioning semantics,
so it's possible to parse any alternative versioning schema and transform it
into a compliant one. This is how OS packagers usually deal with the existing
version schemas and is a preferable alternative than supporting an arbitrary
set of versioning schemas.

Conformance to usual practice and conventions, as well as a simplicity are a
plus, to ease frictionless adoption and painless transition. Practicality beats
purity, sometimes.

Projects have very different versioning needs, but the following are widely
considered important semantics:

1. it should be possible to express more than one versioning level
   (usually this is expressed as major and minor revision and, sometimes,
   also a micro revision).
2. a significant number of projects need special meaning versions for
   "pre-releases" (such as "alpha", "beta", "rc"), and these have widely
   used aliases ("a" stands for "alpha", "b" for "beta" and "c" for "rc").
   And these pre-release versions make it impossible to use a simple
   alphanumerical ordering of the version string components.
   (Example: 3.1a1 < 3.1)
3. some projects also need "post-releases" of regular versions,
   mainly for installer work which can't be clearly expressed otherwise.
4. development versions allow packagers of unreleased work to avoid version
   clash with later regular releases.

For people that want to go further and use a tool to manage their version
numbers, the two major ones are:

- The current Distutils system [#distutils]_
- Setuptools [#setuptools]_

Distutils
---------

Distutils currently provides a ``StrictVersion`` and a ``LooseVersion`` class
that can be used to manage versions.

The ``LooseVersion`` class is quite lax. From Distutils doc::

    Version numbering for anarchists and software realists.
    Implements the standard interface for version number classes as
    described above.  A version number consists of a series of numbers,
    separated by either periods or strings of letters.  When comparing
    version numbers, the numeric components will be compared
    numerically, and the alphabetic components lexically.  The following
    are all valid version numbers, in no particular order:

        1.5.1
        1.5.2b2
        161
        3.10a
        8.02
        3.4j
        1996.07.12
        3.2.pl0
        3.1.1.6
        2g6
        11g
        0.960923
        2.2beta29
        1.13++
        5.5.kw
        2.0b1pl0

    In fact, there is no such thing as an invalid version number under
    this scheme; the rules for comparison are simple and predictable,
    but may not always give the results you want (for some definition
    of "want").

This class makes any version string valid, and provides an algorithm to sort
them numerically then lexically. It means that anything can be used to version
your project::

    >>> from distutils.version import LooseVersion as V
    >>> v1 = V('FunkyVersion')
    >>> v2 = V('GroovieVersion')
    >>> v1 > v2
    False

The problem with this is that while it allows expressing any
nesting level it doesn't allow giving special meaning to versions
(pre and post-releases as well as development versions), as expressed in
requisites 2, 3 and 4.

The ``StrictVersion`` class is more strict. From the doc::

    Version numbering for meticulous retentive and software idealists.
    Implements the standard interface for version number classes as
    described above.  A version number consists of two or three
    dot-separated numeric components, with an optional "pre-release" tag
    on the end.  The pre-release tag consists of the letter 'a' or 'b'
    followed by a number.  If the numeric components of two version
    numbers are equal, then one with a pre-release tag will always
    be deemed earlier (lesser) than one without.

    The following are valid version numbers (shown in the order that
    would be obtained by sorting according to the supplied cmp function):

        0.4       0.4.0  (these two are equivalent)
        0.4.1
        0.5a1
        0.5b3
        0.5
        0.9.6
        1.0
        1.0.4a3
        1.0.4b1
        1.0.4

    The following are examples of invalid version numbers:

        1
        2.7.2.2
        1.3.a4
        1.3pl1
        1.3c4

This class enforces a few rules, and makes a decent tool to work with version
numbers::

    >>> from distutils.version import StrictVersion as V
    >>> v2 = V('GroovieVersion')
    Traceback (most recent call last):
    ...
    ValueError: invalid version number 'GroovieVersion'
    >>> v2 = V('1.1')
    >>> v3 = V('1.3')
    >>> v2 < v3
    True

It adds pre-release versions, and some structure, but lacks a few semantic
elements to make it usable, such as development releases or post-release tags,
as expressed in requisites 3 and 4.

Also, note that Distutils version classes have been present for years
but are not really used in the community.


Setuptools
----------

Setuptools provides another version comparison tool [#setuptools-version]_
which does not enforce any rules for the version, but tries to provide a better
algorithm to convert the strings to sortable keys, with a ``parse_version``
function.

From the doc::

    Convert a version string to a chronologically-sortable key

    This is a rough cross between Distutils' StrictVersion and LooseVersion;
    if you give it versions that would work with StrictVersion, then it behaves
    the same; otherwise it acts like a slightly-smarter LooseVersion. It is
    *possible* to create pathological version coding schemes that will fool
    this parser, but they should be very rare in practice.

    The returned value will be a tuple of strings.  Numeric portions of the
    version are padded to 8 digits so they will compare numerically, but
    without relying on how numbers compare relative to strings.  Dots are
    dropped, but dashes are retained.  Trailing zeros between alpha segments
    or dashes are suppressed, so that e.g. "2.4.0" is considered the same as
    "2.4". Alphanumeric parts are lower-cased.

    The algorithm assumes that strings like "-" and any alpha string that
    alphabetically follows "final"  represents a "patch level".  So, "2.4-1"
    is assumed to be a branch or patch of "2.4", and therefore "2.4.1" is
    considered newer than "2.4-1", which in turn is newer than "2.4".

    Strings like "a", "b", "c", "alpha", "beta", "candidate" and so on (that
    come before "final" alphabetically) are assumed to be pre-release versions,
    so that the version "2.4" is considered newer than "2.4a1".

    Finally, to handle miscellaneous cases, the strings "pre", "preview", and
    "rc" are treated as if they were "c", i.e. as though they were release
    candidates, and therefore are not as new as a version string that does not
    contain them, and "dev" is replaced with an '@' so that it sorts lower than
    than any other pre-release tag.

In other words, ``parse_version`` will return a tuple for each version string,
that is compatible with ``StrictVersion`` but also accept arbitrary version and
deal with them so they can be compared::

    >>> from pkg_resources import parse_version as V
    >>> V('1.2')
    ('00000001', '00000002', '*final')
    >>> V('1.2b2')
    ('00000001', '00000002', '*b', '00000002', '*final')
    >>> V('FunkyVersion')
    ('*funkyversion', '*final')

In this schema practicality takes priority over purity, but as a result it
doesn't enforce any policy and leads to very complex semantics due to the lack
of a clear standard. It just tries to adapt to widely used conventions.

Caveats of existing systems
---------------------------

The major problem with the described version comparison tools is that they are
too permissive and, at the same time, aren't capable of expressing some of the
required semantics. Many of the versions on PyPI [#pypi]_ are obviously not
useful versions, which makes it difficult for users to grok the versioning that
a particular package was using and to provide tools on top of PyPI.

Distutils classes are not really used in Python projects, but the
Setuptools function is quite widespread because it's used by tools like
`easy_install` [#ezinstall]_, `pip` [#pip]_ or `zc.buildout` [#zc.buildout]_
to install dependencies of a given project.

While Setuptools *does* provide a mechanism for comparing/sorting versions,
it is much preferable if the versioning spec is such that a human can make a
reasonable attempt at that sorting without having to run it against some code.

Also there's a problem with the use of dates at the "major" version number
(e.g. a version string "20090421") with RPMs: it means that any attempt to
switch to a more typical "major.minor..." version scheme is problematic because
it will always sort less than "20090421".

Last, the meaning of ``-`` is specific to Setuptools, while it is avoided in
some packaging systems like the one used by Debian or Ubuntu.

The new versioning algorithm
============================

During Pycon, members of the Python, Ubuntu and Fedora community worked on
a version standard that would be acceptable for everyone.

It's currently called `verlib` and a prototype lives at [#prototype]_.

The pseudo-format supported is::

    N.N[.N]+[{a|b|c|rc}N[.N]+][.postN][.devN]

The real regular expression is::

    expr = r"""^
    (?P<version>\d+\.\d+)         # minimum 'N.N'
    (?P<extraversion>(?:\.\d+)*)  # any number of extra '.N' segments
    (?:
        (?P<prerel>[abc]|rc)         # 'a' = alpha, 'b' = beta
                                     # 'c' or 'rc' = release candidate
        (?P<prerelversion>\d+(?:\.\d+)*)
    )?
    (?P<postdev>(\.post(?P<post>\d+))?(\.dev(?P<dev>\d+))?)?
    $"""

Some examples probably make it clearer::

    >>> from verlib import NormalizedVersion as V
    >>> (V('1.0a1')
    ...  < V('1.0a2.dev456')
    ...  < V('1.0a2')
    ...  < V('1.0a2.1.dev456')
    ...  < V('1.0a2.1')
    ...  < V('1.0b1.dev456')
    ...  < V('1.0b2')
    ...  < V('1.0b2.post345')
    ...  < V('1.0c1.dev456')
    ...  < V('1.0c1')
    ...  < V('1.0.dev456')
    ...  < V('1.0')
    ...  < V('1.0.post456.dev34')
    ...  < V('1.0.post456'))
    True

The trailing ``.dev123`` is for pre-releases. The ``.post123`` is for
post-releases -- which apparently are used by a number of projects out there
(e.g. Twisted [#twisted]_). For example *after* a ``1.2.0`` release there might
be a ``1.2.0-r678`` release. We used ``post`` instead of ``r`` because the
``r`` is ambiguous as to whether it indicates a pre- or post-release.

``.post456.dev34`` indicates a dev marker for a post release, that sorts
before a ``.post456`` marker. This can be used to do development versions
of post releases.

Pre-releases can use ``a`` for "alpha", ``b`` for "beta" and ``c`` for
"release candidate". ``rc`` is an alternative notation for "release candidate"
that is added to make the version scheme compatible with Python's own version
scheme. ``rc`` sorts after ``c``::

    >>> from verlib import NormalizedVersion as V
    >>> (V('1.0a1')
    ...  < V('1.0a2')
    ...  < V('1.0b3')
    ...  < V('1.0c1')
    ...  < V('1.0rc2')
    ...  < V('1.0'))
    True

Note that ``c`` is the preferred marker for third party projects.

``verlib`` provides a ``NormalizedVersion`` class and a
``suggest_normalized_version`` function.

NormalizedVersion
-----------------

The `NormalizedVersion` class is used to hold a version and to compare it with
others. It takes a string as an argument, that contains the representation of
the version::

    >>> from verlib import NormalizedVersion
    >>> version = NormalizedVersion('1.0')

The version can be represented as a string::

    >>> str(version)
    '1.0'

Or compared with others::

    >>> NormalizedVersion('1.0') > NormalizedVersion('0.9')
    True
    >>> NormalizedVersion('1.0') < NormalizedVersion('1.1')
    True

A class method called ``from_parts`` is available if you want to create an
instance by providing the parts that composes the version.

Examples ::

    >>> version = NormalizedVersion.from_parts((1, 0))
    >>> str(version)
    '1.0'

    >>> version = NormalizedVersion.from_parts((1, 0), ('c', 4))
    >>> str(version)
    '1.0c4'

    >>> version = NormalizedVersion.from_parts((1, 0), ('c', 4), ('dev', 34))
    >>> str(version)
    '1.0c4.dev34'


suggest_normalized_version
--------------------------

``suggest_normalized_version`` is a function that suggests a normalized version
close to the given version string. If you have a version string that isn't
normalized (i.e. ``NormalizedVersion`` doesn't like it) then you might be able
to get an equivalent (or close) normalized version from this function.

This does a number of simple normalizations to the given string, based
on an observation of versions currently in use on PyPI.

Given a dump of those versions on January 6th 2010, the function has given those
results out of the 8821 distributions PyPI had:

- 7822 (88.67%) already match ``NormalizedVersion`` without any change
- 717 (8.13%) match when using this suggestion method
- 282 (3.20%) don't match at all.

The 3.20% of projects that are incompatible with ``NormalizedVersion``
and cannot be transformed into a compatible form, are for most of them date-based
version schemes, versions with custom markers, or dummy versions. Examples:

- working proof of concept
- 1 (first draft)
- unreleased.unofficialdev
- 0.1.alphadev
- 2008-03-29_r219
- etc.

When a tool needs to work with versions, a strategy is to use
``suggest_normalized_version`` on the versions string. If this function returns
``None``, it means that the provided version is not close enough to the
standard scheme. If it returns a version that slightly differs from
the original version, it's a suggested normalized version. Last, if it
returns the same string, it means that the version matches the scheme.

Here's an example of usage::

    >>> from verlib import suggest_normalized_version, NormalizedVersion
    >>> import warnings
    >>> def validate_version(version):
    ...     rversion = suggest_normalized_version(version)
    ...     if rversion is None:
    ...         raise ValueError('Cannot work with "%s"' % version)
    ...     if rversion != version:
    ...         warnings.warn('"%s" is not a normalized version.\n'
    ...                       'It has been transformed into "%s" '
    ...                       'for interoperability.' % (version, rversion))
    ...     return NormalizedVersion(rversion)
    ...

    >>> validate_version('2.4-rc1')
    __main__:8: UserWarning: "2.4-rc1" is not a normalized version.
    It has been transformed into "2.4c1" for interoperability.
    NormalizedVersion('2.4c1')

    >>> validate_version('2.4c1')
    NormalizedVersion('2.4c1')

    >>> validate_version('foo')
    Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    File "<stdin>", line 4, in validate_version
    ValueError: Cannot work with "foo"

Roadmap
=======

Distutils will deprecate its existing versions class in favor of
``NormalizedVersion``. The ``verlib`` module presented in this PEP will be
renamed to ``version`` and placed into the ``distutils`` package.

References
==========

.. [#distutils]
   http://docs.python.org/distutils

.. [#setuptools]
   http://peak.telecommunity.com/DevCenter/setuptools

.. [#setuptools-version]
   http://peak.telecommunity.com/DevCenter/setuptools#specifying-your-project-s-version

.. [#pypi]
   http://pypi.python.org/pypi

.. [#pip]
   http://pypi.python.org/pypi/pip

.. [#ezinstall]
   http://peak.telecommunity.com/DevCenter/EasyInstall

.. [#zc.buildout]
   http://pypi.python.org/pypi/zc.buildout

.. [#twisted]
   http://twistedmatrix.com/trac/

.. [#requires]
   http://peak.telecommunity.com/DevCenter/setuptools

.. [#prototype]
   http://bitbucket.org/tarek/distutilsversion/

Acknowledgments
===============

Trent Mick, Matthias Klose, Phillip Eby, David Lyon, and many people at Pycon
and Distutils-SIG.

Copyright
=========

This document has been placed in the public domain.



..
   Local Variables:
   mode: indented-text
   indent-tabs-mode: nil
   sentence-end-double-space: t
   fill-column: 70
   coding: utf-8
   End: