python-peps / pep-0374.txt

   1
   2
   3
   4
   5
   6
   7
   8
   9
  10
  11
  12
  13
  14
  15
  16
  17
  18
  19
  20
  21
  22
  23
  24
  25
  26
  27
  28
  29
  30
  31
  32
  33
  34
  35
  36
  37
  38
  39
  40
  41
  42
  43
  44
  45
  46
  47
  48
  49
  50
  51
  52
  53
  54
  55
  56
  57
  58
  59
  60
  61
  62
  63
  64
  65
  66
  67
  68
  69
  70
  71
  72
  73
  74
  75
  76
  77
  78
  79
  80
  81
  82
  83
  84
  85
  86
  87
  88
  89
  90
  91
  92
  93
  94
  95
  96
  97
  98
  99
 100
 101
 102
 103
 104
 105
 106
 107
 108
 109
 110
 111
 112
 113
 114
 115
 116
 117
 118
 119
 120
 121
 122
 123
 124
 125
 126
 127
 128
 129
 130
 131
 132
 133
 134
 135
 136
 137
 138
 139
 140
 141
 142
 143
 144
 145
 146
 147
 148
 149
 150
 151
 152
 153
 154
 155
 156
 157
 158
 159
 160
 161
 162
 163
 164
 165
 166
 167
 168
 169
 170
 171
 172
 173
 174
 175
 176
 177
 178
 179
 180
 181
 182
 183
 184
 185
 186
 187
 188
 189
 190
 191
 192
 193
 194
 195
 196
 197
 198
 199
 200
 201
 202
 203
 204
 205
 206
 207
 208
 209
 210
 211
 212
 213
 214
 215
 216
 217
 218
 219
 220
 221
 222
 223
 224
 225
 226
 227
 228
 229
 230
 231
 232
 233
 234
 235
 236
 237
 238
 239
 240
 241
 242
 243
 244
 245
 246
 247
 248
 249
 250
 251
 252
 253
 254
 255
 256
 257
 258
 259
 260
 261
 262
 263
 264
 265
 266
 267
 268
 269
 270
 271
 272
 273
 274
 275
 276
 277
 278
 279
 280
 281
 282
 283
 284
 285
 286
 287
 288
 289
 290
 291
 292
 293
 294
 295
 296
 297
 298
 299
 300
 301
 302
 303
 304
 305
 306
 307
 308
 309
 310
 311
 312
 313
 314
 315
 316
 317
 318
 319
 320
 321
 322
 323
 324
 325
 326
 327
 328
 329
 330
 331
 332
 333
 334
 335
 336
 337
 338
 339
 340
 341
 342
 343
 344
 345
 346
 347
 348
 349
 350
 351
 352
 353
 354
 355
 356
 357
 358
 359
 360
 361
 362
 363
 364
 365
 366
 367
 368
 369
 370
 371
 372
 373
 374
 375
 376
 377
 378
 379
 380
 381
 382
 383
 384
 385
 386
 387
 388
 389
 390
 391
 392
 393
 394
 395
 396
 397
 398
 399
 400
 401
 402
 403
 404
 405
 406
 407
 408
 409
 410
 411
 412
 413
 414
 415
 416
 417
 418
 419
 420
 421
 422
 423
 424
 425
 426
 427
 428
 429
 430
 431
 432
 433
 434
 435
 436
 437
 438
 439
 440
 441
 442
 443
 444
 445
 446
 447
 448
 449
 450
 451
 452
 453
 454
 455
 456
 457
 458
 459
 460
 461
 462
 463
 464
 465
 466
 467
 468
 469
 470
 471
 472
 473
 474
 475
 476
 477
 478
 479
 480
 481
 482
 483
 484
 485
 486
 487
 488
 489
 490
 491
 492
 493
 494
 495
 496
 497
 498
 499
 500
 501
 502
 503
 504
 505
 506
 507
 508
 509
 510
 511
 512
 513
 514
 515
 516
 517
 518
 519
 520
 521
 522
 523
 524
 525
 526
 527
 528
 529
 530
 531
 532
 533
 534
 535
 536
 537
 538
 539
 540
 541
 542
 543
 544
 545
 546
 547
 548
 549
 550
 551
 552
 553
 554
 555
 556
 557
 558
 559
 560
 561
 562
 563
 564
 565
 566
 567
 568
 569
 570
 571
 572
 573
 574
 575
 576
 577
 578
 579
 580
 581
 582
 583
 584
 585
 586
 587
 588
 589
 590
 591
 592
 593
 594
 595
 596
 597
 598
 599
 600
 601
 602
 603
 604
 605
 606
 607
 608
 609
 610
 611
 612
 613
 614
 615
 616
 617
 618
 619
 620
 621
 622
 623
 624
 625
 626
 627
 628
 629
 630
 631
 632
 633
 634
 635
 636
 637
 638
 639
 640
 641
 642
 643
 644
 645
 646
 647
 648
 649
 650
 651
 652
 653
 654
 655
 656
 657
 658
 659
 660
 661
 662
 663
 664
 665
 666
 667
 668
 669
 670
 671
 672
 673
 674
 675
 676
 677
 678
 679
 680
 681
 682
 683
 684
 685
 686
 687
 688
 689
 690
 691
 692
 693
 694
 695
 696
 697
 698
 699
 700
 701
 702
 703
 704
 705
 706
 707
 708
 709
 710
 711
 712
 713
 714
 715
 716
 717
 718
 719
 720
 721
 722
 723
 724
 725
 726
 727
 728
 729
 730
 731
 732
 733
 734
 735
 736
 737
 738
 739
 740
 741
 742
 743
 744
 745
 746
 747
 748
 749
 750
 751
 752
 753
 754
 755
 756
 757
 758
 759
 760
 761
 762
 763
 764
 765
 766
 767
 768
 769
 770
 771
 772
 773
 774
 775
 776
 777
 778
 779
 780
 781
 782
 783
 784
 785
 786
 787
 788
 789
 790
 791
 792
 793
 794
 795
 796
 797
 798
 799
 800
 801
 802
 803
 804
 805
 806
 807
 808
 809
 810
 811
 812
 813
 814
 815
 816
 817
 818
 819
 820
 821
 822
 823
 824
 825
 826
 827
 828
 829
 830
 831
 832
 833
 834
 835
 836
 837
 838
 839
 840
 841
 842
 843
 844
 845
 846
 847
 848
 849
 850
 851
 852
 853
 854
 855
 856
 857
 858
 859
 860
 861
 862
 863
 864
 865
 866
 867
 868
 869
 870
 871
 872
 873
 874
 875
 876
 877
 878
 879
 880
 881
 882
 883
 884
 885
 886
 887
 888
 889
 890
 891
 892
 893
 894
 895
 896
 897
 898
 899
 900
 901
 902
 903
 904
 905
 906
 907
 908
 909
 910
 911
 912
 913
 914
 915
 916
 917
 918
 919
 920
 921
 922
 923
 924
 925
 926
 927
 928
 929
 930
 931
 932
 933
 934
 935
 936
 937
 938
 939
 940
 941
 942
 943
 944
 945
 946
 947
 948
 949
 950
 951
 952
 953
 954
 955
 956
 957
 958
 959
 960
 961
 962
 963
 964
 965
 966
 967
 968
 969
 970
 971
 972
 973
 974
 975
 976
 977
 978
 979
 980
 981
 982
 983
 984
 985
 986
 987
 988
 989
 990
 991
 992
 993
 994
 995
 996
 997
 998
 999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
1437
1438
1439
1440
1441
1442
1443
1444
1445
1446
1447
1448
1449
1450
1451
1452
1453
1454
1455
1456
1457
1458
1459
1460
1461
1462
1463
1464
1465
1466
1467
1468
1469
1470
1471
1472
1473
1474
1475
1476
1477
1478
1479
1480
1481
1482
1483
1484
1485
1486
1487
1488
1489
1490
1491
1492
1493
1494
1495
1496
1497
1498
1499
1500
1501
1502
1503
1504
1505
1506
1507
1508
1509
1510
1511
1512
1513
1514
1515
1516
1517
PEP: 374
Title: Choosing a distributed VCS for the Python project
Version: $Revision$
Last-Modified: $Date$
Author: Brett Cannon <brett@python.org>,
        Stephen J. Turnbull <stephen@xemacs.org>,
        Alexandre Vassalotti <alexandre@peadrop.com>,
        Barry Warsaw <barry@python.org>,
        Dirkjan Ochtman <dirkjan@ochtman.nl>
Status: Final
Type: Process
Content-Type: text/x-rst
Created: 07-Nov-2008
Post-History: 07-Nov-2008
              22-Jan-2009


Rationale
=========

Python has been using a centralized version control system (VCS;
first CVS, now Subversion) for years to great effect. Having a master
copy of the official version of Python provides people with a single
place to always get the official Python source code. It has also
allowed for the storage of the history of the language, mostly for
help with development, but also for posterity. And of course the V in
VCS is very helpful when developing.

But a centralized version control system has its drawbacks. First and
foremost, in order to have the benefits of version control with
Python in a seamless fashion, one must be a "core developer" (i.e.
someone with commit privileges on the master copy of Python). People
who are not core developers but who wish to work with Python's
revision tree, e.g. anyone writing a patch for Python or creating a
custom version, do not have direct tool support for revisions. This
can be quite a limitation, since these non-core developers cannot
easily do basic tasks such as reverting changes to a previously
saved state, creating branches, publishing one's changes with full
revision history, etc. For non-core developers, the last safe tree
state is one the Python developers happen to set, and this prevents
safe development. This second-class citizenship is a hindrance to
people who wish to contribute to Python with a patch of any
complexity and want a way to incrementally save their progress to
make their development lives easier.

There is also the issue of having to be online to be able to commit
one's work. Because centralized VCSs keep a central copy that stores
all revisions, one must have Internet access in order for their
revisions to be stored; no Net, no commit. This can be annoying if
you happen to be traveling and lack any Internet. There is also the
situation of someone wishing to contribute to Python but having a
bad Internet connection where committing is time-consuming and
expensive and it might work out better to do it in a single step.

Another drawback to a centralized VCS is that a common use case is
for a developer to revise patches in response to review comments.
This is more difficult with a centralized model because there's no
place to contain intermediate work. It's either all checked in or
none of it is checked in. In the centralized VCS, it's also very
difficult to track changes to the trunk as they are committed, while
you're working on your feature or bug fix branch. This increases
the risk that such branches will grow stale, out-dated, or that
merging them into the trunk will generate too may conflicts to be
easily resolved.

Lastly, there is the issue of maintenance of Python. At any one time
there is at least one major version of Python under development (at
the time of this writing there are two). For each major version of
Python under development there is at least the maintenance version
of the last minor version and the in-development minor version (e.g.
with 2.6 just released, that means that both 2.6 and 2.7 are being
worked on). Once a release is done, a branch is created between the
code bases where changes in one version do not (but could) belong in
the other version. As of right now there is no natural support for
this branch in time in central VCSs; you must use tools that
simulate the branching. Tracking merges is similarly painful for
developers, as revisions often need to be merged between four active
branches (e.g. 2.6 maintenance, 3.0 maintenance, 2.7 development,
3.1 development). In this case, VCSs such as Subversion only handle
this through arcane third party tools.

Distributed VCSs (DVCSs) solve all of these problems. While one can
keep a master copy of a revision tree, anyone is free to copy that
tree for their own use. This gives everyone the power to commit
changes to their copy, online or offline. It also more naturally
ties into the idea of branching in the history of a revision tree
for maintenance and the development of new features bound for
Python. DVCSs also provide a great many additional features that
centralized VCSs don't or can't provide.

This PEP explores the possibility of changing Python's use of Subversion
to any of the currently popular  DVCSs, in order to gain
the benefits outlined above. This PEP does not guarantee that a switch
to a DVCS will occur at the conclusion of this PEP. It is quite
possible that no clear winner will be found and that svn will continue
to be used. If this happens, this PEP will be revisited and revised in
the future as the state of DVCSs evolves.


Terminology
===========

Agreeing on a common terminology is surprisingly difficult,
primarily because each VCS uses these terms when describing subtly
different tasks, objects, and concepts. Where possible, we try to
provide a generic definition of the concepts, but you should consult
the individual system's glossaries for details. Here are some basic
references for terminology, from some of the standard web-based
references on each VCS. You can also refer to glossaries for each
DVCS:

* Subversion : http://svnbook.red-bean.com/en/1.5/svn.basic.html
* Bazaar : http://bazaar-vcs.org/BzrGlossary
* Mercurial : http://www.selenic.com/mercurial/wiki/index.cgi/UnderstandingMercurial
* git : http://book.git-scm.com/1_the_git_object_model.html


branch
    A line of development; a collection of revisions, ordered by
    time.

checkout/working copy/working tree
    A tree of code the developer can edit, linked to a branch.

index
    A "staging area" where a revision is built (unique to git).

repository
    A collection of revisions, organized into branches.

clone
    A complete copy of a branch or repository.

commit
    To record a revision in a repository.

merge
    Applying all the changes and history from one branch/repository
    to another.

pull
    To update a checkout/clone from the original branch/repository,
    which can be remote or local

push/publish
    To copy a revision, and all revisions it depends on, from a one
    repository to another.

cherry-pick
    To merge one or more specific revisions from one branch to
    another, possibly in a different repository, possibly without its
    dependent revisions.

rebase
    To "detach" a branch, and move it to a new branch point; move
    commits to the beginning of a branch instead of where they
    happened in time.


Typical Workflow
================

At the moment, the typical workflow for a Python core developer is:


* Edit code in a checkout until it is stable enough to commit/push.
* Commit to the master repository.

It is a rather simple workflow, but it has drawbacks. For one,
because any work that involves the repository takes time thanks to
the network, commits/pushes tend to not necessarily be as atomic as
possible. There is also the drawback of there not being a
necessarily cheap way to create new checkouts beyond a recursive
copy of the checkout directory.

A DVCS would lead to a workflow more like this:

* Branch off of a local clone of the master repository.
* Edit code, committing in atomic pieces.
* Merge the branch into the mainline, and
* Push all commits to the master repository.

While there are more possible steps, the workflow is much more
independent of the master repository than is currently possible. By
being able to commit locally at the speed of your disk, a core
developer is able to do atomic commits much more frequently,
minimizing having commits that do multiple things to the code. Also
by using a branch, the changes are isolated (if desired) from other
changes being made by other developers. Because branches are cheap,
it is easy to create and maintain many smaller branches that address
one specific issue, e.g. one bug or one new feature. More
sophisticated features of DVCSs allow the developer to more easily
track long running development branches as the official mainline
progresses.


Contenders
==========

========== ========== ======= =================================== ==========================================
Name       Short Name Version 2.x Trunk Mirror                    3.x Trunk Mirror
========== ========== ======= =================================== ==========================================
Bazaar_    bzr        1.12    http://code.python.org/python/trunk http://code.python.org/python/3.0
Mercurial_ hg         1.2.0   http://code.python.org/hg/trunk/    http://code.python.org/hg/branches/py3k/
git_       N/A        1.6.1   git://code.python.org/python/trunk  git://code.python.org/python/branches/py3k
========== ========== ======= =================================== ==========================================

.. _Bazaar: http://bazaar-vcs.org/
.. _Mercurial: http://www.selenic.com/mercurial/
.. _git: http://www.git-scm.com/

This PEP does not consider darcs, arch, or monotone. The main
problem with these DVCSs is that they are simply not popular enough
to bother supporting when they do not provide some very compelling
features that the other DVCSs provide. Arch and darcs also have
significant performance problems which seem unlikely to be addressed
in the near future.


Interoperability
================

For those who have already decided which DVCSs they want to use, and
are willing to maintain local mirrors themselves, all three DVCSs
support interchange via the git "fast-import" changeset format.  git
does so natively, of course, and native support for Bazaar is under
active development, and getting good early reviews as of mid-February
2009.  Mercurial has idiosyncratic support for importing via its *hg
convert* command, and `third-party fast-import support`_ is available
for exporting.  Also, the Tailor_ tool supports automatic maintenance
of mirrors based on an official repository in any of the candidate
formats with a local mirror in any format.

.. _third-party fast-import support: http://repo.or.cz/r/fast-export.git/.git/description
.. _Tailor: http://progetti.arstecnica.it/tailor/


Usage Scenarios
===============

Probably the best way to help decide on whether/which DVCS should
replace Subversion is to see what it takes to perform some
real-world usage scenarios that developers (core and non-core) have
to work with. Each usage scenario outlines what it is, a bullet list
of what the basic steps are (which can vary slightly per VCS), and
how to perform the usage scenario in the various VCSs
(including Subversion).

Each VCS had a single author in charge of writing implementations
for each scenario (unless otherwise noted).

========= ===
Name      VCS
========= ===
Brett     svn
Barry     bzr
Alexandre hg
Stephen   git
========= ===


Initial Setup
-------------

Some DVCSs have some perks if you do some initial setup upfront.
This section covers what can be done before any of the usage
scenarios are run in order to take better advantage of the tools.

All of the DVCSs support configuring your project identification.
Unlike the centralized systems, they use your email address to
identify your commits. (Access control is generally done by
mechanisms external to the DVCS, such as ssh or console login).
This identity may be associated with a full name.

All of the DVCSs will query the system to get some approximation to
this information, but that may not be what you want. They also
support setting this information on a per-user basis, and on a per-
project basis. Convenience commands to set these attributes vary,
but all allow direct editing of configuration files.

Some VCSs support end-of-line (EOL) conversions on checkout/checkin.


svn
'''

None required, but it is recommended you follow the
`guidelines <http://www.python.org/dev/faq/#what-configuration-settings-should-i-use>`_
in the dev FAQ.


bzr
'''

No setup is required, but for much quicker and space-efficient local
branching, you should create a shared repository to hold all your
Python branches. A shared repository is really just a parent
directory containing a .bzr directory. When bzr commits a revision,
it searches from the local directory on up the file system for a .bzr
directory to hold the revision. By sharing revisions across multiple
branches, you cut down on the amount of disk space used. Do this::

  cd ~/projects
  bzr init-repo python
  cd python

Now, all your Python branches should be created inside of
``~/projects/python``.

There are also some settings you can put in your
``~/.bzr/bazaar.conf``
and ``~/.bzr/locations.conf`` file to set up defaults for interacting
with Python code. None of them are required, although some are
recommended. E.g. I would suggest gpg signing all commits, but that
might be too high a barrier for developers. Also, you can set up
default push locations depending on where you want to push branches
by default. If you have write access to the master branches, that
push location could be code.python.org. Otherwise, it might be a
free Bazaar code hosting service such as Launchpad. If Bazaar is
chosen, we should decide what the policies and recommendations are.

At a minimum, I would set up your email address::

  bzr whoami "Firstname Lastname <email.address@example.com>"

As with hg and git below, there are ways to set your email address (or really,
just about any parameter) on a
per-repository basis.  You do this with settings in your
``$HOME/.bazaar/locations.conf`` file, which has an ini-style format as does
the other DVCSs.  See the Bazaar documentation for details,
which mostly aren't relevant for this discussion.


hg
''

Minimally, you should set your user name. To do so, create the file
``.hgrc`` in your home directory and add the following::

  [ui]
  username = Firstname Lastname <email.address@example.com>

If you are using Windows and your tools do not support Unix-style newlines,
you can enable automatic newline translation by adding to your configuration::

  [extensions]
  win32text =

These options can also be set locally to a given repository by
customizing ``<repo>/.hg/hgrc``, instead of ``~/.hgrc``.


git
'''

None needed. However, git supports a number of features that can
smooth your work, with a little preparation. git supports setting
defaults at the workspace, user, and system levels. The system
level is out of scope of this PEP. The user configuration file is
``$HOME/.gitconfig`` on Unix-like systems, and the workspace
configuration file is ``$REPOSITORY/.git/config``.

You can use the ``git-config`` tool to set preferences for user.name and
user.email either globally (for your system login account) or
locally (to a given git working copy), or you can edit the
configuration files (which have the same format as shown in the
Mercurial section above).::

  # my full name doesn't change
  # note "--global" flag means per user
  # (system-wide configuration is set with "--system")
  git config --global user.name 'Firstname Lastname'
  # but use my Pythonic email address
  cd /path/to/python/repository
  git config user.email email.address@python.example.com

If you are using Windows, you probably want to set the core.autocrlf
and core.safecrlf preferences to true using ``git-config``.::

  # check out files with CRLF line endings rather than Unix-style LF only
  git config --global core.autocrlf true
  # scream if a transformation would be ambiguous
  # (eg, a working file contains both naked LF and CRLF)
  # and check them back in with the reverse transformation
  git config --global core.safecrlf true

Although the repository will usually contain a .gitignore file
specifying file names that rarely if ever should be registered in the
VCS, you may have personal conventions (e.g., always editing log
messages in a temporary file named ".msg") that you may wish to
specify.::

  # tell git where my personal ignores are
  git config --global core.excludesfile ~/.gitignore
  # I use .msg for my long commit logs, and Emacs makes backups in
  # files ending with ~
  # these are globs, not regular expressions
  echo '*~' >> ~/.gitignore
  echo '.msg' >> ~/.gitignore

If you use multiple branches, as with the other VCSes, you can save a
lot of space by putting all objects in a common object store. This
also can save download time, if the origins of the branches were in
different repositories, because objects are shared across branches in
your repository even if they were not present in the upstream
repositories.  git is very space- and time-efficient and applies a
number of optimizations automatically, so this configuration is
optional.  (Examples are omitted.)


One-Off Checkout
----------------

As a non-core developer, I want to create and publish a one-off patch
that fixes a bug, so that a core developer can review it for
inclusion in the mainline.

* Checkout/branch/clone trunk.
* Edit some code.
* Generate a patch (based on what is best supported by the VCS, e.g.
  branch history).
* Receive reviewer comments and address the issues.
* Generate a second patch for the core developer to commit.


svn
'''
::

  svn checkout http://svn.python.org/projects/python/trunk
  cd trunk
  # Edit some code.
  echo "The cake is a lie!" > README
  # Since svn lacks support for local commits, we fake it with patches.
  svn diff >> commit-1.diff
  svn diff >> patch-1.diff
  # Upload the patch-1 to bugs.python.org.
  # Receive reviewer comments.
  # Edit some code.
  echo "The cake is real!" > README
  # Since svn lacks support for local commits, we fake it with patches.
  svn diff >> commit-2.diff
  svn diff >> patch-2.diff
  # Upload patch-2 to bugs.python.org


bzr
'''
::

  bzr branch http://code.python.org/python/trunk
  cd trunk
  # Edit some code.
  bzr commit -m 'Stuff I did'
  bzr send -o bundle
  # Upload bundle to bugs.python.org
  # Receive reviewer comments
  # Edit some code
  bzr commit -m 'Respond to reviewer comments'
  bzr send -o bundle
  # Upload updated bundle to bugs.python.org

The ``bundle`` file is like a super-patch.  It can be read by ``patch(1)`` but
it contains additional metadata so that it can be fed to ``bzr merge`` to
produce a fully usable branch completely with history.  See `Patch Review`_
section below.


hg
''
::

  hg clone http://code.python.org/hg/trunk
  cd trunk
  # Edit some code.
  hg commit -m "Stuff I did"
  hg outgoing -p > fixes.patch
  # Upload patch to bugs.python.org
  # Receive reviewer comments
  # Edit some code
  hg commit -m "Address reviewer comments."
  hg outgoing -p > additional-fixes.patch
  # Upload patch to bugs.python.org

While ``hg outgoing`` does not have the flag for it, most Mercurial
commands support git's extended patch format through a ``--git``
command. This can be set in one's ``.hgrc`` file so that all commands
that generate a patch use the extended format.


git
'''

The patches could be created with
``git diff master > stuff-i-did.patch``, too, but
``git format-patch | git am`` knows some tricks
(empty files, renames, etc) that ordinary patch can't handle. git
grabs "Stuff I did" out of the the commit message to create the file
name 0001-Stuff-I-did.patch. See Patch Review below for a
description of the git-format-patch format.
::

  # Get the mainline code.
  git clone git://code.python.org/python/trunk
  cd trunk
  # Edit some code.
  git commit -a -m 'Stuff I did.'
  # Create patch for my changes (i.e, relative to master).
  git format-patch master
  git tag stuff-v1
  # Upload 0001-Stuff-I-did.patch to bugs.python.org.
  # Time passes ... receive reviewer comments.
  # Edit more code.
  git commit -a -m 'Address reviewer comments.'
  # Make an add-on patch to apply on top of the original.
  git format-patch stuff-v1
  # Upload 0001-Address-reviewer-comments.patch to bugs.python.org.


Backing Out Changes
-------------------

As a core developer, I want to undo a change that was not ready for
inclusion in the mainline.

* Back out the unwanted change.
* Push patch to server.


svn
'''
::

  # Assume the change to revert is in revision 40
  svn merge -c -40 .
  # Resolve conflicts, if any.
  svn commit -m "Reverted revision 40"


bzr
'''
::

  # Assume the change to revert is in revision 40
  bzr merge -r 40..39
  # Resolve conflicts, if any.
  bzr commit -m "Reverted revision 40"

Note that if the change you want revert is the last one that was
made, you can just use ``bzr uncommit``.


hg
''
::

  # Assume the change to revert is in revision 9150dd9c6d30
  hg backout --merge -r 9150dd9c6d30
  # Resolve conflicts, if any.
  hg commit -m "Reverted changeset 9150dd9c6d30"
  hg push

Note, you can use "hg rollback" and "hg strip" to revert changes you committed
in your local repository, but did not yet push to other repositories.

git
'''
::

  # Assume the change to revert is the grandfather of a revision tagged "newhotness".
  git revert newhotness~2
  # Resolve conflicts if any.  If there are no conflicts, the commit
  # will be done automatically by "git revert", which prompts for a log.
  git commit -m "Reverted changeset 9150dd9c6d30."
  git push


Patch Review
------------

As a core developer, I want to review patches submitted by other
people, so that I can make sure that only approved changes are added
to Python.

Core developers have to review patches as submitted by other people.
This requires applying the patch, testing it, and then tossing away
the changes. The assumption can be made that a core developer already
has a checkout/branch/clone of the trunk.

* Branch off of trunk.
* Apply patch w/o any comments as generated by the patch submitter.
* Push patch to server.
* Delete now-useless branch.


svn
'''

Subversion does not exactly fit into this development style very well
as there are no such thing as a "branch" as has been defined in this
PEP. Instead a developer either needs to create another checkout for
testing a patch or create a branch on the server. Up to this point,
core developers have not taken the "branch on the server" approach to
dealing with individual patches. For this scenario the assumption
will be the developer creates a local checkout of the trunk to work
with.::

    cp -r trunk issue0000
    cd issue0000
    patch -p0 < __patch__
    # Review patch.
    svn commit -m "Some patch."
    cd ..
    rm -r issue0000

Another option is to only have a single checkout running at any one
time and use ``svn diff`` along with ``svn revert -R`` to store away
independent changes you may have made.


bzr
'''
::

    bzr branch trunk issueNNNN
    # Download `patch` bundle from Roundup
    bzr merge patch
    # Review patch
    bzr commit -m'Patch NNN by So N. So' --fixes python:NNNN
    bzr push bzr+ssh://me@code.python.org/trunk
    rm -rf ../issueNNNN

Alternatively, since you're probably going to commit these changes to
the trunk, you could just do a checkout. That would give you a local
working tree while the branch (i.e. all revisions) would continue to
live on the server. This is similar to the svn model and might allow
you to more quickly review the patch. There's no need for the push
in this case.::

    bzr checkout trunk issueNNNN
    # Download `patch` bundle from Roundup
    bzr merge patch
    # Review patch
    bzr commit -m'Patch NNNN by So N. So' --fixes python:NNNN
    rm -rf ../issueNNNN


hg
''
::

    hg clone trunk issue0000
    cd issue0000
    # If the patch was generated using hg export, the user name of the
    # submitter is automatically recorded. Otherwise,
    # use hg import --no-commit submitted.diff and commit with
    # hg commit -u "Firstname Lastname <email.address@example.com>"
    hg import submitted.diff
    # Review patch.
    hg push ssh://alexandre@code.python.org/hg/trunk/


git
'''
We assume a patch created by git-format-patch. This is a Unix mbox
file containing one or more patches, each formatted as an RFC 2822
message. git-am interprets each message as a commit as follows. The
author of the patch is taken from the From: header, the date from the
Date header. The commit log is created by concatenating the content
of the subject line, a blank line, and the message body up to the
start of the patch.::

    cd trunk
    # Create a branch in case we don't like the patch.
    # This checkout takes zero time, since the workspace is left in
    # the same state as the master branch.
    git checkout -b patch-review
    # Download patch from bugs.python.org to submitted.patch.
    git am < submitted.patch
    # Review and approve patch.
    # Merge into master and push.
    git checkout master
    git merge patch-review
    git push


Backport
--------

As a core developer, I want to apply a patch to 2.6, 2.7, 3.0, and 3.1
so that I can fix a problem in all three versions.

Thanks to always having the cutting-edge and the latest release
version under development, Python currently has four branches being
worked on simultaneously. That makes it important for a change to
propagate easily through various branches.

svn
'''

Because of Python's use of svnmerge, changes start with the trunk
(2.7) and then get merged to the release version of 2.6. To get the
change into the 3.x series, the change is merged into 3.1, fixed up,
and then merged into 3.0 (2.7 -> 2.6; 2.7 -> 3.1 -> 3.0).

This is in contrast to a port-forward strategy where the patch would
have been added to 2.6 and then pulled forward into newer versions
(2.6 -> 2.7 -> 3.0 -> 3.1).

::

    # Assume patch applied to 2.7 in revision 0000.
    cd release26-maint
    svnmerge merge -r 0000
    # Resolve merge conflicts and make sure patch works.
    svn commit -F svnmerge-commit-message.txt  # revision 0001.
    cd ../py3k
    svnmerge merge -r 0000
    # Same as for 2.6, except Misc/NEWS changes are reverted.
    svn revert Misc/NEWS
    svn commit -F svnmerge-commit-message.txt  # revision 0002.
    cd ../release30-maint
    svnmerge merge -r 0002
    svn commit -F svnmerge-commit-message.txt  # revision 0003.


bzr
'''

Bazaar is pretty straightforward here, since it supports cherry
picking revisions manually. In the example below, we could have
given a revision id instead of a revision number, but that's usually
not necessary. Martin Pool suggests "We'd generally recommend doing
the fix first in the oldest supported branch, and then merging it
forward to the later releases."::

    # Assume patch applied to 2.7 in revision 0000
    cd release26-maint
    bzr merge ../trunk -c 0000
    # Resolve conflicts and make sure patch works
    bzr commit -m 'Back port patch NNNN'
    bzr push bzr+ssh://me@code.python.org/trunk
    cd ../py3k
    bzr merge ../trunk -r 0000
    # Same as for 2.6 except Misc/NEWS changes are reverted
    bzr revert Misc/NEWS
    bzr commit -m 'Forward port patch NNNN'
    bzr push bzr+ssh://me@code.python.org/py3k


hg
''

Mercurial, like other DVCS, does not well support the current
workflow used by Python core developers to backport patches. Right
now, bug fixes are first applied to the development mainline
(i.e., trunk), then back-ported to the maintenance branches and
forward-ported, as necessary, to the py3k branch. This workflow
requires the ability to cherry-pick individual changes. Mercurial's
transplant extension provides this ability. Here is an example of
the scenario using this workflow::

    cd release26-maint
    # Assume patch applied to 2.7 in revision 0000
    hg transplant -s ../trunk 0000
    # Resolve conflicts, if any.
    cd ../py3k
    hg pull ../trunk
    hg merge
    hg revert Misc/NEWS
    hg commit -m "Merged trunk"
    hg push

In the above example, transplant acts much like the current svnmerge
command. When transplant is invoked without the revision, the command
launches an interactive loop useful for transplanting multiple
changes. Another useful feature is the --filter option which can be
used to modify changesets programmatically (e.g., it could be used
for removing changes to Misc/NEWS automatically).

Alternatively to the traditional workflow, we could avoid
transplanting changesets by committing bug fixes to the oldest
supported release, then merge these fixes upward to the more recent
branches.
::

    cd release25-maint
    hg import fix_some_bug.diff
    # Review patch and run test suite. Revert if failure.
    hg push
    cd ../release26-maint
    hg pull ../release25-maint
    hg merge
    # Resolve conflicts, if any. Then, review patch and run test suite.
    hg commit -m "Merged patches from release25-maint."
    hg push
    cd ../trunk
    hg pull ../release26-maint
    hg merge
    # Resolve conflicts, if any, then review.
    hg commit -m "Merged patches from release26-maint."
    hg push

Although this approach makes the history non-linear and slightly
more difficult to follow, it encourages fixing bugs across all
supported releases. Furthermore, it scales better when there is many
changes to backport, because we do not need to seek the specific
revision IDs to merge.


git
'''

In git I would have a workspace which contains all of
the relevant master repository branches. git cherry-pick doesn't
work across repositories; you need to have the branches in the same
repository.
::

    # Assume patch applied to 2.7 in revision release27~3 (4th patch back from tip).
    cd integration
    git checkout release26
    git cherry-pick release27~3
    # If there are conflicts, resolve them, and commit those changes.
    # git commit -a -m "Resolve conflicts."
    # Run test suite. If fixes are necessary, record as a separate commit.
    # git commit -a -m "Fix code causing test failures."
    git checkout master
    git cherry-pick release27~3
    # Do any conflict resolution and test failure fixups.
    # Revert Misc/NEWS changes.
    git checkout HEAD^ -- Misc/NEWS
    git commit -m 'Revert cherry-picked Misc/NEWS changes.' Misc/NEWS
    # Push both ports.
    git push release26 master

If you are regularly merging (rather than cherry-picking) from a
given branch, then you can block a given commit from being
accidentally merged in the future by merging, then reverting it.
This does not prevent a cherry-pick from pulling in the unwanted
patch, and this technique requires blocking everything that you don't
want merged. I'm not sure if this differs from svn on this point.
::

    cd trunk
    # Merge in the alpha tested code.
    git merge experimental-branch
    # We don't want the 3rd-to-last commit from the experimental-branch,
    # and we don't want it to ever be merged.
    # The notation "^N" means Nth parent of the current commit. Thus HEAD^2^1^1
    # means the first parent of the first parent of the second parent of HEAD.
    git revert HEAD^2^1^1
    # Propagate the merge and the prohibition to the public repository.
    git push


Coordinated Development of a New Feature
----------------------------------------

Sometimes core developers end up working on a major feature with
several developers. As a core developer, I want to be able to
publish feature branches to a common public location so that I can
collaborate with other developers.

This requires creating a branch on a server that other developers
can access. All of the DVCSs support creating new repositories on
hosts where the developer is already able to commit, with
appropriate configuration of the repository host. This is
similar in concept to the existing sandbox in svn, although details
of repository initialization may differ.

For non-core developers, there are various more-or-less public-access
repository-hosting services.
Bazaar has
Launchpad_,
Mercurial has
`bitbucket.org`_,
and git has
GitHub_.
All also have easy-to-use
CGI interfaces for developers who maintain their own servers.


.. _Launchpad: http://www.launchpad.net/
.. _bitbucket.org: http://www.bitbucket.org/
.. _GitHub: http://www.github.com/

* Branch trunk.
* Pull from branch on the server.
* Pull from trunk.
* Push merge to trunk.


svn
'''
::

    # Create branch.
    svn copy svn+ssh://pythondev@svn.python.org/python/trunk svn+ssh://pythondev@svn.python.org/python/branches/NewHotness
    svn checkout svn+ssh://pythondev@svn.python.org/python/branches/NewHotness
    cd NewHotness
    svnmerge init
    svn commit -m "Initialize svnmerge."
    # Pull in changes from other developers.
    svn update
    # Pull in trunk and merge to the branch.
    svnmerge merge
    svn commit -F svnmerge-commit-message.txt


This scenario is incomplete as the decision for what DVCS to go with
was made before the work was complete.


Separation of Issue Dependencies
--------------------------------

Sometimes, while working on an issue, it becomes apparent that the
problem being worked on is actually a compound issue of various
smaller issues. Being able to take the current work and then begin
working on a separate issue is very helpful to separate out issues
into individual units of work instead of compounding them into a
single, large unit.

* Create a branch A (e.g. urllib has a bug).
* Edit some code.
* Create a new branch B that branch A depends on (e.g. the urllib
  bug exposes a socket bug).
* Edit some code in branch B.
* Commit branch B.
* Edit some code in branch A.
* Commit branch A.
* Clean up.


svn
'''

To make up for svn's lack of cheap branching, it has a changelist
option to associate a file with a single changelist. This is not as
powerful as being able to associate at the commit level. There is
also no way to express dependencies between changelists.
::

    cp -r trunk issue0000
    cd issue0000
    # Edit some code.
    echo "The cake is a lie!" > README
    svn changelist A README
    # Edit some other code.
    echo "I own Python!" > LICENSE
    svn changelist B LICENSE
    svn ci -m "Tell it how it is." --changelist B
    # Edit changelist A some more.
    svn ci -m "Speak the truth." --changelist A
    cd ..
    rm -rf issue0000


bzr
'''
Here's an approach that uses bzr shelf (now a standard part of bzr)
to squirrel away some changes temporarily while you take a detour to
fix the socket bugs.
::

    bzr branch trunk bug-0000
    cd bug-0000
    # Edit some code. Dang, we need to fix the socket module.
    bzr shelve --all
    # Edit some code.
    bzr commit -m "Socket module fixes"
    # Detour over, now resume fixing urllib
    bzr unshelve
    # Edit some code

Another approach uses the loom plugin. Looms can
greatly simplify working on dependent branches because they
automatically take care of the stacking dependencies for you.
Imagine looms as a stack of dependent branches (called "threads" in
loom parlance), with easy ways to move up and down the stack of
threads, merge changes up the stack to descendant threads, create
diffs between threads, etc. Occasionally, you may need or want to
export your loom threads into separate branches, either for review
or commit. Higher threads incorporate all the changes in the lower
threads, automatically.
::

    bzr branch trunk bug-0000
    cd bug-0000
    bzr loomify --base trunk
    bzr create-thread fix-urllib
    # Edit some code. Dang, we need to fix the socket module first.
    bzr commit -m "Checkpointing my work so far"
    bzr down-thread
    bzr create-thread fix-socket
    # Edit some code
    bzr commit -m "Socket module fixes"
    bzr up-thread
    # Manually resolve conflicts if necessary
    bzr commit -m 'Merge in socket fixes'
    # Edit me some more code
    bzr commit -m "Now that socket is fixed, complete the urllib fixes"
    bzr record done

For bonus points, let's say someone else fixes the socket module in
exactly the same way you just did. Perhaps this person even grabbed your
fix-socket thread and applied just that to the trunk. You'd like to
be able to merge their changes into your loom and delete your
now-redundant fix-socket thread.
::

    bzr down-thread trunk
    # Get all new revisions to the trunk. If you've done things
    # correctly, this will succeed without conflict.
    bzr pull
    bzr up-thread
    # See? The fix-socket thread is now identical to the trunk
    bzr commit -m 'Merge in trunk changes'
    bzr diff -r thread: | wc -l # returns 0
    bzr combine-thread
    bzr up-thread
    # Resolve any conflicts
    bzr commit -m 'Merge trunk'
    # Now our top-thread has an up-to-date trunk and just the urllib fix.


hg
''

One approach is to use the shelve extension; this extension is not included
with Mercurial, but it is easy to install. With shelve, you can select changes
to put temporarily aside.
::

    hg clone trunk issue0000
    cd issue0000
    # Edit some code (e.g. urllib).
    hg shelve
    # Select changes to put aside
    # Edit some other code (e.g. socket).
    hg commit
    hg unshelve
    # Complete initial fix.
    hg commit
    cd ../trunk
    hg pull ../issue0000
    hg merge
    hg commit
    rm -rf ../issue0000

Several other way to approach this scenario with Mercurial. Alexander Solovyov
presented a few `alternative approaches`_ on Mercurial's mailing list.

.. _alternative approaches: http://selenic.com/pipermail/mercurial/2009-January/023710.html

git
'''
::

    cd trunk
    # Edit some code in urllib.
    # Discover a bug in socket, want to fix that first.
    # So save away our current work.
    git stash
    # Edit some code, commit some changes.
    git commit -a -m "Completed fix of socket."
    # Restore the in-progress work on urllib.
    git stash apply
    # Edit me some more code, commit some more fixes.
    git commit -a -m "Complete urllib fixes."
    # And push both patches to the public repository.
    git push

Bonus points: suppose you took your time, and someone else fixes
socket in the same way you just did, and landed that in the trunk.  In
that case, your push will fail because your branch is not up-to-date.
If the fix was a one-liner, there's a very good chance that it's
*exactly* the same, character for character.  git would notice that,
and you are done; git will silently merge them.

Suppose we're not so lucky::

    # Update your branch.
    git pull git://code.python.org/public/trunk master

    # git has fetched all the necessary data, but reports that the
    # merge failed.  We discover the nearly-duplicated patch.
    # Neither our version of the master branch nor the workspace has
    # been touched.  Revert our socket patch and pull again:
    git revert HEAD^
    git pull git://code.python.org/public/trunk master

Like Bazaar and Mercurial, git has extensions to manage stacks of
patches.  You can use the original Quilt by Andrew Morton, or there is
StGit ("stacked git") which integrates patch-tracking for large sets
of patches into the VCS in a way similar to Mercurial Queues or Bazaar
looms.


Doing a Python Release
----------------------

How does PEP 101 change when using a DVCS?


bzr
'''

It will change, but not substantially so. When doing the
maintenance branch, we'll just push to the new location instead of
doing an svn cp. Tags are totally different, since in svn they are
directory copies, but in bzr (and I'm guessing hg), they are just
symbolic names for revisions on a particular branch. The release.py
script will have to change to use bzr commands instead. It's
possible that because DVCS (in particular, bzr) does cherry picking
and merging well enough that we'll be able to create the maint
branches sooner. It would be a useful exercise to try to do a
release off the bzr/hg mirrors.


hg
''

Clearly, details specific to Subversion in PEP 101 and in the
release script will need to be updated. In particular, release
tagging and maintenance branches creation process will have to be
modified to use Mercurial's features; this will simplify and
streamline certain aspects of the release process. For example,
tagging and re-tagging a release will become a trivial operation
since a tag, in Mercurial, is simply a symbolic name for a given
revision.


git
'''

It will change, but not substantially so. When doing the
maintenance branch, we'll just git push to the new location instead
of doing an svn cp. Tags are totally different, since in svn they
are directory copies, but in git they are just symbolic names for
revisions, as are branches. (The difference between a tag and a
branch is that tags refer to a particular commit, and will never
change unless you use git tag -f to force them to move. The
checked-out branch, on the other hand, is automatically updated by
git commit.) The release.py script will have to change to use git
commands instead. With git I would create a (local) maintenance
branch as soon as the release engineer is chosen. Then I'd "git
pull" until I didn't like a patch, when it would be "git pull; git
revert ugly-patch", until it started to look like the sensible thing
is to fork off, and start doing "git cherry-pick" on the good
patches.


Platform/Tool Support
=====================

Operating Systems
-----------------
==== ======================================= ============================================= =============================
DVCS Windows                                 OS X                                          UNIX
==== ======================================= ============================================= =============================
bzr  yes (installer) w/ tortoise             yes (installer, fink or MacPorts)             yes (various package formats)
hg   yes (third-party installer) w/ tortoise yes (third-party installer, fink or MacPorts) yes (various package formats)
git  yes (third-party installer)             yes (third-party installer, fink or MacPorts) yes (.deb or .rpm)
==== ======================================= ============================================= =============================

As the above table shows, all three DVCSs are available on all three
major OS platforms. But what it also shows is that Bazaar is the
only DVCS that directly supports Windows with a binary installer
while Mercurial and git require you to rely on a third-party for
binaries. Both bzr and hg have a tortoise version while git does not.

Bazaar and Mercurial also has the benefit of being available in pure
Python with optional extensions available for performance.


CRLF -> LF Support
------------------

bzr
    My understanding is that support for this is being worked on as
    I type, landing in a version RSN. I will try to dig up details.

hg
    Supported via the win32text extension.

git
    I can't say from personal experience, but it looks like there's
    pretty good support via the core.autocrlf and core.safecrlf
    configuration attributes.


Case-insensitive filesystem support
-----------------------------------

bzr
    Should be OK. I share branches between Linux and OS X all the
    time. I've done case changes (e.g. ``bzr mv Mailman mailman``) and
    as long as I did it on Linux (obviously), when I pulled in the
    changes on OS X everything was hunky dory.

hg
    Mercurial uses a case safe repository mechanism and detects case
    folding collisions.

git
    Since OS X preserves case, you can do case changes there too.
    git does not have a problem with renames in either direction.
    However, case-insensitive filesystem support is usually taken
    to mean complaining about collisions on case-sensitive files
    systems. git does not do that.


Tools
-----

In terms of code review tools such as `Review Board`_ and Rietveld_,
the former supports all three while the latter supports hg and git but
not bzr. Bazaar does not yet have an online review board, but it
has several ways to manage email based reviews and trunk merging.
There's `Bundle Buggy`_, `Patch Queue Manager`_ (PQM), and
`Launchpad's code reviews <https://launchpad.net/+tour/code-review>`_.

.. _Review Board: http://www.review-board.org/
.. _Rietveld: http://code.google.com/p/rietveld/

.. _Bundle Buggy: http://code.aaronbentley.com/bundlebuggy/
.. _Patch Queue Manager: http://bazaar-vcs.org/PatchQueueManager

All three have some web site online that provides basic hosting
support for people who want to put a repository online. Bazaar has
Launchpad, Mercurial has bitbucket.org, and git has GitHub. Google
Code also has instructions on how to use git with the service, both
to hold a repository and how to act as a read-only mirror.

All three also `appear to be supported
<http://buildbot.net/repos/release/docs/buildbot.html#How-Different-VC-Systems-Specify-Sources>`_
by Buildbot_.

.. _Buildbot: http://buildbot.net


Usage On Top Of Subversion
==========================

==== ============
DVCS svn support
==== ============
bzr  bzr-svn_ (third-party)
hg   `multiple third-parties <http://www.selenic.com/mercurial/wiki/index.cgi/WorkingWithSubversion>`__
git  git-svn_
==== ============

.. _bzr-svn: http://bazaar-vcs.org/BzrForeignBranches/Subversion
.. _git-svn: http://www.kernel.org/pub/software/scm/git/docs/git-svn.html

All three DVCSs have svn support, although git is the only one to
come with that support out-of-the-box.


Server Support
==============

==== ==================
DVCS Web page interface
==== ==================
bzr  loggerhead_
hg   hgweb_
git  gitweb_
==== ==================

.. _loggerhead: https://launchpad.net/loggerhead
.. _hgweb: http://www.selenic.com/mercurial/wiki/index.cgi/HgWebDirStepByStep
.. _gitweb: http://git.or.cz/gitwiki/Gitweb

All three DVCSs support various hooks on the client and server side
for e.g. pre/post-commit verifications.


Development
===========

All three projects are under active development. Git seems to be on a
monthly release schedule. Bazaar is on a time-released monthly
schedule. Mercurial is on a 4-month, timed release schedule.


Special Features
================

bzr
---

Martin Pool adds: "bzr has a stable Python scripting interface, with
a distinction between public and private interfaces and a
deprecation window for APIs that are changing. Some plugins are
listed in https://edge.launchpad.net/bazaar and
http://bazaar-vcs.org/Documentation".


hg
--

Alexander Solovyov comments:

   Mercurial has easy to use extensive API with hooks for main events
   and ability to extend commands. Also there is the mq (mercurial
   queues) extension, distributed with Mercurial, which simplifies
   work with patches.


git
---

git has a cvsserver mode, ie, you can check out a tree from git
using CVS. You can even commit to the tree, but features like
merging are absent, and branches are handled as CVS modules, which
is likely to shock a veteran CVS user.


Tests/Impressions
=================

As I (Brett Cannon) am left with the task of of making the final
decision of which/any DVCS to go with and not my co-authors, I felt
it only fair to write down what tests I ran and my impressions as I
evaluate the various tools so as to be as transparent as possible.


Barrier to Entry
----------------

The amount of time and effort it takes to get a checkout of Python's
repository is critical. If the difficulty or time is too great then a
person wishing to contribute to Python may very well give up. That
cannot be allowed to happen.

I measured the checking out of the 2.x trunk as if I was a non-core
developer. Timings were done using the ``time`` command in zsh and
space was calculated with ``du -c -h``.

======= ================ ========= =====
DVCS    San Francisco    Vancouver Space
======= ================ ========= =====
svn        1:04           2:59     139 M
bzr       10:45          16:04     276 M
hg         2:30           5:24     171 M
git        2:54           5:28     134 M
======= ================ ========= =====

When comparing these numbers to svn, it is important to realize that
it is not a 1:1 comparison. Svn does not pull down the entire revision
history like all of the DVCSs do. That means svn can perform an
initial checkout much faster than the DVCS purely based on the fact
that it has less information to download for the network.


Performance of basic information functionality
----------------------------------------------

To see how the tools did for performing a command that required
querying the history, the log for the ``README`` file was timed.

====  =====
DVCS  Time
====  =====
bzr   4.5 s
hg    1.1 s
git   1.5 s
====  =====

One thing of note during this test was that git took longer than the
other three tools to figure out how to get the log without it using a
pager. While the pager use is a nice touch in general, not having it
automatically turn on took some time (turns out the main ``git``
command has a ``--no-pager`` flag to disable use of the pager).


Figuring out what command to use from built-in help
----------------------------------------------------

I ended up trying to find out what the command was to see what URL the
repository was cloned from. To do this I used nothing more than the
help provided by the tool itself or its man pages.

Bzr was the easiest: ``bzr info``. Running ``bzr help`` didn't show
what I wanted, but mentioned ``bzr help commands``. That list had the
command with a description that made sense.

Git was the second easiest. The command ``git help`` didn't show much
and did not have a way of listing all commands. That is when I viewed
the man page. Reading through the various commands I discovered ``git
remote``. The command itself spit out nothing more than ``origin``.
Trying ``git remote origin`` said it was an error and printed out the
command usage. That is when I noticed ``git remote show``. Running
``git remote show origin`` gave me the information I wanted.

For hg, I never found the information I wanted on my own. It turns out
I wanted ``hg paths``, but that was not obvious from the description
of "show definition of symbolic path names" as printed by ``hg help``
(it should be noted that reporting this in the PEP did lead to the
Mercurial developers to clarify the wording to make the use of the
``hg paths`` command clearer).


Updating a checkout
---------------------

To see how long it takes to update an outdated repository I timed both
updating a repository 700 commits behind and 50 commits behind (three
weeks stale and 1 week stale, respectively).

====  ===========  ==========
DVCS  700 commits  50 commits
====  ===========  ==========
bzr   39 s         7 s
hg    17 s         3 s
git   N/A          4 s
====  ===========  ==========

.. note::
    Git lacks a value for the *700 commits* scenario as it does
    not seem to allow checking out a repository at a specific
    revision.

Git deserves special mention for its output from ``git pull``. It
not only lists the delta change information for each file but also
color-codes the information.


Decision
=========

At PyCon 2009 the decision was made to go with Mercurial.


Why Mercurial over Subversion
-----------------------------

While svn has served the development team well, it needs to be
admitted that svn does not serve the needs of non-committers as well
as a DVCS does. Because svn only provides its features such as version
control, branching, etc. to people with commit privileges on the
repository it can be a hinderance for people who lack commit
privileges. But DVCSs have no such limitiation as anyone can create a
local branch of Python and perform their own local commits without the
burden that comes with cloning the entire svn repository. Allowing
anyone to have the same workflow as the core developers was the key
reason to switch from svn to hg.

Orthogonal to the benefits of allowing anyone to easily commit locally
to their own branches is offline, fast operations. Because hg stores
all data locally there is no need to send requests to a server
remotely and instead work off of the local disk. This improves
response times tremendously. It also allows for offline usage for when
one lacks an Internet connection. But this benefit is minor and
considered simply a side-effect benefit instead of a driving factor
for switching off of Subversion.


Why Mercurial over other DVCSs
------------------------------

Git was not chosen for three key reasons (see the `PyCon 2009
lightning talk <http://pycon.blip.tv/file/1947231/>`_ where Brett
Cannon lists these exact reasons; talk started at 3:45). First, git's
Windows support is the weakest out of the three DVCSs being considered
which is unacceptable as Python needs to support development on any
platform it runs on. Since Python runs on Windows and some people do
develop on the platform it needs solid support. And while git's
support is improving, as of this moment it is the weakest by a large
enough margin to warrant considering it a problem.

Second, and just as important as the first issue, is that the Python
core developers liked git the least out of the three DVCS options by a
wide margin. If you look at the following table you will see the
results of a survey taken of the core developers and how by a large
margin git is the least favorite version control system.

==== == ===== == ==========
DVCS ++ equal -- Uninformed
==== == ===== == ==========
git  5  1     8  13
bzr  10 3     2  12
hg   15 1     1  10
==== == ===== == ==========

Lastly, all things being equal (which they are not
as shown by the previous two issues), it is preferable to
use and support a tool written in Python and not one written in C and
shell. We are pragmatic enough to not choose a tool simply because it
is written in Python, but we do see the usefulness in promoting tools
that do use it when it is reasonable to do so as it is in this case.

As for why Mercurial was chosen over Bazaar, it came down to
popularity.  As the core developer survey shows, hg was preferred over
bzr. But the community also appears to prefer hg as was shown at PyCon
after git's removal from consideration was announced. Many people came
up to Brett and said in various ways that they wanted hg to be chosen.
While no one said they did not want bzr chosen, no one said they did
either.

Based on all of this information, Guido and Brett decided Mercurial
was to be the next version control system for Python.


Transition Plan
===============

PEP 385 outlines the transition from svn to hg.


Copyright
=========

This document has been placed in the public domain.
Tip: Filter by directory path e.g. /media app.js to search for public/media/app.js.
Tip: Use camelCasing e.g. ProjME to search for ProjectModifiedEvent.java.
Tip: Filter by extension type e.g. /repo .js to search for all .js files in the /repo directory.
Tip: Separate your search with spaces e.g. /ssh pom.xml to search for src/ssh/pom.xml.
Tip: Use ↑ and ↓ arrow keys to navigate and return to view the file.
Tip: You can also navigate files with Ctrl+j (next) and Ctrl+k (previous) and view the file with Ctrl+o.
Tip: You can also navigate files with Alt+j (next) and Alt+k (previous) and view the file with Alt+o.