reposurgeon - a repository surgeon
reposurgeon enables risky operations that version-control systems don't
want to let you do, such as (a) editing past comments and metadata,
(b) excising commits, (c) coalescing commits, and (d) removing files
and subtrees from repo history. The original motivation for
reposurgeon was to clean up artifacts created by repository
reposurgeon is also useful for scripting very high-quality conversions
from Subversion. It is better than git-svn at tag lifting,
automatically cleaning up cvs2svn conversion artifacts, dealing with
nonstandard repository layouts, recognizing branch merges, handling
mixed-branch commits, and generally at coping with Subversion's many
odd corner cases. Normally Subversion repos should be analyzed at a
rate of upwards of ten thousand commits per minute.
repodiffer is a program that reports differences between repository
histories. It uses a diff(1)-like algorithm to identify spans of
identical revisions, and to pick out revisions that have been
changed or deleted or inserted. It may be useful for comparing the
output of different repository-conversion tools in detail.
Another auxiliary program, repopuller, assists in mirroring Subversion
This distribution also includes a generic Makefile describing a
repeatable conversion workflow using these tools.
Finally, an Emacs Lisp mode with useful functions for editing large
comment mailboxes is included.
There is an extensive regression-test suite in the test/ directory.
To test the correctness of this software, ensure that pylint
is installed and then type 'make check'.
reposurgeon is an extremely algorithmically complex program. It may
still have bugs when dealing with strange corner cases in older
repositories. It is often extremely difficult or impossible to
reproduce those bugs without a full copy of the history on which they
When you find a bug, please send me
(a) An exact description of how observed behavior differed from expected
behavior. If reposurgeon died, include the backtrace.
(b) A git fast-import or Subversion dump file of the repository you
were operating on, or a pointer to where I can pull it from.
If your Subversion repo is large, or contains proprietary content that
you need to obscure, consider using the svncutter tool from the
Subversion contrib directory. The "skeletonize" option will replace
all content blobs with dummies, preserving the structure of the repo
and (almost certainly) its ability to trigger your bug. Skeletonizing
also makes the dump drastically smaller and faster to process.
(c) The sequence of reposurgeon commands that tickled the bug.