Wiki

Clone wiki

gorilla / CardCatalog

The Card Catalog

I want to follow homebrew's lead here.

Each Python library the tool supports would have card file in the tool's catalog. Want to add a library to the tool? Add a card file (we should have a command to generate a template, like like homebrew), commit, and ask a catalog maintainer to pull it in.

Paul: I believe this would be a good time to use Mercurial's sub-repo instead of having the tool and the library in the same repo. This could allow us tighter control of the tool and possibly allow more people to manage the library, much like Steve's setup with hgtip.

Steve: The bad part about using subrepos is that they're very new and not supported terribly well. If we use a subrepo then people cloning the tool to use it will need to have a very recent version of Mercurial installed.

Also, making a git mirror (for those that want to use git) would be much tougher. Without the subrepo we can use hg-git to push and pull from a git repo, so people can contribute via Mercurial or git. With a subrepo we need to do some trickery with git externals or only offer Mercurial and static archive forms of installations.

Paul: Fair enough. I do like the model, but we certainly don't want to do anything to discourage participation. If the tool always runs from an hg or git repo, then it should be possible to provide a "selfupdate" command that will get the latest release of the library and system itself. It could autodetect the correct commands and handle it, a bit like aptitude full-upgrade.

I just thought that allowing people to contribute to one or the other might be easier to deal with. The only time we'd have to do the sub-repo thing would be for installation, and even then we wouldn't have to. The 2 repos could just live side-by-side in a common folder which our install script would create. In fact, that might make the "update" and "upgrade" style commands easier to write since it would be separate operations to update the system and the library. So the gorilla program could evolve independently of the collection of packages, and development in one wouldn't clutter the log of the other. We could also use a tagging system like djang-south where he tags every usable release with the same name, "stablish", and doing an hg pull; hg up stablish will always get you the latest release, but not necessarily bleeding-edge. So we could have commands for upgrading directly to dev or stable. They could of course manage the repos manually, but I like the single command interface a la aptitude or brew.

Steve: I think I'm coming around to the "keep the catalog in a separate repo" point of view.

It definitely makes things easier for maintainers. Homebrew's network graph is a spider web of changes to the core and formula additions. Breaking the two pieces apart makes things much easier.

Like you said, it also opens the possibility of having "catalog maintainers" and "core maintainers" of the tool, which could be a very nice option down the road.

We could handle the pulling/updating of the catalog in the tool itself like you said, which makes things easier for the users. If a catalog directory doesn't exist, we can clone it down using whatever VCS they cloned with to start (or grab a tarball if they got a static version).

I'm torn on the question of using subrepos/submodules vs. plain old repositories.

  • If a person has a version of their VCS that supports them (Mercurial 1.3+, for example), we can use that with subrepos.
  • If they don't, though, we still need to support cloning the repo separately.
  • For Mercurial, using a subrepo means that the .hgsubstate file changes (and needs to be committed) every time something in the subrepo changes. That would clutter the hell out of the main repo's changelog, which is one of the things we're trying to avoid by splitting the repos in the first place.
  • Subrepos are really new and Mercurial needs people to kick the tires a bit to reveal problems. I don't mind doing that for small stuff like hgtip.com but for something like this I kind of want to wait for things to settle down.

Paul: I agree. I think subrepos are out. I like the idea of having the user clone the core repo, and then having the tool clone the catalog (if it doesn't exist) using the same tool with which core was cloned or detecting which tools are available. Just 2 plain but separate repos sounds like it will work very nicely, and is fully supported by any VCS through which we choose to expose the tool.

Card File Format

A card file could look like this:

gorilla/catalog/mercurial.py

'''Mercurial is a fast, elegant distributed revision control system.'''

homepage = 'http://mercurial.selenic.com/'

# Some or all of these can be specified.
# The user can specify a preference of VCS if they want.
sources = {
    'hg': [
        'http://selenic.com/repo/hg-stable/',
        'http://bitbucket.org/mirror/mercurial-stable/',
    ]
    'git': [
        'git://github.com/sjl/hg-stable-mirror/',
    ]
    'static': {
        '1.3.1': 'http://mercurial.selenic.com/release/mercurial-1.3.1.tar.gz',
        '1.3': 'http://mercurial.selenic.com/release/mercurial-1.3.tar.gz',
    }
}

# These will be run in the cloned or unextracted directory, before symlinking
build_commands = [
    'make local',
]

# These will be symlinked into the tool's packages directory, which users will
# add to their $PYTHONPATH
packages = [
    'mercurial',
    'hgext',
]

# These will be symlinked into the tool's bin directory, which users will add
# to their $PATH
scripts = [
    'hg',
]

# This allows for simple tests to be run to make sure the install was successful.
# The keys are the commands to run, the values are regexes
# which the output of that commands should match.
tests = {
    'python -e "import mercurial; import hgext"': '^$',
    'hg --version --quiet': '^Mercurial Distributed SCM (version .*)$',
}

# Most projects use a sane version numbering and tagging scheme like 1.0,
# 1.0.1, 1.1, etc.  I'm sure there are some that don't though.
#
# To support those annoying edge cases we could allow recipes to define a
# parse_tags function that takes a list of strings representing the tags in
# the repo and returns a sorted list representing the versions.
#
# This would let you filter out non-version tags for a project, and handle
# non-easily-sortable version numbering schemes.
def parse_tags(tags):
    # parse the list of tags

Populate the catalog

Thomas: Well, I'm unsure whether it's really relevant to create this subsection, but here are my thoughts. Should Gorilla search directly in the Python Package index or is this out of its scope? Otherwise could we write a small script which browse the PyPi and create a card for each package. I think it would be a shame not to use such a resource.

Steve: I think I want to avoid trying to use PyPI for a couple of reasons:

  • There's a lot of abandoned, stale, and crappy projects listed on there. See the comments on this issue.
  • We want to focus on using repositories instead of static download links, and most of the projects I've seen don't list their repository on their PyPI page.

Updated