HTTPS SSH

mmfhg

This is a mercurial extension with a collection of tools that I use. It has several goals:

  1. Provide a pip-installable package that collects all extensions I regularly use. (Mostly to provide ways of dealing with features of last resort.) If the extensions are not pip-installable, then we typically archive them from source using myrepos.
  2. Provide extensions for myrepos in mrconfig that provide the mr freeze and mr unfreeze commands for rigorous but lightweight alternative to subrepos that works across version control systems. (This may eventually be included in the base myrepos project, but is provided here in the meantime.)
  3. Provide some additional commands for automating tasks. Additional commands that add functionality to regular commands have a suffix mmf. For example, hg initmmf will not only run hg init, but will initialize a README.rst file and a simple Makefile to build the corresponding documentation.
  4. Adds a hook for mercurial that is run after each update command that adds %include ../.hgrc to the repository's .hg/hgrc file if the file .hgrc exists in the top level. This top-level file can be version controlled, allowing you to share a bunch of [paths], for example. (See [this answer at StackOverflow](http://stackoverflow.com/a/24195392/1088938).)
  5. Document some of my working patterns and provide a common place for tips etc. about Mercurial. See the files in the docs directory.

Installing

To install this, make sure you have done the following:

  1. Install python, preferably with conda (or a virtualenv) to isolate your environments.

  2. Install mercurial. You might like to use a version customized for your operating system so that you can get various GUI tools, but you can install it with python conda install mercurial (or pip install mercurial).

  3. Install myrepos: port install myrepos. (This is using MacPorts: use your distribution's package manager.)

  4. Make this package: make install

  5. Edit your startup files to have something like:

    export MMFHG="$HOME/hg_work/mmfbb/mmfhg"
    export HGRCPATH="$HGRCPATH:$HOME/.hgrc:$MMFHG/hgrc"
    

    If you read about subhg and subgit, the author recommends you add the .subhg and .subgit directories to your global .hgignore and .gitignore files. You do not need to do this with the mercurial files: setting these variables will ensure that the mmfhg resource and ignore files are read. I do not use git, so cannot give suggestions on how to set this up, so you probably will have to create the git config files.

  6. If you have not added your username = section to your ~/.hgrc file, then do that now:

    [ui]
    username = Your Full Name <youremail@at.your.domain>
    

    This is required before mercurial will allow you to commit changes. Now you can further customize your ~/.hgrc file, but keep in mind that the hgrc file provided here will also be included (you included it in your HGRCPATH in step 4. above), so many of the useful extensions will already be enabled.

myrepos

The myrepos project provides a version control agnostic way of managing a set of repositories. By issuing a single command such as mr update, the corresponding update command will be issued to each of the repositories under myrepos control. These repositories are specified in a project .mrconfig file. This is useful for

  • Managing a bunch of repositories (i.e. all of my Bitbucket repositories) allowing me to check them all out or in etc. and deploy them on a new machine.
  • Managing a set of dependencies where one does not need to track specific versions (i.e. if you can just always update to the most latest version.): By default myrepos does not keep track of the revision numbers of the repositories: we augment the behaviour here with mr freeze and mr unfreeze commands.

If you have a directory with a bunch of repos, the following can be a useful way of creating a complete config file:

touch .mrconfig
for f in `ls -1`; do
  mr register $f
done

mr (un)freeze

The mrconfig file here defines two additional commands that allow you to freeze the current revision numbers of the packages. This feature may eventually be included in the default myrepos command, but for now, this mmfhg package provides the implementation and documentation. To enable these features, simply include the provided mrconfig command in your ~/.mrconfig file with a line:

include = cat ~${MMFHG}/mrconfig

Here is the documentation:

mr freeze [-f]

Record the current revision numbers in the file mrfreeze, thereby freezing your repository in a consistent working state for future use. You should issue this command whenever you update the repositories controlled by myrepos after you have tested the configuration. The optional -f flag will force unfreeze to proceed, even if there are uncommitted changes in the repositort.

mr unfreeze

Restore the repositories managed by myrepos to the states specified in mrfreeze. If a repo is not listed in mrfreeze, then it is simply updated as would be done with mr update.

mrfreeze

The mrfreeze file will be stored next to the .mrconfig file. (We use the same mechanism for locating this file as the mr command, so if the only config file is ~/.mrconfig, then the freeze file ~/mrfreeze will be stored in your home directory too.)

You should version control the mrfreeze file in a top level repository so you have a history of the working version. You might also like to freeze other aspects of your project, like the version of python packages installed (see pip freeze or conda list for example) but we do not manage those here.

The format is simple a list of two columns: the first is the name of the repository and the second is the revision number at the time the last mr freeze command was given.

(By including this in your top level .mrconfig file, you tell mr to trust the commands used here to implement mr freeze and mr unfreeze. If you include this in a local .mrconfig file, you will probably get errors about that being insecure and will then need to explicitly run mr -t freeze to trust these commands.)

General Recommendations

Here are some general notes about good DVC practices.

  1. On commit messages: A very good discussion about how to commit in order to make the revision log useful. Summary: Each commit should be a single logical change to the code without any cruft (no whitespace changes -- keep those as separate commits.)

Extensions

  1. Some extensions are simply source repositories. These cannot be installed with pip because they have no setup.py file. For these, we simply checkout the source with myrepos and then point to them in the hgrc file. These will be checked out into _ext directory by myrepos. This includes extensions like rdiff.
  2. Other extensions can be pip installed, like hg-git. These are listed in requirements.txt and will be installed by pip when either conda pip . or pip install . is run by make.
  3. Some extensions have other dependencies (like hg-subversion). We do not support these yet.

Bitbucket

If you have a whole bunch of Bitbucket repositories that you want to clone, the following script can be useful:

#!/bin/bash
#Script to get all repositories under a user from bitbucket
#Usage: getAllRepos.sh [username]
curl -u ${1}  https://api.bitbucket.org/1.0/users/${1} > repoinfo
for repo_name in `grep \"name\" repoinfo | cut -f4 -d\"`
do
        hg clone ssh://hg@bitbucket.org/${1}/$repo_name
done

Subrepositories

I often want to keep track related projects. Mercurial has a Subrepository feature to do this, but this is a feature of last resort. A major problem is that your repository then depends on another repository, and updates etc. will fail if you do not have access to that repository. Consider what happens if it moves, for example. You will no longer be able to update to old versions of your code. There are ways to deal with this, but they are not trivial.)

If the subproject is very tightly related, then it might be better to include it directly in the main project. This is the idea behind the article Submodules and Subrepos Done Right. One directly includes the subrepo in the parent project so that the parent project manages the files: if someone clones the project, they get everything and have no idea that the subrepo is actually maintained elsewhere. This assumes that users will not be modifying the subrepo. On your local machine, you keep the subrepo as an independent repository (the parent project ignores the .hg or .git folder) and manipulate it with special hgsub and hggit commands (included here in bin). I use this for .makefiles. To configure, add the following configurations:

~/.gitconfig:

[core]
excludesfile = "~/.gitignore"

~/.gitignore:

.subgit
.subhg

~/.hgrc:

[ui]
ignore=~/.hgignore

~/.hgignore:

.subgit
.subhg

If you really need the subrepository functionality, consider instead the confman extension which allows one to manage these manually, without breaking the repo if something goes missing (but without the automatic update feature as well.)

Finally, for loose collections, consider myrepos as demonstrated here.

Largefiles

I often generate data as the result of a lengthy simulation and would like a way of archiving this. As a build product -- in principle reproducible from my source code -- such information should generally not be kept in the repository. Mercurial is also no particularly adept at dealing with large amounts of binary data. To address the second issue, mercurial has the Largefiles extension -- another feature of last resort. This is not designed for managing build products even though there are valid use-cases (see for example, this discussion). Another problem is that bitbucket does not support the Largefiles extension.

My current recommendation is to use the subhg setupdata and then subhg commands as described above.