|Author:||Benoît Allard <firstname.lastname@example.org>|
It all began from a blog post about bug prediction, that describes how some files can be considered as hot spots inside a codebase. Those files, for instance have to be reviewed more carrefully than others, or even, development on them should be done with a higher level of alert. The trouble is only, how to detect those files when you are new to the codebase. The Google blog post quotes a few papers that shows that history can helps.
I decided to implement their method as a Mercurial extension.
Like any other extension, this one has to be enables in one of the mercurial configuration file like this:
[extensions] hotfiles = path/to/hotfiles.py
This extension adds a new command hotfiles that displays the ten files the most succeptible to have issues. Or at least, the one that contained issues in the late time.
This command can takes up to four parameters:
|-r REV, --revision REV|
|This stop the computation at a particular revision and displays the hot files at that moment in time. By default, the current parent revision will be used.|
|-p REGEX, --pattern REGEX|
|This is a regular expression used on the commit message to filter only the changesets that are related to issues.|
|-I PATTERN, --include PATTERN|
|The specified files will be explicitely included in the computation (the rest will be excluded).|
|-X PATTERN, --exclude PATTERN|
|The specified files will be excluded from the included files.|
hg hotfiles -p 'issue\d+'
Will display the ten files the most succeptible to contains bug if run in the Mercurial repository.
Filtering of changesets
The value of the regex to filter issue changesets can be configured in the mercurial configuration file. In this case, if not provided on the command line, this one will be taken. The value is configured as follow:
[hotfiles] pattern = issue\d+
If neither the configuration file, nor the command line parameter is provided, all non-merge changesets will be considered in the computation.
Filtering of files
To filter some parts of the files out of the computation, like the docs or the test directory for instance, a set of exclude and include rules can be used like in the following example:
[hotfiles] exclude.glob = mercurial/util.py include.glob = mercurial/
If the include.glob key is not set, all files will be included, if the exclude.glob key is not set, all included files will be considered.
For bug reports, pull requests, comments, ... simply use the Bitbucket interface or send me a mail.