1. Stefan Saasen
  2. git

Commits

Junio C Hamano  committed acca687

git-pickaxe: retire pickaxe

Just make it take over blame's place. Documentation and command
have all stopped mentioning "git-pickaxe". The built-in synonym
is left in the command table, so you can still say "git pickaxe",
but it probably is a good idea to retire it as well.

Signed-off-by: Junio C Hamano <junkio@cox.net>

  • Participants
  • Parent commits 659db3f
  • Branches master

Comments (0)

Files changed (10)

File Documentation/git-blame.txt

View file
  • Ignore whitespace
 
 SYNOPSIS
 --------
-'git-blame' [-c] [-l] [-t] [-f] [-n] [-p] [-S <revs-file>] [--] <file> [<rev>]
+[verse]
+'git-blame' [-c] [-l] [-t] [-f] [-n] [-p] [-L n,m] [-S <revs-file>]
+            [-M] [-C] [-C] [--since=<date>] [<rev>] [--] <file>
 
 DESCRIPTION
 -----------
 Annotates each line in the given file with information from the revision which
 last modified the line. Optionally, start annotating from the given revision.
 
+Also it can limit the range of lines annotated.
+
 This report doesn't tell you anything about lines which have been deleted or
 replaced; you need to use a tool such as gitlink:git-diff[1] or the "pickaxe"
 interface briefly mentioned in the following paragraph.
 -c, --compatibility::
 	Use the same output mode as gitlink:git-annotate[1] (Default: off).
 
+-L n,m::
+	Annotate only the specified line range (lines count from
+	1).  The range can be specified with a regexp.  For
+	example, `-L '/^sub esc_html /,/^}$/'` limits the
+	annotation only to the body of `esc_html` subroutine.
+
 -l, --long::
 	Show long rev (Default: off).
 
 -p, --porcelain::
 	Show in a format designed for machine consumption.
 
+-M::
+	Detect moving lines in the file as well.  When a commit
+	moves a block of lines in a file (e.g. the original file
+	has A and then B, and the commit changes it to B and
+	then A), traditional 'blame' algorithm typically blames
+	the lines that were moved up (i.e. B) to the parent and
+	assigns blame to the lines that were moved down (i.e. A)
+	to the child commit.  With this option, both groups of
+	lines are blamed on the parent.
+
+-C::
+	In addition to `-M`, detect lines copied from other
+	files that were modified in the same commit.  This is
+	useful when you reorganize your program and move code
+	around across files.  When this option is given twice,
+	the command looks for copies from all other files in the
+	parent for the commit that creates the file in addition.
+
 -h, --help::
 	Show help message.
 
 header, prefixed by a TAB. This is to allow adding more
 header elements later.
 
+
+SPECIFIYING RANGES
+------------------
+
+Unlike `git-blame` and `git-annotate` in older git, the extent
+of annotation can be limited to both line ranges and revision
+ranges.  When you are interested in finding the origin for
+ll. 40-60 for file `foo`, you can use `-L` option like this:
+
+	git blame -L 40,60 foo
+
+When you are not interested in changes older than the version
+v2.6.18, or changes older than 3 weeks, you can use revision
+range specifiers  similar to `git-rev-list`:
+
+	git blame v2.6.18.. -- foo
+	git blame --since=3.weeks -- foo
+
+When revision range specifiers are used to limit the annotation,
+lines that have not changed since the range boundary (either the
+commit v2.6.18 or the most recent commit that is more than 3
+weeks old in the above example) are blamed for that range
+boundary commit.
+
+A particularly useful way is to see if an added file have lines
+created by copy-and-paste from existing files.  Sometimes this
+indicates that the developer was being sloppy and did not
+refactor the code properly.  You can first find the commit that
+introduced the file with:
+
+	git log --diff-filter=A --pretty=short -- foo
+
+and then annotate the change between the commit and its
+parents, using `commit{caret}!` notation:
+
+	git blame -C -C -f $commit^! -- foo
+
+
 SEE ALSO
 --------
 gitlink:git-annotate[1]
 
 AUTHOR
 ------
-Written by Fredrik Kuivinen <freku045@student.liu.se>.
+Written by Junio C Hamano <junkio@cox.net>
 
 GIT
 ---

File Documentation/git-pickaxe.txt

  • Ignore whitespace
-git-pickaxe(1)
-==============
-
-NAME
-----
-git-pickaxe - Show what revision and author last modified each line of a file
-
-SYNOPSIS
---------
-[verse]
-'git-pickaxe' [-c] [-l] [-t] [-f] [-n] [-p] [-L n,m] [-S <revs-file>]
-              [-M] [-C] [-C] [--since=<date>] [<rev>] [--] <file>
-
-DESCRIPTION
------------
-
-Annotates each line in the given file with information from the revision which
-last modified the line. Optionally, start annotating from the given revision.
-
-Also it can limit the range of lines annotated.
-
-This report doesn't tell you anything about lines which have been deleted or
-replaced; you need to use a tool such as gitlink:git-diff[1] or the "pickaxe"
-interface briefly mentioned in the following paragraph.
-
-Apart from supporting file annotation, git also supports searching the
-development history for when a code snippet occured in a change. This makes it
-possible to track when a code snippet was added to a file, moved or copied
-between files, and eventually deleted or replaced. It works by searching for
-a text string in the diff. A small example:
-
------------------------------------------------------------------------------
-$ git log --pretty=oneline -S'blame_usage'
-5040f17eba15504bad66b14a645bddd9b015ebb7 blame -S <ancestry-file>
-ea4c7f9bf69e781dd0cd88d2bccb2bf5cc15c9a7 git-blame: Make the output
------------------------------------------------------------------------------
-
-OPTIONS
--------
--c, --compatibility::
-	Use the same output mode as gitlink:git-annotate[1] (Default: off).
-
--L n,m::
-	Annotate only the specified line range (lines count from 1).
-
--l, --long::
-	Show long rev (Default: off).
-
--t, --time::
-	Show raw timestamp (Default: off).
-
--S, --rev-file <revs-file>::
-	Use revs from revs-file instead of calling gitlink:git-rev-list[1].
-
--f, --show-name::
-	Show filename in the original commit.  By default
-	filename is shown if there is any line that came from a
-	file with different name, due to rename detection.
-
--n, --show-number::
-	Show line number in the original commit (Default: off).
-
--p, --porcelain::
-	Show in a format designed for machine consumption.
-
--M::
-	Detect moving lines in the file as well.  When a commit
-	moves a block of lines in a file (e.g. the original file
-	has A and then B, and the commit changes it to B and
-	then A), traditional 'blame' algorithm typically blames
-	the lines that were moved up (i.e. B) to the parent and
-	assigns blame to the lines that were moved down (i.e. A)
-	to the child commit.  With this option, both groups of
-	lines are blamed on the parent.
-
--C::
-	In addition to `-M`, detect lines copied from other
-	files that were modified in the same commit.  This is
-	useful when you reorganize your program and move code
-	around across files.  When this option is given twice,
-	the command looks for copies from all other files in the
-	parent for the commit that creates the file in addition.
-
--h, --help::
-	Show help message.
-
-
-THE PORCELAIN FORMAT
---------------------
-
-In this format, each line is output after a header; the
-header at the minumum has the first line which has:
-
-- 40-byte SHA-1 of the commit the line is attributed to;
-- the line number of the line in the original file;
-- the line number of the line in the final file;
-- on a line that starts a group of line from a different
-  commit than the previous one, the number of lines in this
-  group.  On subsequent lines this field is absent.
-
-This header line is followed by the following information
-at least once for each commit:
-
-- author name ("author"), email ("author-mail"), time
-  ("author-time"), and timezone ("author-tz"); similarly
-  for committer.
-- filename in the commit the line is attributed to.
-- the first line of the commit log message ("summary").
-
-The contents of the actual line is output after the above
-header, prefixed by a TAB. This is to allow adding more
-header elements later.
-
-
-SPECIFIYING RANGES
-------------------
-
-Unlike `git-blame` and `git-annotate` in older git, the extent
-of annotation can be limited to both line ranges and revision
-ranges.  When you are interested in finding the origin for
-ll. 40-60 for file `foo`, you can use `-L` option like this:
-
-	git pickaxe -L 40,60 foo
-
-When you are not interested in changes older than the version
-v2.6.18, or changes older than 3 weeks, you can use revision
-range specifiers  similar to `git-rev-list`:
-
-	git pickaxe v2.6.18.. -- foo
-	git pickaxe --since=3.weeks -- foo
-
-When revision range specifiers are used to limit the annotation,
-lines that have not changed since the range boundary (either the
-commit v2.6.18 or the most recent commit that is more than 3
-weeks old in the above example) are blamed for that range
-boundary commit.
-
-A particularly useful way is to see if an added file have lines
-created by copy-and-paste from existing files.  Sometimes this
-indicates that the developer was being sloppy and did not
-refactor the code properly.  You can first find the commit that
-introduced the file with:
-
-	git log --diff-filter=A --pretty=short -- foo
-
-and then annotate the change between the commit and its
-parents, using `commit{caret}!` notation:
-
-	git pickaxe -C -C -f $commit^! -- foo
-
-
-SEE ALSO
---------
-gitlink:git-blame[1]
-
-AUTHOR
-------
-Written by Junio C Hamano <junkio@cox.net>
-
-GIT
----
-Part of the gitlink:git[7] suite

File Documentation/git.txt

View file
  • Ignore whitespace
 	Annotate file lines with commit info.
 
 gitlink:git-blame[1]::
-	Blame file lines on commits.
-
-gitlink:git-pickaxe[1]::
 	Find out where each line in a file came from.
 
 gitlink:git-check-ref-format[1]::

File Makefile

View file
  • Ignore whitespace
 	git-update-server-info$X \
 	git-upload-pack$X git-verify-pack$X \
 	git-pack-redundant$X git-var$X \
-	git-describe$X git-merge-tree$X git-blame$X git-imap-send$X \
+	git-describe$X git-merge-tree$X git-imap-send$X \
 	git-merge-recursive$X \
 	$(EXTRA_PROGRAMS)
 
 	builtin-annotate.o \
 	builtin-apply.o \
 	builtin-archive.o \
+	builtin-blame.o \
 	builtin-branch.o \
 	builtin-cat-file.o \
 	builtin-checkout-index.o \
 	builtin-mv.o \
 	builtin-name-rev.o \
 	builtin-pack-objects.o \
-	builtin-pickaxe.o \
 	builtin-prune.o \
 	builtin-prune-packed.o \
 	builtin-push.o \

File blame.c

  • Ignore whitespace
-/*
- * Copyright (C) 2006, Fredrik Kuivinen <freku045@student.liu.se>
- */
-
-#include <assert.h>
-#include <time.h>
-#include <sys/time.h>
-#include <math.h>
-
-#include "cache.h"
-#include "refs.h"
-#include "tag.h"
-#include "commit.h"
-#include "tree.h"
-#include "blob.h"
-#include "diff.h"
-#include "diffcore.h"
-#include "revision.h"
-#include "xdiff-interface.h"
-#include "quote.h"
-
-#ifndef DEBUG
-#define DEBUG 0
-#endif
-
-static const char blame_usage[] =
-"git-blame [-c] [-l] [-t] [-f] [-n] [-p] [-S <revs-file>] [--] file [commit]\n"
-"  -c, --compatibility Use the same output mode as git-annotate (Default: off)\n"
-"  -l, --long          Show long commit SHA1 (Default: off)\n"
-"  -t, --time          Show raw timestamp (Default: off)\n"
-"  -f, --show-name     Show original filename (Default: auto)\n"
-"  -n, --show-number   Show original linenumber (Default: off)\n"
-"  -p, --porcelain     Show in a format designed for machine consumption\n"
-"  -S revs-file        Use revisions from revs-file instead of calling git-rev-list\n"
-"  -h, --help          This message";
-
-static struct commit **blame_lines;
-static int num_blame_lines;
-static char *blame_contents;
-static int blame_len;
-
-struct util_info {
-	int *line_map;
-	unsigned char sha1[20];	/* blob sha, not commit! */
-	char *buf;
-	unsigned long size;
-	int num_lines;
-	const char *pathname;
-	unsigned meta_given:1;
-
-	void *topo_data;
-};
-
-struct chunk {
-	int off1, len1;	/* --- */
-	int off2, len2;	/* +++ */
-};
-
-struct patch {
-	struct chunk *chunks;
-	int num;
-};
-
-static void get_blob(struct commit *commit);
-
-/* Only used for statistics */
-static int num_get_patch;
-static int num_commits;
-static int patch_time;
-static int num_read_blob;
-
-struct blame_diff_state {
-	struct xdiff_emit_state xm;
-	struct patch *ret;
-};
-
-static void process_u0_diff(void *state_, char *line, unsigned long len)
-{
-	struct blame_diff_state *state = state_;
-	struct chunk *chunk;
-
-	if (len < 4 || line[0] != '@' || line[1] != '@')
-		return;
-
-	if (DEBUG)
-		printf("chunk line: %.*s", (int)len, line);
-	state->ret->num++;
-	state->ret->chunks = xrealloc(state->ret->chunks,
-				      sizeof(struct chunk) * state->ret->num);
-	chunk = &state->ret->chunks[state->ret->num - 1];
-
-	assert(!strncmp(line, "@@ -", 4));
-
-	if (parse_hunk_header(line, len,
-			      &chunk->off1, &chunk->len1,
-			      &chunk->off2, &chunk->len2)) {
-		state->ret->num--;
-		return;
-	}
-
-	if (chunk->len1 == 0)
-		chunk->off1++;
-	if (chunk->len2 == 0)
-		chunk->off2++;
-
-	if (chunk->off1 > 0)
-		chunk->off1--;
-	if (chunk->off2 > 0)
-		chunk->off2--;
-
-	assert(chunk->off1 >= 0);
-	assert(chunk->off2 >= 0);
-}
-
-static struct patch *get_patch(struct commit *commit, struct commit *other)
-{
-	struct blame_diff_state state;
-	xpparam_t xpp;
-	xdemitconf_t xecfg;
-	mmfile_t file_c, file_o;
-	xdemitcb_t ecb;
-	struct util_info *info_c = (struct util_info *)commit->util;
-	struct util_info *info_o = (struct util_info *)other->util;
-	struct timeval tv_start, tv_end;
-
-	get_blob(commit);
-	file_c.ptr = info_c->buf;
-	file_c.size = info_c->size;
-
-	get_blob(other);
-	file_o.ptr = info_o->buf;
-	file_o.size = info_o->size;
-
-	gettimeofday(&tv_start, NULL);
-
-	xpp.flags = XDF_NEED_MINIMAL;
-	xecfg.ctxlen = 0;
-	xecfg.flags = 0;
-	ecb.outf = xdiff_outf;
-	ecb.priv = &state;
-	memset(&state, 0, sizeof(state));
-	state.xm.consume = process_u0_diff;
-	state.ret = xmalloc(sizeof(struct patch));
-	state.ret->chunks = NULL;
-	state.ret->num = 0;
-
-	xdl_diff(&file_c, &file_o, &xpp, &xecfg, &ecb);
-
-	gettimeofday(&tv_end, NULL);
-	patch_time += 1000000 * (tv_end.tv_sec - tv_start.tv_sec) +
-		tv_end.tv_usec - tv_start.tv_usec;
-
-	num_get_patch++;
-	return state.ret;
-}
-
-static void free_patch(struct patch *p)
-{
-	free(p->chunks);
-	free(p);
-}
-
-static int get_blob_sha1_internal(const unsigned char *sha1, const char *base,
-				  int baselen, const char *pathname,
-				  unsigned mode, int stage);
-
-static unsigned char blob_sha1[20];
-static const char *blame_file;
-static int get_blob_sha1(struct tree *t, const char *pathname,
-			 unsigned char *sha1)
-{
-	const char *pathspec[2];
-	blame_file = pathname;
-	pathspec[0] = pathname;
-	pathspec[1] = NULL;
-	hashclr(blob_sha1);
-	read_tree_recursive(t, "", 0, 0, pathspec, get_blob_sha1_internal);
-
-	if (is_null_sha1(blob_sha1))
-		return -1;
-
-	hashcpy(sha1, blob_sha1);
-	return 0;
-}
-
-static int get_blob_sha1_internal(const unsigned char *sha1, const char *base,
-				  int baselen, const char *pathname,
-				  unsigned mode, int stage)
-{
-	if (S_ISDIR(mode))
-		return READ_TREE_RECURSIVE;
-
-	if (strncmp(blame_file, base, baselen) ||
-	    strcmp(blame_file + baselen, pathname))
-		return -1;
-
-	hashcpy(blob_sha1, sha1);
-	return -1;
-}
-
-static void get_blob(struct commit *commit)
-{
-	struct util_info *info = commit->util;
-	char type[20];
-
-	if (info->buf)
-		return;
-
-	info->buf = read_sha1_file(info->sha1, type, &info->size);
-	num_read_blob++;
-
-	assert(!strcmp(type, blob_type));
-}
-
-/* For debugging only */
-static void print_patch(struct patch *p)
-{
-	int i;
-	printf("Num chunks: %d\n", p->num);
-	for (i = 0; i < p->num; i++) {
-		printf("%d,%d %d,%d\n", p->chunks[i].off1, p->chunks[i].len1,
-		       p->chunks[i].off2, p->chunks[i].len2);
-	}
-}
-
-#if DEBUG
-/* For debugging only */
-static void print_map(struct commit *cmit, struct commit *other)
-{
-	struct util_info *util = cmit->util;
-	struct util_info *util2 = other->util;
-
-	int i;
-	int max =
-	    util->num_lines >
-	    util2->num_lines ? util->num_lines : util2->num_lines;
-	int num;
-
-	if (print_map == NULL)
-		; /* to avoid "unused function" warning */
-
-	for (i = 0; i < max; i++) {
-		printf("i: %d ", i);
-		num = -1;
-
-		if (i < util->num_lines) {
-			num = util->line_map[i];
-			printf("%d\t", num);
-		}
-		else
-			printf("\t");
-
-		if (i < util2->num_lines) {
-			int num2 = util2->line_map[i];
-			printf("%d\t", num2);
-			if (num != -1 && num2 != num)
-				printf("---");
-		}
-		else
-			printf("\t");
-
-		printf("\n");
-	}
-}
-#endif
-
-/* p is a patch from commit to other. */
-static void fill_line_map(struct commit *commit, struct commit *other,
-			  struct patch *p)
-{
-	struct util_info *util = commit->util;
-	struct util_info *util2 = other->util;
-	int *map = util->line_map;
-	int *map2 = util2->line_map;
-	int cur_chunk = 0;
-	int i1, i2;
-
-	if (DEBUG) {
-		if (p->num)
-			print_patch(p);
-		printf("num lines 1: %d num lines 2: %d\n", util->num_lines,
-		       util2->num_lines);
-	}
-
-	for (i1 = 0, i2 = 0; i1 < util->num_lines; i1++, i2++) {
-		struct chunk *chunk = NULL;
-		if (cur_chunk < p->num)
-			chunk = &p->chunks[cur_chunk];
-
-		if (chunk && chunk->off1 == i1) {
-			if (DEBUG && i2 != chunk->off2)
-				printf("i2: %d off2: %d\n", i2, chunk->off2);
-
-			assert(i2 == chunk->off2);
-
-			i1--;
-			i2--;
-			if (chunk->len1 > 0)
-				i1 += chunk->len1;
-
-			if (chunk->len2 > 0)
-				i2 += chunk->len2;
-
-			cur_chunk++;
-		}
-		else {
-			if (i2 >= util2->num_lines)
-				break;
-
-			if (map[i1] != map2[i2] && map[i1] != -1) {
-				if (DEBUG)
-					printf("map: i1: %d %d %p i2: %d %d %p\n",
-					       i1, map[i1],
-					       (void *) (i1 != -1 ? blame_lines[map[i1]] : NULL),
-					       i2, map2[i2],
-					       (void *) (i2 != -1 ? blame_lines[map2[i2]] : NULL));
-				if (map2[i2] != -1 &&
-				    blame_lines[map[i1]] &&
-				    !blame_lines[map2[i2]])
-					map[i1] = map2[i2];
-			}
-
-			if (map[i1] == -1 && map2[i2] != -1)
-				map[i1] = map2[i2];
-		}
-
-		if (DEBUG > 1)
-			printf("l1: %d l2: %d i1: %d i2: %d\n",
-			       map[i1], map2[i2], i1, i2);
-	}
-}
-
-static int map_line(struct commit *commit, int line)
-{
-	struct util_info *info = commit->util;
-	assert(line >= 0 && line < info->num_lines);
-	return info->line_map[line];
-}
-
-static struct util_info *get_util(struct commit *commit)
-{
-	struct util_info *util = commit->util;
-
-	if (util)
-		return util;
-
-	util = xcalloc(1, sizeof(struct util_info));
-	util->num_lines = -1;
-	commit->util = util;
-	return util;
-}
-
-static int fill_util_info(struct commit *commit)
-{
-	struct util_info *util = commit->util;
-
-	assert(util);
-	assert(util->pathname);
-
-	return !!get_blob_sha1(commit->tree, util->pathname, util->sha1);
-}
-
-static void alloc_line_map(struct commit *commit)
-{
-	struct util_info *util = commit->util;
-	int i;
-
-	if (util->line_map)
-		return;
-
-	get_blob(commit);
-
-	util->num_lines = 0;
-	for (i = 0; i < util->size; i++) {
-		if (util->buf[i] == '\n')
-			util->num_lines++;
-	}
-	if (util->buf[util->size - 1] != '\n')
-		util->num_lines++;
-
-	util->line_map = xmalloc(sizeof(int) * util->num_lines);
-
-	for (i = 0; i < util->num_lines; i++)
-		util->line_map[i] = -1;
-}
-
-static void init_first_commit(struct commit *commit, const char *filename)
-{
-	struct util_info *util = commit->util;
-	int i;
-
-	util->pathname = filename;
-	if (fill_util_info(commit))
-		die("fill_util_info failed");
-
-	alloc_line_map(commit);
-
-	util = commit->util;
-
-	for (i = 0; i < util->num_lines; i++)
-		util->line_map[i] = i;
-}
-
-static void process_commits(struct rev_info *rev, const char *path,
-			    struct commit **initial)
-{
-	int i;
-	struct util_info *util;
-	int lines_left;
-	int *blame_p;
-	int *new_lines;
-	int new_lines_len;
-
-	struct commit *commit = get_revision(rev);
-	assert(commit);
-	init_first_commit(commit, path);
-
-	util = commit->util;
-	num_blame_lines = util->num_lines;
-	blame_lines = xmalloc(sizeof(struct commit *) * num_blame_lines);
-	blame_contents = util->buf;
-	blame_len = util->size;
-
-	for (i = 0; i < num_blame_lines; i++)
-		blame_lines[i] = NULL;
-
-	lines_left = num_blame_lines;
-	blame_p = xmalloc(sizeof(int) * num_blame_lines);
-	new_lines = xmalloc(sizeof(int) * num_blame_lines);
-	do {
-		struct commit_list *parents;
-		int num_parents;
-		struct util_info *util;
-
-		if (DEBUG)
-			printf("\nProcessing commit: %d %s\n", num_commits,
-			       sha1_to_hex(commit->object.sha1));
-
-		if (lines_left == 0)
-			return;
-
-		num_commits++;
-		memset(blame_p, 0, sizeof(int) * num_blame_lines);
-		new_lines_len = 0;
-		num_parents = 0;
-		for (parents = commit->parents;
-		     parents != NULL; parents = parents->next)
-			num_parents++;
-
-		if (num_parents == 0)
-			*initial = commit;
-
-		if (fill_util_info(commit))
-			continue;
-
-		alloc_line_map(commit);
-		util = commit->util;
-
-		for (parents = commit->parents;
-		     parents != NULL; parents = parents->next) {
-			struct commit *parent = parents->item;
-			struct patch *patch;
-
-			if (parse_commit(parent) < 0)
-				die("parse_commit error");
-
-			if (DEBUG)
-				printf("parent: %s\n",
-				       sha1_to_hex(parent->object.sha1));
-
-			if (fill_util_info(parent)) {
-				num_parents--;
-				continue;
-			}
-
-			patch = get_patch(parent, commit);
-                        alloc_line_map(parent);
-                        fill_line_map(parent, commit, patch);
-
-                        for (i = 0; i < patch->num; i++) {
-                            int l;
-                            for (l = 0; l < patch->chunks[i].len2; l++) {
-                                int mapped_line =
-                                    map_line(commit, patch->chunks[i].off2 + l);
-                                if (mapped_line != -1) {
-                                    blame_p[mapped_line]++;
-                                    if (blame_p[mapped_line] == num_parents)
-                                        new_lines[new_lines_len++] = mapped_line;
-                                }
-                            }
-			}
-                        free_patch(patch);
-		}
-
-		if (DEBUG)
-			printf("parents: %d\n", num_parents);
-
-		for (i = 0; i < new_lines_len; i++) {
-			int mapped_line = new_lines[i];
-			if (blame_lines[mapped_line] == NULL) {
-				blame_lines[mapped_line] = commit;
-				lines_left--;
-				if (DEBUG)
-					printf("blame: mapped: %d i: %d\n",
-					       mapped_line, i);
-			}
-		}
-	} while ((commit = get_revision(rev)) != NULL);
-}
-
-static int compare_tree_path(struct rev_info *revs,
-			     struct commit *c1, struct commit *c2)
-{
-	int ret;
-	const char *paths[2];
-	struct util_info *util = c2->util;
-	paths[0] = util->pathname;
-	paths[1] = NULL;
-
-	diff_tree_setup_paths(get_pathspec(revs->prefix, paths),
-			      &revs->pruning);
-	ret = rev_compare_tree(revs, c1->tree, c2->tree);
-	diff_tree_release_paths(&revs->pruning);
-	return ret;
-}
-
-static int same_tree_as_empty_path(struct rev_info *revs, struct tree *t1,
-				   const char *path)
-{
-	int ret;
-	const char *paths[2];
-	paths[0] = path;
-	paths[1] = NULL;
-
-	diff_tree_setup_paths(get_pathspec(revs->prefix, paths),
-			      &revs->pruning);
-	ret = rev_same_tree_as_empty(revs, t1);
-	diff_tree_release_paths(&revs->pruning);
-	return ret;
-}
-
-static const char *find_rename(struct commit *commit, struct commit *parent)
-{
-	struct util_info *cutil = commit->util;
-	struct diff_options diff_opts;
-	const char *paths[1];
-	int i;
-
-	if (DEBUG) {
-		printf("find_rename commit: %s ",
-		       sha1_to_hex(commit->object.sha1));
-		puts(sha1_to_hex(parent->object.sha1));
-	}
-
-	diff_setup(&diff_opts);
-	diff_opts.recursive = 1;
-	diff_opts.detect_rename = DIFF_DETECT_RENAME;
-	paths[0] = NULL;
-	diff_tree_setup_paths(paths, &diff_opts);
-	if (diff_setup_done(&diff_opts) < 0)
-		die("diff_setup_done failed");
-
-	diff_tree_sha1(commit->tree->object.sha1, parent->tree->object.sha1,
-		       "", &diff_opts);
-	diffcore_std(&diff_opts);
-
-	for (i = 0; i < diff_queued_diff.nr; i++) {
-		struct diff_filepair *p = diff_queued_diff.queue[i];
-
-		if (p->status == 'R' &&
-		    !strcmp(p->one->path, cutil->pathname)) {
-			if (DEBUG)
-				printf("rename %s -> %s\n",
-				       p->one->path, p->two->path);
-			return p->two->path;
-		}
-	}
-
-	return 0;
-}
-
-static void simplify_commit(struct rev_info *revs, struct commit *commit)
-{
-	struct commit_list **pp, *parent;
-
-	if (!commit->tree)
-		return;
-
-	if (!commit->parents) {
-		struct util_info *util = commit->util;
-		if (!same_tree_as_empty_path(revs, commit->tree,
-					     util->pathname))
-			commit->object.flags |= TREECHANGE;
-		return;
-	}
-
-	pp = &commit->parents;
-	while ((parent = *pp) != NULL) {
-		struct commit *p = parent->item;
-
-		if (p->object.flags & UNINTERESTING) {
-			pp = &parent->next;
-			continue;
-		}
-
-		parse_commit(p);
-		switch (compare_tree_path(revs, p, commit)) {
-		case REV_TREE_SAME:
-			parent->next = NULL;
-			commit->parents = parent;
-			get_util(p)->pathname = get_util(commit)->pathname;
-			return;
-
-		case REV_TREE_NEW:
-		{
-			struct util_info *util = commit->util;
-			if (revs->remove_empty_trees &&
-			    same_tree_as_empty_path(revs, p->tree,
-						    util->pathname)) {
-				const char *new_name = find_rename(commit, p);
-				if (new_name) {
-					struct util_info *putil = get_util(p);
-					if (!putil->pathname)
-						putil->pathname = xstrdup(new_name);
-				}
-				else {
-					*pp = parent->next;
-					continue;
-				}
-			}
-		}
-
-		/* fallthrough */
-		case REV_TREE_DIFFERENT:
-			pp = &parent->next;
-			if (!get_util(p)->pathname)
-				get_util(p)->pathname =
-					get_util(commit)->pathname;
-			continue;
-		}
-		die("bad tree compare for commit %s",
-		    sha1_to_hex(commit->object.sha1));
-	}
-	commit->object.flags |= TREECHANGE;
-}
-
-struct commit_info
-{
-	char *author;
-	char *author_mail;
-	unsigned long author_time;
-	char *author_tz;
-
-	/* filled only when asked for details */
-	char *committer;
-	char *committer_mail;
-	unsigned long committer_time;
-	char *committer_tz;
-
-	char *summary;
-};
-
-static void get_ac_line(const char *inbuf, const char *what,
-			int bufsz, char *person, char **mail,
-			unsigned long *time, char **tz)
-{
-	int len;
-	char *tmp, *endp;
-
-	tmp = strstr(inbuf, what);
-	if (!tmp)
-		goto error_out;
-	tmp += strlen(what);
-	endp = strchr(tmp, '\n');
-	if (!endp)
-		len = strlen(tmp);
-	else
-		len = endp - tmp;
-	if (bufsz <= len) {
-	error_out:
-		/* Ugh */
-		person = *mail = *tz = "(unknown)";
-		*time = 0;
-		return;
-	}
-	memcpy(person, tmp, len);
-
-	tmp = person;
-	tmp += len;
-	*tmp = 0;
-	while (*tmp != ' ')
-		tmp--;
-	*tz = tmp+1;
-
-	*tmp = 0;
-	while (*tmp != ' ')
-		tmp--;
-	*time = strtoul(tmp, NULL, 10);
-
-	*tmp = 0;
-	while (*tmp != ' ')
-		tmp--;
-	*mail = tmp + 1;
-	*tmp = 0;
-}
-
-static void get_commit_info(struct commit *commit, struct commit_info *ret, int detailed)
-{
-	int len;
-	char *tmp, *endp;
-	static char author_buf[1024];
-	static char committer_buf[1024];
-	static char summary_buf[1024];
-
-	ret->author = author_buf;
-	get_ac_line(commit->buffer, "\nauthor ",
-		    sizeof(author_buf), author_buf, &ret->author_mail,
-		    &ret->author_time, &ret->author_tz);
-
-	if (!detailed)
-		return;
-
-	ret->committer = committer_buf;
-	get_ac_line(commit->buffer, "\ncommitter ",
-		    sizeof(committer_buf), committer_buf, &ret->committer_mail,
-		    &ret->committer_time, &ret->committer_tz);
-
-	ret->summary = summary_buf;
-	tmp = strstr(commit->buffer, "\n\n");
-	if (!tmp) {
-	error_out:
-		sprintf(summary_buf, "(%s)", sha1_to_hex(commit->object.sha1));
-		return;
-	}
-	tmp += 2;
-	endp = strchr(tmp, '\n');
-	if (!endp)
-		goto error_out;
-	len = endp - tmp;
-	if (len >= sizeof(summary_buf))
-		goto error_out;
-	memcpy(summary_buf, tmp, len);
-	summary_buf[len] = 0;
-}
-
-static const char *format_time(unsigned long time, const char *tz_str,
-			       int show_raw_time)
-{
-	static char time_buf[128];
-	time_t t = time;
-	int minutes, tz;
-	struct tm *tm;
-
-	if (show_raw_time) {
-		sprintf(time_buf, "%lu %s", time, tz_str);
-		return time_buf;
-	}
-
-	tz = atoi(tz_str);
-	minutes = tz < 0 ? -tz : tz;
-	minutes = (minutes / 100)*60 + (minutes % 100);
-	minutes = tz < 0 ? -minutes : minutes;
-	t = time + minutes * 60;
-	tm = gmtime(&t);
-
-	strftime(time_buf, sizeof(time_buf), "%Y-%m-%d %H:%M:%S ", tm);
-	strcat(time_buf, tz_str);
-	return time_buf;
-}
-
-static void topo_setter(struct commit *c, void *data)
-{
-	struct util_info *util = c->util;
-	util->topo_data = data;
-}
-
-static void *topo_getter(struct commit *c)
-{
-	struct util_info *util = c->util;
-	return util->topo_data;
-}
-
-static int read_ancestry(const char *graft_file,
-			 unsigned char **start_sha1)
-{
-	FILE *fp = fopen(graft_file, "r");
-	char buf[1024];
-	if (!fp)
-		return -1;
-	while (fgets(buf, sizeof(buf), fp)) {
-		/* The format is just "Commit Parent1 Parent2 ...\n" */
-		int len = strlen(buf);
-		struct commit_graft *graft = read_graft_line(buf, len);
-		register_commit_graft(graft, 0);
-		if (!*start_sha1)
-			*start_sha1 = graft->sha1;
-	}
-	fclose(fp);
-	return 0;
-}
-
-static int lineno_width(int lines)
-{
-	int i, width;
-
-	for (width = 1, i = 10; i <= lines + 1; width++)
-		i *= 10;
-	return width;
-}
-
-static int find_orig_linenum(struct util_info *u, int lineno)
-{
-	int i;
-
-	for (i = 0; i < u->num_lines; i++)
-		if (lineno == u->line_map[i])
-			return i + 1;
-	return 0;
-}
-
-static void emit_meta(struct commit *c, int lno,
-		      int sha1_len, int compatibility, int porcelain,
-		      int show_name, int show_number, int show_raw_time,
-		      int longest_file, int longest_author,
-		      int max_digits, int max_orig_digits)
-{
-	struct util_info *u;
-	int lineno;
-	struct commit_info ci;
-
-	u = c->util;
-	lineno = find_orig_linenum(u, lno);
-
-	if (porcelain) {
-		int group_size = -1;
-		struct commit *cc = (lno == 0) ? NULL : blame_lines[lno-1];
-		if (cc != c) {
-			/* This is the beginning of this group */
-			int i;
-			for (i = lno + 1; i < num_blame_lines; i++)
-				if (blame_lines[i] != c)
-					break;
-			group_size = i - lno;
-		}
-		if (0 < group_size)
-			printf("%s %d %d %d\n", sha1_to_hex(c->object.sha1),
-			       lineno, lno + 1, group_size);
-		else
-			printf("%s %d %d\n", sha1_to_hex(c->object.sha1),
-			       lineno, lno + 1);
-		if (!u->meta_given) {
-			get_commit_info(c, &ci, 1);
-			printf("author %s\n", ci.author);
-			printf("author-mail %s\n", ci.author_mail);
-			printf("author-time %lu\n", ci.author_time);
-			printf("author-tz %s\n", ci.author_tz);
-			printf("committer %s\n", ci.committer);
-			printf("committer-mail %s\n", ci.committer_mail);
-			printf("committer-time %lu\n", ci.committer_time);
-			printf("committer-tz %s\n", ci.committer_tz);
-			printf("filename ");
-			if (quote_c_style(u->pathname, NULL, NULL, 0))
-				quote_c_style(u->pathname, NULL, stdout, 0);
-			else
-				fputs(u->pathname, stdout);
-			printf("\nsummary %s\n", ci.summary);
-
-			u->meta_given = 1;
-		}
-		putchar('\t');
-		return;
-	}
-
-	get_commit_info(c, &ci, 0);
-	fwrite(sha1_to_hex(c->object.sha1), sha1_len, 1, stdout);
-	if (compatibility) {
-		printf("\t(%10s\t%10s\t%d)", ci.author,
-		       format_time(ci.author_time, ci.author_tz,
-				   show_raw_time),
-		       lno + 1);
-	}
-	else {
-		if (show_name)
-			printf(" %-*.*s", longest_file, longest_file,
-			       u->pathname);
-		if (show_number)
-			printf(" %*d", max_orig_digits,
-			       lineno);
-		printf(" (%-*.*s %10s %*d) ",
-		       longest_author, longest_author, ci.author,
-		       format_time(ci.author_time, ci.author_tz,
-				   show_raw_time),
-		       max_digits, lno + 1);
-	}
-}
-
-int main(int argc, const char **argv)
-{
-	int i;
-	struct commit *initial = NULL;
-	unsigned char sha1[20], *sha1_p = NULL;
-
-	const char *filename = NULL, *commit = NULL;
-	char filename_buf[256];
-	int sha1_len = 8;
-	int compatibility = 0;
-	int show_raw_time = 0;
-	int options = 1;
-	struct commit *start_commit;
-
-	const char *args[10];
-	struct rev_info rev;
-
-	struct commit_info ci;
-	const char *buf;
-	int max_digits, max_orig_digits;
-	int longest_file, longest_author, longest_file_lines;
-	int show_name = 0;
-	int show_number = 0;
-	int porcelain = 0;
-
-	const char *prefix = setup_git_directory();
-	git_config(git_default_config);
-
-	for (i = 1; i < argc; i++) {
-		if (options) {
-			if (!strcmp(argv[i], "-h") ||
-			   !strcmp(argv[i], "--help"))
-				usage(blame_usage);
-			if (!strcmp(argv[i], "-l") ||
-			    !strcmp(argv[i], "--long")) {
-				sha1_len = 40;
-				continue;
-			}
-			if (!strcmp(argv[i], "-c") ||
-			    !strcmp(argv[i], "--compatibility")) {
-				compatibility = 1;
-				continue;
-			}
-			if (!strcmp(argv[i], "-t") ||
-			    !strcmp(argv[i], "--time")) {
-				show_raw_time = 1;
-				continue;
-			}
-			if (!strcmp(argv[i], "-S")) {
-				if (i + 1 < argc &&
-				    !read_ancestry(argv[i + 1], &sha1_p)) {
-					compatibility = 1;
-					i++;
-					continue;
-				}
-				usage(blame_usage);
-			}
-			if (!strcmp(argv[i], "-f") ||
-			    !strcmp(argv[i], "--show-name")) {
-				show_name = 1;
-				continue;
-			}
-			if (!strcmp(argv[i], "-n") ||
-			    !strcmp(argv[i], "--show-number")) {
-				show_number = 1;
-				continue;
-			}
-			if (!strcmp(argv[i], "-p") ||
-			    !strcmp(argv[i], "--porcelain")) {
-				porcelain = 1;
-				sha1_len = 40;
-				show_raw_time = 1;
-				continue;
-			}
-			if (!strcmp(argv[i], "--")) {
-				options = 0;
-				continue;
-			}
-			if (argv[i][0] == '-')
-				usage(blame_usage);
-			options = 0;
-		}
-
-		if (!options) {
-			if (!filename)
-				filename = argv[i];
-			else if (!commit)
-				commit = argv[i];
-			else
-				usage(blame_usage);
-		}
-	}
-
-	if (!filename)
-		usage(blame_usage);
-	if (commit && sha1_p)
-		usage(blame_usage);
-	else if (!commit)
-		commit = "HEAD";
-
-	if (prefix)
-		sprintf(filename_buf, "%s%s", prefix, filename);
-	else
-		strcpy(filename_buf, filename);
-	filename = filename_buf;
-
-	if (!sha1_p) {
-		if (get_sha1(commit, sha1))
-			die("get_sha1 failed, commit '%s' not found", commit);
-		sha1_p = sha1;
-	}
-	start_commit = lookup_commit_reference(sha1_p);
-	get_util(start_commit)->pathname = filename;
-	if (fill_util_info(start_commit)) {
-		printf("%s not found in %s\n", filename, commit);
-		return 1;
-	}
-
-	init_revisions(&rev, setup_git_directory());
-	rev.remove_empty_trees = 1;
-	rev.topo_order = 1;
-	rev.prune_fn = simplify_commit;
-	rev.topo_setter = topo_setter;
-	rev.topo_getter = topo_getter;
-	rev.parents = 1;
-	rev.limited = 1;
-
-	commit_list_insert(start_commit, &rev.commits);
-
-	args[0] = filename;
-	args[1] = NULL;
-	diff_tree_setup_paths(args, &rev.pruning);
-	prepare_revision_walk(&rev);
-	process_commits(&rev, filename, &initial);
-
-	for (i = 0; i < num_blame_lines; i++)
-		if (!blame_lines[i])
-			blame_lines[i] = initial;
-
-	buf = blame_contents;
-	max_digits = lineno_width(num_blame_lines);
-
-	longest_file = 0;
-	longest_author = 0;
-	longest_file_lines = 0;
-	for (i = 0; i < num_blame_lines; i++) {
-		struct commit *c = blame_lines[i];
-		struct util_info *u;
-		u = c->util;
-
-		if (!show_name && strcmp(filename, u->pathname))
-			show_name = 1;
-		if (longest_file < strlen(u->pathname))
-			longest_file = strlen(u->pathname);
-		if (longest_file_lines < u->num_lines)
-			longest_file_lines = u->num_lines;
-		get_commit_info(c, &ci, 0);
-		if (longest_author < strlen(ci.author))
-			longest_author = strlen(ci.author);
-	}
-
-	max_orig_digits = lineno_width(longest_file_lines);
-
-	for (i = 0; i < num_blame_lines; i++) {
-		emit_meta(blame_lines[i], i,
-			  sha1_len, compatibility, porcelain,
-			  show_name, show_number, show_raw_time,
-			  longest_file, longest_author,
-			  max_digits, max_orig_digits);
-
-		if (i == num_blame_lines - 1) {
-			fwrite(buf, blame_len - (buf - blame_contents),
-			       1, stdout);
-			if (blame_contents[blame_len-1] != '\n')
-				putc('\n', stdout);
-		}
-		else {
-			char *next_buf = strchr(buf, '\n') + 1;
-			fwrite(buf, next_buf - buf, 1, stdout);
-			buf = next_buf;
-		}
-	}
-
-	if (DEBUG) {
-		printf("num read blob: %d\n", num_read_blob);
-		printf("num get patch: %d\n", num_get_patch);
-		printf("num commits: %d\n", num_commits);
-		printf("patch time: %f\n", patch_time / 1000000.0);
-		printf("initial: %s\n", sha1_to_hex(initial->object.sha1));
-	}
-
-	return 0;
-}

File builtin-blame.c

View file
  • Ignore whitespace
+/*
+ * Pickaxe
+ *
+ * Copyright (c) 2006, Junio C Hamano
+ */
+
+#include "cache.h"
+#include "builtin.h"
+#include "blob.h"
+#include "commit.h"
+#include "tag.h"
+#include "tree-walk.h"
+#include "diff.h"
+#include "diffcore.h"
+#include "revision.h"
+#include "xdiff-interface.h"
+
+#include <time.h>
+#include <sys/time.h>
+#include <regex.h>
+
+static char blame_usage[] =
+"git-blame [-c] [-l] [-t] [-f] [-n] [-p] [-L n,m] [-S <revs-file>] [-M] [-C] [-C] [commit] [--] file\n"
+"  -c, --compatibility Use the same output mode as git-annotate (Default: off)\n"
+"  -l, --long          Show long commit SHA1 (Default: off)\n"
+"  -t, --time          Show raw timestamp (Default: off)\n"
+"  -f, --show-name     Show original filename (Default: auto)\n"
+"  -n, --show-number   Show original linenumber (Default: off)\n"
+"  -p, --porcelain     Show in a format designed for machine consumption\n"
+"  -L n,m              Process only line range n,m, counting from 1\n"
+"  -M, -C              Find line movements within and across files\n"
+"  -S revs-file        Use revisions from revs-file instead of calling git-rev-list\n";
+
+static int longest_file;
+static int longest_author;
+static int max_orig_digits;
+static int max_digits;
+static int max_score_digits;
+
+#ifndef DEBUG
+#define DEBUG 0
+#endif
+
+/* stats */
+static int num_read_blob;
+static int num_get_patch;
+static int num_commits;
+
+#define PICKAXE_BLAME_MOVE		01
+#define PICKAXE_BLAME_COPY		02
+#define PICKAXE_BLAME_COPY_HARDER	04
+
+/*
+ * blame for a blame_entry with score lower than these thresholds
+ * is not passed to the parent using move/copy logic.
+ */
+static unsigned blame_move_score;
+static unsigned blame_copy_score;
+#define BLAME_DEFAULT_MOVE_SCORE	20
+#define BLAME_DEFAULT_COPY_SCORE	40
+
+/* bits #0..7 in revision.h, #8..11 used for merge_bases() in commit.c */
+#define METAINFO_SHOWN		(1u<<12)
+#define MORE_THAN_ONE_PATH	(1u<<13)
+
+/*
+ * One blob in a commit that is being suspected
+ */
+struct origin {
+	int refcnt;
+	struct commit *commit;
+	mmfile_t file;
+	unsigned char blob_sha1[20];
+	char path[FLEX_ARRAY];
+};
+
+static char *fill_origin_blob(struct origin *o, mmfile_t *file)
+{
+	if (!o->file.ptr) {
+		char type[10];
+		num_read_blob++;
+		file->ptr = read_sha1_file(o->blob_sha1, type,
+					   (unsigned long *)(&(file->size)));
+		o->file = *file;
+	}
+	else
+		*file = o->file;
+	return file->ptr;
+}
+
+static inline struct origin *origin_incref(struct origin *o)
+{
+	if (o)
+		o->refcnt++;
+	return o;
+}
+
+static void origin_decref(struct origin *o)
+{
+	if (o && --o->refcnt <= 0) {
+		if (o->file.ptr)
+			free(o->file.ptr);
+		memset(o, 0, sizeof(*o));
+		free(o);
+	}
+}
+
+struct blame_entry {
+	struct blame_entry *prev;
+	struct blame_entry *next;
+
+	/* the first line of this group in the final image;
+	 * internally all line numbers are 0 based.
+	 */
+	int lno;
+
+	/* how many lines this group has */
+	int num_lines;
+
+	/* the commit that introduced this group into the final image */
+	struct origin *suspect;
+
+	/* true if the suspect is truly guilty; false while we have not
+	 * checked if the group came from one of its parents.
+	 */
+	char guilty;
+
+	/* the line number of the first line of this group in the
+	 * suspect's file; internally all line numbers are 0 based.
+	 */
+	int s_lno;
+
+	/* how significant this entry is -- cached to avoid
+	 * scanning the lines over and over
+	 */
+	unsigned score;
+};
+
+struct scoreboard {
+	/* the final commit (i.e. where we started digging from) */
+	struct commit *final;
+
+	const char *path;
+
+	/* the contents in the final; pointed into by buf pointers of
+	 * blame_entries
+	 */
+	const char *final_buf;
+	unsigned long final_buf_size;
+
+	/* linked list of blames */
+	struct blame_entry *ent;
+
+	/* look-up a line in the final buffer */
+	int num_lines;
+	int *lineno;
+};
+
+static int cmp_suspect(struct origin *a, struct origin *b)
+{
+	int cmp = hashcmp(a->commit->object.sha1, b->commit->object.sha1);
+	if (cmp)
+		return cmp;
+	return strcmp(a->path, b->path);
+}
+
+#define cmp_suspect(a, b) ( ((a)==(b)) ? 0 : cmp_suspect(a,b) )
+
+static void sanity_check_refcnt(struct scoreboard *);
+
+static void coalesce(struct scoreboard *sb)
+{
+	struct blame_entry *ent, *next;
+
+	for (ent = sb->ent; ent && (next = ent->next); ent = next) {
+		if (!cmp_suspect(ent->suspect, next->suspect) &&
+		    ent->guilty == next->guilty &&
+		    ent->s_lno + ent->num_lines == next->s_lno) {
+			ent->num_lines += next->num_lines;
+			ent->next = next->next;
+			if (ent->next)
+				ent->next->prev = ent;
+			origin_decref(next->suspect);
+			free(next);
+			ent->score = 0;
+			next = ent; /* again */
+		}
+	}
+
+	if (DEBUG) /* sanity */
+		sanity_check_refcnt(sb);
+}
+
+static struct origin *make_origin(struct commit *commit, const char *path)
+{
+	struct origin *o;
+	o = xcalloc(1, sizeof(*o) + strlen(path) + 1);
+	o->commit = commit;
+	o->refcnt = 1;
+	strcpy(o->path, path);
+	return o;
+}
+
+static struct origin *get_origin(struct scoreboard *sb,
+				 struct commit *commit,
+				 const char *path)
+{
+	struct blame_entry *e;
+
+	for (e = sb->ent; e; e = e->next) {
+		if (e->suspect->commit == commit &&
+		    !strcmp(e->suspect->path, path))
+			return origin_incref(e->suspect);
+	}
+	return make_origin(commit, path);
+}
+
+static int fill_blob_sha1(struct origin *origin)
+{
+	unsigned mode;
+	char type[10];
+
+	if (!is_null_sha1(origin->blob_sha1))
+		return 0;
+	if (get_tree_entry(origin->commit->object.sha1,
+			   origin->path,
+			   origin->blob_sha1, &mode))
+		goto error_out;
+	if (sha1_object_info(origin->blob_sha1, type, NULL) ||
+	    strcmp(type, blob_type))
+		goto error_out;
+	return 0;
+ error_out:
+	hashclr(origin->blob_sha1);
+	return -1;
+}
+
+static struct origin *find_origin(struct scoreboard *sb,
+				  struct commit *parent,
+				  struct origin *origin)
+{
+	struct origin *porigin = NULL;
+	struct diff_options diff_opts;
+	const char *paths[2];
+
+	if (parent->util) {
+		/* This is a freestanding copy of origin and not
+		 * refcounted.
+		 */
+		struct origin *cached = parent->util;
+		if (!strcmp(cached->path, origin->path)) {
+			porigin = get_origin(sb, parent, cached->path);
+			if (porigin->refcnt == 1)
+				hashcpy(porigin->blob_sha1, cached->blob_sha1);
+			return porigin;
+		}
+		/* otherwise it was not very useful; free it */
+		free(parent->util);
+		parent->util = NULL;
+	}
+
+	/* See if the origin->path is different between parent
+	 * and origin first.  Most of the time they are the
+	 * same and diff-tree is fairly efficient about this.
+	 */
+	diff_setup(&diff_opts);
+	diff_opts.recursive = 1;
+	diff_opts.detect_rename = 0;
+	diff_opts.output_format = DIFF_FORMAT_NO_OUTPUT;
+	paths[0] = origin->path;
+	paths[1] = NULL;
+
+	diff_tree_setup_paths(paths, &diff_opts);
+	if (diff_setup_done(&diff_opts) < 0)
+		die("diff-setup");
+	diff_tree_sha1(parent->tree->object.sha1,
+		       origin->commit->tree->object.sha1,
+		       "", &diff_opts);
+	diffcore_std(&diff_opts);
+
+	/* It is either one entry that says "modified", or "created",
+	 * or nothing.
+	 */
+	if (!diff_queued_diff.nr) {
+		/* The path is the same as parent */
+		porigin = get_origin(sb, parent, origin->path);
+		hashcpy(porigin->blob_sha1, origin->blob_sha1);
+	}
+	else if (diff_queued_diff.nr != 1)
+		die("internal error in blame::find_origin");
+	else {
+		struct diff_filepair *p = diff_queued_diff.queue[0];
+		switch (p->status) {
+		default:
+			die("internal error in blame::find_origin (%c)",
+			    p->status);
+		case 'M':
+			porigin = get_origin(sb, parent, origin->path);
+			hashcpy(porigin->blob_sha1, p->one->sha1);
+			break;
+		case 'A':
+		case 'T':
+			/* Did not exist in parent, or type changed */
+			break;
+		}
+	}
+	diff_flush(&diff_opts);
+	if (porigin) {
+		struct origin *cached;
+		cached = make_origin(porigin->commit, porigin->path);
+		hashcpy(cached->blob_sha1, porigin->blob_sha1);
+		parent->util = cached;
+	}
+	return porigin;
+}
+
+static struct origin *find_rename(struct scoreboard *sb,
+				  struct commit *parent,
+				  struct origin *origin)
+{
+	struct origin *porigin = NULL;
+	struct diff_options diff_opts;
+	int i;
+	const char *paths[2];
+
+	diff_setup(&diff_opts);
+	diff_opts.recursive = 1;
+	diff_opts.detect_rename = DIFF_DETECT_RENAME;
+	diff_opts.output_format = DIFF_FORMAT_NO_OUTPUT;
+	diff_opts.single_follow = origin->path;
+	paths[0] = NULL;
+	diff_tree_setup_paths(paths, &diff_opts);
+	if (diff_setup_done(&diff_opts) < 0)
+		die("diff-setup");
+	diff_tree_sha1(parent->tree->object.sha1,
+		       origin->commit->tree->object.sha1,
+		       "", &diff_opts);
+	diffcore_std(&diff_opts);
+
+	for (i = 0; i < diff_queued_diff.nr; i++) {
+		struct diff_filepair *p = diff_queued_diff.queue[i];
+		if ((p->status == 'R' || p->status == 'C') &&
+		    !strcmp(p->two->path, origin->path)) {
+			porigin = get_origin(sb, parent, p->one->path);
+			hashcpy(porigin->blob_sha1, p->one->sha1);
+			break;
+		}
+	}
+	diff_flush(&diff_opts);
+	return porigin;
+}
+
+struct chunk {
+	/* line number in postimage; up to but not including this
+	 * line is the same as preimage
+	 */
+	int same;
+
+	/* preimage line number after this chunk */
+	int p_next;
+
+	/* postimage line number after this chunk */
+	int t_next;
+};
+
+struct patch {
+	struct chunk *chunks;
+	int num;
+};
+
+struct blame_diff_state {
+	struct xdiff_emit_state xm;
+	struct patch *ret;
+	unsigned hunk_post_context;
+	unsigned hunk_in_pre_context : 1;
+};
+
+static void process_u_diff(void *state_, char *line, unsigned long len)
+{
+	struct blame_diff_state *state = state_;
+	struct chunk *chunk;
+	int off1, off2, len1, len2, num;
+
+	num = state->ret->num;
+	if (len < 4 || line[0] != '@' || line[1] != '@') {
+		if (state->hunk_in_pre_context && line[0] == ' ')
+			state->ret->chunks[num - 1].same++;
+		else {
+			state->hunk_in_pre_context = 0;
+			if (line[0] == ' ')
+				state->hunk_post_context++;
+			else
+				state->hunk_post_context = 0;
+		}
+		return;
+	}
+
+	if (num && state->hunk_post_context) {
+		chunk = &state->ret->chunks[num - 1];
+		chunk->p_next -= state->hunk_post_context;
+		chunk->t_next -= state->hunk_post_context;
+	}
+	state->ret->num = ++num;
+	state->ret->chunks = xrealloc(state->ret->chunks,
+				      sizeof(struct chunk) * num);
+	chunk = &state->ret->chunks[num - 1];
+	if (parse_hunk_header(line, len, &off1, &len1, &off2, &len2)) {
+		state->ret->num--;
+		return;
+	}
+
+	/* Line numbers in patch output are one based. */
+	off1--;
+	off2--;
+
+	chunk->same = len2 ? off2 : (off2 + 1);
+
+	chunk->p_next = off1 + (len1 ? len1 : 1);
+	chunk->t_next = chunk->same + len2;
+	state->hunk_in_pre_context = 1;
+	state->hunk_post_context = 0;
+}
+
+static struct patch *compare_buffer(mmfile_t *file_p, mmfile_t *file_o,
+				    int context)
+{
+	struct blame_diff_state state;
+	xpparam_t xpp;
+	xdemitconf_t xecfg;
+	xdemitcb_t ecb;
+
+	xpp.flags = XDF_NEED_MINIMAL;
+	xecfg.ctxlen = context;
+	xecfg.flags = 0;
+	ecb.outf = xdiff_outf;
+	ecb.priv = &state;
+	memset(&state, 0, sizeof(state));
+	state.xm.consume = process_u_diff;
+	state.ret = xmalloc(sizeof(struct patch));
+	state.ret->chunks = NULL;
+	state.ret->num = 0;
+
+	xdl_diff(file_p, file_o, &xpp, &xecfg, &ecb);
+
+	if (state.ret->num) {
+		struct chunk *chunk;
+		chunk = &state.ret->chunks[state.ret->num - 1];
+		chunk->p_next -= state.hunk_post_context;
+		chunk->t_next -= state.hunk_post_context;
+	}
+	return state.ret;
+}
+
+static struct patch *get_patch(struct origin *parent, struct origin *origin)
+{
+	mmfile_t file_p, file_o;
+	struct patch *patch;
+
+	fill_origin_blob(parent, &file_p);
+	fill_origin_blob(origin, &file_o);
+	if (!file_p.ptr || !file_o.ptr)
+		return NULL;
+	patch = compare_buffer(&file_p, &file_o, 0);
+	num_get_patch++;
+	return patch;
+}
+
+static void free_patch(struct patch *p)
+{
+	free(p->chunks);
+	free(p);
+}
+
+static void add_blame_entry(struct scoreboard *sb, struct blame_entry *e)
+{
+	struct blame_entry *ent, *prev = NULL;
+
+	origin_incref(e->suspect);
+
+	for (ent = sb->ent; ent && ent->lno < e->lno; ent = ent->next)
+		prev = ent;
+
+	/* prev, if not NULL, is the last one that is below e */
+	e->prev = prev;
+	if (prev) {
+		e->next = prev->next;
+		prev->next = e;
+	}
+	else {
+		e->next = sb->ent;
+		sb->ent = e;
+	}
+	if (e->next)
+		e->next->prev = e;
+}
+
+static void dup_entry(struct blame_entry *dst, struct blame_entry *src)
+{
+	struct blame_entry *p, *n;
+
+	p = dst->prev;
+	n = dst->next;
+	origin_incref(src->suspect);
+	origin_decref(dst->suspect);
+	memcpy(dst, src, sizeof(*src));
+	dst->prev = p;
+	dst->next = n;
+	dst->score = 0;
+}
+
+static const char *nth_line(struct scoreboard *sb, int lno)
+{
+	return sb->final_buf + sb->lineno[lno];
+}
+
+static void split_overlap(struct blame_entry *split,
+			  struct blame_entry *e,
+			  int tlno, int plno, int same,
+			  struct origin *parent)
+{
+	/* it is known that lines between tlno to same came from
+	 * parent, and e has an overlap with that range.  it also is
+	 * known that parent's line plno corresponds to e's line tlno.
+	 *
+	 *                <---- e ----->
+	 *                   <------>
+	 *                   <------------>
+	 *             <------------>
+	 *             <------------------>
+	 *
+	 * Potentially we need to split e into three parts; before
+	 * this chunk, the chunk to be blamed for parent, and after
+	 * that portion.
+	 */
+	int chunk_end_lno;
+	memset(split, 0, sizeof(struct blame_entry [3]));
+
+	if (e->s_lno < tlno) {
+		/* there is a pre-chunk part not blamed on parent */
+		split[0].suspect = origin_incref(e->suspect);
+		split[0].lno = e->lno;
+		split[0].s_lno = e->s_lno;
+		split[0].num_lines = tlno - e->s_lno;
+		split[1].lno = e->lno + tlno - e->s_lno;
+		split[1].s_lno = plno;
+	}
+	else {
+		split[1].lno = e->lno;
+		split[1].s_lno = plno + (e->s_lno - tlno);
+	}
+
+	if (same < e->s_lno + e->num_lines) {
+		/* there is a post-chunk part not blamed on parent */
+		split[2].suspect = origin_incref(e->suspect);
+		split[2].lno = e->lno + (same - e->s_lno);
+		split[2].s_lno = e->s_lno + (same - e->s_lno);
+		split[2].num_lines = e->s_lno + e->num_lines - same;
+		chunk_end_lno = split[2].lno;
+	}
+	else
+		chunk_end_lno = e->lno + e->num_lines;
+	split[1].num_lines = chunk_end_lno - split[1].lno;
+
+	if (split[1].num_lines < 1)
+		return;
+	split[1].suspect = origin_incref(parent);
+}
+
+static void split_blame(struct scoreboard *sb,
+			struct blame_entry *split,
+			struct blame_entry *e)
+{
+	struct blame_entry *new_entry;
+
+	if (split[0].suspect && split[2].suspect) {
+		/* we need to split e into two and add another for parent */
+		dup_entry(e, &split[0]);
+
+		new_entry = xmalloc(sizeof(*new_entry));
+		memcpy(new_entry, &(split[2]), sizeof(struct blame_entry));
+		add_blame_entry(sb, new_entry);
+
+		new_entry = xmalloc(sizeof(*new_entry));
+		memcpy(new_entry, &(split[1]), sizeof(struct blame_entry));
+		add_blame_entry(sb, new_entry);
+	}
+	else if (!split[0].suspect && !split[2].suspect)
+		/* parent covers the entire area */
+		dup_entry(e, &split[1]);
+	else if (split[0].suspect) {
+		dup_entry(e, &split[0]);
+
+		new_entry = xmalloc(sizeof(*new_entry));
+		memcpy(new_entry, &(split[1]), sizeof(struct blame_entry));
+		add_blame_entry(sb, new_entry);
+	}
+	else {
+		dup_entry(e, &split[1]);
+
+		new_entry = xmalloc(sizeof(*new_entry));
+		memcpy(new_entry, &(split[2]), sizeof(struct blame_entry));
+		add_blame_entry(sb, new_entry);
+	}
+
+	if (DEBUG) { /* sanity */
+		struct blame_entry *ent;
+		int lno = sb->ent->lno, corrupt = 0;
+
+		for (ent = sb->ent; ent; ent = ent->next) {
+			if (lno != ent->lno)
+				corrupt = 1;
+			if (ent->s_lno < 0)
+				corrupt = 1;
+			lno += ent->num_lines;
+		}
+		if (corrupt) {
+			lno = sb->ent->lno;
+			for (ent = sb->ent; ent; ent = ent->next) {
+				printf("L %8d l %8d n %8d\n",
+				       lno, ent->lno, ent->num_lines);
+				lno = ent->lno + ent->num_lines;
+			}
+			die("oops");
+		}
+	}
+}
+
+static void decref_split(struct blame_entry *split)
+{
+	int i;
+
+	for (i = 0; i < 3; i++)
+		origin_decref(split[i].suspect);
+}
+
+static void blame_overlap(struct scoreboard *sb, struct blame_entry *e,
+			  int tlno, int plno, int same,
+			  struct origin *parent)
+{
+	struct blame_entry split[3];
+
+	split_overlap(split, e, tlno, plno, same, parent);
+	if (split[1].suspect)
+		split_blame(sb, split, e);
+	decref_split(split);
+}
+
+static int find_last_in_target(struct scoreboard *sb, struct origin *target)
+{
+	struct blame_entry *e;
+	int last_in_target = -1;
+
+	for (e = sb->ent; e; e = e->next) {
+		if (e->guilty || cmp_suspect(e->suspect, target))
+			continue;
+		if (last_in_target < e->s_lno + e->num_lines)
+			last_in_target = e->s_lno + e->num_lines;
+	}
+	return last_in_target;
+}
+
+static void blame_chunk(struct scoreboard *sb,
+			int tlno, int plno, int same,
+			struct origin *target, struct origin *parent)
+{
+	struct blame_entry *e;
+
+	for (e = sb->ent; e; e = e->next) {
+		if (e->guilty || cmp_suspect(e->suspect, target))
+			continue;
+		if (same <= e->s_lno)
+			continue;
+		if (tlno < e->s_lno + e->num_lines)
+			blame_overlap(sb, e, tlno, plno, same, parent);
+	}
+}
+
+static int pass_blame_to_parent(struct scoreboard *sb,
+				struct origin *target,
+				struct origin *parent)
+{
+	int i, last_in_target, plno, tlno;
+	struct patch *patch;
+
+	last_in_target = find_last_in_target(sb, target);
+	if (last_in_target < 0)
+		return 1; /* nothing remains for this target */
+
+	patch = get_patch(parent, target);
+	plno = tlno = 0;
+	for (i = 0; i < patch->num; i++) {
+		struct chunk *chunk = &patch->chunks[i];
+
+		blame_chunk(sb, tlno, plno, chunk->same, target, parent);
+		plno = chunk->p_next;
+		tlno = chunk->t_next;
+	}
+	/* rest (i.e. anything above tlno) are the same as parent */
+	blame_chunk(sb, tlno, plno, last_in_target, target, parent);
+
+	free_patch(patch);
+	return 0;
+}
+
+static unsigned ent_score(struct scoreboard *sb, struct blame_entry *e)
+{
+	unsigned score;
+	const char *cp, *ep;
+
+	if (e->score)
+		return e->score;
+
+	score = 1;
+	cp = nth_line(sb, e->lno);
+	ep = nth_line(sb, e->lno + e->num_lines);
+	while (cp < ep) {
+		unsigned ch = *((unsigned char *)cp);
+		if (isalnum(ch))
+			score++;
+		cp++;
+	}
+	e->score = score;
+	return score;
+}
+
+static void copy_split_if_better(struct scoreboard *sb,
+				 struct blame_entry *best_so_far,
+				 struct blame_entry *this)
+{
+	int i;
+
+	if (!this[1].suspect)
+		return;
+	if (best_so_far[1].suspect) {
+		if (ent_score(sb, &this[1]) < ent_score(sb, &best_so_far[1]))
+			return;
+	}
+
+	for (i = 0; i < 3; i++)
+		origin_incref(this[i].suspect);
+	decref_split(best_so_far);
+	memcpy(best_so_far, this, sizeof(struct blame_entry [3]));
+}
+
+static void find_copy_in_blob(struct scoreboard *sb,
+			      struct blame_entry *ent,
+			      struct origin *parent,
+			      struct blame_entry *split,
+			      mmfile_t *file_p)
+{
+	const char *cp;
+	int cnt;
+	mmfile_t file_o;
+	struct patch *patch;
+	int i, plno, tlno;
+
+	cp = nth_line(sb, ent->lno);
+	file_o.ptr = (char*) cp;
+	cnt = ent->num_lines;
+
+	while (cnt && cp < sb->final_buf + sb->final_buf_size) {
+		if (*cp++ == '\n')
+			cnt--;
+	}
+	file_o.size = cp - file_o.ptr;
+
+	patch = compare_buffer(file_p, &file_o, 1);
+
+	memset(split, 0, sizeof(struct blame_entry [3]));
+	plno = tlno = 0;
+	for (i = 0; i < patch->num; i++) {
+		struct chunk *chunk = &patch->chunks[i];
+
+		/* tlno to chunk->same are the same as ent */
+		if (ent->num_lines <= tlno)
+			break;
+		if (tlno < chunk->same) {
+			struct blame_entry this[3];
+			split_overlap(this, ent,
+				      tlno + ent->s_lno, plno,
+				      chunk->same + ent->s_lno,
+				      parent);
+			copy_split_if_better(sb, split, this);
+			decref_split(this);
+		}
+		plno = chunk->p_next;
+		tlno = chunk->t_next;
+	}
+	free_patch(patch);
+}
+
+static int find_move_in_parent(struct scoreboard *sb,
+			       struct origin *target,
+			       struct origin *parent)
+{
+	int last_in_target, made_progress;
+	struct blame_entry *e, split[3];
+	mmfile_t file_p;
+
+	last_in_target = find_last_in_target(sb, target);
+	if (last_in_target < 0)
+		return 1; /* nothing remains for this target */
+
+	fill_origin_blob(parent, &file_p);
+	if (!file_p.ptr)
+		return 0;
+
+	made_progress = 1;
+	while (made_progress) {
+		made_progress = 0;
+		for (e = sb->ent; e; e = e->next) {
+			if (e->guilty || cmp_suspect(e->suspect, target))
+				continue;
+			find_copy_in_blob(sb, e, parent, split, &file_p);
+			if (split[1].suspect &&
+			    blame_move_score < ent_score(sb, &split[1])) {
+				split_blame(sb, split, e);
+				made_progress = 1;
+			}
+			decref_split(split);
+		}
+	}
+	return 0;
+}
+
+
+struct blame_list {
+	struct blame_entry *ent;
+	struct blame_entry split[3];
+};
+
+static struct blame_list *setup_blame_list(struct scoreboard *sb,
+					   struct origin *target,
+					   int *num_ents_p)
+{
+	struct blame_entry *e;
+	int num_ents, i;
+	struct blame_list *blame_list = NULL;
+
+	/* Count the number of entries the target is suspected for,
+	 * and prepare a list of entry and the best split.
+	 */
+	for (e = sb->ent, num_ents = 0; e; e = e->next)
+		if (!e->guilty && !cmp_suspect(e->suspect, target))
+			num_ents++;
+	if (num_ents) {
+		blame_list = xcalloc(num_ents, sizeof(struct blame_list));
+		for (e = sb->ent, i = 0; e; e = e->next)
+			if (!e->guilty && !cmp_suspect(e->suspect, target))
+				blame_list[i++].ent = e;
+	}
+	*num_ents_p = num_ents;
+	return blame_list;
+}
+
+static int find_copy_in_parent(struct scoreboard *sb,
+			       struct origin *target,