Commits

Anonymous committed 5d86501

Set up for better tree diff optimizations

This is mainly just a cleanup patch, and sets up for later changes where
the tree-diff.c "interesting()" function can return more than just a
yes/no value.

In particular, it should be quite possible to say "no subsequent entries
in this tree can possibly be interesting any more", and thus allow the
callers to short-circuit the tree entirely.

In fact, changing the callers to do so is trivial, and is really all this
patch really does, because changing "interesting()" itself to say that
nothing further is going to be interesting is definitely more complicated,
considering that we may have arbitrary pathspecs.

But in cleaning up the callers, this actually fixes a potential small
performance issue in diff_tree(): if the second tree has a lot of
uninterestign crud in it, we would keep on doing the "is it interesting?"
check on the first tree for each uninteresting entry in the second one.

The answer is obviously not going to change, so that was just not helping.
The new code is clearer and simpler and avoids this issue entirely.

I also renamed "interesting()" to "tree_entry_interesting()", because I
got frustrated by the fact that

- we actually had *another* function called "interesting()" in another
file, and I couldn't tell from the profiles which one was the one that
mattered more.

- when rewriting it to return a ternary value, you can't just do

if (interesting(...))
...

any more, but want to assign the return value to a local variable. The
name of choice for that variable would normally be "interesting", so
I just wanted to make the function name be more specific, and avoid
that whole issue (even though I then didn't choose that name for either
of the users, just to avoid confusion in the patch itself ;)

In other words, this doesn't really change anything, but I think it's a
good thing to do, and if somebody comes along and writes the logic for
"yeah, none of the pathspecs you have are interesting", we now support
that trivially.

It could easily be a meaningful optimization for things like "blame",
where there's just one pathspec, and stopping when you've seen it would
allow you to avoid about 50% of the tree traversals on average.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>

  • Participants
  • Parent commits c711a21

Comments (0)

Files changed (1)

 	return 0;
 }
 
-static int interesting(struct tree_desc *desc, const char *base, int baselen, struct diff_options *opt)
+/*
+ * Is a tree entry interesting given the pathspec we have?
+ *
+ * Return:
+ *  - positive for yes
+ *  - zero for no
+ *  - negative for "no, and no subsequent entries will be either"
+ */
+static int tree_entry_interesting(struct tree_desc *desc, const char *base, int baselen, struct diff_options *opt)
 {
 	const char *path;
 	const unsigned char *sha1;
 static void show_tree(struct diff_options *opt, const char *prefix, struct tree_desc *desc, const char *base, int baselen)
 {
 	while (desc->size) {
-		if (interesting(desc, base, baselen, opt))
+		int show = tree_entry_interesting(desc, base, baselen, opt);
+		if (show < 0)
+			break;
+		if (show)
 			show_entry(opt, prefix, desc, base, baselen);
 		update_tree_entry(desc);
 	}
 	}
 }
 
+static void skip_uninteresting(struct tree_desc *t, const char *base, int baselen, struct diff_options *opt)
+{
+	while (t->size) {
+		int show = tree_entry_interesting(t, base, baselen, opt);
+		if (!show) {
+			update_tree_entry(t);
+			continue;
+		}
+		/* Skip it all? */
+		if (show < 0)
+			t->size = 0;
+		return;
+	}
+}
+
 int diff_tree(struct tree_desc *t1, struct tree_desc *t2, const char *base, struct diff_options *opt)
 {
 	int baselen = strlen(base);
 
-	while (t1->size | t2->size) {
+	for (;;) {
 		if (opt->quiet && opt->has_changes)
 			break;
-		if (opt->nr_paths && t1->size && !interesting(t1, base, baselen, opt)) {
-			update_tree_entry(t1);
-			continue;
-		}
-		if (opt->nr_paths && t2->size && !interesting(t2, base, baselen, opt)) {
-			update_tree_entry(t2);
-			continue;
+		if (opt->nr_paths) {
+			skip_uninteresting(t1, base, baselen, opt);
+			skip_uninteresting(t2, base, baselen, opt);
 		}
 		if (!t1->size) {
+			if (!t2->size)
+				break;
 			show_entry(opt, "+", t2, base, baselen);
 			update_tree_entry(t2);
 			continue;