Anonymous avatar Anonymous committed dfd05e3

filter-branch: Big syntax change; support rewriting multiple refs

We used to take the first non-option argument as the name for the new
branch. This syntax is not extensible to support rewriting more than just
HEAD.

Instead, we now have the following syntax:

git filter-branch [<filter options>...] [<rev-list options>]

All positive refs given in <rev-list options> are rewritten. Yes,
in-place. If a ref was changed, the original head is stored in
refs/original/$ref now, for your inspecting pleasure, in addition to the
reflogs (since it is easier to inspect "git show-ref | grep original" than
to inspect all the reflogs).

This commit also adds the --force option to remove .git-rewrite/ and all
refs from refs/original/ before filtering.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>;
Signed-off-by: Junio C Hamano <gitster@pobox.com>;

Comments (0)

Files changed (3)

Documentation/git-filter-branch.txt

 	[--index-filter <command>] [--parent-filter <command>]
 	[--msg-filter <command>] [--commit-filter <command>]
 	[--tag-name-filter <command>] [--subdirectory-filter <directory>]
-	[-d <directory>] <new-branch-name> [<rev-list options>...]
+	[-d <directory>] [-f | --force] [<rev-list options>...]
 
 DESCRIPTION
 -----------
 The command takes the new branch name as a mandatory argument and
 the filters as optional arguments.  If you specify no filters, the
 commits will be recommitted without any changes, which would normally
-have no effect and result in the new branch pointing to the same
-branch as your current branch.  Nevertheless, this may be useful in
-the future for compensating for some git bugs or such, therefore
-such a usage is permitted.
+have no effect.  Nevertheless, this may be useful in the future for
+compensating for some git bugs or such, therefore such a usage is
+permitted.
 
 *WARNING*! The rewritten history will have different object names for all
 the objects and will not converge with the original branch.  You will not
 full implications, and avoid using it anyway, if a simple single commit
 would suffice to fix your problem.
 
-Always verify that the rewritten version is correct before disposing
-the original branch.
+Always verify that the rewritten version is correct: The original refs,
+if different from the rewritten ones, will be stored in the namespace
+'refs/original/'.
 
 Note that since this operation is extensively I/O expensive, it might
 be a good idea to redirect the temporary directory off-disk, e.g. on
 	does this in the '.git-rewrite/' directory but you can override
 	that choice by this parameter.
 
+-f\|--force::
+	`git filter-branch` refuses to start with an existing temporary
+	directory or when there are already refs starting with
+	'refs/original/', unless forced.
+
 <rev-list-options>::
 	When options are given after the new branch name, they will
 	be passed to gitlink:git-rev-list[1].  Only commits in the resulting
 or copyright violation) from all commits:
 
 -------------------------------------------------------
-git filter-branch --tree-filter 'rm filename' newbranch
+git filter-branch --tree-filter 'rm filename' HEAD
 -------------------------------------------------------
 
 A significantly faster version:
 
--------------------------------------------------------------------------------
-git filter-branch --index-filter 'git update-index --remove filename' newbranch
--------------------------------------------------------------------------------
+--------------------------------------------------------------------------
+git filter-branch --index-filter 'git update-index --remove filename' HEAD
+--------------------------------------------------------------------------
 
 Now, you will get the rewritten history saved in the branch 'newbranch'
 (your current branch is left untouched).
 history) to be the parent of the current initial commit, in
 order to paste the other history behind the current history:
 
-------------------------------------------------------------------------
-git filter-branch --parent-filter 'sed "s/^\$/-p <graft-id>/"' newbranch
-------------------------------------------------------------------------
+-------------------------------------------------------------------
+git filter-branch --parent-filter 'sed "s/^\$/-p <graft-id>/"' HEAD
+-------------------------------------------------------------------
 
 (if the parent string is empty - therefore we are dealing with the
 initial commit - add graftcommit as a parent).  Note that this assumes
 history with a single root (that is, no merge without common ancestors
 happened).  If this is not the case, use:
 
--------------------------------------------------------------------------------
+--------------------------------------------------------------------------
 git filter-branch --parent-filter \
-	'cat; test $GIT_COMMIT = <commit-id> && echo "-p <graft-id>"' newbranch
--------------------------------------------------------------------------------
+	'cat; test $GIT_COMMIT = <commit-id> && echo "-p <graft-id>"' HEAD
+--------------------------------------------------------------------------
 
 or even simpler:
 
 -----------------------------------------------
 echo "$commit-id $graft-id" >> .git/info/grafts
-git filter-branch newbranch $graft-id..
+git filter-branch $graft-id..HEAD
 -----------------------------------------------
 
 To remove commits authored by "Darl McBribe" from the history:
 		done;
 	else
 		git commit-tree "$@";
-	fi' newbranch
+	fi' HEAD
 ------------------------------------------------------------------------------
 
 The shift magic first throws away the tree id and then the -p
 To rewrite only commits D,E,F,G,H, but leave A, B and C alone, use:
 
 --------------------------------
-git filter-branch ... new-H C..H
+git filter-branch ... C..H
 --------------------------------
 
 To rewrite commits E,F,G,H, use one of these:
 
 ----------------------------------------
-git filter-branch ... new-H C..H --not D
-git filter-branch ... new-H D..H --not C
+git filter-branch ... C..H --not D
+git filter-branch ... D..H --not C
 ----------------------------------------
 
 To move the whole tree into a subdirectory, or remove it from there:
 	'git ls-files -s | sed "s-\t-&newsubdir/-" |
 		GIT_INDEX_FILE=$GIT_INDEX_FILE.new \
 			git update-index --index-info &&
-	 mv $GIT_INDEX_FILE.new $GIT_INDEX_FILE' directorymoved
+	 mv $GIT_INDEX_FILE.new $GIT_INDEX_FILE' HEAD
 ---------------------------------------------------------------
 
 

git-filter-branch.sh

 filter_commit='git commit-tree "$@"'
 filter_tag_name=
 filter_subdir=
+orig_namespace=refs/original/
+force=
 while case "$#" in 0) usage;; esac
 do
 	case "$1" in
 		shift
 		break
 		;;
+	--force|-f)
+		shift
+		force=t
+		continue
+		;;
 	-*)
 		;;
 	*)
 	--subdirectory-filter)
 		filter_subdir="$OPTARG"
 		;;
+	--original)
+		orig_namespace="$OPTARG"
+		;;
 	*)
 		usage
 		;;
 	esac
 done
 
-dstbranch="$1"
-shift
-test -n "$dstbranch" || die "missing branch name"
-git show-ref "refs/heads/$dstbranch" 2> /dev/null &&
-	die "branch $dstbranch already exists"
-
-test ! -e "$tempdir" || die "$tempdir already exists, please remove it"
+case "$force" in
+t)
+	rm -rf "$tempdir"
+;;
+'')
+	test -d "$tempdir" &&
+		die "$tempdir already exists, please remove it"
+esac
 mkdir -p "$tempdir/t" &&
+tempdir="$(cd "$tempdir"; pwd)" &&
 cd "$tempdir/t" &&
 workdir="$(pwd)" ||
 die ""
 
+# Make sure refs/original is empty
+git for-each-ref > "$tempdir"/backup-refs
+while read sha1 type name
+do
+	case "$force,$name" in
+	,$orig_namespace*)
+		die "Namespace $orig_namespace not empty"
+	;;
+	t,$orig_namespace*)
+		git update-ref -d "$name" $sha1
+	;;
+	esac
+done < "$tempdir"/backup-refs
+
 case "$GIT_DIR" in
 /*)
 	;;
 esac
 export GIT_DIR GIT_WORK_TREE=.
 
+# These refs should be updated if their heads were rewritten
+
+git rev-parse --revs-only --symbolic "$@" |
+while read ref
+do
+	# normalize ref
+	case "$ref" in
+	HEAD)
+		ref="$(git symbolic-ref "$ref")"
+	;;
+	refs/*)
+	;;
+	*)
+		ref="$(git for-each-ref --format='%(refname)' |
+			grep /"$ref")"
+	esac
+
+	git check-ref-format "$ref" && echo "$ref"
+done > "$tempdir"/heads
+
+test -s "$tempdir"/heads ||
+	die "Which ref do you want to rewrite?"
+
 export GIT_INDEX_FILE="$(pwd)/../index"
 git read-tree || die "Could not seed the index"
 
 
 test $commits -eq 0 && die "Found nothing to rewrite"
 
+# Rewrite the commits
+
 i=0
 while read commit parents; do
 	i=$(($i+1))
 		$(git write-tree) $parentstr < ../message > ../map/$commit
 done <../revs
 
-src_head=$(tail -n 1 ../revs | sed -e 's/ .*//')
-target_head=$(head -n 1 ../map/$src_head)
-case "$target_head" in
-'')
-	echo Nothing rewritten
+# In case of a subdirectory filter, it is possible that a specified head
+# is not in the set of rewritten commits, because it was pruned by the
+# revision walker.  Fix it by mapping these heads to the next rewritten
+# ancestor(s), i.e. the boundaries in the set of rewritten commits.
+
+# NEEDSWORK: we should sort the unmapped refs topologically first
+while read ref
+do
+	sha1=$(git rev-parse "$ref"^0)
+	test -f "$workdir"/../map/$sha1 && continue
+	# Assign the boundarie(s) in the set of rewritten commits
+	# as the replacement commit(s).
+	# (This would look a bit nicer if --not --stdin worked.)
+	for p in $((cd "$workdir"/../map; ls | sed "s/^/^/") |
+		git rev-list $ref --boundary --stdin |
+		sed -n "s/^-//p")
+	do
+		map $p >> "$workdir"/../map/$sha1
+	done
+done < "$tempdir"/heads
+
+# Finally update the refs
+
+_x40='[0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f]'
+_x40="$_x40$_x40$_x40$_x40$_x40$_x40$_x40$_x40"
+count=0
+echo
+while read ref
+do
+	# avoid rewriting a ref twice
+	test -f "$orig_namespace$ref" && continue
+
+	sha1=$(git rev-parse "$ref"^0)
+	rewritten=$(map $sha1)
+
+	test $sha1 = "$rewritten" &&
+		warn "WARNING: Ref '$ref' is unchanged" &&
+		continue
+
+	case "$rewritten" in
+	'')
+		echo "Ref '$ref' was deleted"
+		git update-ref -m "filter-branch: delete" -d "$ref" $sha1 ||
+			die "Could not delete $ref"
 	;;
-*)
-	git update-ref refs/heads/"$dstbranch" $target_head ||
-		die "Could not update $dstbranch with $target_head"
-	if [ $(wc -l <../map/$src_head) -gt 1 ]; then
-		echo "WARNING: Your commit filter caused the head commit to expand to several rewritten commits. Only the first such commit was recorded as the current $dstbranch head but you will need to resolve the situation now (probably by manually merging the other commits). These are all the commits:" >&2
-		sed 's/^/	/' ../map/$src_head >&2
-		ret=1
-	fi
+	$_x40)
+		echo "Ref '$ref' was rewritten"
+		git update-ref -m "filter-branch: rewrite" \
+				"$ref" $rewritten $sha1 ||
+			die "Could not rewrite $ref"
 	;;
-esac
+	*)
+		# NEEDSWORK: possibly add -Werror, making this an error
+		warn "WARNING: '$ref' was rewritten into multiple commits:"
+		warn "$rewritten"
+		warn "WARNING: Ref '$ref' points to the first one now."
+		rewritten=$(echo "$rewritten" | head -n 1)
+		git update-ref -m "filter-branch: rewrite to first" \
+				"$ref" $rewritten $sha1 ||
+			die "Could not rewrite $ref"
+	;;
+	esac
+	git update-ref -m "filter-branch: backup" "$orig_namespace$ref" $sha1
+	count=$(($count+1))
+done < "$tempdir"/heads
+
+# TODO: This should possibly go, with the semantics that all positive given
+#       refs are updated, and their original heads stored in refs/original/
+# Filter tags
 
 if [ "$filter_tag_name" ]; then
 	git for-each-ref --format='%(objectname) %(objecttype) %(refname)' refs/tags |
 
 cd ../..
 rm -rf "$tempdir"
-printf "\nRewritten history saved to the $dstbranch branch\n"
+echo
+test $count -gt 0 && echo "These refs were rewritten:"
+git show-ref | grep ^"$orig_namespace"
 
 exit $ret

t/t7003-filter-branch.sh

 H=$(git rev-parse H)
 
 test_expect_success 'rewrite identically' '
-	git-filter-branch H2
+	git-filter-branch branch
 '
-
 test_expect_success 'result is really identical' '
-	test $H = $(git rev-parse H2)
+	test $H = $(git rev-parse HEAD)
 '
 
 test_expect_success 'rewrite, renaming a specific file' '
-	git-filter-branch --tree-filter "mv d doh || :" H3
+	git-filter-branch -f --tree-filter "mv d doh || :" HEAD
 '
 
 test_expect_success 'test that the file was renamed' '
-	test d = $(git show H3:doh)
+	test d = $(git show HEAD:doh)
 '
 
-git tag oldD H3~4
+git tag oldD HEAD~4
 test_expect_success 'rewrite one branch, keeping a side branch' '
-	git-filter-branch --tree-filter "mv b boh || :" modD D..oldD
+	git branch modD oldD &&
+	git-filter-branch -f --tree-filter "mv b boh || :" D..modD
 '
 
 test_expect_success 'common ancestor is still common (unchanged)' '
 	git rm a &&
 	test_tick &&
 	git commit -m "again not subdir" &&
-	git-filter-branch --subdirectory-filter subdir sub
+	git branch sub &&
+	git-filter-branch -f --subdirectory-filter subdir refs/heads/sub
 '
 
 test_expect_success 'subdirectory filter result looks okay' '
 	test_tick &&
 	git commit -m "again subdir on master" &&
 	git merge branch &&
-	git-filter-branch --subdirectory-filter subdir sub-master
+	git branch sub-master &&
+	git-filter-branch -f --subdirectory-filter subdir sub-master
 '
 
 test_expect_success 'subdirectory filter result looks okay' '
 '
 
 test_expect_success 'use index-filter to move into a subdirectory' '
-	git-filter-branch --index-filter \
+	git branch directorymoved &&
+	git-filter-branch -f --index-filter \
 		 "git ls-files -s | sed \"s-\\t-&newsubdir/-\" |
 	          GIT_INDEX_FILE=\$GIT_INDEX_FILE.new \
 			git update-index --index-info &&
 	test -z "$(git diff HEAD directorymoved:newsubdir)"'
 
 test_expect_success 'stops when msg filter fails' '
-	! git-filter-branch --msg-filter false nonono &&
-	rm -rf .git-rewrite &&
-	! git rev-parse nonono
+	old=$(git rev-parse HEAD) &&
+	! git-filter-branch -f --msg-filter false &&
+	test $old = $(git rev-parse HEAD) &&
+	rm -rf .git-rewrite
 '
 
 test_expect_success 'author information is preserved' '
 	git add i &&
 	test_tick &&
 	GIT_AUTHOR_NAME="B V Uips" git commit -m bvuips &&
-	git-filter-branch --msg-filter "cat; \
+	git branch preserved-author &&
+	git-filter-branch -f --msg-filter "cat; \
 			test \$GIT_COMMIT != $(git rev-parse master) || \
 			echo Hallo" \
 		preserved-author &&
 	echo i > i &&
 	test_tick &&
 	git commit -m i i &&
-	git-filter-branch --commit-filter "\
+	git branch removed-author &&
+	git-filter-branch -f --commit-filter "\
 		if [ \"\$GIT_AUTHOR_NAME\" = \"B V Uips\" ];\
 		then\
 			shift;\
 	test 0 = $(git rev-list --author="B V Uips" removed-author | wc -l)
 '
 
+test_expect_success 'barf on invalid name' '
+	! git filter-branch -f master xy-problem &&
+	! git filter-branch -f HEAD^
+'
+
 test_done
Tip: Filter by directory path e.g. /media app.js to search for public/media/app.js.
Tip: Use camelCasing e.g. ProjME to search for ProjectModifiedEvent.java.
Tip: Filter by extension type e.g. /repo .js to search for all .js files in the /repo directory.
Tip: Separate your search with spaces e.g. /ssh pom.xml to search for src/ssh/pom.xml.
Tip: Use ↑ and ↓ arrow keys to navigate and return to view the file.
Tip: You can also navigate files with Ctrl+j (next) and Ctrl+k (previous) and view the file with Ctrl+o.
Tip: You can also navigate files with Alt+j (next) and Alt+k (previous) and view the file with Alt+o.