mutable-history / docs / instability.rst

The instability Principle

An intrinsic contradiction

XXX starts by talking about getting ride of changeset.

DVCSes bring two new major concepts to the Version Control Scene:

  • History is organized as a robust DAG,
  • History can be rewritten.

However, the two concepts are in contradiction:

To achieve a robust history, three key elements are gathered in changesets:

  • Full snapshot of the versioned content,
  • Reference to the previous full snapshot used to build the new one,
  • A description of the change who lead from the old content to the new old.

All three elements are to compute a unique hash that identify the changeset (with various other metadata). This identification is a key part of DVCS design.

This is a very useful property because Changing B parent means changing B content too. This requires the creation of another changeset, which is semantically good.

Schema base,  A, B and B'

To avoid duplication, the older changeset is usually discarded from accessible history. I'm calling them obsolete changesets.

But rewriting a changeset with children does not change these children's parent! And because children of the rewritten changeset still depend on the older "dead" version of the changeset with can not get rid of this dead version.

Schema base,  A and A' and B.

I'm calling these children unstable because they are based on a dead changeset and prevent people to get rid of it.

This instability is an unavoidable consequence of the strict dependency of changeset. History Rewriting history alway need to take it in account and provide a way to rewrite the descendant on the new changeset to avoid coexistence of the old and new version of a rewritten changeset.

Everybody is working around the issue

I'm not claiming that rewriting history is impossible. People are successfully doing for years. However they all need to work around instability. Several work around strategy exists.

Rewriting all at once

The simplest way to avoid instability is to ensure rewriting operations always end in a stable situation. This is achieved by rewriting all affected changesets at the same time.

Rewriting all descendants at the same time when rewriting a changeset.

Schema!

Several Mercurial commands apply it: rebase, collapse, histedit. Mercurial also refuses to amend changeset with descendant. The git branch design enforces such approach in git too.

However, DVCS are Distributed. This means that you do not control what happen outside your repository. Once a changeset have been exchanged outside, there is no way to be sure it does not have descendants somewhere else. Therefore if you rewrite changeset that exists elsewhere, you can't eradicate the risk of instability.

Do not rewrite exchanged changeset

To work around the issue above, Mercurial introduced phases, which prevent you from rewriting shared changesets and ensure others can't pull certain changesets from you. But this is a very frustrating limitation that prevents you to efficiently sharing, reviewing and collaborating on mutable changesets.

In the Git world, they use another approach to prevent instability. By convention only a single developper works on a changeset contained in a named branch. But once again this is a huge blocker for collaborating. Moreover clueless people will mess up social convention soon or later.

Loose the DAG robustness

The other approach in Mercurial is to keep the mutable part of the history outside the DVCS constraint. This is the MQ approach of sticking a quilt queue over Mercurial.

This allow much more flexible workflow but two major feature are lost in the process:

Graceful merge:MQ use plain-patch to store changeset content and patch have trouble to apply in changing context. Applying your queue becomes very painful when context changes.
easy branching:A quilt queue is by definition a linear queue. Increasing risk of conflict

It is possible to collaborate over versioned mq! But you are going to have a lot of troubles.

Ignore conflicts

Another ignored issue is a conflicting rewrite of the same changeset. If a changeset is rewritten two times we have two newer versions, and duplicated history is complicated to merge.

Mercurial work around by

The "One set of mutable changset == One developer" mantra is also a way to work around conflicting rewriting of changeset. If two different people are able to

The git branch model allow to overwrite changeset version by another one, but it does not care about divergent version. It is the equivalent of "common ftp" source management for changesets.

Facing The Danger Once And For All

Above we saw that, the more effort you put to avoid instability, the more option you deny. And even most restrictive work flow can't guarantee that instability will never show up!

Obsolete marker can handle the job

It is time to provide a full featured solution to deal with instability and to stop working around the issue! This is why I developing a new feature for mercurial called "Obsolete markers". Obsolete markers have two key properties:

  • Any "old" changeset we want to get ride of is explicitly marked as "obsolete" by history rewriting operation.

    By explicitly marking the obsolete part of the history, we will be able to easily detect instability situation.

  • Relations between old and new version of changesets are tracked by obsolete markers.

    By Storing a meta-history of changeset evolution we are able to easily resolve instability and edition conflict [1] .

[1]edition conflict is another major obstable to collaboration. See the section dedicated to obsolete marker for details.

Improves robustness == improves simplicity

This proposal should first be seen as a safety measure.

It allow to detect instability as soon as possible

$ hg pull
added 3 changeset
+2 unstable changeset
(do you want "hg stabilize" ?)
working directory parent is obsolete!
$ hg push
outgoing unstable changesets
(use "hg stabilize" or force the push)

And should not not encourage people to create instability

$ hg up 42
$ hg commit --amend
changeset have descendant.
$ hg commit --amend -f
+5 unstable changeset

$ hg rebase -D --rev 40::44
rebasing already obsolete changeset 42:AAA will conflict with newer version 48:BBB

While allowing powerful feature

  • Help to automatically solve instability.

  • "prune" changeset remotely.

  • track resulting changeset when submitting patch//pull request.

  • Focus on what you do:

    I do not like the "all at once" model of history rewriting. I'm comfortable with instability and obsolete marker offer all the tool to safely create and handle instability locally.

Tip: Filter by directory path e.g. /media app.js to search for public/media/app.js.
Tip: Use camelCasing e.g. ProjME to search for ProjectModifiedEvent.java.
Tip: Filter by extension type e.g. /repo .js to search for all .js files in the /repo directory.
Tip: Separate your search with spaces e.g. /ssh pom.xml to search for src/ssh/pom.xml.
Tip: Use ↑ and ↓ arrow keys to navigate and return to view the file.
Tip: You can also navigate files with Ctrl+j (next) and Ctrl+k (previous) and view the file with Ctrl+o.
Tip: You can also navigate files with Alt+j (next) and Alt+k (previous) and view the file with Alt+o.