Welcome to Untangler

Untangler is a tool built on the Chronos framework that helps a developer organize changes during the commit phase. Unlike the staging area in a typical version control system, Untangler uses a fine-grained development history to help a developer cherry pick separate development tasks into different commits. Untangler uses the following hypothesis: when a developer is implementing a new feature or fixing a bug, most of the time the developer’s changes are localized to a few code elements (methods, classes, etc.). Therefore, Untangler creates a commit summary by transforming the fine-grained development history into a courser granularity (code-element-level granularity). My collaborators for this project are Kıvanç Muşlu, Michael D. Ernst, and Yuriy Brun in the UW Programming Languages and Software Engineering group.

Untangler's Main Algorithm

Untangler's main algorithm is a two-pass algorithm that clusters fine-grained commits from a fine-grained history (history of keystroke-level diffs) into a meaningful hierarchy of coarser-grained clusters of commits. The first pass traverses the fine-grained commits (key-stroke level diffs) to create clusters of fine-grained commits that are in contiguous regions within the same resource. Each of these clusters, made up of overlapping, fine-grained commits within a file, is defined as a “coherent cluster”. The second pass traverses all of the coherent clusters, creating clusters of a coarser granularity, which are comprised of coherent clusters. This second-pass is similar to the first-pass, but it groups the coherent clusters instead of the fine-grained commits. These coarser-grained clusters, called "final clusters", are defined arbitrarily by overriding a method like 'areCoherentClustersSimilar(coherentCluster1, coherentCluster2)', which allows us to group the coherent clusters by resource, by time, or both. In most cases we group coherent clusters by resource, such as the file name, to generate a result of one final cluster per file that has been changed after the two-pass algorithm is complete. As a result, we have a hierarchy of final clusters, coherent clusters, and fine-grained commits.

Developer Interaction

After the main algorithm has executed, the developer can select each cluster that will be included in her next commit. The developer is initially presented with only the highest granularity clusters, which are the final clusters. The developer can change the cluster granularity by "exploding" each final cluster into its comprising coherent clusters. After exploding any cluster, the comprising clusters can be selectively added and "collapsed" into a single diff, called a "collapsed" cluster, allowing the developer to choose the content and granularity of each cluster that will be added to her commit.

It has not yet been implemented, but this process is repeatable until the developer has control over intra-line edits, enabling the developer to select the changes within each line that will be part of her commit.

Who do I talk to?