HTTPS SSH

Replication Study: "The Impact of Human Discussions on Just-in-Time Quality Assurance"


Author: Mike Schaekermann

This repository provides a replication environment for the results reported in:

This replication study was done as a course project for CS846 Empirical Software Engineering Using Ultra-Large Repositories, taught by Mei Nagappan in the School of Computer Science at the University of Waterloo, Canada. This repository is not the official replication environment for the original paper, but the result of an independent attempt to reproduce the results of this paper.

Getting Started


In this environment, all code is run and all data is hosted inside a Docker container. This will save users from having to manually install any dependencies and will ensure that all the data and code is encapsulated in its own environment. Follow these steps to get started:

  1. Install Docker CE. If you are on a Linux system, also install docker-compose using pip install docker-compose. Note: this is not required on Mac or Windows.

  2. Start the docker container (this may take a few minutes): docker-compose up

  3. Enter the docker container: docker exec -it replicationpapertouraniadams_db_1 bash

  4. Download the MetricsGrimoire data set (this will take around an hour): cd /code/grimoire && ./download_dataset.sh

  5. Import the MetricsGrimoire data set into MySQL: cd /code/grimoire && ./import_dataset.py

  6. Generate the final data set used for analysis: cd /code && ./generate_data_set.py

Step 5 imported MetricsGrimoire data about source code commits, issue reports and code reviews from the Eclipse and OpenStack ecosystems into a MySQL database. You can visually explore the databases at http://localhost:8080/ (username: root; password: grimoire), while the docker container is running.

Step 6 generated the final data set used for analysis. You will now find it in the paper directory located in the repository root under the name final_data_set.csv.

Generate Final Paper


The paper can be fully generated from the file final_data_set.csv (see previous section) in a dynamic fashion including all result tables. We use LaTeX markup to typeset the paper content and KnitR to output analysis results (e.g., tables) directly into the text flow by inlining R code into the LaTeX markup. To produce the paper, just run the following commands from inside the Docker container (for this to work, final_data_set.csv needs to be in the /code/paper/ directory):

  • cd /code/paper/
  • ./compile.sh

The compile command may take a while the first you run it because it also triggers the full data analysis script (paper/analysis.R). Subsequent runs will use cached analysis results, so the paper compilation will go significantly faster. Once the command completes, you will find the final paper.pdf file inside the paper/ directory.

Download Review IDs with Corresponding Commit IDs


The original paper mentions:

“In particular, links from an accepted review to its corresponding git commit can be identified by searching the Gerrit reviews for the commit identifier of the accepted revision of a patch. These commit identifiers had not been stored in Barahona’s exposed databases, hence we modified MetricsGrimoire to download this additional information from the review repository, then updated Barahona’s database with the extracted commit identifiers.”

We replicated this step by creating a modified version of the Bicho tool from the MetricsGrimoire suite and storing it in this repository in the directory grimoire/Bicho. Use this tool to download Gerrit review IDs with their corresponding commit IDs by following these steps:

  1. Create Gerrit accounts for OpenStack and Eclipse:
  • OpenStack: Create an Ubuntu One user account at review.openstack.org; after that, make sure to set a username in your Ubuntu One profile; finally, follow the instructions here to generate and upload your SSH public key for OpenStack's review system; it is crucial that you generate your SSH key from inside of the running Docker container
  • Eclipse: Create an Eclipse Foundation user account at git.eclipse.org/r/; after that, enable Gerrit for your Eclipse account by filing a bug as described here, you can copy the bug title and description from this bug; finally, follow the instructions here to generate and upload your SSH public key for Eclipse's review system; it is crucial that you generate your SSH key from inside of the running Docker container
  1. SSH into your running Docker container, as described in points 2 and 3 under Getting Started

  2. Download the Gerrit review IDs with corresponding commit IDs (make sure to replace {YOUR_OPENSTACK_USERNAME} and {YOUR_ECLIPSE_USERNAME} with the correct values): cd /code/grimoire/reviews_with_commit_ids && export OPENSTACK_USERNAME={YOUR_OPENSTACK_USERNAME} && export ECLIPSE_USERNAME={YOUR_ECLIPSE_USERNAME} && ./download_reviews_with_commit_ids.sh