Wiki

Clone wiki

lab2 / Home

CSE 6230, Fall 2014: Lab 2, Tu Sep 9: OpenMP

In this lab, you will start by converting your Cilk Plus program into an OpenMP program.

After you've done that, you'll try writing your own OpenMP program to test an affect visible on real server systems, called non-uniform memory access, or NUMA.

You may if you wish work in teams of two. To simplify our grading of your assignments, each person should submit his/her own assignment; however, all team members may submit identical code. Be sure to indicate with whom you worked by creating a README file as part of your submission. (See below for details.)

Lastly, note that we are using the same compiler (Intel's icpc), so if you set up your environment correctly last week, you don't need to do anything differently this week.

Part 0: Get the assignment code

Use the same fork-checkout procedure from last week. The repo you want is gtcse6230fa14/lab2. As a reminder, the basic steps to get started are:

  1. Log into your Bitbucket account.

  2. Fork the code for this week's lab into your account. The URL is: https://bitbucket.org/gtcse6230fa14/lab2.git. Be sure to rename your repo with your Bitbucket ID, and to also mark your repo as "Private" if you do not want the world to see your commits.

  3. Check out your forked repo on Jinx. Assuming your Bitbucket login is MyBbLogin and assuming that you gave your forked repo the same name (lab2), you would on Jinx use the command:

#!bash
git clone https://MyBbLogin@bitbucket.org/MyBbLogin/lab2--MyBbLogin.git

Alternatively, if you figured out how to do passwordless checkouts using ssh keys, you might use the alternative checkout style, git clone git@bitbucket.org:MyBbLogin/lab2--MyBbLogin.git.

If it worked, you'll have a lab2 subdirectory that you can start editing.

Part 1: Implement a fully-parallel Quicksort using OpenMP

Inside the qsort subdirectory of the code you just checked out is a slightly modified clone of the original driver. The main modifications are:

  1. The Makefile is set up to enable OpenMP pragmas, instead of Cilk Plus pragmas.
  2. The file in which you should create your OpenMP-based implementation is called, parallel-qsort--omp.cc.
  3. To build the driver that includes your implementation, run make qsort-omp, which on success produces a binary executable called, qsort-omp.
  4. You can run this executable in the same way as Lab 1. Additionally, we've provided a different job script, qsort-omp.pbs, which you can use for testing.

Implement an OpenMP-based version of Quicksort in the file named parallel-qsort--omp.cc. If you did the Cilk Plus version of this assignment successfully, it should be possible to produce an OpenMP-based variant from it by translating Cilk Plus primitives into their OpenMP equivalents.

There is one potentially tricky bit: you need to remember and recognize where to create a parallel region or region(s) as needed.

Once you've got something working, use the usual add-commit-push steps to save this version of your repo. You do not need to transfer it to us just yet, until you have completed Part 2 below.

Part 2: Seeing NUMA

In class, we will discuss a performance effect known as non-uniform memory access, or NUMA. For Part 2, modify the skeleton code under the numa subdirectory to exhibit this effect, exploiting the Linux first-touch policy.

When you commit your code, be sure to

  • include the data that supports your observation of the NUMA effect;
  • plot this data in a way to show the effect (you will see an example in class);
  • and document in a README file your experimental approach and list the various files we should consider as your supporting evidence, e.g., where the data and plot reside.

We will give extra points to a really good analysis, e.g., one in which you dig deeper into the experimental results, do a nice statistical analysis, etc.

Additional Resources

Updated