Wiki

Clone wiki

CS5220-S14 / HW3

Homework 3: Smoothed Particle Hydrodynamics

-> Due:Friday, March 28th by 11:59pm <-

Picture of simulation

Problem description

The purpose of this assignment is to get a somewhat more serious parallel programming experience using OpenMP. Your goal is to parallelize a three dimensional smoothed particle hydrodynamics simulation. A reference serial implementation is documented here. The current version uses a spatial hashing algorithm to significantly reduce the cost of checking interactions. The assignment tasks are as follows:

  • Complete the missing calls to the spatial hashing algorithm which will improve the complexity by changing from the naive O(n^2) algorithm to an O(n) (roughly) force evaluation algorithm based on spatial partitioning.

  • Profile the serial code and report on any bottlenecks. Is the time going where you expected? Ideally, both your serial and parallel performance tuning should be informed by profiling experiments.

  • Parallelize your code using OpenMP, and spend some time tuning the parallel implementation. You will need to go beyond just adding pragmas before your for loops in order to get reasonable speed!

  • Do a scaling study to characterize the performance of both your serial and parallel codes as a function of the number of particles in the system and the number of processors. Try to explain what you see. Where are the bottlenecks? What are the parallel overheads? Do you have good load balance? Are you making good use of the memory system? For several of these questions, you may find it useful to try timing your code with Intel VTune, gperftools, IPM, or TAU (all of which are installed on the cluster).

  • If you have extra time, play a little! Improve or extend the code in some way that appeals to you, either by doing something clever with the time integrator (e.g. automatic adaptation of the time step), or adding a feature (We deliberately left out the surface tension calculations), or by doing something else. If this project appeals to you, extensions to it could easily be the basis of a class project.

To visualize the output files generated by the code, use the provided webgl viewer: viewer.html

You can get started with the serial code by reading the annotated source code, which I generated from the file sources using dsbweb, my home-grown literate programming tool.

Submission

You are expected to do this assignment in groups of 2-5, though everybody should get familiar enough with all the tasks to be able to do them independently. One person in your group should be a non-CS student (if possible), but otherwise you're responsible for finding a group. You do not have to have the same groups as last time.

You will submit your code on CMS, but the main deliverable for this project is a report. Here is a list of items you might show in your report:

  • A plot in log-log scale that shows that your serial and parallel codes run in O(n) time, and a description of the data structures that you used to achieve it.

  • Profiling data to illustrate the bottlenecks in your code.

  • A description of the synchronization you used in the OpenMP code.

  • A description of the design choices that you tried and how did they affect the performance.

  • Speedup plots that show how closely your parallel codes approach the idealized p-times speedup and a discussion on whether it is possible to do better.

  • Where does the time go? Consider breaking down the runtime into computation time, synchronization time and/or communication time. How do they scale with the number of processors? The number of particles?

Resources

Updated