Wiki
Clone wikilab5 / Home
CSE 6230, Fall 2013: Lab 5, Th Sep 26: Profiling
- This lab: http://j.mp/gtcse6230fa13lab5
- Info on the Jinx cluster: http://support.cc.gatech.edu/facilities/instructional-labs/jinx-cluster
In this lab you will practice profiling and compiler-assisted optimization on a real-world program. The lab will consist of two parts, and will be done in class, but we will also post home assignment. In this lab you will profile and optimize ImageMagick, a library for image transformation, convertion, and manipulation.
You may if you wish work in teams of two. To simplify our grading of your assignments, each person should submit his/her own assignment; however, all team members may submit identical code. Be sure to indicate with whom you worked by creating a README file as part of your submission. (See below for details.)
Please try to use Jinx compute nodes for this assignment (use qsub -I -q class_long
to get a node in interactive mode).
Part 0: Getting started
Different nodes of Jinx cluster might have different processors, so it is important that you specify the processor on your Jinx node. You may get the information about the processor from the file /proc/cpuinfo
. Execute the command cat /proc/cpuinfo
, look at the value of model name
and answer the question:
- What is the name of the processor which you used for this assignment?
Execute the following command to setup your environment and get the recent gcc (4.8.1), clang (3.3), and valgrind:
source /nethome/mdukhan3/install/envvars.sh
Create a directory ProfilingLab in your home directory on Jinx:
mkdir ProfilingLab
cd ProfilingLab
Download sources for ImageMagick:
wget http://www.imagemagick.org/download/ImageMagick-6.8.7-0.tar.bz2
Unpack the source archive:
tar -xjf ImageMagick-6.8.7-0.tar.bz2
Navigate to unpacked ImageMagick directory:
cd ImageMagick-6.8.7-0
Run the configure script which detects compilers and libraries available on Jinx and sets the compilation settings accordingly:
./configure --prefix=$HOME/ProfilingLab --disable-shared --disable-openmp --without-bzlib --without-dps --without-djvu --without-jbig --without-jp2 --without-lcms --without-lcms2 --without-lqr --without-lzma --without-openexr --without-pango --without-tiff --without-webp --without-xml
--prefix=$HOME/ProfilingLab
parameter specifies that the library (and acompanying tools) must be installed to ProfilingLab
folder in your home directory, --disable-openmp
disables multithreading for this build, and other parameters disable ImageMagick features which we don't need for this assignment. To get the list of all supported parameters run ./configure --help
.
After the script completes configuration, build the program with the command
make -j8
-j8
parameter means that make should use 8 threads (jobs) for building.
The build finished execute make install
to install the library and acompanying tools to $HOME/ProfilingLab
.
Now navigate back to ProfilingLab
directory. We will use convert
utility which is installed to $HOME/ProfilingLab/bin
. Run bin/convert --version
to make sure that it was built and installed correctly. You should see the banner as below:
Version: ImageMagick 6.8.7-0 2013-09-26 Q16 http://www.imagemagick.org Copyright: Copyright (C) 1999-2013 ImageMagick Studio LLC Features: DPC Delegates: fftw fontconfig freetype jng jpeg png png x zlib
Download the image sample (this is a cat photo created by Stephan "Macphreak" Brunet an freely available on Wikimedia Commons) which we will you for this assignment:
cd .. # Go up one directory wget http://upload.wikimedia.org/wikipedia/commons/1/10/Louis-%26-Chanel-taking-a-nap.jpg -O input.jpg
In this assignment you will use the convert
utility to:
- Load JPG image
- Blur the image
- Transform it to grayscale
- Save as PNG
To do all of that run the utility with the following parameters:
bin/convert -blur 15x15 -colorspace gray input.jpg output.png
You should get a blurred grayscale version of input.jpg
in the file output.jpg
.
Part 1: Profiling
Run the conversion (probably multiple times) under perf stat
utility. Answer the questions:
- What is the IPC of the utility in our use case? Is it good?
- What is the fraction of mispredicted branches? Is it acceptable?
- What is the rate of cache misses? How do you feel about this number?
Now run the use case again under perf record
. After it finishes, use perf report
to browse the results. Answer the questions:
- Which two functions take most of the execution time? What do they do?
Part 2: Compiler Optimizations
Measure the time it takes the convert
utility to do the processing:
time bin/convert ...
In the rest of this lab you will have to apply profile-guided optimization to ImageMagick.
Navigate back to ImageMagick source directory:
cd ImageMagick-6.8.7-0
First, you need to delete all products and temporaries from previous build. Run make clean
to achieve that.
Next, you will have to reconfigure ImageMagick by running ./configure
script again, but with additional parameters. You may specify additional options for C compiler via variable CFLAGS
, and additional options for linker via LDFLAGS
variable. E.g. to compile with -fprofile-generate
option (which must be specified for both compiler and linker), run:
./configure ... CFLAGS="-fprofile-generate" LDFLAGS="-fprofile-generate"
configure
to use a different C compiler via CC
variable, i.e. ./configure ... CC=icc
will configure the build to use Intel C compiler)
Using profile-guided optimization with GCC involves three steps:
- Build the program with
-fprofile-generate
(both for C compiler and linker). - Run the program on representative inputs.
- Build the program again with
-fprofile-use
(both for C compiler and linker).
Apply these steps to ImageMagick to get a profiled version of the program. Run the image conversion again under time
utility and answer the question:
- What is the "User Time" for program execution after you completed all three steps and rune the program with
-fprofile-use
?
Part 3: More Compiler Optimizations (out-of-class, due October 3rd 4:30 PM)
In Part 2 you tried specific compiler optimization (Profile-Guided Optimization). In this part you can try any compiler optimizations except OpenMP (i.e. keep the configuration parameter --disable-openmp
as is, and do NOT add -fopenmp
or -openmp
to CFLAGS
) in order to achieve maximum performance.
Besides the compiler options from lecture you may also consider fine grained optimization options from compilers' documentation:
- GCC Optimization options (also compatible with clang).
- Use command
man icc
for reference on Intel Compiler options.
What to submit
Add to your repository the following files:
- Your compiled
convert
binary (you will find it in$HOME/ProfilingLab/ImageMagick-6.8.7-0/utilities/convert
) - A readme file (in plain text or Markdown format) detailing CC, CFLAGS, and LDFLAGS parameters and explaining the compiler optimizations you used and achieved performance.
Performance targets
To get A for this assignment you need to reduce the execution time to 3.2 secs or lower (as measured on Jinx-login)
Updated