Wiki
Clone wikiCS5220-S14 / HW0
Homework 0: Cluster Warmup and Class Survey
-> Due:Tuesday, Feb 4th by 11:59pm <-
This homework is meant to get you started with the C4 cluster. We need
to make sure that you can access the cluster and that you are
comfortable with its basic functionality. You may ask anyone you wish
for help, but please make sure that ultimately you understand how to
do each task by yourself. Once you have tried the tasks below, you
should submit the files lstopo.out
, timings.csv.pdf
and writeup.txt
through CMS.
-
Follow the instructions under First steps. You now have a folder
cs5220-s14
under the path which you calledgit clone
, presumably under your home directory on C4. Note that you can usegit pull origin
to retrieve the latest updates from the class repository. -
Go to the membench directory and type
make run
which first creates the membench executable and then proceeds to submit the job to the cluster using thecsub
script. Note that you should not run membench directly on the front-end node. You can inspect the Makefile to see what commands make executes. The contents of thecsub
script may also be of interest (typingwhich csub
will reveal the location of the particular script). -
The
make run
command will return almost immediately, but it will actually take a few moments to runmembench
on the cluster nodes. You can monitor the state of your job using thecondor_q
andcondor_status
commands. The filecsub-XXXX.sub
contains the job description.csub-XXXX-o.0
,csub-XXXX-e.0
andcsub-XXXX-l.0
are also generated (XXXX is a number given by the process id that created it). They will store output, error and log information, respectively, for the particular job. If for some reason your job is not running, after waiting for a while, usecondor_q -analyze XXXX
to get a diagnostic message.condor_hold
will suspend a submitted job until it can be later released for execution by the commandcondor_release
. You may also usecondor_rm XXXX.X
to cancel the job. -
Once the program is done running you should have a file called
timings.csv
which contains the memory benchmark results. We have also supplied you with the timing results from a prior benchmark on a pac node which are available in thetimings-pac.csv
file. Note that while the program is running,timings.csv
will be visible in your directory, but will not have anything in it. Once the file is ready, typemake pdf
to produce some plots using themembench.py
python script. You can then fetch the files to your own machine for inspection. -
Now that we have our performance results, we may wish to retrieve some additional information about the hardware. The
pinfo
script found under the membench folder will outputlstopo.out
andcpuinfo.out
which contain useful information about the processor. Suppose that we want to run the job on a specific node. We can restrict execution to a particular group, machine or slot in the htcondor pool. Typecondor_status -master
to get a list of all the nodes. Pick the name of a pac machine at random. Let's say we decided to run the job onslot 5
ofpac-0-2
. Use your favorite editor (maybenano
if you don't have a favorite) to create a text file which will contain the job requirements in a line of the formrequirements = Machine == "pac-0-2.local"
. We can execute thepinfo
script on the target node using:csub -a -f myreq.txt ./pinfo
. Try running pinfo both on a cs-instructional node and a pac node. -
Transfer a copy of the results from steps 4 and 5 to your local machine with the help of
scp
orsftp
. If you are using Linux or OS X, typescp netID@c4.coecis.cornell.edu:~/cs5220-s14/membench/timings.csv.pdf .
using a local machine terminal. Alternately, you can connect to C4 withsftp netID@c4.coecis.cornell.edu
, change directories withcd cs5220-s14/membench
and then fetch the files withget *.pdf
. Windows users may want to check WinSCP, Cygwin and Xming. -
The cache line length on the instructional nodes is 64 bytes; if you look at the behavior at stride 64, can you see the four basic memory access times for a hit in L1, a hit in L2, a hit in L3, and an access to main memory? Try to identify how the features of the memory hierarchy show up in the plots. How does the memory hierarchy of a
cs-instructional
node differ from that of apac
node? Give a brief comparison of the timing results in which you identify major features of the memory performance for each system and justify the difference using the processor information we have previously retrieved. -
Please include a short document (as
writeup.txt
) in your submission which should have your memory hierarchy analysis and any assignment feedback you may have. You can also tell us a bit about yourself, reasons for taking the class, any particular topics you would like to see us cover during the semester and whether you currently have a good idea for the final class project. -
To complete the assignment submit your results (output files), for the cs-instructional node of your choosing, along with your writeup, on CMS.
Updated