Wiki

Clone wiki

CS5220-S14 / HW0

Homework 0: Cluster Warmup and Class Survey

-> Due:Tuesday, Feb 4th by 11:59pm <-

This homework is meant to get you started with the C4 cluster. We need to make sure that you can access the cluster and that you are comfortable with its basic functionality. You may ask anyone you wish for help, but please make sure that ultimately you understand how to do each task by yourself. Once you have tried the tasks below, you should submit the files lstopo.out, timings.csv.pdf and writeup.txt through CMS.

  1. Follow the instructions under First steps. You now have a folder cs5220-s14 under the path which you called git clone, presumably under your home directory on C4. Note that you can use git pull origin to retrieve the latest updates from the class repository.

  2. Go to the membench directory and type make run which first creates the membench executable and then proceeds to submit the job to the cluster using the csub script. Note that you should not run membench directly on the front-end node. You can inspect the Makefile to see what commands make executes. The contents of the csub script may also be of interest (typing which csub will reveal the location of the particular script).

  3. The make run command will return almost immediately, but it will actually take a few moments to run membench on the cluster nodes. You can monitor the state of your job using the condor_q and condor_status commands. The file csub-XXXX.sub contains the job description. csub-XXXX-o.0, csub-XXXX-e.0 and csub-XXXX-l.0 are also generated (XXXX is a number given by the process id that created it). They will store output, error and log information, respectively, for the particular job. If for some reason your job is not running, after waiting for a while, use condor_q -analyze XXXX to get a diagnostic message. condor_hold will suspend a submitted job until it can be later released for execution by the command condor_release. You may also use condor_rm XXXX.X to cancel the job.

  4. Once the program is done running you should have a file called timings.csv which contains the memory benchmark results. We have also supplied you with the timing results from a prior benchmark on a pac node which are available in the timings-pac.csv file. Note that while the program is running, timings.csv will be visible in your directory, but will not have anything in it. Once the file is ready, type make pdf to produce some plots using the membench.py python script. You can then fetch the files to your own machine for inspection.

  5. Now that we have our performance results, we may wish to retrieve some additional information about the hardware. The pinfo script found under the membench folder will output lstopo.out and cpuinfo.out which contain useful information about the processor. Suppose that we want to run the job on a specific node. We can restrict execution to a particular group, machine or slot in the htcondor pool. Type condor_status -master to get a list of all the nodes. Pick the name of a pac machine at random. Let's say we decided to run the job on slot 5 of pac-0-2. Use your favorite editor (maybe nano if you don't have a favorite) to create a text file which will contain the job requirements in a line of the form requirements = Machine == "pac-0-2.local". We can execute the pinfo script on the target node using: csub -a -f myreq.txt ./pinfo. Try running pinfo both on a cs-instructional node and a pac node.

  6. Transfer a copy of the results from steps 4 and 5 to your local machine with the help of scp or sftp. If you are using Linux or OS X, type scp netID@c4.coecis.cornell.edu:~/cs5220-s14/membench/timings.csv.pdf . using a local machine terminal. Alternately, you can connect to C4 with sftp netID@c4.coecis.cornell.edu, change directories with cd cs5220-s14/membench and then fetch the files with get *.pdf. Windows users may want to check WinSCP, Cygwin and Xming.

  7. The cache line length on the instructional nodes is 64 bytes; if you look at the behavior at stride 64, can you see the four basic memory access times for a hit in L1, a hit in L2, a hit in L3, and an access to main memory? Try to identify how the features of the memory hierarchy show up in the plots. How does the memory hierarchy of a cs-instructional node differ from that of a pac node? Give a brief comparison of the timing results in which you identify major features of the memory performance for each system and justify the difference using the processor information we have previously retrieved.

  8. Please include a short document (as writeup.txt) in your submission which should have your memory hierarchy analysis and any assignment feedback you may have. You can also tell us a bit about yourself, reasons for taking the class, any particular topics you would like to see us cover during the semester and whether you currently have a good idea for the final class project.

  9. To complete the assignment submit your results (output files), for the cs-instructional node of your choosing, along with your writeup, on CMS.

Updated