Wiki
Clone wikicomputing / getting-started-farm
Getting Started on Farm
This page walks you through the process of using Farm with the example code contained in the Computing repository. For more details see the Farm page.
Outline
- Create & Test Code locally (on your computer)
- Create Cluster Version of Code
- Upload Data
- Upload Code
-
Run Code
- How to setup a job
- How to setup a batch job
- Keep tabs on the work
- Get notified when done
-
Retrieve results
Example, Step 1 Creating and Testing code locally
1. Get a copy of the code
git clone https://bitbucket.org/hijmans-lab/computing.git #git clone git@bitbucket.org:hijmans-lab/computing.git
2. Generate sample data
In the code there is a script to generate some fake data for testing purposes. It's 10 text files of random numbers 0-100.
Run the following script once. Note, if you're not running this from command-line you will need to set the working directory
Rscript generateexample.R
3. Test a simple script
Once you have the sample data created you can create a local script to test working with 1 file at a time.
The single-test.R script calculates summary statistics and plots a histogram. 1 argument is taken, which is the path to the file you want to process.
Rscript single-test.R example/1.csv
4. Try adding looping
The single-loop-test.R script calculates summary statistics and plots a histogram for all numbered #.csv files in a directory. 1 argument is taken, which is the path to the directory you want to process.
Rscript single-loop-test.R example
Example, Step 2+ Moving to the cluster
1. Clone the code to the cluster
For Biogeo users we have a deployment key, otherwise login to [FARM] and use git clone
as you normally do on other machines.
Upload, using SFTP (e.g.) Filezilla you can upload your code and data to Farm.
Login to Farm
ssh <user>@farm.cse.ucdavis.edu
Clone your Code if you didn't upload it already
git clone git@bitbucket.org:hijmans-lab/computing.git
2. Running code
For sanity start with code that does the bare minimum on the cluster. You can actually test the single scripts from the previous section to get used to calling commands on a Linux machine.
Next using the cluster-slurm-test-simple.R , this script just prints the array task id. sbatch requires you send it a shell script (the .sh). In this case we've provided a generic one that can wrap any R script you want to run on the cluster. Have a peak in clusterR.sh to see how it works.
sbatch --array=1-10 --job-name=test1 --partition=serial --mail-type=ALL --mail-user=<you>@ucdavis.edu /home/<username>/computing/clusterR.sh /home/<username>/computing/cluster-slurm-test-simple.R
3. Running complicated code
Now that you've gotten a simple script to work on the cluster you can try the same analysis we did earlier but utilize the cluster so they are run in parallel (all at the same time instead of one by one). To do this you'll want the cluster version of the single-test, called cluster-slurm-test.R. You might ask why the single version. Well it turns out that when you run things in parallel, it's the single unit running independently from all other units that is the what makes it work.
Of course to initially test you might want to only run 1 item. To do this simply provide a single number to the --array option
sbatch --array=2 --job-name=test2 --partition=serial --mail-type=ALL --mail-user=<you>@ucdavis.edu /home/<username>/computing/clusterR.sh /home/<username>/computing/cluster-slurm-test.R
Once you've confirmed that it works as expected you can create an array as long as the number of items you need to process. In our example we have 10 files to process so --array=1-10
sbatch --array=1-10 --job-name=test10 --partition=serial --mail-type=ALL --mail-user=<you>@ucdavis.edu /home/<username>/computing/clusterR.sh /home/<username>/computing/cluster-slurm-test.R
Want to check on your job
squeue -u $(whoami)
4. Getting Results
When it finishes, you can reconnect your SFTP client (e.g. Filezilla) and download your results. As a curtosy to others please clean up your files after you've confirmed your downloads. See Checksums for tips on how to verify your downloaded all your files correctly. At this point if you are using Hadoop, see the relevant wiki page about how to directly transfer your results in to Hadoop instead of downloading.
Next Steps
There are more helpful tips about using FARM on the FARM page. In particular learning to use the array mechanism and commandline arguments to control what analysis is run will take a little practice.
Updated