Wiki

Clone wiki

tutorial-edinburgh2016 / Running Sampling Workflows Using ExTASY

Documentation

Detailed documentation for ExTASY is available online: http://extasy-workflows.readthedocs.org - we will be following the instructions there closely.

If you run into problems, there is a troubleshooting page which gives answers to many common problems.

Installation & Setup

Log in to workflow.iu.xsede.org, where we have set up the ExTASY scripts.

A Python virtual environment is already set up, with the dependency radical.ensemblemd installed. To activate it, just do:

source ~/ve/bin/activate

Check that everything is ready by running:

#!bash
(ve)[ibethune@workflow ~]$ ensemblemd-version 

You should see version 0.4 was installed.

You can skip the subsection Preparing the Environment since a MongoDB server is provided for the tutorial, and SSH keys to access the HPC machines have already been installed. You would need to follow these instructions if you were using ExTASY on your own.

We have prepared a specialised version of the scripts referred to in the documentation. You can download them all and unpack in your home directory on workflow:

(ve)[ibethune@workflow ~]$ wget https://bitbucket.org/extasy-project/extasy-workflows/downloads/edinburgh_tutorial_2016.tar
(ve)[ibethune@workflow ~]$ tar -xvf edinburgh_tutorial_2016.tar
(ve)[ibethune@workflow ~]$ cd edinburgh_tutorial_2016

All of the scripts referred to in the documentation are contained here. You should not need to download any more scripts, just search in this directory.

Running a simple workflow script locally

Section 4 of the documentation covers how to set up and run a workflow using the Simulation-Analysis Loop pattern, with test kernels which generate random data and count the occurrence of characters within it. This will give a sense of the output to be expected when running workflows like ExTASY.

To begin with, we can just run the workflow locally. It should complete in around 30s.

***Note: if this is the first time you run on workflow there will be a pause of around 2 minutes after the Job waiting on queue... message, which the execution environment is set up.

If successful, you should see output like:

(extasy-test)[ibethune@workflow ]$ cd generic
(extasy-test)[ibethune@workflow generic]$ python multiple_simulations_single_analysis.py epsrc.archer
================================================================================
 EnsembleMD (0.4)                                                            
================================================================================

Starting Allocation                                                           ok
Verifying pattern                                                             ok
Starting pattern execution                                                    ok
--------------------------------------------------------------------------------
Executing simulation-analysis loop with 4 iterations on 1 allocated core(s) on 'epsrc.archer'

Job waiting on queue...
Job is now running !
Iteration 1: Waiting for 16 simulation tasks: misc.mkfile to complete          done
Iteration 1: Waiting for analysis tasks: misc.ccount to complete            done
Iteration 2: Waiting for 16 simulation tasks: misc.mkfile to complete          done
Iteration 2: Waiting for analysis tasks: misc.ccount to complete            done
Iteration 3: Waiting for 16 simulation tasks: misc.mkfile to complete          done
Iteration 3: Waiting for analysis tasks: misc.ccount to complete            done
Iteration 4: Waiting for 16 simulation tasks: misc.mkfile to complete          done
Iteration 4: Waiting for analysis tasks: misc.ccount to complete            done
--------------------------------------------------------------------------------
Pattern execution successfully finished                                         

Starting Deallocation                                                       done 

Congratulations, you have successfully run a simple simulation-analysis workflow!

You should also find in your current directory a set of files produced by the analysis part of each iteration. These contain the counts of the characters occurring in the randomly generated files of each 'simulation' task:

(extasy-test)[ibethune@workflow generic]$ ls -ltr
total 120
-rwxr-xr-x 1 ibethune portal  3157 Feb 18 09:02 multiple_simulations_multiple_analysis.py
-rwxr-xr-x 1 ibethune portal  3345 May 11 10:08 multiple_simulations_single_analysis.py
-rw------- 1 ibethune portal 27072 May 11 10:15 cfreqs-1.dat
-rw------- 1 ibethune portal 27072 May 11 10:18 cfreqs-2.dat
-rw------- 1 ibethune portal 27072 May 11 10:20 cfreqs-3.dat
-rw------- 1 ibethune portal 27072 May 11 10:24 cfreqs-4.dat

You can also run this script on a remote HPC machine. You can select the machine you want to run on by passing its name as an argument to the script, for example to run on ARCHER:

python multiple_simulations_single_analysis.py epsrc.archer

A list of possible targets to run the workflow on are in the supplied config.json file. For Stampede, choose xsede.stampede. Don't forget to set the correct project and queue in the .rcfg file which can be found on the Computer Environment Setup page.

Running ExTASY Workflows on ARCHER or Stampede

Having run example workflows, you will now run real Simulation-Analysis workflows that implement the CoCo-MD and DM-d-MD algorithms introduced earlier.

Example ExTASY scripts are provided for running workflows based on CoCo-MD (section 5) in edinburgh_tutorial_2016/amber-coco and DM-d-MD (section 6) in edinburgh_tutorial_2016/gromacs-lsdmap.

Unlike the 'generic' example above these run real MD calculations and use the CoCo and LSDMap analysis tools you have used already. The input files are designed to run quickly, so the individual simulations are shorter, there are fewer of them, and less simulation-analysis loop iterations than would be typical for a real calculation.

The Amber-CoCo and Gromacs-LSDMap workflows have been implemented as python programs which can be configured via external files. The .rcfg file contains details of the target HPC machine, job size and length, and the .wcfg contains parameters that control the workflow itself. You'll need to se the .rcfg up with your username, allocation and queue. You should not need to modify the source code, although you are welcome to look at it to understand how the workflow is implemented.

Depending on the machine you are using, the ExTASY workflows with the provided parameters should take up to 10 minutes to run (excluding any queue waiting time).

Once you have been able to run the workflow successfully:

--------------------------------------------------------------------------------
Pattern execution successfully finished                                         

Starting Deallocation                                                       done 

then check to see that the output files from the workflow have been produced in your current directory. There are sections of the documentation for CoCo-MD and DM-d-MD that explain what output is expected and how to interpret it.

Make sure that you can run both workflows on your chosen HPC machine.

Extension Exercises

If you have completed all of the above successfully, try experimenting with some of the workflow parameters e.g. num_CUs which controls how many MD runs will be executed at each iteration and num_iterations which controls the number of simulation-analysis loops.

Explore how the length of time taken to execute the workflow varies when you increase the PILOTSIZE - the total number of compute cores which may be used.

Updated