Wiki

Clone wiki

ATLAS-Pipeline / Pallas

Population ALLele-frequency AnalysiS

This part of the workflow handles your desired downstream analysis. At the moment Pallas will not run without running Perses first as it will use the file created file dependencies.

Before running the pipeline:

  1. Make sure you ran Perses successfully.
  2. Create a config file. An example can be found at example_files/example_config_Perses.yaml

Configfile

Provide for each project an individual configfile in yaml format. This file can be shared with other researchers to perform the exact same analysis independently.
You can adapt the default behavior of all atlas-tasks by passing a string to the keyword atlasParams. Also, each individual task provides additional optionality to pass parameters (like e.g. glfParams for creating GLF-files). If the same parameter is used multiple times, the individual string (e.g. glfParams) is favored over atlasParams, and atlasParams is favored over the default behavior of the pipeline.
This is a template to an example configfile for Perses:


runScript: Pallas

#-------------------------
# 1. sample_file (required) - use the same input-file as for the Perses pipeline!
sample_file: supportingFiles/samples.tsv

# 2. programs, references, etc.
ref: /data/projects/p243_ancientdna_unifr/Reference/hs37d5.fa
atlas: /home/ischulz/atlas_develop/atlas

# 3. specify what ATLAS-params should occur to all tasks (leave empty "" if you want to keep the default)
atlasParams: "filterSoftClips"

# 3. create GLF-files. Can be *T* or *F*.
glf: T
glfParams: "minDepth=2"

# 4. Estimate Heterozygosity (Theta). Can be *global* (one theta over the whole genome), *window* or *F* to not estimate theta at all.
# specify additionally *thetaGlobalParams* or *thetaWindowParams*
theta: global
thetaGlobalParams: ""

# 5. Call. Can be *Bayes*, *MLE*, *AllelePresence* or *F* to disable calling.
# specify additionally *bayesParams*, *mleParams* or *allelePresenceParams*
call: Bayes
bayesParams: "prior=theta fixedTheta=0.001 formatFields=GT,AD,AB,AI,DP,GQ,GP infoFields=DP"
#mleParams: "formatFields=GT,AD,AB,AI,DP,GQ infoFields=DP "
#allelePresenceParams: "prior=theta fixedTheta=0.001"

# 6. estimate the genetic sex (currently only available for human data). Can be *T* or *F*.
# using a script from Skoglund (2013, doi.org/10.1016/j.jas.2013.07.004) 
sex: F

Samples file

Please use the same inputfile you used for Perses (containing the input for Perses. Do not update the paths)

Parameters:

For detailed information on the individual parameters you want to apply to each job, please refer to these ATLAS-Wiki pages: Engine-Parameters, GLF, Theta, Calling

Results:

The results can be found in Results/4.Pallas/

File description
Results/4.Pallas/GLF/ GLF-files you can use e.g. in MajorMinor
Results/4.Pallas/theta/ Heterozygosity estimates (Theta)
Results/4.Pallas/call/ calling results
Results/4.Pallas/sex/ sex-estimation

Updated