Wiki
Clone wikiATLAS-Pipeline / Home
###### !! OUR WIKI HAS MOVED !! ###### ######
Starting with commit 9ef31db, please refer to https://atlaswiki.netlify.app/pipeline.html
###### ####### ####### ####### ###### ######
For older commits, this wiki page stays online.
Welcome to the ATLAS-Pipeline,
A pipeline for ancient and modern Low-Depth DNA Analysis using the tool ATLAS!
ATLAS-Pipeline will easily convert all of your fastq-files into bamfiles in parallel, perform local realignment, recalibrate your base quality scores and correct your data for Post-Mortem Damage (PMD) by combining the most standard tools into one pipeline while keeping high flexibility. Further you can directly call variants, create vcf/glf files or estimate the heterozygosity of your samples.
Fastq-Files --> Bamfiles --> vcf/glf/θ
Requirements
- ATLAS-Pipeline runs on a Linux-based machine. A cluster-support for SLURM clusters is included.
- You need the program ATLAS to be installed on your local machine.
- For running Rhea (local InDel-realignment), you need a valid GATK license on your machine
To ensure data continuity, ATLAS-Pipeline works best in a conda environment.
A suggested environment setup is provided within the repository (environment_5.yaml).
If you prefer to work with locally installed programs here is a list of the packages and versions used throughout this pipeline:
- bcftools=1.9
- bwa=0.7.17
- fastqc=0.11.8
- gatk=3.8
- graphviz=2.40.1
- picard=2.21.1
- python=3.6
- pyyaml=5.1.2
- rpy2=2.9.4
- samtools=1.9
- snakemake=5.4.4
- trim-galore=0.6.4
How to run the ATLAS-Pipeline
You can download the repository to the location on your computer where the analysis should be executed by typing
git clone git@bitbucket.org:wegmannlab/atlas-pipeline.bam.git
General command for execution:
bash Atlas-Pipeline.sh -f [configfile.yaml] [options]
&> logs/[logname].txt
to your command.
Hint: if you have changed something in your config-files and want to run one part of the pipeline again, delete the "wrong" output-files and the corresponding summaries (for example if you want to run GLF with different parameters, run rm Results/4.Pallas/GLF/*
and Results/4.Pallas/summaries/*
)
You can find all options available here or with bash Atlas-Pipeline.sh -h
Config-File
To run the ATLAS-Pipeline, you need to provide a config-file. Here you specify all major information, input-files and thresholds needed for your project. You can find example-config-files in 'example_files/example.config.*' and on the wiki pages of each module.
Overview
The complete ATLAS-Pipeline workflow is split in 4 major parts. Find out more by following the links to:
-
Gaia -- Genome Wide Alignment Including Adapter-trimming
from your sequencing results in fastq format to aligned bamfiles -
Rhea -- Local InDel-Realignment
locally realign alongside known InDels and a dataset from your population of interest -
Perses --Post-Mortem-Damage and Error Rate Estimation for Sequence Data
using ATLAS to merge paired-end reads, split single-end reads, and produce PMD and recal files for further analysis -
Pallas -- Population ALLele-frequency AnalysiS
produce vcf- and glf-files, estimate heterozygosity and (for mammals) the sex of your individuals.
Disclaimer
ATLAS-Pipeline is under active construction and although we have a test suite we do not guarantee that our code is bug-free.
Questions?
Please contact ilektra.schulz@unifr.ch
Updated