- changed status to resolved
rna-seq gold standard variant calling function
Issue #17
resolved
Amjad: feel free to reassign it to someone else
follow gatk gold standards for variant calling for making this function, here is already some code for it in anduril1, but check the gatk documentation as well
reference = INPUT(path="/opt/share/annotation/human-ensembl38/genome.fa")
for r:std.iterArray(bams.in) {
regroup = BashEvaluate( var1 = bams.in[r.key],
script = "java -Xmx4096m -jar /mnt/csc-gc5/opt/picard/picard.jar AddOrReplaceReadGroups I=@var1@ O=@optOut1@ RGID=r.key RGLB=paired RGSM=r.key RGPL=ILLUMINA RGPU=ILLUMINA",
@name = "regroup_"+r.key)
@out.optOut1.filename = "dedupped.bam"
dup = BashEvaluate( var1 = regroup.optOut1,
script = "java -Xmx4096m -jar /mnt/csc-gc5/opt/picard/picard.jar MarkDuplicates I=@var1@ O=@optOut1@ CREATE_INDEX=true VALIDATION_STRINGENCY=SILENT M=@optOut2@",
@name = "marked_"+r.key)
@out.optOut1.filename = "split.bam"
splitNtrim = BashEvaluate( var1 = dup.optOut1,
var2 = reference,
script ="""export _JAVA_OPTIONS=-Djava.io.tmpdir=@folder1@
java -jar /opt/share/gatk-3.5/GenomeAnalysisTK.jar -T SplitNCigarReads -R @var2@ -I @var1@ -o @optOut1@ -rf ReassignOneMappingQuality -RMQF 255 -RMQT 60 -U ALLOW_N_CIGAR_READS --fix_misencoded_quality_scores -fixMisencodedQuals""",
@name = "splitNtrim_"+r.key,
@execute= "once")
@out.optOut1.filename = "calls.vcf"
hapCalls = BashEvaluate(var1 = splitNtrim.optOut1,
var2 = reference,
script ="""java -jar /opt/share/gatk-3.5/GenomeAnalysisTK.jar -T HaplotypeCaller -R @var2@ -I @var1@ -dontUseSoftClippedBases -stand_call_conf 20.0 -stand_emit_conf 20.0 -o @optOut1@""",
@name = "hapCalls_"+r.key)
@out.optOut1.filename = "filtered.vcf"
filtered = BashEvaluate(var1 = hapCalls.optOut1,
var2 = reference,
script ="""java -jar /opt/share/gatk-3.5/GenomeAnalysisTK.jar -T VariantFiltration -R @var2@ -V @var1@ -window 35 -cluster 3 -filterName FS -filter "FS > 30.0" -filterName QD -filter "QD < 2.0" -o @optOut1@""",
@name = "filtered_"+r.key)
}
Comments (1)
-
- Log in to comment
created PrepRnaBam function that should do all preprocessing. Variant calling should be separate since it can be done in several different ways