rsem fails on adjacent reads not being mates of paired-end read

Issue #9 resolved
Per Unneberg created an issue

Command:

pytest tests/test_scrnaseq.py::TestSampleOrganization --slow  -W

output:

38 of 48 steps (79%) done
rule rsem_calculate_expression:
    input: s1/s1.merge.tx.sortn.bam, ../ref/rsem_index.transcripts.fa
    output: s1/s1.merge.tx.sortn.isoforms.results, s1/s1.merge.tx.sortn.genes.results
Read SRR358836.311489: The adjacent two lines do not represent the two mates of a paired-end read! (RSEM assumes the two mates of a paired-end read should be adjacent)
Error in job rsem_calculate_expression while creating output files s1/s1.merge.tx.sortn.isoforms.results, s1/s1.merge.tx.sortn.genes.results.
RuleException:
CalledProcessError in line 818 of /home/peru/dev/snakemake-workflows/snakemake_workflows/scrnaseq/workflow.sm:
Command 'rsem-calculate-expression --alignments --paired-end -p 1 s1/s1.merge.tx.sortn.bam ../ref/rsem_index s1/s1.merge.tx.sortn' returned non-zero exit status 255
  File "/home/peru/anaconda3/envs/py3.4-devel/lib/python3.4/concurrent/futures/thread.py", line 54, in run
Will exit after finishing currently running jobs.
Exiting because a job execution failed. Look above for error message

Comments (2)

  1. Per Unneberg reporter

    Run rsem on unique reads and sorted output

    Fixes #9

    • merging is done on toTranscriptome.out_unique.bam files, i.e. bamtools unique is run on the STAR alignment
    • merged files are sorted by name (tag 'sortn') by new rule samtools_sort_by_name

    → <<cset 5e676a8922b8>>

  2. Log in to comment