More guidelines for pre-processing step please

Issue #10 resolved
mabelwong created an issue

According to the paper, 'As a pre-requisite for binning, the user must create BAM files by aligning the reads of each sample separately to the assembled metagenome'

As a non bioinformatician who is trying to work through some metagenomics data, I am confused... May I ask:

  • Can I use this software for my data please? I have 6 different samples in total, and 1 illumina PE run per sample. I would like to perform genome binning per sample.

  • How to create BAM files please? I have the raw Illumina reads, and assembled reads generated by Abyss.

I am not sure whether this is the place for asking questions, but this is the closest thing to a Q&A forum I find here. Sorry in advance if the message shouldn't be posted here. Your answer will be greatly appreciated.

Thank you very much, Mabel

Comments (1)

  1. Rob Egan

    Yes you absolutely can use metabat on your data. I recommend generating 6 separate bams, one for each sample/Illumina run. Generating BAMs is outside the scope of Metabat, but I it is easy to do, and there are many ways to do it if you search google and the seqansers forums: http://seqanswers.com.

    I recommend downloading and installing bwa and samtools. You can use bwa to align your reads (recommend the mem module) into a SAM file, and then use samtools to convert the SAM into a sorted BAM file.

    bwa index reference.fasta bwa mem [options] reference.fasta <in1.fq> [in2.fq] > in.fq.sam samtools view -Sbu in.fq.sam | samtools sort - in.fq.sam

    Repeat for each sample, then you can use the reference.fasta and the resulting 6 different *.fq.sam.bam files and the as input into runMetabat.sh

  2. Log in to comment