KMA only outputs a single alignment
I have a test fastq file of 125,000 interleaved metagenomics reads that generates alignments to 42 reference genomes when I run it on the CCmetagen server: https://cge.cbs.dtu.dk//cgi-bin/webface.fcgi?jobid=5E3DB53E00005FFB8AD4C041
When I run the same file locally using I am using KMA-1.2.21 with the ncbi_nt_no_env_11jun2019 database only one alignment appears. My command was:
kma -int ../rqc_data/reads/FRCS-D1-R1-238._sample.rqc.fq -o test1 -t_db ../../gbru_fy20_rice_methane/reference_db/ncbi_nt_no_env_11jun2019/ncbi_nt_no_env_11jun2019 -t 76 -1t1 -and -apm f -mem_mode
The file used to generate the issue is attched.
Thanks for looking into it.
Comments (5)
-
-
reporter - edited description
-
reporter Okay, thanks. Does this mean that only 42 reads are being aligned out of 125,000? If so, this seems quite low relative to Diamond, etc. Can you suggest parameter changes in increase recall?
-
Each alignment in the *.aln file contains the consensus sequence of each template reaching the thresholds. This means that each of these alignments contains all the reads that aligned to that template, where the bases are determined using majority voting.
-
- changed status to closed
- Log in to comment
Hi Adam
The CCMetagen webserver reads your fastq sample as single end reads instead of interleaved paired end reads. I will add a note about this to the webserver.
When you analyse the sample using the paired end information KMA will split the input reads over several templates, which means that the individual templates are no longer significantly overrepresented. Seemingly there are too few fragments to use the paired end information properly. You can adjust this by lowering the threshold of including templates by setting the option “-mrs” to e.g. 0.01.
Best,
Philip