- edited description
kma index error
Hello!
I am trying to use KMA for metagenomic classification and want to index a database consisting of bacterial, viral and fungi Refseq Complete Genomes as well as the human genome. However, I am prompted with an error when running the index:
Invalid option: index
This is my code:
module load KMA/2018-Nov-12-foss-2018a
kma index -i path/to/inputfile/file.fna -k 20 -o kmadb208/kmadb208
module purge
Comments (8)
-
reporter -
reporter - changed status to resolved
seems that you need to add an "_" so that the command is
kma_index -i path/to .......
-
It is an old version of KMA you are running, which does not include most the of the updates performed in the last years.
I would recommend to update it.Best,
Philip -
reporter Dear Philip,
We had memory issues with the installed version and updated to the newest, running with
-NI -Sparse TG
Finalizing the build, the following messages were printed:
# Templates key-value pairs: 1711887037. # Total time used for DB indexing: 14693.47 s. # Compressing templates # Preparing compressed DB. # Calculating relative indexes. # Compression overflow. # Finalizing indexes. # Dumping compressed DB # Template database created. # Total time used for DB compression: 20089.82 s.
Does the ‘compression overflow’ mean that the database build was not finalized??
Sincerely,
Morten
-
Hi Morten
It means that the number of inferred taxis exceeded what could be stored in an unsigned integer, which causes KMA to store them in a long unsigned integer.
So the build is complete and valid. If something unexpected or potentially unintended happened it will be printed to stderr without a preceding '#'. If an error occurred the exit code will be different from 0 too, where an exit code above 1 usually results in an invalid index.Best,
Philip -
reporter Thanks a lot Philip. I am testing different mappers for performance on short erronous Nanopore reads (cell free DNA). Which settings for kma would you suggest for such a task?
The reads are generated with badread (rrwick) and consist of 99.5% human reads (~170bp) and 0.5% bacterial reads (~70bp) at a mean quality of 93%. Other than kma I test Centrifuge, minimap2, bowtie2 and kraken2 with the most recent refseq complete genomes release (bacterial, viral, fungi, human). If you have suggestions for classifiers that I did not include but should consider please dont hesitate to post them. I should also mention that the bacterial coverage will be very low.
I hope that my question is OK
Kind regards,
Morten
-
Hi Morten
For that I would use something like:
-mem_mode -bc 0.7 -bcNano -mrs 0 -ID 0 -ef -1t1 -caFor the testing KrakenUniq would be a good addition. You might have to lower the k-mer size for Kraken2 and KrakenUniq. For Centrifuge, Minimap2 and Bowtie2 there might be some mapping quality and other scoring thresholds you might have to lower.
Best,
Philip -
reporter Hi Philip,
Thank you for the help!
Kind regards,
Morten
- Log in to comment