Computational requirement indexing

Hi

I would like to index a multi-fasta (RefSeq Genomes) of around 140G.
I used the following command on a server with 1TB of RAM:

 kma index -i sequences_20210211.fna -o kma_db

The output is:

# Total time used for DB indexing: 211063.00 s.
#
# Compressing templates
# Calculating relative indexes.
# Compressing indexes.
# Compression overflow.
# Bypassing overflow.
# Overflow bypassed.
# Finalizing indexes.
Killed

I have two question:

I assume the process was killed because of too much memory usage.
Is there an estimate of the memory usage depending on the size of the provided FASTA?
Does splitting the FASTA file in multiple smaller files and updating the database with -t_db use less memory than using one big FASTA file to create the database?
Is Overflow bypassed a problem?

‌

Comments (3)