Method to reduce needed memory when running a huge file

Hi, I'm running vcontact2 on a 2.4G protein .faa file and there is a MemoryError: Unable to allocate 9.01 TiB for an array with shape (1113079, 1113079) and data type float64. Are there any methods to reduce the needed memory when running such a huge protein file? Here is the code:

source activate vContact2
vcontact2_gene2genome -p prot_vir.faa \
                      -o g2g.csv \
                      -s 'Prodigal-FAA'

vcontact2 --raw-proteins prot_vir.faa \
          --rel-mode 'Diamond' \
          --proteins-fp g2g.csv \
          --db 'ProkaryoticViralRefSeq201-Merged' \
          --pcs-mode MCL \
          --vcs-mode ClusterONE \
          --c1-bin /lustre/home/liutang/01software/MAVERICLab-vcontact2-aaa065683c99/bin/cluster_one-1.0.jar \
          --output-dir vcontact2_ref201

Thank you.

Comments (2)