Missing genomes in output

I've run the latest version of the tool with the test data and it completed without errors, but some genomes are missing from the output.

‌

There are 575 user-supplied genomes in the test file:

\$ cut -f2 MAVERICLab-vcontact2-a3541dd53c3e/test_data/proteins.csv -d ',' | sed 1d | sort -u | wc -l

575

‌

But only 246 are found in the viral_cluster_overview file:

\$ grep -o 'VIR' viral_cluster_overview.csv | wc -l
246

‌

And the genome_by_genome file is not present:

\$ ls

c1.clusters modules_mcl_5.0_modules.pandas
c1.ntw modules_mcl_5.0_pcs.pandas
merged.dmnd sig1.0_mcl2.0_clusters.csv
merged.faa sig1.0_mcl2.0_contigs.csv
merged.self-diamond.tab sig1.0_mcl2.0_modsig1.0_modmcl5.0_minshared3_link_mod_cluster.csv
merged.self-diamond.tab.abc sig1.0_mcl5.0_minshared3_modules.csv
merged.self-diamond.tab.mci vConTACT_contigs.csv
merged.self-diamond.tab_mcl20.clusters vConTACT_pcs.csv
merged.self-diamond.tab_mcxload.tab vConTACT_profiles.csv
merged_df.csv vConTACT_proteins.csv
modules.ntwk viral_cluster_overview.csv
modules_mcl_5.0.clusters

‌

Any help would be appreciated

Comments (4)