Still identifying genome substrings. Consider adjusting input genomes naming. 'Shewanella~sp.~phage~1/4'

Issue #63 new
Toscan created an issue

Hello Developers,

thank you for developing this super nice tool!!

I am running into an interesting issue here.

At the step “----------------------------Exporting results files-----------------------------” I get the following error:

“Still identifying genome substrings. Consider adjusting input genomes naming.

And then that is it, the execution stops…

Snapshot of my .out file from SLURM job:

Snapshot of my .err file from SLURM job:

I have used this successfully many times with other datasets, but this particular errors only with this dataset.

I checked to see if the problem was runtime or memory exceeded and it was not:

\$ seff 8841463
Job ID: 8841463
Cluster: eve
User/Group: brizolat/umb
State: COMPLETED (exit code 0)
Cores: 1
CPU Utilized: 14:41:22
CPU Efficiency: 90.48% of 16:14:06 core-walltime
Job Wall-clock time: 16:14:06
Memory Utilized: 129.31 GB
Memory Efficiency: 64.65% of 200.00 GB

I also checked the intermediary files here to provide you some more details:

\$ ls -lh vcontact-output
total 5,7G
-rw-rw-r--+ 1 brizolat eve_umbmsb 3,1M 15. Dez 06:43 c1.clusters
-rw-rw-r--+ 1 brizolat eve_umbmsb 89M 15. Dez 06:36 c1.ntw
-rw-rw-r--+ 1 brizolat eve_umbmsb 15M 15. Dez 03:08 merged_df.csv
-rw-rw-r--+ 1 brizolat eve_umbmsb 431M 14. Dez 17:57 merged.dmnd
-rw-rw-r--+ 1 brizolat eve_umbmsb 418M 14. Dez 17:56 merged.faa
-rw-rw-r--+ 1 brizolat eve_umbmsb 2,3G 14. Dez 22:27
-rw-rw-r--+ 1 brizolat eve_umbmsb 1,6G 14. Dez 22:29
-rw-rw-r--+ 1 brizolat eve_umbmsb 431M 14. Dez 23:01
-rw-rw-r--+ 1 brizolat eve_umbmsb 41M 14. Dez 23:20 merged.self-diamond.tab_mcl20.clusters
-rw-rw-r--+ 1 brizolat eve_umbmsb 47M 14. Dez 23:01
-rw-rw-r--+ 1 brizolat eve_umbmsb 419K 15. Dez 07:09 modules_mcl_5.0.clusters
-rw-rw-r--+ 1 brizolat eve_umbmsb 259K 15. Dez 07:10 modules_mcl_5.0_modules.pandas
-rw-rw-r--+ 1 brizolat eve_umbmsb 7,9M 15. Dez 07:10 modules_mcl_5.0_pcs.pandas
-rw-rw-r--+ 1 brizolat eve_umbmsb 134M 15. Dez 07:09 modules.ntwk
-rw-rw-r--+ 1 brizolat eve_umbmsb 196K 15. Dez 07:14 sig1.0_mcl2.0_clusters.csv
-rw-rw-r--+ 1 brizolat eve_umbmsb 17M 15. Dez 07:14 sig1.0_mcl2.0_contigs.csv
-rw-rw-r--+ 1 brizolat eve_umbmsb 204K 15. Dez 07:14 sig1.0_mcl2.0_modsig1.0_modmcl5.0_minshared3_link_mod_cluster.csv
-rw-rw-r--+ 1 brizolat eve_umbmsb 158K 15. Dez 07:14 sig1.0_mcl5.0_minshared3_modules.csv
-rw-rw-r--+ 1 brizolat eve_umbmsb 11M 15. Dez 03:08 vConTACT_contigs.csv
-rw-rw-r--+ 1 brizolat eve_umbmsb 8,1M 15. Dez 03:08 vConTACT_pcs.csv
-rw-rw-r--+ 1 brizolat eve_umbmsb 53M 15. Dez 03:08 vConTACT_profiles.csv
-rw-rw-r--+ 1 brizolat eve_umbmsb 123M 15. Dez 03:08 vConTACT_proteins.csv
-rw-rw-r--+ 1 brizolat eve_umbmsb 3,0M 15. Dez 07:56 viral_cluster_overview.csv

Could you please help me out on this one?

Thanks a lot and thank you for developing this amazing tool.



Comments (2)

  1. Adhip Mukhopadhyay


    Thanks for developing this nice tool.

    I am also facing the same issue as mentioned by Rodolfo.

    However, I am not sure whether it is an error or not, as the run ends with only the message:

    ----------------------------Exporting results files-----------------------------

    There were 687 genomes (including refs) that were singleton, outlier or overlaps.

    Still identifying genome substrings. Consider adjusting input genomes naming.


    There was no error message or warning even!

    Interestingly in the output folder, I am missing the important genome_by_genome_overview.csv file.

    The run log is provided below

    I am a beginner, so please pardon and guide me.



    (vcontact2) virology@virology:~$ vcontact --raw-proteins ./Documents/Adhip/prodigal_outputs/prodigal_nchoe_02.faa --rel-mode Diamond --proteins-fp ./Documents/Adhip/vcontact2/vc2_g2g_nc_02.csv --db ProkaryoticViralRefSeq94-Merged --pcs-mode MCL --vcs-mode ClusterONE --c1-bin /home/virology/miniconda3/pkgs/clusterone-1.0-hdfd78af_0/lib/cluster_one-v1.0.jar --output-dir ./Documents/Adhip/vcontact2/vc2_nchoe_02/ -t 12
    ============================This is vConTACT2 0.9.13============================
    INFO:vcontact2: Found Diamond: /home/virology/miniconda3/envs/vcontact2/bin/diamond
    INFO:vcontact2: Found MCL: /home/virology/miniconda3/envs/vcontact2/bin/mcxload
    INFO:vcontact2: Identified 12 CPUs
    INFO:vcontact2: Using reference database: ProkaryoticViralRefSeq94-Merged
    INFO:vcontact2: Using existing directory ./Documents/Adhip/vcontact2/vc2_nchoe_02/.
    ------------------------------Reference databases-------------------------------
    INFO:vcontact2: Merging ProkaryoticViralRefSeq94-Merged to user sequences...
    INFO:vcontact2: Creating Diamond database and running Diamond...
    diamond v2.0.14.152 (C) Max Planck Society for the Advancement of Science
    Documentation, support and updates available at
    Please cite: Nature Methods (2021)
    #CPU threads: 12
    Scoring parameters: (Matrix=BLOSUM62 Lambda=0.267 K=0.041 Penalties=11/1)
    Database input file: ./Documents/Adhip/vcontact2/vc2_nchoe_02/merged.faa
    Opening the database file...  [0s]
    Loading sequences...  [0.222s]
    Masking sequences...  [0.309s]
    Writing sequences...  [0.041s]
    Hashing sequences...  [0.017s]
    Loading sequences...  [0s]
    Writing trailer...  [0.002s]
    Closing the input file...  [0s]
    Closing the database file...  [0.002s]
    Database sequences  270595
      Database letters  55506732
        Database hash  c401340008b9bf4b11c40fc71c68b6a8
            Total time  0.596000s
    diamond v2.0.14.152 (C) Max Planck Society for the Advancement of Science
    Documentation, support and updates available at
    Please cite: Nature Methods (2021)
    #CPU threads: 12
    Scoring parameters: (Matrix=BLOSUM62 Lambda=0.267 K=0.041 Penalties=11/1)
    Temporary directory: ./Documents/Adhip/vcontact2/vc2_nchoe_02
    #Target sequences to report alignments for: 25
    Opening the database...  [0.044s]
    Database: ./Documents/Adhip/vcontact2/vc2_nchoe_02/merged.dmnd (type: Diamond database, sequences: 270595, letters: 55506732)
    Block size = 2000000000
    Opening the input file...  [0.023s]
    Opening the output file...  [0s]
    Loading query sequences...  [0.207s]
    Masking queries...  [0.314s]
    Algorithm: Double-indexed
    Building query histograms...  [0.993s]
    Allocating buffers...  [0s]
    Loading reference sequences...  [0.093s]
    Masking reference...  [0.324s]
    Initializing temporary storage...  [0s]
    Building reference histograms...  [1.358s]
    Allocating buffers...  [0s]
    Processing query block 1, reference block 1/1, shape 1/16, index chunk 1/4.
    Building reference seed array...  [0.084s]
    Building query seed array...  [0.09s]
    Computing hash join...  [0.067s]
    Masking low complexity seeds...  [0.025s]
    Searching alignments...  [3.74s]
    Processing query block 1, reference block 1/1, shape 1/16, index chunk 2/4.
    Building reference seed array...  [0.097s]
    Building query seed array...  [0.102s]
    Computing hash join...  [0.066s]
    Masking low complexity seeds...  [0.025s]
    Searching alignments...  [2.736s]
    Processing query block 1, reference block 1/1, shape 1/16, index chunk 3/4.
    Building reference seed array...  [0.1s]
    Building query seed array...  [0.103s]
    Computing hash join...  [0.066s]
    Masking low complexity seeds...  [0.025s]
    Searching alignments...  [2.525s]
    Processing query block 1, reference block 1/1, shape 1/16, index chunk 4/4.
    Building reference seed array...  [0.081s]
    Building query seed array...  [0.083s]
    Computing hash join...  [0.066s]
    Masking low complexity seeds...  [0.025s]
    Searching alignments...  [2.43s]
    Processing query block 1, reference block 1/1, shape 2/16, index chunk 1/4.
    Building reference seed array...  [0.082s]
    Building query seed array...  [0.085s]
    Computing hash join...  [0.064s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.737s]
    Processing query block 1, reference block 1/1, shape 2/16, index chunk 2/4.
    Building reference seed array...  [0.099s]
    Building query seed array...  [0.101s]
    Computing hash join...  [0.065s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.538s]
    Processing query block 1, reference block 1/1, shape 2/16, index chunk 3/4.
    Building reference seed array...  [0.1s]
    Building query seed array...  [0.105s]
    Computing hash join...  [0.065s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.425s]
    Processing query block 1, reference block 1/1, shape 2/16, index chunk 4/4.
    Building reference seed array...  [0.081s]
    Building query seed array...  [0.084s]
    Computing hash join...  [0.065s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.369s]
    Processing query block 1, reference block 1/1, shape 3/16, index chunk 1/4.
    Building reference seed array...  [0.083s]
    Building query seed array...  [0.087s]
    Computing hash join...  [0.066s]
    Masking low complexity seeds...  [0.025s]
    Searching alignments...  [2.825s]
    Processing query block 1, reference block 1/1, shape 3/16, index chunk 2/4.
    Building reference seed array...  [0.1s]
    Building query seed array...  [0.104s]
    Computing hash join...  [0.066s]
    Masking low complexity seeds...  [0.025s]
    Searching alignments...  [2.579s]
    Processing query block 1, reference block 1/1, shape 3/16, index chunk 3/4.
    Building reference seed array...  [0.105s]
    Building query seed array...  [0.109s]
    Computing hash join...  [0.066s]
    Masking low complexity seeds...  [0.025s]
    Searching alignments...  [2.491s]
    Processing query block 1, reference block 1/1, shape 3/16, index chunk 4/4.
    Building reference seed array...  [0.084s]
    Building query seed array...  [0.087s]
    Computing hash join...  [0.066s]
    Masking low complexity seeds...  [0.025s]
    Searching alignments...  [2.389s]
    Processing query block 1, reference block 1/1, shape 4/16, index chunk 1/4.
    Building reference seed array...  [0.085s]
    Building query seed array...  [0.086s]
    Computing hash join...  [0.066s]
    Masking low complexity seeds...  [0.025s]
    Searching alignments...  [2.805s]
    Processing query block 1, reference block 1/1, shape 4/16, index chunk 2/4.
    Building reference seed array...  [0.1s]
    Building query seed array...  [0.101s]
    Computing hash join...  [0.066s]
    Masking low complexity seeds...  [0.025s]
    Searching alignments...  [2.57s]
    Processing query block 1, reference block 1/1, shape 4/16, index chunk 3/4.
    Building reference seed array...  [0.103s]
    Building query seed array...  [0.104s]
    Computing hash join...  [0.066s]
    Masking low complexity seeds...  [0.025s]
    Searching alignments...  [2.469s]
    Processing query block 1, reference block 1/1, shape 4/16, index chunk 4/4.
    Building reference seed array...  [0.082s]
    Building query seed array...  [0.087s]
    Computing hash join...  [0.065s]
    Masking low complexity seeds...  [0.025s]
    Searching alignments...  [2.38s]
    Processing query block 1, reference block 1/1, shape 5/16, index chunk 1/4.
    Building reference seed array...  [0.083s]
    Building query seed array...  [0.086s]
    Computing hash join...  [0.065s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.785s]
    Processing query block 1, reference block 1/1, shape 5/16, index chunk 2/4.
    Building reference seed array...  [0.101s]
    Building query seed array...  [0.099s]
    Computing hash join...  [0.065s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.556s]
    Processing query block 1, reference block 1/1, shape 5/16, index chunk 3/4.
    Building reference seed array...  [0.1s]
    Building query seed array...  [0.104s]
    Computing hash join...  [0.065s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.451s]
    Processing query block 1, reference block 1/1, shape 5/16, index chunk 4/4.
    Building reference seed array...  [0.08s]
    Building query seed array...  [0.083s]
    Computing hash join...  [0.065s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.421s]
    Processing query block 1, reference block 1/1, shape 6/16, index chunk 1/4.
    Building reference seed array...  [0.081s]
    Building query seed array...  [0.087s]
    Computing hash join...  [0.066s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.789s]
    Processing query block 1, reference block 1/1, shape 6/16, index chunk 2/4.
    Building reference seed array...  [0.096s]
    Building query seed array...  [0.1s]
    Computing hash join...  [0.066s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.546s]
    Processing query block 1, reference block 1/1, shape 6/16, index chunk 3/4.
    Building reference seed array...  [0.103s]
    Building query seed array...  [0.107s]
    Computing hash join...  [0.065s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.456s]
    Processing query block 1, reference block 1/1, shape 6/16, index chunk 4/4.
    Building reference seed array...  [0.081s]
    Building query seed array...  [0.084s]
    Computing hash join...  [0.066s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.399s]
    Processing query block 1, reference block 1/1, shape 7/16, index chunk 1/4.
    Building reference seed array...  [0.082s]
    Building query seed array...  [0.085s]
    Computing hash join...  [0.064s]
    Masking low complexity seeds...  [0.025s]
    Searching alignments...  [2.72s]
    Processing query block 1, reference block 1/1, shape 7/16, index chunk 2/4.
    Building reference seed array...  [0.099s]
    Building query seed array...  [0.1s]
    Computing hash join...  [0.065s]
    Masking low complexity seeds...  [0.025s]
    Searching alignments...  [2.554s]
    Processing query block 1, reference block 1/1, shape 7/16, index chunk 3/4.
    Building reference seed array...  [0.099s]
    Building query seed array...  [0.106s]
    Computing hash join...  [0.065s]
    Masking low complexity seeds...  [0.025s]
    Searching alignments...  [2.437s]
    Processing query block 1, reference block 1/1, shape 7/16, index chunk 4/4.
    Building reference seed array...  [0.078s]
    Building query seed array...  [0.082s]
    Computing hash join...  [0.065s]
    Masking low complexity seeds...  [0.025s]
    Searching alignments...  [2.381s]
    Processing query block 1, reference block 1/1, shape 8/16, index chunk 1/4.
    Building reference seed array...  [0.083s]
    Building query seed array...  [0.086s]
    Computing hash join...  [0.065s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.772s]
    Processing query block 1, reference block 1/1, shape 8/16, index chunk 2/4.
    Building reference seed array...  [0.098s]
    Building query seed array...  [0.102s]
    Computing hash join...  [0.065s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.559s]
    Processing query block 1, reference block 1/1, shape 8/16, index chunk 3/4.
    Building reference seed array...  [0.103s]
    Building query seed array...  [0.106s]
    Computing hash join...  [0.065s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.451s]
    Processing query block 1, reference block 1/1, shape 8/16, index chunk 4/4.
    Building reference seed array...  [0.081s]
    Building query seed array...  [0.085s]
    Computing hash join...  [0.065s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.399s]
    Processing query block 1, reference block 1/1, shape 9/16, index chunk 1/4.
    Building reference seed array...  [0.084s]
    Building query seed array...  [0.087s]
    Computing hash join...  [0.066s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.81s]
    Processing query block 1, reference block 1/1, shape 9/16, index chunk 2/4.
    Building reference seed array...  [0.101s]
    Building query seed array...  [0.103s]
    Computing hash join...  [0.065s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.556s]
    Processing query block 1, reference block 1/1, shape 9/16, index chunk 3/4.
    Building reference seed array...  [0.105s]
    Building query seed array...  [0.108s]
    Computing hash join...  [0.066s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.487s]
    Processing query block 1, reference block 1/1, shape 9/16, index chunk 4/4.
    Building reference seed array...  [0.081s]
    Building query seed array...  [0.085s]
    Computing hash join...  [0.066s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.401s]
    Processing query block 1, reference block 1/1, shape 10/16, index chunk 1/4.
    Building reference seed array...  [0.083s]
    Building query seed array...  [0.086s]
    Computing hash join...  [0.064s]
    Masking low complexity seeds...  [0.025s]
    Searching alignments...  [2.746s]
    Processing query block 1, reference block 1/1, shape 10/16, index chunk 2/4.
    Building reference seed array...  [0.098s]
    Building query seed array...  [0.102s]
    Computing hash join...  [0.064s]
    Masking low complexity seeds...  [0.025s]
    Searching alignments...  [2.546s]
    Processing query block 1, reference block 1/1, shape 10/16, index chunk 3/4.
    Building reference seed array...  [0.102s]
    Building query seed array...  [0.106s]
    Computing hash join...  [0.064s]
    Masking low complexity seeds...  [0.025s]
    Searching alignments...  [2.451s]
    Processing query block 1, reference block 1/1, shape 10/16, index chunk 4/4.
    Building reference seed array...  [0.081s]
    Building query seed array...  [0.084s]
    Computing hash join...  [0.064s]
    Masking low complexity seeds...  [0.025s]
    Searching alignments...  [2.385s]
    Processing query block 1, reference block 1/1, shape 11/16, index chunk 1/4.
    Building reference seed array...  [0.084s]
    Building query seed array...  [0.086s]
    Computing hash join...  [0.066s]
    Masking low complexity seeds...  [0.025s]
    Searching alignments...  [2.794s]
    Processing query block 1, reference block 1/1, shape 11/16, index chunk 2/4.
    Building reference seed array...  [0.101s]
    Building query seed array...  [0.103s]
    Computing hash join...  [0.065s]
    Masking low complexity seeds...  [0.025s]
    Searching alignments...  [2.593s]
    Processing query block 1, reference block 1/1, shape 11/16, index chunk 3/4.
    Building reference seed array...  [0.102s]
    Building query seed array...  [0.107s]
    Computing hash join...  [0.065s]
    Masking low complexity seeds...  [0.025s]
    Searching alignments...  [2.466s]
    Processing query block 1, reference block 1/1, shape 11/16, index chunk 4/4.
    Building reference seed array...  [0.091s]
    Building query seed array...  [0.084s]
    Computing hash join...  [0.065s]
    Masking low complexity seeds...  [0.025s]
    Searching alignments...  [2.434s]
    Processing query block 1, reference block 1/1, shape 12/16, index chunk 1/4.
    Building reference seed array...  [0.082s]
    Building query seed array...  [0.085s]
    Computing hash join...  [0.068s]
    Masking low complexity seeds...  [0.025s]
    Searching alignments...  [2.755s]
    Processing query block 1, reference block 1/1, shape 12/16, index chunk 2/4.
    Building reference seed array...  [0.096s]
    Building query seed array...  [0.101s]
    Computing hash join...  [0.065s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.534s]
    Processing query block 1, reference block 1/1, shape 12/16, index chunk 3/4.
    Building reference seed array...  [0.105s]
    Building query seed array...  [0.107s]
    Computing hash join...  [0.064s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.449s]
    Processing query block 1, reference block 1/1, shape 12/16, index chunk 4/4.
    Building reference seed array...  [0.082s]
    Building query seed array...  [0.085s]
    Computing hash join...  [0.064s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.391s]
    Processing query block 1, reference block 1/1, shape 13/16, index chunk 1/4.
    Building reference seed array...  [0.084s]
    Building query seed array...  [0.086s]
    Computing hash join...  [0.064s]
    Masking low complexity seeds...  [0.025s]
    Searching alignments...  [2.749s]
    Processing query block 1, reference block 1/1, shape 13/16, index chunk 2/4.
    Building reference seed array...  [0.098s]
    Building query seed array...  [0.102s]
    Computing hash join...  [0.064s]
    Masking low complexity seeds...  [0.025s]
    Searching alignments...  [2.551s]
    Processing query block 1, reference block 1/1, shape 13/16, index chunk 3/4.
    Building reference seed array...  [0.103s]
    Building query seed array...  [0.107s]
    Computing hash join...  [0.064s]
    Masking low complexity seeds...  [0.025s]
    Searching alignments...  [2.472s]
    Processing query block 1, reference block 1/1, shape 13/16, index chunk 4/4.
    Building reference seed array...  [0.08s]
    Building query seed array...  [0.083s]
    Computing hash join...  [0.064s]
    Masking low complexity seeds...  [0.025s]
    Searching alignments...  [2.389s]
    Processing query block 1, reference block 1/1, shape 14/16, index chunk 1/4.
    Building reference seed array...  [0.085s]
    Building query seed array...  [0.088s]
    Computing hash join...  [0.065s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.77s]
    Processing query block 1, reference block 1/1, shape 14/16, index chunk 2/4.
    Building reference seed array...  [0.101s]
    Building query seed array...  [0.104s]
    Computing hash join...  [0.064s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.584s]
    Processing query block 1, reference block 1/1, shape 14/16, index chunk 3/4.
    Building reference seed array...  [0.105s]
    Building query seed array...  [0.11s]
    Computing hash join...  [0.065s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.473s]
    Processing query block 1, reference block 1/1, shape 14/16, index chunk 4/4.
    Building reference seed array...  [0.083s]
    Building query seed array...  [0.086s]
    Computing hash join...  [0.064s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.4s]
    Processing query block 1, reference block 1/1, shape 15/16, index chunk 1/4.
    Building reference seed array...  [0.081s]
    Building query seed array...  [0.083s]
    Computing hash join...  [0.065s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.793s]
    Processing query block 1, reference block 1/1, shape 15/16, index chunk 2/4.
    Building reference seed array...  [0.1s]
    Building query seed array...  [0.103s]
    Computing hash join...  [0.065s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.557s]
    Processing query block 1, reference block 1/1, shape 15/16, index chunk 3/4.
    Building reference seed array...  [0.104s]
    Building query seed array...  [0.107s]
    Computing hash join...  [0.065s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.468s]
    Processing query block 1, reference block 1/1, shape 15/16, index chunk 4/4.
    Building reference seed array...  [0.081s]
    Building query seed array...  [0.084s]
    Computing hash join...  [0.065s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.435s]
    Processing query block 1, reference block 1/1, shape 16/16, index chunk 1/4.
    Building reference seed array...  [0.081s]
    Building query seed array...  [0.085s]
    Computing hash join...  [0.064s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.752s]
    Processing query block 1, reference block 1/1, shape 16/16, index chunk 2/4.
    Building reference seed array...  [0.098s]
    Building query seed array...  [0.102s]
    Computing hash join...  [0.064s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.529s]
    Processing query block 1, reference block 1/1, shape 16/16, index chunk 3/4.
    Building reference seed array...  [0.103s]
    Building query seed array...  [0.105s]
    Computing hash join...  [0.064s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.429s]
    Processing query block 1, reference block 1/1, shape 16/16, index chunk 4/4.
    Building reference seed array...  [0.082s]
    Building query seed array...  [0.084s]
    Computing hash join...  [0.064s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.402s]
    Deallocating buffers...  [0.004s]
    Clearing query masking...  [0.005s]
    Computing alignments...  [42.947s]
    Deallocating reference...  [0.003s]
    Loading reference sequences...  [0s]
    Deallocating buffers...  [0.004s]
    Deallocating queries...  [0.002s]
    Loading query sequences...  [0s]
    Closing the input file...  [0s]
    Closing the output file...  [0.002s]
    Cleaning up...  [0s]
    Total time = 228.434s
    Reported 4038385 pairwise alignments, 4038385 HSPs.
    270568 queries aligned.
    -------------------------------Protein clustering-------------------------------
    INFO:vcontact2: Loading proteins...
    INFO:vcontact2: Merging ProkaryoticViralRefSeq94-Merged to user gene-to-genome mapping...
    .................................................. 1M
    .................................................. 2M
    .................................................. 3M
    [mclIO] writing <./Documents/Adhip/vcontact2/vc2_nchoe_02/>
    [mclIO] wrote native interchange 241042x241042 matrix with 4652476 entries to stream <./Documents/Adhip/vcontact2/vc2_nchoe_02/>
    [mclIO] wrote 241042 tab entries to stream <./Documents/Adhip/vcontact2/vc2_nchoe_02/>
    [mcxload] tab has 241042 entries
    [mclIO] reading <./Documents/Adhip/vcontact2/vc2_nchoe_02/>
    [mclIO] read native interchange 241042x241042 matrix with 4652476 entries
    [mcl] pid 19358
     ite -------------------  chaos  time hom(avg,lo,hi) m-ie m-ex i-ex fmv
      1  ...................  37.70  1.29 0.98/0.23/3.52 2.33 2.20 2.20   0
      2  ...................  57.50  6.04 0.88/0.14/5.08 3.17 0.88 1.95   2
      3  ...................  31.00  3.60 0.84/0.14/6.01 2.12 0.75 1.47   0
      4  ...................  21.25  1.66 0.83/0.09/6.92 1.47 0.74 1.09   0
      5  ...................  15.16  0.81 0.82/0.14/5.31 1.21 0.72 0.78   0
      6  ...................  10.98  0.42 0.82/0.11/4.13 1.08 0.74 0.57   0
      7  ...................   9.18  0.26 0.83/0.20/2.55 1.03 0.78 0.45   0
      8  ...................   6.50  0.19 0.84/0.20/2.10 1.01 0.81 0.36   0
      9  ...................   5.23  0.16 0.86/0.20/1.27 1.00 0.82 0.30   0
     10  ...................   5.41  0.13 0.89/0.22/1.29 1.00 0.82 0.24   0
     11  ...................   4.31  0.11 0.92/0.19/1.32 1.00 0.82 0.20   0
     12  ...................   4.50  0.09 0.95/0.19/1.05 1.00 0.83 0.17   0
     13  ...................   4.73  0.08 0.97/0.22/1.00 1.00 0.86 0.14   0
     14  ...................   5.19  0.07 0.98/0.20/1.00 1.00 0.89 0.13   0
     15  ...................   4.09  0.07 0.99/0.24/1.00 1.00 0.92 0.12   0
     16  ...................   4.28  0.07 0.99/0.19/1.00 1.00 0.95 0.11   0
     17  ...................   4.21  0.06 1.00/0.23/1.00 1.00 0.97 0.11   0
     18  ...................   5.36  0.06 1.00/0.36/1.00 1.00 0.98 0.11   0
     19  ...................   2.49  0.06 1.00/0.39/1.00 1.00 0.99 0.10   0
     20  ...................   0.67  0.06 1.00/0.57/1.00 1.00 0.99 0.10   0
     21  ...................   0.49  0.06 1.00/0.64/1.00 1.00 1.00 0.10   0
     22  ...................   0.39  0.06 1.00/0.70/1.00 1.00 1.00 0.10   0
     23  ...................   0.30  0.06 1.00/0.75/1.00 1.00 1.00 0.10   0
     24  ...................   0.24  0.06 1.00/0.79/1.00 1.00 1.00 0.10   0
     25  ...................   0.22  0.06 1.00/0.78/1.00 1.00 1.00 0.10   0
     26  ...................   0.08  0.06 1.00/0.93/1.00 1.00 1.00 0.10   0
     27  ...................   0.14  0.06 1.00/0.93/1.00 1.00 1.00 0.10   0
     28  ...................   0.22  0.06 1.00/0.83/1.00 1.00 1.00 0.10   0
     29  ...................   0.24  0.06 1.00/0.76/1.00 1.00 1.00 0.10   0
     30  ...................   0.12  0.06 1.00/0.88/1.00 1.00 1.00 0.10   0
     31  ...................   0.01  0.06 1.00/0.99/1.00 1.00 1.00 0.10   0
     32  ...................   0.00  0.06 1.00/1.00/1.00 1.00 1.00 0.10   0
    [mcl] jury pruning marks: <99,99,99>, out of 100
    [mcl] jury pruning synopsis: <99.0 or perfect> (cf -scheme, -do log)
    [mcl] output is in ./Documents/Adhip/vcontact2/vc2_nchoe_02/merged.self-diamond.tab_mcl20.clusters
    [mcl] 31100 clusters found
    [mcl] output is in ./Documents/Adhip/vcontact2/vc2_nchoe_02/merged.self-diamond.tab_mcl20.clusters
    Please cite:
        Stijn van Dongen, Graph Clustering by Flow Simulation.  PhD thesis,
        University of Utrecht, May 2000.
        Stijn van Dongen, A cluster algorithm for graphs. Technical
        Report INS-R0010, National Research Institute for Mathematics
        and Computer Science in the Netherlands, Amsterdam, May 2000.
    INFO:vcontact2: Building the cluster and profiles (this may take some time...)
    If it fails, try re-running using --blast-fp flag and specifiying (or
    INFO:vcontact2: Saving intermediate files...
    ----------------------------------Loading data----------------------------------
    INFO:vcontact2: Read 233640 entries (dropped 2626 singletons) from ./Documents/Adhip/vcontact2/vc2_nchoe_02/vConTACT_profiles.csv
    --------------------------------Adding Taxonomy---------------------------------
    ------------------------Calculating Similarity Networks-------------------------
    ------------------------Contig Clustering & Affiliation-------------------------
    Loaded graph with 2634 nodes and 96104 edges
    [====================] 100% Growing clusters from seeds...
    [====================] 100% Finding highly overlapping clusters...
    [====================] 100% Merging highly overlapping clusters...
    Detected 362 complexes
    --------------------------------Protein modules---------------------------------
    .................................................. 1M
    .................................................. 2M
    [mcl] new tab created
    [mcl] pid 19686
     ite -------------------  chaos  time hom(avg,lo,hi) m-ie m-ex i-ex fmv
      1  ...................  99.19  0.82 1.04/0.01/9.75 5.47 1.81 1.81  72
      2  ...................  49.59  1.67 0.72/0.04/4.18 10.33 0.11 0.20  91
      3  ...................   7.42  0.09 0.87/0.08/11.03 2.06 0.22 0.04  13
      4  ...................   7.14  0.01 0.96/0.14/11.33 1.04 0.58 0.03   0
      5  ...................   1.35  0.01 0.99/0.30/1.23 1.00 0.80 0.02   0
      6  ...................   0.30  0.01 1.00/0.76/1.22 1.00 0.91 0.02   0
      7  ...................   0.25  0.01 1.00/0.77/1.16 1.00 0.98 0.02   0
      8  ...................   0.75  0.01 1.00/0.54/1.03 1.00 1.00 0.02   0
      9  ...................   0.01  0.01 1.00/1.00/1.00 1.00 1.00 0.02   0
     10  ...................   0.00  0.01 1.00/1.00/1.00 1.00 1.00 0.02   0
    [mcl] jury pruning marks: <97,99,99>, out of 100
    [mcl] jury pruning synopsis: <97.8 or superb> (cf -scheme, -do log)
    [mcl] output is in ./Documents/Adhip/vcontact2/vc2_nchoe_02/modules_mcl_5.0.clusters
    [mcl] 707 clusters found
    [mcl] output is in ./Documents/Adhip/vcontact2/vc2_nchoe_02/modules_mcl_5.0.clusters
    Please cite:
        Stijn van Dongen, Graph Clustering by Flow Simulation.  PhD thesis,
        University of Utrecht, May 2000.
        Stijn van Dongen, A cluster algorithm for graphs. Technical
        Report INS-R0010, National Research Institute for Mathematics
        and Computer Science in the Netherlands, Amsterdam, May 2000.
    ---------------------------Link modules and clusters----------------------------
    ----------------------------Exporting results files-----------------------------
    There were 687 genomes (including refs) that were singleton, outlier or overlaps.
    Still identifying genome substrings. Consider adjusting input genomes naming.
    (vcontact2) virology@virology:~$

  2. Ben Bolduc

    Similar mentions of this genome affecting runs have been reported elsewhere. This seems to be an issue where vContact2 can’t identify a specific genome within the network because the name is contained within another name, e.g. Pseudomonas Phage P1 is a subset of Pseudomonas Phage P10. A fix for this was implemented quite some time ago - but that seems to only affect user genomes, not the reference database.

    We’ll be updating the reference DB to deal with this sequence.

  3. Log in to comment