Still identifying genome substrings. Consider adjusting input genomes naming. 'Shewanella~sp.~phage~1/4'

Issue #63 new
Toscan created an issue

Hello Developers,

thank you for developing this super nice tool!!

I am running into an interesting issue here.

At the step “----------------------------Exporting results files-----------------------------” I get the following error:

“Still identifying genome substrings. Consider adjusting input genomes naming.
'Shewanella~sp.~phage~1/4'“

And then that is it, the execution stops…

Snapshot of my .out file from SLURM job:

Snapshot of my .err file from SLURM job:

I have used this successfully many times with other datasets, but this particular errors only with this dataset.

I checked to see if the problem was runtime or memory exceeded and it was not:

\$ seff 8841463
Job ID: 8841463
Cluster: eve
User/Group: brizolat/umb
State: COMPLETED (exit code 0)
Cores: 1
CPU Utilized: 14:41:22
CPU Efficiency: 90.48% of 16:14:06 core-walltime
Job Wall-clock time: 16:14:06
Memory Utilized: 129.31 GB
Memory Efficiency: 64.65% of 200.00 GB

I also checked the intermediary files here to provide you some more details:

\$ ls -lh vcontact-output
total 5,7G
-rw-rw-r--+ 1 brizolat eve_umbmsb 3,1M 15. Dez 06:43 c1.clusters
-rw-rw-r--+ 1 brizolat eve_umbmsb 89M 15. Dez 06:36 c1.ntw
-rw-rw-r--+ 1 brizolat eve_umbmsb 15M 15. Dez 03:08 merged_df.csv
-rw-rw-r--+ 1 brizolat eve_umbmsb 431M 14. Dez 17:57 merged.dmnd
-rw-rw-r--+ 1 brizolat eve_umbmsb 418M 14. Dez 17:56 merged.faa
-rw-rw-r--+ 1 brizolat eve_umbmsb 2,3G 14. Dez 22:27 merged.self-diamond.tab
-rw-rw-r--+ 1 brizolat eve_umbmsb 1,6G 14. Dez 22:29 merged.self-diamond.tab.abc
-rw-rw-r--+ 1 brizolat eve_umbmsb 431M 14. Dez 23:01 merged.self-diamond.tab.mci
-rw-rw-r--+ 1 brizolat eve_umbmsb 41M 14. Dez 23:20 merged.self-diamond.tab_mcl20.clusters
-rw-rw-r--+ 1 brizolat eve_umbmsb 47M 14. Dez 23:01 merged.self-diamond.tab_mcxload.tab
-rw-rw-r--+ 1 brizolat eve_umbmsb 419K 15. Dez 07:09 modules_mcl_5.0.clusters
-rw-rw-r--+ 1 brizolat eve_umbmsb 259K 15. Dez 07:10 modules_mcl_5.0_modules.pandas
-rw-rw-r--+ 1 brizolat eve_umbmsb 7,9M 15. Dez 07:10 modules_mcl_5.0_pcs.pandas
-rw-rw-r--+ 1 brizolat eve_umbmsb 134M 15. Dez 07:09 modules.ntwk
-rw-rw-r--+ 1 brizolat eve_umbmsb 196K 15. Dez 07:14 sig1.0_mcl2.0_clusters.csv
-rw-rw-r--+ 1 brizolat eve_umbmsb 17M 15. Dez 07:14 sig1.0_mcl2.0_contigs.csv
-rw-rw-r--+ 1 brizolat eve_umbmsb 204K 15. Dez 07:14 sig1.0_mcl2.0_modsig1.0_modmcl5.0_minshared3_link_mod_cluster.csv
-rw-rw-r--+ 1 brizolat eve_umbmsb 158K 15. Dez 07:14 sig1.0_mcl5.0_minshared3_modules.csv
-rw-rw-r--+ 1 brizolat eve_umbmsb 11M 15. Dez 03:08 vConTACT_contigs.csv
-rw-rw-r--+ 1 brizolat eve_umbmsb 8,1M 15. Dez 03:08 vConTACT_pcs.csv
-rw-rw-r--+ 1 brizolat eve_umbmsb 53M 15. Dez 03:08 vConTACT_profiles.csv
-rw-rw-r--+ 1 brizolat eve_umbmsb 123M 15. Dez 03:08 vConTACT_proteins.csv
-rw-rw-r--+ 1 brizolat eve_umbmsb 3,0M 15. Dez 07:56 viral_cluster_overview.csv

Could you please help me out on this one?

Thanks a lot and thank you for developing this amazing tool.

Best,

Rodolfo

Comments (2)

  1. Adhip Mukhopadhyay

    Hello!

    Thanks for developing this nice tool.

    I am also facing the same issue as mentioned by Rodolfo.

    However, I am not sure whether it is an error or not, as the run ends with only the message:

    ----------------------------Exporting results files-----------------------------

    There were 687 genomes (including refs) that were singleton, outlier or overlaps.

    Still identifying genome substrings. Consider adjusting input genomes naming.

    'nchoe_02_contig_11'

    There was no error message or warning even!

    Interestingly in the output folder, I am missing the important genome_by_genome_overview.csv file.

    The run log is provided below

    I am a beginner, so please pardon and guide me.

    Best

    Adhip

    (vcontact2) virology@virology:~$ vcontact --raw-proteins ./Documents/Adhip/prodigal_outputs/prodigal_nchoe_02.faa --rel-mode Diamond --proteins-fp ./Documents/Adhip/vcontact2/vc2_g2g_nc_02.csv --db ProkaryoticViralRefSeq94-Merged --pcs-mode MCL --vcs-mode ClusterONE --c1-bin /home/virology/miniconda3/pkgs/clusterone-1.0-hdfd78af_0/lib/cluster_one-v1.0.jar --output-dir ./Documents/Adhip/vcontact2/vc2_nchoe_02/ -t 12
    
    ============================This is vConTACT2 0.9.13============================
    
    
    
    ----------------------------------Pre-Analysis----------------------------------
    INFO:vcontact2: Found Diamond: /home/virology/miniconda3/envs/vcontact2/bin/diamond
    INFO:vcontact2: Found MCL: /home/virology/miniconda3/envs/vcontact2/bin/mcxload
    INFO:vcontact2: Identified 12 CPUs
    INFO:vcontact2: Using reference database: ProkaryoticViralRefSeq94-Merged
    INFO:vcontact2: Using existing directory ./Documents/Adhip/vcontact2/vc2_nchoe_02/.
    
    
    ------------------------------Reference databases-------------------------------
    INFO:vcontact2: Merging ProkaryoticViralRefSeq94-Merged to user sequences...
    INFO:vcontact2: Creating Diamond database and running Diamond...
    diamond v2.0.14.152 (C) Max Planck Society for the Advancement of Science
    Documentation, support and updates available at http://www.diamondsearch.org
    Please cite: http://dx.doi.org/10.1038/s41592-021-01101-x Nature Methods (2021)
    
    #CPU threads: 12
    Scoring parameters: (Matrix=BLOSUM62 Lambda=0.267 K=0.041 Penalties=11/1)
    Database input file: ./Documents/Adhip/vcontact2/vc2_nchoe_02/merged.faa
    Opening the database file...  [0s]
    Loading sequences...  [0.222s]
    Masking sequences...  [0.309s]
    Writing sequences...  [0.041s]
    Hashing sequences...  [0.017s]
    Loading sequences...  [0s]
    Writing trailer...  [0.002s]
    Closing the input file...  [0s]
    Closing the database file...  [0.002s]
    
    Database sequences  270595
      Database letters  55506732
        Database hash  c401340008b9bf4b11c40fc71c68b6a8
            Total time  0.596000s
    diamond v2.0.14.152 (C) Max Planck Society for the Advancement of Science
    Documentation, support and updates available at http://www.diamondsearch.org
    Please cite: http://dx.doi.org/10.1038/s41592-021-01101-x Nature Methods (2021)
    
    #CPU threads: 12
    Scoring parameters: (Matrix=BLOSUM62 Lambda=0.267 K=0.041 Penalties=11/1)
    Temporary directory: ./Documents/Adhip/vcontact2/vc2_nchoe_02
    #Target sequences to report alignments for: 25
    Opening the database...  [0.044s]
    Database: ./Documents/Adhip/vcontact2/vc2_nchoe_02/merged.dmnd (type: Diamond database, sequences: 270595, letters: 55506732)
    Block size = 2000000000
    Opening the input file...  [0.023s]
    Opening the output file...  [0s]
    Loading query sequences...  [0.207s]
    Masking queries...  [0.314s]
    Algorithm: Double-indexed
    Building query histograms...  [0.993s]
    Allocating buffers...  [0s]
    Loading reference sequences...  [0.093s]
    Masking reference...  [0.324s]
    Initializing temporary storage...  [0s]
    Building reference histograms...  [1.358s]
    Allocating buffers...  [0s]
    Processing query block 1, reference block 1/1, shape 1/16, index chunk 1/4.
    Building reference seed array...  [0.084s]
    Building query seed array...  [0.09s]
    Computing hash join...  [0.067s]
    Masking low complexity seeds...  [0.025s]
    Searching alignments...  [3.74s]
    Processing query block 1, reference block 1/1, shape 1/16, index chunk 2/4.
    Building reference seed array...  [0.097s]
    Building query seed array...  [0.102s]
    Computing hash join...  [0.066s]
    Masking low complexity seeds...  [0.025s]
    Searching alignments...  [2.736s]
    Processing query block 1, reference block 1/1, shape 1/16, index chunk 3/4.
    Building reference seed array...  [0.1s]
    Building query seed array...  [0.103s]
    Computing hash join...  [0.066s]
    Masking low complexity seeds...  [0.025s]
    Searching alignments...  [2.525s]
    Processing query block 1, reference block 1/1, shape 1/16, index chunk 4/4.
    Building reference seed array...  [0.081s]
    Building query seed array...  [0.083s]
    Computing hash join...  [0.066s]
    Masking low complexity seeds...  [0.025s]
    Searching alignments...  [2.43s]
    Processing query block 1, reference block 1/1, shape 2/16, index chunk 1/4.
    Building reference seed array...  [0.082s]
    Building query seed array...  [0.085s]
    Computing hash join...  [0.064s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.737s]
    Processing query block 1, reference block 1/1, shape 2/16, index chunk 2/4.
    Building reference seed array...  [0.099s]
    Building query seed array...  [0.101s]
    Computing hash join...  [0.065s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.538s]
    Processing query block 1, reference block 1/1, shape 2/16, index chunk 3/4.
    Building reference seed array...  [0.1s]
    Building query seed array...  [0.105s]
    Computing hash join...  [0.065s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.425s]
    Processing query block 1, reference block 1/1, shape 2/16, index chunk 4/4.
    Building reference seed array...  [0.081s]
    Building query seed array...  [0.084s]
    Computing hash join...  [0.065s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.369s]
    Processing query block 1, reference block 1/1, shape 3/16, index chunk 1/4.
    Building reference seed array...  [0.083s]
    Building query seed array...  [0.087s]
    Computing hash join...  [0.066s]
    Masking low complexity seeds...  [0.025s]
    Searching alignments...  [2.825s]
    Processing query block 1, reference block 1/1, shape 3/16, index chunk 2/4.
    Building reference seed array...  [0.1s]
    Building query seed array...  [0.104s]
    Computing hash join...  [0.066s]
    Masking low complexity seeds...  [0.025s]
    Searching alignments...  [2.579s]
    Processing query block 1, reference block 1/1, shape 3/16, index chunk 3/4.
    Building reference seed array...  [0.105s]
    Building query seed array...  [0.109s]
    Computing hash join...  [0.066s]
    Masking low complexity seeds...  [0.025s]
    Searching alignments...  [2.491s]
    Processing query block 1, reference block 1/1, shape 3/16, index chunk 4/4.
    Building reference seed array...  [0.084s]
    Building query seed array...  [0.087s]
    Computing hash join...  [0.066s]
    Masking low complexity seeds...  [0.025s]
    Searching alignments...  [2.389s]
    Processing query block 1, reference block 1/1, shape 4/16, index chunk 1/4.
    Building reference seed array...  [0.085s]
    Building query seed array...  [0.086s]
    Computing hash join...  [0.066s]
    Masking low complexity seeds...  [0.025s]
    Searching alignments...  [2.805s]
    Processing query block 1, reference block 1/1, shape 4/16, index chunk 2/4.
    Building reference seed array...  [0.1s]
    Building query seed array...  [0.101s]
    Computing hash join...  [0.066s]
    Masking low complexity seeds...  [0.025s]
    Searching alignments...  [2.57s]
    Processing query block 1, reference block 1/1, shape 4/16, index chunk 3/4.
    Building reference seed array...  [0.103s]
    Building query seed array...  [0.104s]
    Computing hash join...  [0.066s]
    Masking low complexity seeds...  [0.025s]
    Searching alignments...  [2.469s]
    Processing query block 1, reference block 1/1, shape 4/16, index chunk 4/4.
    Building reference seed array...  [0.082s]
    Building query seed array...  [0.087s]
    Computing hash join...  [0.065s]
    Masking low complexity seeds...  [0.025s]
    Searching alignments...  [2.38s]
    Processing query block 1, reference block 1/1, shape 5/16, index chunk 1/4.
    Building reference seed array...  [0.083s]
    Building query seed array...  [0.086s]
    Computing hash join...  [0.065s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.785s]
    Processing query block 1, reference block 1/1, shape 5/16, index chunk 2/4.
    Building reference seed array...  [0.101s]
    Building query seed array...  [0.099s]
    Computing hash join...  [0.065s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.556s]
    Processing query block 1, reference block 1/1, shape 5/16, index chunk 3/4.
    Building reference seed array...  [0.1s]
    Building query seed array...  [0.104s]
    Computing hash join...  [0.065s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.451s]
    Processing query block 1, reference block 1/1, shape 5/16, index chunk 4/4.
    Building reference seed array...  [0.08s]
    Building query seed array...  [0.083s]
    Computing hash join...  [0.065s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.421s]
    Processing query block 1, reference block 1/1, shape 6/16, index chunk 1/4.
    Building reference seed array...  [0.081s]
    Building query seed array...  [0.087s]
    Computing hash join...  [0.066s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.789s]
    Processing query block 1, reference block 1/1, shape 6/16, index chunk 2/4.
    Building reference seed array...  [0.096s]
    Building query seed array...  [0.1s]
    Computing hash join...  [0.066s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.546s]
    Processing query block 1, reference block 1/1, shape 6/16, index chunk 3/4.
    Building reference seed array...  [0.103s]
    Building query seed array...  [0.107s]
    Computing hash join...  [0.065s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.456s]
    Processing query block 1, reference block 1/1, shape 6/16, index chunk 4/4.
    Building reference seed array...  [0.081s]
    Building query seed array...  [0.084s]
    Computing hash join...  [0.066s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.399s]
    Processing query block 1, reference block 1/1, shape 7/16, index chunk 1/4.
    Building reference seed array...  [0.082s]
    Building query seed array...  [0.085s]
    Computing hash join...  [0.064s]
    Masking low complexity seeds...  [0.025s]
    Searching alignments...  [2.72s]
    Processing query block 1, reference block 1/1, shape 7/16, index chunk 2/4.
    Building reference seed array...  [0.099s]
    Building query seed array...  [0.1s]
    Computing hash join...  [0.065s]
    Masking low complexity seeds...  [0.025s]
    Searching alignments...  [2.554s]
    Processing query block 1, reference block 1/1, shape 7/16, index chunk 3/4.
    Building reference seed array...  [0.099s]
    Building query seed array...  [0.106s]
    Computing hash join...  [0.065s]
    Masking low complexity seeds...  [0.025s]
    Searching alignments...  [2.437s]
    Processing query block 1, reference block 1/1, shape 7/16, index chunk 4/4.
    Building reference seed array...  [0.078s]
    Building query seed array...  [0.082s]
    Computing hash join...  [0.065s]
    Masking low complexity seeds...  [0.025s]
    Searching alignments...  [2.381s]
    Processing query block 1, reference block 1/1, shape 8/16, index chunk 1/4.
    Building reference seed array...  [0.083s]
    Building query seed array...  [0.086s]
    Computing hash join...  [0.065s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.772s]
    Processing query block 1, reference block 1/1, shape 8/16, index chunk 2/4.
    Building reference seed array...  [0.098s]
    Building query seed array...  [0.102s]
    Computing hash join...  [0.065s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.559s]
    Processing query block 1, reference block 1/1, shape 8/16, index chunk 3/4.
    Building reference seed array...  [0.103s]
    Building query seed array...  [0.106s]
    Computing hash join...  [0.065s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.451s]
    Processing query block 1, reference block 1/1, shape 8/16, index chunk 4/4.
    Building reference seed array...  [0.081s]
    Building query seed array...  [0.085s]
    Computing hash join...  [0.065s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.399s]
    Processing query block 1, reference block 1/1, shape 9/16, index chunk 1/4.
    Building reference seed array...  [0.084s]
    Building query seed array...  [0.087s]
    Computing hash join...  [0.066s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.81s]
    Processing query block 1, reference block 1/1, shape 9/16, index chunk 2/4.
    Building reference seed array...  [0.101s]
    Building query seed array...  [0.103s]
    Computing hash join...  [0.065s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.556s]
    Processing query block 1, reference block 1/1, shape 9/16, index chunk 3/4.
    Building reference seed array...  [0.105s]
    Building query seed array...  [0.108s]
    Computing hash join...  [0.066s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.487s]
    Processing query block 1, reference block 1/1, shape 9/16, index chunk 4/4.
    Building reference seed array...  [0.081s]
    Building query seed array...  [0.085s]
    Computing hash join...  [0.066s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.401s]
    Processing query block 1, reference block 1/1, shape 10/16, index chunk 1/4.
    Building reference seed array...  [0.083s]
    Building query seed array...  [0.086s]
    Computing hash join...  [0.064s]
    Masking low complexity seeds...  [0.025s]
    Searching alignments...  [2.746s]
    Processing query block 1, reference block 1/1, shape 10/16, index chunk 2/4.
    Building reference seed array...  [0.098s]
    Building query seed array...  [0.102s]
    Computing hash join...  [0.064s]
    Masking low complexity seeds...  [0.025s]
    Searching alignments...  [2.546s]
    Processing query block 1, reference block 1/1, shape 10/16, index chunk 3/4.
    Building reference seed array...  [0.102s]
    Building query seed array...  [0.106s]
    Computing hash join...  [0.064s]
    Masking low complexity seeds...  [0.025s]
    Searching alignments...  [2.451s]
    Processing query block 1, reference block 1/1, shape 10/16, index chunk 4/4.
    Building reference seed array...  [0.081s]
    Building query seed array...  [0.084s]
    Computing hash join...  [0.064s]
    Masking low complexity seeds...  [0.025s]
    Searching alignments...  [2.385s]
    Processing query block 1, reference block 1/1, shape 11/16, index chunk 1/4.
    Building reference seed array...  [0.084s]
    Building query seed array...  [0.086s]
    Computing hash join...  [0.066s]
    Masking low complexity seeds...  [0.025s]
    Searching alignments...  [2.794s]
    Processing query block 1, reference block 1/1, shape 11/16, index chunk 2/4.
    Building reference seed array...  [0.101s]
    Building query seed array...  [0.103s]
    Computing hash join...  [0.065s]
    Masking low complexity seeds...  [0.025s]
    Searching alignments...  [2.593s]
    Processing query block 1, reference block 1/1, shape 11/16, index chunk 3/4.
    Building reference seed array...  [0.102s]
    Building query seed array...  [0.107s]
    Computing hash join...  [0.065s]
    Masking low complexity seeds...  [0.025s]
    Searching alignments...  [2.466s]
    Processing query block 1, reference block 1/1, shape 11/16, index chunk 4/4.
    Building reference seed array...  [0.091s]
    Building query seed array...  [0.084s]
    Computing hash join...  [0.065s]
    Masking low complexity seeds...  [0.025s]
    Searching alignments...  [2.434s]
    Processing query block 1, reference block 1/1, shape 12/16, index chunk 1/4.
    Building reference seed array...  [0.082s]
    Building query seed array...  [0.085s]
    Computing hash join...  [0.068s]
    Masking low complexity seeds...  [0.025s]
    Searching alignments...  [2.755s]
    Processing query block 1, reference block 1/1, shape 12/16, index chunk 2/4.
    Building reference seed array...  [0.096s]
    Building query seed array...  [0.101s]
    Computing hash join...  [0.065s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.534s]
    Processing query block 1, reference block 1/1, shape 12/16, index chunk 3/4.
    Building reference seed array...  [0.105s]
    Building query seed array...  [0.107s]
    Computing hash join...  [0.064s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.449s]
    Processing query block 1, reference block 1/1, shape 12/16, index chunk 4/4.
    Building reference seed array...  [0.082s]
    Building query seed array...  [0.085s]
    Computing hash join...  [0.064s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.391s]
    Processing query block 1, reference block 1/1, shape 13/16, index chunk 1/4.
    Building reference seed array...  [0.084s]
    Building query seed array...  [0.086s]
    Computing hash join...  [0.064s]
    Masking low complexity seeds...  [0.025s]
    Searching alignments...  [2.749s]
    Processing query block 1, reference block 1/1, shape 13/16, index chunk 2/4.
    Building reference seed array...  [0.098s]
    Building query seed array...  [0.102s]
    Computing hash join...  [0.064s]
    Masking low complexity seeds...  [0.025s]
    Searching alignments...  [2.551s]
    Processing query block 1, reference block 1/1, shape 13/16, index chunk 3/4.
    Building reference seed array...  [0.103s]
    Building query seed array...  [0.107s]
    Computing hash join...  [0.064s]
    Masking low complexity seeds...  [0.025s]
    Searching alignments...  [2.472s]
    Processing query block 1, reference block 1/1, shape 13/16, index chunk 4/4.
    Building reference seed array...  [0.08s]
    Building query seed array...  [0.083s]
    Computing hash join...  [0.064s]
    Masking low complexity seeds...  [0.025s]
    Searching alignments...  [2.389s]
    Processing query block 1, reference block 1/1, shape 14/16, index chunk 1/4.
    Building reference seed array...  [0.085s]
    Building query seed array...  [0.088s]
    Computing hash join...  [0.065s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.77s]
    Processing query block 1, reference block 1/1, shape 14/16, index chunk 2/4.
    Building reference seed array...  [0.101s]
    Building query seed array...  [0.104s]
    Computing hash join...  [0.064s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.584s]
    Processing query block 1, reference block 1/1, shape 14/16, index chunk 3/4.
    Building reference seed array...  [0.105s]
    Building query seed array...  [0.11s]
    Computing hash join...  [0.065s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.473s]
    Processing query block 1, reference block 1/1, shape 14/16, index chunk 4/4.
    Building reference seed array...  [0.083s]
    Building query seed array...  [0.086s]
    Computing hash join...  [0.064s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.4s]
    Processing query block 1, reference block 1/1, shape 15/16, index chunk 1/4.
    Building reference seed array...  [0.081s]
    Building query seed array...  [0.083s]
    Computing hash join...  [0.065s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.793s]
    Processing query block 1, reference block 1/1, shape 15/16, index chunk 2/4.
    Building reference seed array...  [0.1s]
    Building query seed array...  [0.103s]
    Computing hash join...  [0.065s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.557s]
    Processing query block 1, reference block 1/1, shape 15/16, index chunk 3/4.
    Building reference seed array...  [0.104s]
    Building query seed array...  [0.107s]
    Computing hash join...  [0.065s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.468s]
    Processing query block 1, reference block 1/1, shape 15/16, index chunk 4/4.
    Building reference seed array...  [0.081s]
    Building query seed array...  [0.084s]
    Computing hash join...  [0.065s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.435s]
    Processing query block 1, reference block 1/1, shape 16/16, index chunk 1/4.
    Building reference seed array...  [0.081s]
    Building query seed array...  [0.085s]
    Computing hash join...  [0.064s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.752s]
    Processing query block 1, reference block 1/1, shape 16/16, index chunk 2/4.
    Building reference seed array...  [0.098s]
    Building query seed array...  [0.102s]
    Computing hash join...  [0.064s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.529s]
    Processing query block 1, reference block 1/1, shape 16/16, index chunk 3/4.
    Building reference seed array...  [0.103s]
    Building query seed array...  [0.105s]
    Computing hash join...  [0.064s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.429s]
    Processing query block 1, reference block 1/1, shape 16/16, index chunk 4/4.
    Building reference seed array...  [0.082s]
    Building query seed array...  [0.084s]
    Computing hash join...  [0.064s]
    Masking low complexity seeds...  [0.024s]
    Searching alignments...  [2.402s]
    Deallocating buffers...  [0.004s]
    Clearing query masking...  [0.005s]
    Computing alignments...  [42.947s]
    Deallocating reference...  [0.003s]
    Loading reference sequences...  [0s]
    Deallocating buffers...  [0.004s]
    Deallocating queries...  [0.002s]
    Loading query sequences...  [0s]
    Closing the input file...  [0s]
    Closing the output file...  [0.002s]
    Cleaning up...  [0s]
    Total time = 228.434s
    Reported 4038385 pairwise alignments, 4038385 HSPs.
    270568 queries aligned.
    
    
    -------------------------------Protein clustering-------------------------------
    INFO:vcontact2: Loading proteins...
    INFO:vcontact2: Merging ProkaryoticViralRefSeq94-Merged to user gene-to-genome mapping...
    .................................................. 1M
    .................................................. 2M
    .................................................. 3M
    ......................................
    [mclIO] writing <./Documents/Adhip/vcontact2/vc2_nchoe_02/merged.self-diamond.tab.mci>
    .......................................
    [mclIO] wrote native interchange 241042x241042 matrix with 4652476 entries to stream <./Documents/Adhip/vcontact2/vc2_nchoe_02/merged.self-diamond.tab.mci>
    [mclIO] wrote 241042 tab entries to stream <./Documents/Adhip/vcontact2/vc2_nchoe_02/merged.self-diamond.tab_mcxload.tab>
    [mcxload] tab has 241042 entries
    [mclIO] reading <./Documents/Adhip/vcontact2/vc2_nchoe_02/merged.self-diamond.tab.mci>
    .......................................
    [mclIO] read native interchange 241042x241042 matrix with 4652476 entries
    [mcl] pid 19358
     ite -------------------  chaos  time hom(avg,lo,hi) m-ie m-ex i-ex fmv
      1  ...................  37.70  1.29 0.98/0.23/3.52 2.33 2.20 2.20   0
      2  ...................  57.50  6.04 0.88/0.14/5.08 3.17 0.88 1.95   2
      3  ...................  31.00  3.60 0.84/0.14/6.01 2.12 0.75 1.47   0
      4  ...................  21.25  1.66 0.83/0.09/6.92 1.47 0.74 1.09   0
      5  ...................  15.16  0.81 0.82/0.14/5.31 1.21 0.72 0.78   0
      6  ...................  10.98  0.42 0.82/0.11/4.13 1.08 0.74 0.57   0
      7  ...................   9.18  0.26 0.83/0.20/2.55 1.03 0.78 0.45   0
      8  ...................   6.50  0.19 0.84/0.20/2.10 1.01 0.81 0.36   0
      9  ...................   5.23  0.16 0.86/0.20/1.27 1.00 0.82 0.30   0
     10  ...................   5.41  0.13 0.89/0.22/1.29 1.00 0.82 0.24   0
     11  ...................   4.31  0.11 0.92/0.19/1.32 1.00 0.82 0.20   0
     12  ...................   4.50  0.09 0.95/0.19/1.05 1.00 0.83 0.17   0
     13  ...................   4.73  0.08 0.97/0.22/1.00 1.00 0.86 0.14   0
     14  ...................   5.19  0.07 0.98/0.20/1.00 1.00 0.89 0.13   0
     15  ...................   4.09  0.07 0.99/0.24/1.00 1.00 0.92 0.12   0
     16  ...................   4.28  0.07 0.99/0.19/1.00 1.00 0.95 0.11   0
     17  ...................   4.21  0.06 1.00/0.23/1.00 1.00 0.97 0.11   0
     18  ...................   5.36  0.06 1.00/0.36/1.00 1.00 0.98 0.11   0
     19  ...................   2.49  0.06 1.00/0.39/1.00 1.00 0.99 0.10   0
     20  ...................   0.67  0.06 1.00/0.57/1.00 1.00 0.99 0.10   0
     21  ...................   0.49  0.06 1.00/0.64/1.00 1.00 1.00 0.10   0
     22  ...................   0.39  0.06 1.00/0.70/1.00 1.00 1.00 0.10   0
     23  ...................   0.30  0.06 1.00/0.75/1.00 1.00 1.00 0.10   0
     24  ...................   0.24  0.06 1.00/0.79/1.00 1.00 1.00 0.10   0
     25  ...................   0.22  0.06 1.00/0.78/1.00 1.00 1.00 0.10   0
     26  ...................   0.08  0.06 1.00/0.93/1.00 1.00 1.00 0.10   0
     27  ...................   0.14  0.06 1.00/0.93/1.00 1.00 1.00 0.10   0
     28  ...................   0.22  0.06 1.00/0.83/1.00 1.00 1.00 0.10   0
     29  ...................   0.24  0.06 1.00/0.76/1.00 1.00 1.00 0.10   0
     30  ...................   0.12  0.06 1.00/0.88/1.00 1.00 1.00 0.10   0
     31  ...................   0.01  0.06 1.00/0.99/1.00 1.00 1.00 0.10   0
     32  ...................   0.00  0.06 1.00/1.00/1.00 1.00 1.00 0.10   0
    [mcl] jury pruning marks: <99,99,99>, out of 100
    [mcl] jury pruning synopsis: <99.0 or perfect> (cf -scheme, -do log)
    [mcl] output is in ./Documents/Adhip/vcontact2/vc2_nchoe_02/merged.self-diamond.tab_mcl20.clusters
    [mcl] 31100 clusters found
    [mcl] output is in ./Documents/Adhip/vcontact2/vc2_nchoe_02/merged.self-diamond.tab_mcl20.clusters
    
    Please cite:
        Stijn van Dongen, Graph Clustering by Flow Simulation.  PhD thesis,
        University of Utrecht, May 2000.
        (  http://www.library.uu.nl/digiarchief/dip/diss/1895620/full.pdf
        or  http://micans.org/mcl/lit/svdthesis.pdf.gz)
    OR
        Stijn van Dongen, A cluster algorithm for graphs. Technical
        Report INS-R0010, National Research Institute for Mathematics
        and Computer Science in the Netherlands, Amsterdam, May 2000.
        (  http://www.cwi.nl/ftp/CWIreports/INS/INS-R0010.ps.Z
        or  http://micans.org/mcl/lit/INS-R0010.ps.Z)
    
    INFO:vcontact2: Building the cluster and profiles (this may take some time...)
    If it fails, try re-running using --blast-fp flag and specifiying merged.self-diamond.tab (or merged.self-blastp.tab)
    INFO:vcontact2: Saving intermediate files...
    
    
    ----------------------------------Loading data----------------------------------
    INFO:vcontact2: Read 233640 entries (dropped 2626 singletons) from ./Documents/Adhip/vcontact2/vc2_nchoe_02/vConTACT_profiles.csv
    
    
    --------------------------------Adding Taxonomy---------------------------------
    
    
    ------------------------Calculating Similarity Networks-------------------------
    
    
    ------------------------Contig Clustering & Affiliation-------------------------
    Loaded graph with 2634 nodes and 96104 edges
    [====================] 100% Growing clusters from seeds...
    [====================] 100% Finding highly overlapping clusters...
    [====================] 100% Merging highly overlapping clusters...
    Detected 362 complexes
    
    
    --------------------------------Protein modules---------------------------------
    .................................................. 1M
    .................................................. 2M
    ...............
    [mcl] new tab created
    [mcl] pid 19686
     ite -------------------  chaos  time hom(avg,lo,hi) m-ie m-ex i-ex fmv
      1  ...................  99.19  0.82 1.04/0.01/9.75 5.47 1.81 1.81  72
      2  ...................  49.59  1.67 0.72/0.04/4.18 10.33 0.11 0.20  91
      3  ...................   7.42  0.09 0.87/0.08/11.03 2.06 0.22 0.04  13
      4  ...................   7.14  0.01 0.96/0.14/11.33 1.04 0.58 0.03   0
      5  ...................   1.35  0.01 0.99/0.30/1.23 1.00 0.80 0.02   0
      6  ...................   0.30  0.01 1.00/0.76/1.22 1.00 0.91 0.02   0
      7  ...................   0.25  0.01 1.00/0.77/1.16 1.00 0.98 0.02   0
      8  ...................   0.75  0.01 1.00/0.54/1.03 1.00 1.00 0.02   0
      9  ...................   0.01  0.01 1.00/1.00/1.00 1.00 1.00 0.02   0
     10  ...................   0.00  0.01 1.00/1.00/1.00 1.00 1.00 0.02   0
    [mcl] jury pruning marks: <97,99,99>, out of 100
    [mcl] jury pruning synopsis: <97.8 or superb> (cf -scheme, -do log)
    [mcl] output is in ./Documents/Adhip/vcontact2/vc2_nchoe_02/modules_mcl_5.0.clusters
    [mcl] 707 clusters found
    [mcl] output is in ./Documents/Adhip/vcontact2/vc2_nchoe_02/modules_mcl_5.0.clusters
    
    Please cite:
        Stijn van Dongen, Graph Clustering by Flow Simulation.  PhD thesis,
        University of Utrecht, May 2000.
        (  http://www.library.uu.nl/digiarchief/dip/diss/1895620/full.pdf
        or  http://micans.org/mcl/lit/svdthesis.pdf.gz)
    OR
        Stijn van Dongen, A cluster algorithm for graphs. Technical
        Report INS-R0010, National Research Institute for Mathematics
        and Computer Science in the Netherlands, Amsterdam, May 2000.
        (  http://www.cwi.nl/ftp/CWIreports/INS/INS-R0010.ps.Z
        or  http://micans.org/mcl/lit/INS-R0010.ps.Z)
    
    
    
    ---------------------------Link modules and clusters----------------------------
    
    
    ----------------------------Exporting results files-----------------------------
    There were 687 genomes (including refs) that were singleton, outlier or overlaps.
    Still identifying genome substrings. Consider adjusting input genomes naming.
    'nchoe_02_contig_11'
    (vcontact2) virology@virology:~$
    

  2. Ben Bolduc

    Similar mentions of this genome affecting runs have been reported elsewhere. This seems to be an issue where vContact2 can’t identify a specific genome within the network because the name is contained within another name, e.g. Pseudomonas Phage P1 is a subset of Pseudomonas Phage P10. A fix for this was implemented quite some time ago - but that seems to only affect user genomes, not the reference database.

    We’ll be updating the reference DB to deal with this sequence.

  3. Log in to comment