General Questions

Issue #9 resolved
Bishav Bhattarai created an issue

Hello,

I used vcontact2 in the Cyverse environment. Are the results in the Cyverse environment updated?

I have some questions on the application of vcontact2.

  1. The protein file that is to be used as input file id from VIRSORTER. However, the contig file uploaded in VIRSORTER contains non- viral contigs as well. Does vcontact2 take into consideration of that while assigning taxonomy?
  2. I assume we look at the taxonomy composition under the ‘genome_by_genome_overview.csv’. How do we calculate the relative abundance? Do we use the ‘Size’ column to calculate that?

I am trying to install this in the Linux terminal as well. If any problems, in there, I will have some questions again,

Thanks,

Comments (3)

  1. Ben Bolduc

    Hi Bishav,

    1. vConTACT assumes that all genomes/proteins within the input file are viral, or at least, could be related to any other genomes within the file. That said, there’s no hard requirement or special “viral exclusive” techniques applied to the dataset. You could as easily cluster plasmid sequences or large operons (I won’t make any promises though!). If you have non-viral sequences, they will either 1) be excluded if they do not share significant sequence similarity with another sequence or 2) will cluster with similar sequences. I’d strongly suggest taking the VirSorter categories 1-2, 4-5, concatenating the fastas, and then predicting their ORFs using prodigal or similar. Then take those results and parse using Gene2Genome in CyVerse.
    2. The “Size” column corresponds to the number of genomes found in the initial genome clustering. Calculating abundance is a bit tricker. If you’re using CyVerse, check out https://dx.doi.org/10.17504/protocols.io.gv2bw8e. Essentially, you’re mapping reads to your viral contigs and geneerating a relative abundance table.

    If you do have any Qs about installation, consult the readme or wiki, or post a new issue!

    Thanks!

  2. Log in to comment