"protein(s) without contig" leading to "ERROR:vcontact2: Error in contig clustering"

Issue #30 resolved
Former user created an issue

Hi,

I'm trying to run vcontact2 on two phage contigs, but I keep getting "protein(s) without contig" warnings in "Building the cluster and profiles" followed by a possibly related fatal error in "Contig Clustering & Affiliation" ERROR:vcontact2: Error in contig clustering ERROR:vcontact2: 'DataFrame' object has no attribute 'ix'

A log of the run and my gene-to-genome mapping file are attached. I'm running vcontact2-0.9.13 installed via conda on an ubuntu 18.04 system.

Any help would be appreciated

Comments (11)

  1. Thomas Hackl

    OK, red herring = the "protein(s) without contig" errors occur ed because I forgot the header line in the “gene-to-genome”.csv… 🙈

    However, even with the proper mapping file I get:

    ------------------------Contig Clustering & Affiliation-------------------------
    Loaded graph with 2243 nodes and 79278 edges
    [====================] 100% Growing clusters from seeds...
    [====================] 100% Finding highly overlapping clusters...
    [====================] 100% Merging highly overlapping clusters...
    Detected 319 complexes
    ERROR:vcontact2: Error in contig clustering
    ERROR:vcontact2: 'DataFrame' object has no attribute 'ix'
    

  2. Thomas Hackl

    Ok, also had to downgrade to python=3.7 for panda=0.25.3 to work. But now I get another error…

    ------------------------Contig Clustering & Affiliation-------------------------
    Loaded graph with 2243 nodes and 79278 edges
    [====================] 100% Growing clusters from seeds...
    [====================] 100% Finding highly overlapping clusters...
    [====================] 100% Merging highly overlapping clusters...
    Detected 319 complexes
    ERROR:vcontact2: Error in viral clusters
    ERROR:vcontact2: type object 'object' has no attribute 'dtype'
    

  3. Eugen Pfeifer

    Have the same issue by running the test data set.

    ERROR:vcontact2: Error in viral clusters
    ERROR:vcontact2: type object 'object' has no attribute 'dtype'

    python=3.8.6 and pandas=0.25.3

    Looking forward for help!

  4. Thomas Hackl

    What worked for me, in the end, was installing the right pandas and python first:

    # python needs to be 3.7 for pandas < 1.0
    conda create --name vContact2 python=3.7 # NOTE the explicit 3.7 python here
    conda activate vContact2
    pandas need to be 0.25.3 - https://bitbucket.org/MAVERICLab/vcontact2/issues/17/error-vcontact2-error-in-contig-clustering
    conda install pandas=0.25.3
    conda install vcontact2
    conda install -y mcl blast diamond
    

  5. Eugen Pfeifer

    Thanks a lot @Thomas Hackl!!

    I reinstalled using your order and package version. Now it is working!

  6. Bridget Hegarty

    I’m having the same error with the example data. I tried specifying the versions of pandas (0.25.3) and python (3.7.9) like you suggested @Thomas Hackl , but I’m still getting the same error message. If anyone has any other suggestions, I’d greatly appreciate them. Thanks!

  7. Samuel Barnett

    In case someone is still having the “ERROR:vcontact2: Error in viral clusters, ERROR:vcontact2: type object 'object' has no attribute 'dtype'“ error, downgrading numpy to numpy=1.19.5 seemed to work for me. I’m not sure if also specifying python=3.7 and pandas=0.25.3 is also needed with this fix though because I did do that as well and continued to have the same error before downgrading numpy.

  8. Ben Bolduc

    I updated both the Singularity definition file and setup.py requirements to enforce numpy and pandas versions, but I haven’t yet gotten to Anaconda, so if you installed vContact2 through Anaconda, a quick fix was posted by Thomas. Indeed, using pandas <=0.25.3 and numpy <1.19.5 should resolve this issue.

    I will also add a check to ensure headers are enforced in the gene-to-genome file…

  9. Bridget Hegarty

    thanks for the additional fixes! I appreciate the help greatly. Unfortunately, I have a new error that is identical to that in issue 36, so commented there with the new problem.

  10. Ben Bolduc

    Closing this as it's due to absent headers or use of .ix. A check for headers will be included in 0.9.22. Pandas versioning will be used to avoid .ix depreciation.

  11. Log in to comment