Error with r-rcpp during rafah step

Issue #19 resolved
Former user created an issue

Hi there,

I'm having trouble with getting iphop to run during the rafah tool. It gives me the "FileNotFoundError: [...] rafah_out/Full_Host_Predictions.tsv" error but I don't believe this issue is to do with running out of memory or the perl libraries (I added the update bioperl scripts already). When I look at rafah.log it appears the issue is to do with r-rcpp:

<pre>Error in rangerCpp(treetype, x, y, forest$independent.variable.names, : function 'Rcpp_precious_remove' not provided by package 'Rcpp' Calls: predict ... predict.ranger -> predict -> predict.ranger.forest -> rangerCpp Execution halted No such file or directory at /home/ubuntu/sdb/miniconda3/envs/iphop/lib/python3.8/site-packages/iphop/utils/RaFAH_v0.3.pl line 313. Parsing output of host prediction iphop_results/Wdir/rafah_out/Full_Host_Predictions.tsv ```

I've tried downgrading rcpp from 1.08 to 1.07/1.06 as well as downgrading iphop from 1.2 to 1.1 but this hasn't changed anything. Do you have any idea how to fix this?

P.S. Merry christmas!

Comments (7)

  1. Stanley Ho

    Not sure why the error code didnt format correctly above. Here’s the entirety of the rafah.log file.

    Running host prediction mode
    Indexing sequences from iphop_results/Wdir/split_input/
    Processing AJ421943.1.fasta
    Processed 1 Genomic Sequences
    Running Prodigal
    Indexing sequences from iphop_results/Wdir/rafah_out/Full_CDS_Prediction.faa
    Running hmmsearch. Query: iphop_results/Wdir/rafah_out/Full_CDS_Prediction.faa DB: Test_db/db/rafah_data/HP_Ranger_Model_3_Filtered_0.9_Valids.hmm
    Obtained 43644 ids from Test_db/db/rafah_data/HP_Ranger_Model_3_Valid_Cols.txt
    Parsing iphop_results/Wdir/rafah_out/Full_CDSxClusters_Prediction
    Detected 88 OGs across 1 genomic sequences
    Performing host prediction
    [1] "Loading Model from  Test_db/db/rafah_data/MMSeqs_Clusters_Ranger_Model_1+2+3_Clean.RData"
    [1] "Reading input file  iphop_results/Wdir/rafah_out/Full_Genome_to_OG_Score_Min_Score_50-Max_evalue_1e-05_Prediction.tsv"
                 used    (Mb) gc trigger    (Mb)   max used    (Mb)
    Ncells    8386030   447.9   14550460   777.1   11684437   624.1
    Vcells 2145114740 16366.0 2774702278 21169.3 2191067527 16716.6
    [1] "Passing data to Random Forest using  16  threads"
    Error in rangerCpp(treetype, x, y, forest$independent.variable.names,  :
      function 'Rcpp_precious_remove' not provided by package 'Rcpp'
    Calls: predict ... predict.ranger -> predict -> predict.ranger.forest -> rangerCpp
    Execution halted
    No such file or directory at /home/ubuntu/sdb/miniconda3/envs/iphop/lib/python3.8/site-packages/iphop/utils/RaFAH_v0.3.pl line 313.
    Parsing output of host prediction iphop_results/Wdir/rafah_out/Full_Host_Predictions.tsv
    

  2. Simon Roux repo owner

    Yikes, sorry I’ve never seen this issue with RaFAH in a conda install. I wonder if there is maybe something weird happening because you are only processing a single sequence (it shouldn’t be, but ..). Did you try the test dataset, and do you see the same error there ?

  3. Stanley Ho

    I just tried the test dataset and got the same error in the log

    Running host prediction mode
    Indexing sequences from iphop_results/Wdir/split_input/
    Processing AJ421943.1.fasta
    Processing CP017905.1.fasta
    Processing IMGVR_UViG_3300013274_000001.fasta
    Processing IMGVR_UViG_3300013456_000001.fasta
    Processing MT657335.1.fasta
    Processed 5 Genomic Sequences
    Running Prodigal
    Indexing sequences from iphop_results/Wdir/rafah_out/Full_CDS_Prediction.faa
    Running hmmsearch. Query: iphop_results/Wdir/rafah_out/Full_CDS_Prediction.faa DB: Test_db/db/rafah_data/HP_Ranger_Model_3_Filtered_0.9_Valids.hmm
    Obtained 43644 ids from Test_db/db/rafah_data/HP_Ranger_Model_3_Valid_Cols.txt
    Parsing iphop_results/Wdir/rafah_out/Full_CDSxClusters_Prediction
    Detected 653 OGs across 5 genomic sequences
    Performing host prediction
    [1] "Loading Model from  Test_db/db/rafah_data/MMSeqs_Clusters_Ranger_Model_1+2+3_Clean.RData"
    [1] "Reading input file  iphop_results/Wdir/rafah_out/Full_Genome_to_OG_Score_Min_Score_50-Max_evalue_1e-05_Prediction.tsv"
                 used    (Mb) gc trigger    (Mb)   max used    (Mb)
    Ncells    8386022   447.9   14550437   777.1   11684437   624.1
    Vcells 2145248296 16367.0 2774702278 21169.3 2191973437 16723.5
    [1] "Passing data to Random Forest using  16  threads"
    Error in rangerCpp(treetype, x, y, forest$independent.variable.names,  :
      function 'Rcpp_precious_remove' not provided by package 'Rcpp'
    Calls: predict ... predict.ranger -> predict -> predict.ranger.forest -> rangerCpp
    Execution halted
    No such file or directory at /home/ubuntu/sdb/miniconda3/envs/iphop/lib/python3.8/site-packages/iphop/utils/RaFAH_v0.3.pl line 313.
    Parsing output of host prediction iphop_results/Wdir/rafah_out/Full_Host_Predictions.tsv
    

  4. Stanley Ho

    It worked! Thanks a lot for investigating. I wonder why the conda package didn’t work for this.

  5. Simon Roux repo owner

    Good question, conda is still mysterious to me sometimes, but glad that it worked :-)

  6. Log in to comment