iPHoP fails at step [6/2] "Get relevant RaFAH scores", FileNotFoundError for Full_Host_Predictions.tsv

Issue #15 resolved
Michael Shamash created an issue

Hi,

I followed the instructions in the README to get iPHoP up and running with the test database + dataset (installed via mamba), and unfortunately am running into an error consistently after RaFAH finishes running, where iPHoP cannot find the output file it’s looking for. The same thing happens when using the full database. I’ve attached my stdout below, as well as a list of the files in the rafah_out directory.

Any ideas why this would be happening on a new install?

Thanks in advance.

Michael

$ iphop predict --fa_file test_input_phages.fna --db_dir iphop_db/Test_db/ --out_dir iphop_test_results/test_input_phages_iphop
### Welcome to iPHoP ###
Looks like everything is now set up, we will first clean up the input file, and then we will start the host prediction steps themselves
[1/1/Run] Running blastn against genomes...
[1/3/Run] Get relevant blast matches...
[2/1/Run] Running blastn against CRISPR...
[2/2/Run] Get relevant crispr matches...
[3/1/Run] Running WIsH...
[3/2/Run] Get relevant WIsH hits...
[4/1/Run] Running VHM s2 similarities...
[4/2/Run] Get relevant VHM hits...
[5/1/Run] Running PHP...
[5/2/Run] Get relevant PHP hits...
[6/1/Run] Running RaFAH...
[6/2/Run] Get relevant RaFAH scores...
Traceback (most recent call last):
  File "/home/mshamash/miniconda3/envs/iphop_env/bin/iphop", line 10, in <module>
    sys.exit(cli())
  File "/home/mshamash/miniconda3/envs/iphop_env/lib/python3.8/site-packages/iphop/iphop.py", line 121, in cli
    args["func"](args)
  File "/home/mshamash/miniconda3/envs/iphop_env/lib/python3.8/site-packages/iphop/modules/master_predict.py", line 85, in main
    rafah.run_and_parse_rafah(args)
  File "/home/mshamash/miniconda3/envs/iphop_env/lib/python3.8/site-packages/iphop/modules/rafah.py", line 32, in run_and_parse_rafah
    get_rafah_results(args["fasta_file"],rafahinput,args["rafahrawresult"],args["rafahparsed"],args["rafahfullresult"],args["genus_file"])
  File "/home/mshamash/miniconda3/envs/iphop_env/lib/python3.8/site-packages/iphop/modules/rafah.py", line 53, in get_rafah_results
    df_pred = pd.read_csv(pred_file,delimiter='\t',quotechar='"', index_col=0)
  File "/home/mshamash/miniconda3/envs/iphop_env/lib/python3.8/site-packages/pandas/util/_decorators.py", line 311, in wrapper
    return func(*args, **kwargs)
  File "/home/mshamash/miniconda3/envs/iphop_env/lib/python3.8/site-packages/pandas/io/parsers/readers.py", line 586, in read_csv
    return _read(filepath_or_buffer, kwds)
  File "/home/mshamash/miniconda3/envs/iphop_env/lib/python3.8/site-packages/pandas/io/parsers/readers.py", line 482, in _read
    parser = TextFileReader(filepath_or_buffer, **kwds)
  File "/home/mshamash/miniconda3/envs/iphop_env/lib/python3.8/site-packages/pandas/io/parsers/readers.py", line 811, in __init__
    self._engine = self._make_engine(self.engine)
  File "/home/mshamash/miniconda3/envs/iphop_env/lib/python3.8/site-packages/pandas/io/parsers/readers.py", line 1040, in _make_engine
    return mapping[engine](self.f, **self.options)  # type: ignore[call-arg]
  File "/home/mshamash/miniconda3/envs/iphop_env/lib/python3.8/site-packages/pandas/io/parsers/c_parser_wrapper.py", line 51, in __init__
    self._open_handles(src, kwds)
  File "/home/mshamash/miniconda3/envs/iphop_env/lib/python3.8/site-packages/pandas/io/parsers/base_parser.py", line 222, in _open_handles
    self.handles = get_handle(
  File "/home/mshamash/miniconda3/envs/iphop_env/lib/python3.8/site-packages/pandas/io/common.py", line 701, in get_handle
    handle = open(
FileNotFoundError: [Errno 2] No such file or directory: 'iphop_test_results/test_input_phages_iphop/Wdir/rafah_out/Full_Host_Predictions.tsv'

$ ls iphop_test_results/test_input_phages_iphop/Wdir/rafah_out/
Full_CDS_Prediction.faa
Full_CDS_Prediction.fna
Full_CDS_Prediction.gff
Full_CDSxClusters_Prediction
Full_Genome_to_OG_Score_Min_Score_50-Max_evalue_1e-05_Prediction.tsv
Full_Genomes_Prediction.fasta

Comments (2)

  1. Michael Shamash reporter

    Just realized that the job is running out of memory at this step. So this is likely the issue, as mentioned in the README. Closing for now!

  2. Simon Roux repo owner

    Haha I was just replying “Well, that’s weird it seems like RaFAH does not finish, it sometimes run out of memory” :-)

    In the “wdir” folder, you should have a file called “rafah.log”. This should give you more information of what exactly went wrong, but most likely it is an out of memory error.

    To avoid re-running everything, you can remove the folder “rafah_out' and the files rafah.log and rafahparsed.csv, and if you re-run iPHoP with the same output directory it should pick up straight at the RaFAH step.

  3. Log in to comment