How to successfully run add_to_db when archaeal genomes are absent?

Issue #83 open
em_v created an issue

Hi,

Thank you for creating this valuable tool.I encountered some issues while running add_to_db. My MAGs are all classified as bacteria, so there are no tree files for archaea. When running add_to_db, I got the following prompt:

“Starting
[1] Get a list of genomes to import...
[2] Import information from GTDBtk trees...
No archaeal tree, we will use the one from the original db
Traceback (most recent call last):
File "/home/miniconda3/envs/iphop_env/bin/iphop", line 10, in <module> sys.exit(cli())
File "/home/miniconda3/envs/iphop_env/lib/python3.8/site-packages/iphop/iphop.py", line 128, in cliargs "func"

File "/home/miniconda3/envs/iphop_env/lib/python3.8/site-packages/iphop/modules/master_add_to_db.py", line 188, in main get_tree_members(args['tree_a'],args['genome_list'],logger)
File "/home/miniconda3/envs/iphop_env/lib/python3.8/site-packages/iphop/modules/master_add_to_db.py", line 29, in get_tree_members tree = Phylo.read(tree_file, 'newick')
File "/home/miniconda3/envs/iphop_env/lib/python3.8/site-packages/Bio/Phylo/_io.py", line 60, in read tree = next(tree_gen)
File "/home/miniconda3/envs/iphop_env/lib/python3.8/site-packages/Bio/Phylo/_io.py", line 48, in parse with File.as_handle(file) as fp:
File "/home/miniconda3/envs/iphop_env/lib/python3.8/contextlib.py", line 113, in enter return next(self.gen)
File "/home/miniconda3/envs/iphop_env/lib/python3.8/site-packages/Bio/File.py", line 72, in as_handle with open(handleish, mode, **kwargs) as fp:
FileNotFoundError: [Errno 2] No such file or directory: '/home/mydatabase/iphop_db/Aug_2023_pub_rw/db_infos/gtdbtk.ar122.decorated.tree'”

There is only one file named gtdbtk.ar53.decorated.tree in the directory Aug_2023_pub_rw/db_infos. How should I handle this issue (rename gtdbtk.ar53.decorated.tree to gtdbtk.ar122.decorated.tree)? Could you provide me with some advice?

Comments (6)

  1. Simon Roux repo owner

    Hi,
    Sorry you encountered this issue. This should have been fixed in the latest version, can you check which version of iphop you are running, and update to 1.3.3 if needed ?

  2. em_v reporter

    Thank you for your response. When I input 'iphop -h,' the version displayed is “iPHoP v1.3.3: integrating Host Phage Predictions”. Could I be encountering issues in other aspects?

  3. Simon Roux repo owner

    No sorry, that’s a bug on our side I just understood. We’ll try to fix it asap, but in your case you can simply copy the file “gtdbtk.ar53.decorated.tree” to “gtdbtk.ar122.decorated.tree” and the rest of the script should work just fine. Sorry again, and let me know if you encounter any other issue !

  4. em_v reporter

    Thank you very much for your response. After copying 'gtdbtk.ar53.decorated.tree' as 'gtdbtk.ar122.decorated.tree,' the 'add_to_db' process ran successfully, and the log file is as follows.

    “…

    Processing bin.2750 fa

    Preparing output file

    done.

    [8] Now build the new host genome metadata file...

    Reading/home/mydatabase/iphop_db/Aug_2023_pub_rw/db_infos/gtdbtk.ar122.decorated.tree

    We added 2750 additional bacteria genomes and 0 additional archaea genomes

    [9] All done

    !#!#!#!#!#! WARNING --- SOME UNEXPECTED EVENTS HAPPENED -- WE LIST THEM BELOW, IT COULD BE NOTHING, BUT YOU SHOULD STILL DOUBLE-CHECK #!#!#!#!#!#!#

    Note - we did not find an archaeal tree, so we did not use any data from a new archaeal genome

    Note - we did not find a decorated file for the archaeal tree, so we did not use any data from a new archaeal genome.”

    Currently, I'm proceeding with 'iphop predict' using the newly generated host database. It seems that 'iphop predict' is running smoothly, but this will take some time to await the output results (my dataset is quite large, and the standard database took approximately 2 weeks to run).

  5. Log in to comment