How to successfully run add_to_db when archaeal genomes are absent?
Hi,
Thank you for creating this valuable tool.I encountered some issues while running add_to_db. My MAGs are all classified as bacteria, so there are no tree files for archaea. When running add_to_db, I got the following prompt:
“Starting
[1] Get a list of genomes to import...
[2] Import information from GTDBtk trees...
No archaeal tree, we will use the one from the original db
Traceback (most recent call last):
File "/home/miniconda3/envs/iphop_env/bin/iphop", line 10, in <module> sys.exit(cli())
File "/home/miniconda3/envs/iphop_env/lib/python3.8/site-packages/iphop/iphop.py", line 128, in cliargs "func"
File "/home/miniconda3/envs/iphop_env/lib/python3.8/site-packages/iphop/modules/master_add_to_db.py", line 188, in main get_tree_members(args['tree_a'],args['genome_list'],logger)
File "/home/miniconda3/envs/iphop_env/lib/python3.8/site-packages/iphop/modules/master_add_to_db.py", line 29, in get_tree_members tree = Phylo.read(tree_file, 'newick')
File "/home/miniconda3/envs/iphop_env/lib/python3.8/site-packages/Bio/Phylo/_io.py", line 60, in read tree = next(tree_gen)
File "/home/miniconda3/envs/iphop_env/lib/python3.8/site-packages/Bio/Phylo/_io.py", line 48, in parse with File.as_handle(file) as fp:
File "/home/miniconda3/envs/iphop_env/lib/python3.8/contextlib.py", line 113, in enter return next(self.gen)
File "/home/miniconda3/envs/iphop_env/lib/python3.8/site-packages/Bio/File.py", line 72, in as_handle with open(handleish, mode, **kwargs) as fp:
FileNotFoundError: [Errno 2] No such file or directory: '/home/mydatabase/iphop_db/Aug_2023_pub_rw/db_infos/gtdbtk.ar122.decorated.tree'”
There is only one file named gtdbtk.ar53.decorated.tree in the directory Aug_2023_pub_rw/db_infos. How should I handle this issue (rename gtdbtk.ar53.decorated.tree to gtdbtk.ar122.decorated.tree)? Could you provide me with some advice?
Comments (6)
-
repo owner -
reporter Thank you for your response. When I input 'iphop -h,' the version displayed is “iPHoP v1.3.3: integrating Host Phage Predictions”. Could I be encountering issues in other aspects?
-
repo owner No sorry, that’s a bug on our side I just understood. We’ll try to fix it asap, but in your case you can simply copy the file “gtdbtk.ar53.decorated.tree” to “gtdbtk.ar122.decorated.tree” and the rest of the script should work just fine. Sorry again, and let me know if you encounter any other issue !
-
reporter Thank you very much for your response. After copying 'gtdbtk.ar53.decorated.tree' as 'gtdbtk.ar122.decorated.tree,' the 'add_to_db' process ran successfully, and the log file is as follows.
“…
Processing bin.2750 fa
Preparing output file
done.
[8] Now build the new host genome metadata file...
Reading/home/mydatabase/iphop_db/Aug_2023_pub_rw/db_infos/gtdbtk.ar122.decorated.tree
We added 2750 additional bacteria genomes and 0 additional archaea genomes
[9] All done
!#!#!#!#!#! WARNING --- SOME UNEXPECTED EVENTS HAPPENED -- WE LIST THEM BELOW, IT COULD BE NOTHING, BUT YOU SHOULD STILL DOUBLE-CHECK #!#!#!#!#!#!#
Note - we did not find an archaeal tree, so we did not use any data from a new archaeal genome
Note - we did not find a decorated file for the archaeal tree, so we did not use any data from a new archaeal genome.”
Currently, I'm proceeding with 'iphop predict' using the newly generated host database. It seems that 'iphop predict' is running smoothly, but this will take some time to await the output results (my dataset is quite large, and the standard database took approximately 2 weeks to run).
-
repo owner - changed status to closed
-
repo owner - changed status to open
- Log in to comment
Hi,
Sorry you encountered this issue. This should have been fixed in the latest version, can you check which version of iphop you are running, and update to 1.3.3 if needed ?