Building custom database no archaea

Issue #93 closed
huaiyu wang created an issue

Hi Simon

Thank you for develping iPHoP. I just start to use this tools while I can not build archaea database based on my own MAG. The bacteria database was built successfully

The error is

[2024-02-28 19:25:09] ERROR: Input file does not exist: MAGs_GTDB-tk_results/align/gtdbtk.ar53.msa.fasta.gz
[2024-02-28 19:25:09] ERROR: Controlled exit resulting from an unrecoverable error or warning.

I am sure that my MAGs do not have archaea because I did SIP experiments and all the MAGs have been annotated.

So I am wondering if the error comes from no Archaea input. And if it is, it is possible to do the following step

Thanks

Best

Huaiyu

Comments (4)

  1. huaiyu wang reporter

    Thanks you Simon, I already successfully add my MAGs into the database and get a new database named 2024_hosts_63_iphop_1.3.3 close to Aug_2023_pub_rw. My question is should I merge this two database together or use 2024_hosts_63_iphop_1.3.3. I run iPHoP with these two database but I found the results are different, Aug_2023_pub_rw gave me more host-virus prediction. Could you please give me some suggestions?

  2. Simon Roux repo owner

    You should not merge databases after “add_to_db”, all the possible merging is already done. What you will have is two distinct database (Aug_2023_pub_rw and 2024_hosts_63_iphop_1.3.3). The recommendation is then to run iPHoP predict separately for each database. It is expected that the results will be different, and some predictions available with Aug_2023_pub_rw may be missing from 2024_hosts_63_iphop_1.3.3 (because some genomes can not be imported in the custom database). Our current recommendation is to take the prediction with the highest score for each virus across the two databases (there is no reason to trust one database more than the other, so the iPHoP score is the best thing we have to select the “best” prediction).

  3. Log in to comment