error with download the iPHoP host database
Hi developer:
I tried 2 ways to download the large iPHoP host database to server, i.e. iphop download --db_dir path_to_iPHoP_db
or wget https://portal.nersc.gov/cfs/m342/iphop/db/iPHoP.latest_rw.tar.gz
, however both were failed.
wget error looked like this:
21071450K .......... .......... .......... .......... .......... 11% 10.8M 8h9m
21071500K .......... .......... .......... .......... .......... 11% 10.7M 8h9m
21071550K .......... .......... .......... .......... .......... 11% 11.3M 8h9m
21071600K .......... .......... .......... .......... .......... 11% 10.4M 8h9m
21071650K .......... .......... .......... .......... .......... 11% 11.3M 8h9m
21071700K .......... .......... .......... .......... .......... 11% 10.4M 8h9m
21071750K .......... 11% 18.1M=3m18s
2024-02-25 00:52:37 (5.22 MB/s) - Connection closed at byte 21577482770. Giving up.
I guess the database is so big that the net closed during the downloading? I tried several times it always stops at 11%.
I don’t know why and how to solve this.
Another question is that as long as I use iphop to predict hosts, is it necessary to predict hosts simultaneously using VirHostMatcher and CRISPR spacers blast(CRISPR spacer searched by MinCED tool) further, I am not sure whether they are redundant algorithmically, as in many papers, they preidict hosts using not only one tool.
Thanks!
Comments (7)
-
repo owner -
repo owner As for the use of other tools, iPHoP is designed to run a series of tools (which include VirHostMatcher and CRISPR spacer blast) and then provide a single consensus host prediction. So in my opinion you do not need to run another tool, unless you have specific signal and/or host genomes you want to look into.
-
reporter Thank you so much for your help! I will try to use '--split' to download the large database.
-
repo owner I got an email saying there was another message in this issue, but can’t see it here. So just checking: was there an issue on Bitbucket or was this message about something you fixed in the meantime, and then deleted the message ? Thanks !
-
reporter Yes, I’m sorry I was about to ask something related and then I understaood what was wrong. Thank you!
-
repo owner No worries and thanks for confirming !
-
repo owner - changed status to closed
- Log in to comment
Hi !
Unfortunately, the database is very large, and for some connections can not be downloaded in one step. You can try to use the parameter “--split” in “iphop download”, as the database is then downloaded by chunks of 10Gb, which may be sufficient. If this does not work, my recommendation would be to use a download manager like aria2c (https://aria2.github.io/manual/en/html/aria2c.html), and ask it to download the file https://portal.nersc.gov/cfs/m342/iphop/db/iPHoP.latest_rw.tar.gz. Hopefully it would handle the connection closed better, and let you download the whole database eventually.