How to get the genome fasta files in the default database
Issue #84
closed
Hello! Thank you for developing such a comprehensive and convenient tool!
I would like to ask how to obtain a genome file (FASTA) with a known host genome name (e.g., 3300019135_4, GB_GCA_003224935.1, or RS_GCF_002802985.1) as I would like to further analyze phage-host interactions. But I don't see the fasta file of the genome in the Aug_2023_pub_rw file. Thanks again!
Comments (3)
-
repo owner -
reporter Thank you for your quick response, your answer was helpful to me.
-
repo owner - changed status to closed
- Log in to comment
Hi ! We do not compile a fasta of all genome files since these are pulled from other resources. Specifically, you can look at:
- GTDB (https://gtdb.ecogenomic.org) / NCBI for any genome starting with GB_ (“GenBank”) or RS_ (“RefSeq”)
- the GEM data release for genome bins like 3300019135_4 (https://portal.nersc.gov/GEM/genomes/)
- IMG (img.jgi.doe.gov/m/) for IMG isolate genomes
Best,
Simon