KeyError: '3300006428_5_vs_RS_GCF_008727735.1'
Hi Simon,I found the error…
[7.5] Aggregating all results and formatting for RF...
### Welcome to iPHoP ###
write
Traceback (most recent call last):
File "/home/yc/miniconda3/envs/iphop_env/bin/iphop", line 10, in <module>
sys.exit(cli())
File "/home/yc/miniconda3/envs/iphop_env/lib/python3.8/site-packages/iphop/iphop.py", line 122, in cli
args["func"](args)
File "/home/yc/miniconda3/envs/iphop_env/lib/python3.8/site-packages/iphop/modules/master_predict.py", line 96, in main
dataprep_rf.aggregate_rf(args)
File "/home/yc/miniconda3/envs/iphop_env/lib/python3.8/site-packages/iphop/modules/dataprep_rf.py", line 35, in aggregate_rf
compute_matrices(df_blast,df_crispr,df_labels,args)
File "/home/yc/miniconda3/envs/iphop_env/lib/python3.8/site-packages/iphop/modules/dataprep_rf.py", line 147, in compute_matrices
selected_blast['Dist'] = selected_blast['Repr'].apply(lambda x: update_dist(host_pivot,x,store_dist))
File "/home/yc/miniconda3/envs/iphop_env/lib/python3.8/site-packages/pandas/core/series.py", line 4357, in apply
return SeriesApply(self, func, convert_dtype, args, kwargs).apply()
File "/home/yc/miniconda3/envs/iphop_env/lib/python3.8/site-packages/pandas/core/apply.py", line 1043, in apply
return self.apply_standard()
File "/home/yc/miniconda3/envs/iphop_env/lib/python3.8/site-packages/pandas/core/apply.py", line 1099, in apply_standard
mapped = lib.map_infer(
File "pandas/_libs/lib.pyx", line 2859, in pandas._libs.lib.map_infer
File "/home/yc/miniconda3/envs/iphop_env/lib/python3.8/site-packages/iphop/modules/dataprep_rf.py", line 147, in <lambda>
selected_blast['Dist'] = selected_blast['Repr'].apply(lambda x: update_dist(host_pivot,x,store_dist))
File "/home/yc/miniconda3/envs/iphop_env/lib/python3.8/site-packages/iphop/modules/dataprep_rf.py", line 218, in update_dist
dist = store_dist[code]
KeyError: '3300006428_5_vs_RS_GCF_008727735.1''
I still haven't found out how to solve the problem. I look forward to your reply.
Comments (8)
-
repo owner -
reporter Default database… that’s wired..
I checked the previous steps and there seems to be no problem
-
repo owner Can you check that the database downloaded correctly ? It seems like iPHoP has some issues loading the information from the trees (there should be a file named “gtdbtk.bac120.decorated.tree” in the folder “db_infos”)
-
reporter it’s ”iPHoP_db_Sept21.tar.gz“,and complete. By the way,I haven't had any problems using test data before…
“gtdbtk.bac120.decorated.tree” is in the folder “db_infos”, RS_ GCF_ 008727735.1 is also included.
Perhaps there is any way I can abandon this sequence? If the final result can be successfully exported
-
repo owner Is 3300006428_5 also found in the file “gtdbtk.bac120.decorated.tree” ? If not, that’s probably an issue with this specific file.
You can’t “abandon” a reference sequence, however it is possible (likely) that the issue is linked with a single specific sequence in your input file, so you may want to try to split this input files in smaller groups and see if some of these groups can finish successfully.
-
reporter 3300006428_5 is also in the file “gtdbtk.bac120.decorated.tree“ .
I will accept your suggestion and try to find the specific sequence , thanks!
If ok, let's find the reason.
-
repo owner This was an unexpected error with “database_prep_rf”, should be fixed now in 1.3.1, thanks for reporting !
-
repo owner - changed status to resolved
- Log in to comment
Good question, this seems like a potential issue with the database. Are you using a custom host database, or the default one ?