Predict step [9/2]: AttributeError: 'float' object has no attribute 'split'
Thank you for being responsive to issues on this new and useful tool. I've run into an error in step 9 of the predict workflow "[9/2] Combining all results (Blast, CRISPR, iPHoP, and RaFAH) in a single file..." All of the preceding steps seem to have worked fine, I'm not sure what the issue is. This is the error that I get at step 9/2:
Traceback (most recent call last):
File "/storage1/rstudent/miniconda3/envs/iphop/bin/iphop", line 10, in <module>
sys.exit(cli())
File "/storage1/rstudent/miniconda3/envs/iphop/lib/python3.8/site-packages/iphop/iphop.py", line 121, in cli
args["func"](args)
File "/storage1/rstudent/miniconda3/envs/iphop/lib/python3.8/site-packages/iphop/modules/master_predict.py", line 102, in main
runaggregatormodel.run_model(args)
File "/storage1/rstudent/miniconda3/envs/iphop/lib/python3.8/site-packages/iphop/modules/runaggregatormodel.py", line 65, in run_model
merged = merge_all_results(args)
File "/storage1/rstudent/miniconda3/envs/iphop/lib/python3.8/site-packages/iphop/modules/runaggregatormodel.py", line 232, in merge_all_results
merged['Repr host genus'] = merged['Repr host taxonomy'].apply(lambda x: transform_into_genus(x))
File "/storage1/rstudent/miniconda3/envs/iphop/lib/python3.8/site-packages/pandas/core/series.py", line 4357, in apply
return SeriesApply(self, func, convert_dtype, args, kwargs).apply()
File "/storage1/rstudent/miniconda3/envs/iphop/lib/python3.8/site-packages/pandas/core/apply.py", line 1043, in apply
return self.apply_standard()
File "/storage1/rstudent/miniconda3/envs/iphop/lib/python3.8/site-packages/pandas/core/apply.py", line 1099, in apply_standard
mapped = lib.map_infer(
File "pandas/_libs/lib.pyx", line 2859, in pandas._libs.lib.map_infer
File "/storage1/rstudent/miniconda3/envs/iphop/lib/python3.8/site-packages/iphop/modules/runaggregatormodel.py", line 232, in <lambda>
merged['Repr host genus'] = merged['Repr host taxonomy'].apply(lambda x: transform_into_genus(x))
File "/storage1/rstudent/miniconda3/envs/iphop/lib/python3.8/site-packages/iphop/modules/runaggregatormodel.py", line 161, in transform_into_genus
tab = taxo_string.split(";")
AttributeError: 'float' object has no attribute 'split'
I looked at the intermediate outputs, there are results in "blastparsed.tsv", "crisprparsed.tsv", "phpparsed.csv", "rafahparsed.csv", "vhmparsed.csv", "wishparsed.csv", and "All_scores_iPHoP_by_instance.csv". It seems like there are results available for every individual tool (unless I missed something) but they just aren't getting merged together. Thanks in advance for the help.
Comments (3)
-
repo owner -
reporter Email sent, thank you. BTW, I am using a combined database built from the “add_to_db” module using MAGs from our own data. That was successful and the add_to_db module ended without errors.
-
reporter - changed status to resolved
Resolved. I was using a custom database with some archaeal MAGs built with GTDBtk 2.1.1, which used 53 archaeal marker genes instead of 122. This resulted in the Host_genomes.tsv file in my database "db_info" directory having NA's for the taxonomy of my MAGs, which led to the error in step 9/2. The solution was to fill all of these NA's with the taxonomy information given by GTDBtk in "gtdbtk.ar53.decorated.tree-taxonomy" from the results of the de novo workflow used to build my database, remove "matrixlabels.csv", all files named "Prediction_Model ... .csv", and "All_scores_iPHoP_by_instance.csv" from the "Wdir" subdirectory in the iPHoP predict out directory, and then re-run iPHoP predict. Thank you to Simon Roux for coming up with this solution.
- Log in to comment
Hi James,
I don’t remember seeing this error before, and it looks like there may be some issue in at least one of these csv files. Would it be possible to upload (or share via email: sroux at lbl.gov) “…parsed” files and the “All_scores_iPHoP_by_instance.csv” file ? Thanks !