Predict step [9/2]: AttributeError: 'float' object has no attribute 'split'

Issue #9 resolved
James Kosmopoulos created an issue

Thank you for being responsive to issues on this new and useful tool. I've run into an error in step 9 of the predict workflow "[9/2] Combining all results (Blast, CRISPR, iPHoP, and RaFAH) in a single file..." All of the preceding steps seem to have worked fine, I'm not sure what the issue is. This is the error that I get at step 9/2:

Traceback (most recent call last):
  File "/storage1/rstudent/miniconda3/envs/iphop/bin/iphop", line 10, in <module>
    sys.exit(cli())
  File "/storage1/rstudent/miniconda3/envs/iphop/lib/python3.8/site-packages/iphop/iphop.py", line 121, in cli
    args["func"](args)
  File "/storage1/rstudent/miniconda3/envs/iphop/lib/python3.8/site-packages/iphop/modules/master_predict.py", line 102, in main
    runaggregatormodel.run_model(args)
  File "/storage1/rstudent/miniconda3/envs/iphop/lib/python3.8/site-packages/iphop/modules/runaggregatormodel.py", line 65, in run_model
    merged = merge_all_results(args)
  File "/storage1/rstudent/miniconda3/envs/iphop/lib/python3.8/site-packages/iphop/modules/runaggregatormodel.py", line 232, in merge_all_results
    merged['Repr host genus'] = merged['Repr host taxonomy'].apply(lambda x: transform_into_genus(x))
  File "/storage1/rstudent/miniconda3/envs/iphop/lib/python3.8/site-packages/pandas/core/series.py", line 4357, in apply
    return SeriesApply(self, func, convert_dtype, args, kwargs).apply()
  File "/storage1/rstudent/miniconda3/envs/iphop/lib/python3.8/site-packages/pandas/core/apply.py", line 1043, in apply
    return self.apply_standard()
  File "/storage1/rstudent/miniconda3/envs/iphop/lib/python3.8/site-packages/pandas/core/apply.py", line 1099, in apply_standard
    mapped = lib.map_infer(
  File "pandas/_libs/lib.pyx", line 2859, in pandas._libs.lib.map_infer
  File "/storage1/rstudent/miniconda3/envs/iphop/lib/python3.8/site-packages/iphop/modules/runaggregatormodel.py", line 232, in <lambda>
    merged['Repr host genus'] = merged['Repr host taxonomy'].apply(lambda x: transform_into_genus(x))
  File "/storage1/rstudent/miniconda3/envs/iphop/lib/python3.8/site-packages/iphop/modules/runaggregatormodel.py", line 161, in transform_into_genus
    tab = taxo_string.split(";")
AttributeError: 'float' object has no attribute 'split'

I looked at the intermediate outputs, there are results in "blastparsed.tsv", "crisprparsed.tsv", "phpparsed.csv", "rafahparsed.csv", "vhmparsed.csv", "wishparsed.csv", and "All_scores_iPHoP_by_instance.csv". It seems like there are results available for every individual tool (unless I missed something) but they just aren't getting merged together. Thanks in advance for the help.

Comments (3)

  1. Simon Roux repo owner

    Hi James,

    I don’t remember seeing this error before, and it looks like there may be some issue in at least one of these csv files. Would it be possible to upload (or share via email: sroux at lbl.gov) “…parsed” files and the “All_scores_iPHoP_by_instance.csv” file ? Thanks !

  2. James Kosmopoulos reporter

    Email sent, thank you. BTW, I am using a combined database built from the “add_to_db” module using MAGs from our own data. That was successful and the add_to_db module ended without errors.

  3. James Kosmopoulos reporter

    Resolved. I was using a custom database with some archaeal MAGs built with GTDBtk 2.1.1, which used 53 archaeal marker genes instead of 122. This resulted in the Host_genomes.tsv file in my database "db_info" directory having NA's for the taxonomy of my MAGs, which led to the error in step 9/2. The solution was to fill all of these NA's with the taxonomy information given by GTDBtk in "gtdbtk.ar53.decorated.tree-taxonomy" from the results of the de novo workflow used to build my database, remove "matrixlabels.csv", all files named "Prediction_Model ... .csv", and "All_scores_iPHoP_by_instance.csv" from the "Wdir" subdirectory in the iPHoP predict out directory, and then re-run iPHoP predict. Thank you to Simon Roux for coming up with this solution.

  4. Log in to comment