Possible error with taxonomy assignment?

Issue #7 new
created an issue

Hi - This may not be an error/bug, but rather an issue with interpretation on my end. I apologize if I am missing something simple.

Whatever the case, I am running the command line version of metannotate. I am able to run the test data set with no errors as instructed in the README file, and all the output files are created no problem. When I try the program on my own data and a different HMM, there does not seem to be any analyses for assigning taxonomy. This is the command I am running:

sudo python run_metannotate.py --orf_files=/media/4TB_drive1/Cladophora_Metannotate/Calumet20B_all.out.out.faa,/media/4TB_drive1/Cladophora_Metannotate/JeorsePark22B_all.out.out.faa --hmm_files=data/hmms/TIGR01287.HMM --reference_database=data/Refseq.fa --output_dir=Cladophora_test --tmp_dir=test_tmp --run_mode=both --orfs_hmm_evalue=0.01 --refseq_hmm_evalue=0.01

This is the output:
Running commands: hmmstat data/hmms/TIGR01287.HMM
Running hmmsearch on provided sequences.
Running commands: hmmsearch -o /dev/null -A /media/4TB_drive1/doxeylab-metannotateinstaller-0207d4a79dad/metannotate/test_tmp/tmpZJNLh9 --domtblout /media/4TB_drive1/doxeylab-metannotateinstaller-0207d4a79dad/metannotate/test_tmp/tmpiTH6EV --domE 0.01 --cpu 6 data/hmms/TIGR01287.HMM /media/4TB_drive1/Cladophora_Metannotate/Calumet20B_all.out.out.faa
Running commands: esl-reformat -o /media/4TB_drive1/doxeylab-metannotateinstaller-0207d4a79dad/metannotate/test_tmp/tmp2R_5e1 fasta /media/4TB_drive1/doxeylab-metannotateinstaller-0207d4a79dad/metannotate/test_tmp/tmpZJNLh9
Running commands: hmmsearch -o /dev/null -A /media/4TB_drive1/doxeylab-metannotateinstaller-0207d4a79dad/metannotate/test_tmp/tmpYMAp94 --domtblout /media/4TB_drive1/doxeylab-metannotateinstaller-0207d4a79dad/metannotate/test_tmp/tmp_9b6rQ --domE 0.01 --cpu 6 data/hmms/TIGR01287.HMM /media/4TB_drive1/Cladophora_Metannotate/JeorsePark22B_all.out.out.faa
Running commands: esl-reformat -o /media/4TB_drive1/doxeylab-metannotateinstaller-0207d4a79dad/metannotate/test_tmp/tmpaLVs54 fasta /media/4TB_drive1/doxeylab-metannotateinstaller-0207d4a79dad/metannotate/test_tmp/tmpYMAp94
Running hmmsearch on Reference database.
Running commands: hmmsearch -o /dev/null -A /media/4TB_drive1/doxeylab-metannotateinstaller-0207d4a79dad/metannotate/test_tmp/tmpZZQwtt --domE 0.01 --domtblout cache/60189aab19612de37ecc2709d76ad4cb749037d221324b25.domtblout --cpu 6 data/hmms/TIGR01287.HMM data/Refseq.fa
Running commands: esl-reformat -o cache/60189aab19612de37ecc2709d76ad4cb749037d221324b25.converted.msa fasta /media/4TB_drive1/doxeylab-metannotateinstaller-0207d4a79dad/metannotate/test_tmp/tmpZZQwtt
Job ran successfully. The following files are now available:


The program does not run usearch, FastTree, pplacer, or any of those analyses for taxonomic assignment for my data. If I retry the analysis with RPOB.HMM for my data as in the README, that works fine and all the output files are generated.

So, does this mean that of the sequences matching TIGR01287.HMM in my data, none can be assigned taxonomy, or is there some kind of error here?

Thanks for your time and help -

Comments (3)

  1. Andrew Doxey repo owner

    Yes, I am guessing that there were no hits. What are the contents of the
    normalized_counts_sCmrJs397902270.csv file?
    To verify that no hits were detected, you could try running hmmsearch
    directly against your fasta file.

  2. aaunins reporter

    Thanks for the reply. In the csv file for raw counts for my own dataset, there are 992, 1002, and 751 hits from hmmsearch of my data against the TIGR01287 HMM. I can reproduce these numbers by running hmmsearch independent of the run_metannotate.py script. Since there were many hits, I am guessing it is a bug then with the format of the Refseq database.

    I'm looking forward to the fix, and I appreciate your making the metannotate tool available - the results I generate will be very useful for my analyses.

  3. Log in to comment