Inconsistency in "match" or my misunderstanding of it?

Issue #83 resolved
Jessica Rowell created an issue

Hi ResFinder team!

I think I don’t understand the term “Match” in pheno_table.txt. My understanding is that Match=2 and Match=3 both mean 100% identity:

# The 'Match' column stores one of the integers 0, 1, 2, 3.
#      0: No match found
#      1: Match < 100% ID AND match length < ref length
#      2: Match = 100% ID AND match length < ref length
#      3: Match = 100% ID AND match length = ref length
# If several hits causing the same resistance are found,
# the highest number will be stored in the 'Match' column.=

But in some of my results, I get Match=2 in pheno_table.txt and <100% identity in Resfinder_results_tab.txt. Here is an example, focusing on tetracyclines:

pheno_table.txt results

# Antimicrobial Class   WGS-predicted phenotype Match   Genetic background
tetracycline    tetracycline    Resistant       2       tetA(46) (tetA(46)_HQ652506), tetB(46) (tetB(46)_HQ652506), tetA(60) (tetA(60)_KX000272), tet(M) (tet(M)_X75073)
doxycycline     tetracycline    Resistant       2       tetA(46) (tetA(46)_HQ652506), tetB(46) (tetB(46)_HQ652506), tetA(60) (tetA(60)_KX000272), tet(M) (tet(M)_X75073)
minocycline     tetracycline    Resistant       1       tet(M) (tet(M)_X75073)
tigecycline     tetracycline    Resistant       2       tetA(46) (tetA(46)_HQ652506), tetB(46) (tetB(46)_HQ652506), tetA(60) (tetA(60)_KX000272)

All tet genes in my Resfinder_results_tab.txt results

Resistance gene Identity        Alignment Length/Gene Length    Coverage        Position in reference   Contig  Position in contig      Phenotype       Accession no.
tetA(46)        92.70   1725/1725       97.33   1..1726 NA      NA..NA  Warning: gene is missing from Notes file. Please inform curator.        HQ652506
tetB(46)        81.35   1637/1737       86.36   1..1638 NA      NA..NA  Warning: gene is missing from Notes file. Please inform curator.        HQ652506
tetA(60)        88.62   1704/1740       93.68   1..1705 NA      NA..NA  Warning: gene is missing from Notes file. Please inform curator.        KX000272
tet(M)  92.66   1922/1920       92.86   1..1923 NA      NA..NA  Tetracycline resistance X75073

(All identities for tet genes are <100% - hence why I must be misunderstanding “Match” in pheno_table.txt.)

A related question: for tetA(46) and tetB(46), these have the same accession ID (HQ652506) but different reference gene lengths (1725, 1737 bp, respectively). Why?

And what does the “Warning: gene is missing from Notes file. Please inform curator,” mean exactly? Is there a form somewhere where I should input these when I find them? Does it mean there is less confidence about these genes?

Thanks for your help!

Comments (3)

  1. CGE Helpdesk

    Dear Jessica,
    Thank you very much for your interest in ResFinder.

    To answer your questions:
    1) match types: I understand the confusion and this is due to the text being wrong. The match = 2, means that Identity is below 100% and the match length = ref length. The texted is now fixed.

    2) length of tetA and tetB: an accession number can contain multiple genes as is the case here. Genes tetA and tetB are two different genes and therefor it is possible to have different lengths.

    3) the warning: Resfinder contains a notes.txt file which should include all genes in ResFinder but we are missing some. So we ask the users to report them to us when they come by them so we can fill them in. The absence do not effect the results.
    If you find additional genes please send the output to food-cgehelp@dtu.dk and we will ensure they are added.

    I hope that clears everything up and please do not hesitate if you have further questions.

    Best regards,
    Maja, CGE Helpdesk

  2. Jessica Rowell reporter

    Ok, thank you so much! My confusion regarding tetA(46) and tetB(46) was more that they had the same accession ID (HQ652506). But after reading the associated paper, I see that this is a heterodimeric ABC transporter, tetAB(46), where both genes are required for tetracycline resistance. So it makes sense now. Thanks again!

  3. Jessica Rowell reporter

    Help text for defining "Match" in pheno_table.txt was incorrect. ResFinder team has indicated that it has been fixed.

  4. Log in to comment