ZeroDivisionError: float division by zero in key_terms_from_text

Issue #256 open
Anonymous created an issue

I got following error while running key_terms_from_text:

{{{ File "/usr/lib/python2.7/dist-packages/whoosh/", line 471, in key_terms_from_text return expander.expanded_terms(numterms, normalize=normalize)

File "/usr/lib/python2.7/dist-packages/whoosh/", line 177, in expanded_terms norm = model.normalizer(maxweight, self.top_total)

File "/usr/lib/python2.7/dist-packages/whoosh/", line 58, in normalizer return (maxweight * log((1.0 + f) / f) + log(1.0 + f)) / log(2.0)

ZeroDivisionError: float division by zero }}}

Unfortunately I can not reproduce the issue as the index was meanwhile changed and using same string does not trigger this bug.

Comments (4)

  1. Thomas Waldmann
    • changed status to open

    looks like in 177 it is calling normalizer with maxweight == 0 (it then computes f = 0 and divides by f, boom).

    that can happen if there is no score > 0 is found in 172.

    note: "maxweight" looks rather like "maxscore" (see 173).

    not sure about it, but for me it looks like line 176..180 is pointless/harmful in the case of maxweight == 0 (btw, there is another division by zero issue lurking in 180) - so just put a "if maxweight > 0:" before that? the sorting is still needed to sort by x[1].

    fix should be done in 2.4x branch (IMHO).

    matt, what do you think?

  2. Thomas Waldmann

    all potential and real division by zero issues in the normalizers:

    maxweight == 0 crashes Bo1Model.normalizer and Bo2Model.normalizer (HAPPENS!)

    top_total == 0 crashes KLModel.normalizer

    self.N == 0 crashes Bo1Model.normalizer and Bo2Model.normalizer

    self.collection_total == 0 crashes Bo2Model.normalizer

    there are also quite some such (potential) issues in the scorers.

  3. Log in to comment