1. Matt Chaput
  2. whoosh
  3. Issues
Issue #305 resolved

Term calls field.to_text in matcher but not in estimate_size

Anonymous created an issue

I'm seeing an error when running a search: "coercing to Unicode: need string or buffer, int found."

I have a schema with a regular TEXT and a NUMERIC field. I create Term instances for each and then join them together with And, something like this:

t1 = Term(numeric_field, 100) t2 = Term(text_field, "something") q = And([t1, t2]) results = searcher.search(query)

I am also specifying some facet fields... I don't know if that's significant.

The error is being generated in W2TermsReader.keycoder. During the search, this gets (eventually) called twice from the Term.matcher function as well as from Term.estimate_size. When it gets called from Term.matcher, it sees the "text" value that the matcher ran through field.to_text. But when it gets called from estimate_size, it sees self.text, which is an int. This causes the keycoder function to fail.

If Term accepts non-text values and is willing to convert them, I think it should convert them in all cases.

Comments (3)

  1. Willis Blackburn

    I entered this issue.

    It looks like calling text = field.to_text(text) in the Term constructor would work. The only down side is that then the encoded value is printed when the Term is rendered as a string. So maybe keep the original value around as a separate field just for that.

  2. Log in to comment