every parsed not true boolean is false

Issue #336 resolved
cdar
created an issue
#!/usr/local/bin/python
# -*- coding: utf-8 -*-

from whoosh.fields import Schema, TEXT, BOOLEAN
from whoosh.filedb.filestore import RamStorage
from whoosh.qparser import MultifieldParser

schema = Schema(name=TEXT(stored=True), bool=BOOLEAN(stored=True))
storage = RamStorage()
ix = storage.create_index(schema)

writer = ix.writer()

writer.add_document(name=u'audi', bool=True)
writer.add_document(name=u'vw', bool=False)
writer.add_document(name=u'porsche', bool=False)
writer.add_document(name=u'ferrari', bool=True)
writer.add_document(name=u'citroen', bool=False)

writer.commit()

pq = MultifieldParser(['name', 'bool'], schema=schema).parse('query')
searcher = ix.searcher()
results = searcher.search(pq, terms=True)

for r in results:
    print r['name'], r.matched_terms()

and the output is

vw [('bool', 'f')]
porsche [('bool', 'f')]
citroen [('bool', 'f')]

I think it should return 0 results.

Comments (3)

  1. Matt Chaput repo owner

    This is a bug, but the correct behavior is actually to return "audi" and "ferrari", since aside from special cased strings such as "true", "false", "yes", "no" etc., the boolean field uses bool(queryobj) to determine which boolean value to search for, for consistency with Python.

    I think using a boolean field as one of the alternatives in a MultifieldParser is just generally a bad idea, it leads to unintuitive results like this.

  2. Log in to comment