1. Matt Chaput
  2. whoosh
  3. Issues
Issue #254 resolved

Query excluding certain fields seems to resolve incorrectly.

created an issue

I have a test [1] that passes fine on haystack's elasticsearch backend, but fails on whoosh. Looking through the code, the problem seems to manifest itself when iterating over search results. Specifically: {{{ -> results = SmartSearchQuerySet().auto_query(query) (Pdb) n -> results = dict((unicode(r.pk), r) for r in results) (Pdb) results.query.build_query() u'NOT (feed:(1))' (Pdb) len(results) 26 (Pdb) len([r for r in results]) 20 }}}

Diving into the code a bit, it looks (to me) like the problem goes straight into whoosh. Specifically, {{{ searcher.search(parsed_query, limit=30) }}} always returns the top 20 results, rather than the top thirty. I am hesitant to dig any deeper, since I don't really understand how exactly whoosh works.

[1] https://github.com/pculture/mirocommunity/blob/develop/localtv/tests/unit/search/query.py#L222

Comments (7)

  1. Matt Chaput repo owner
    • changed status to open

    I can't reproduce this with a simple test case:

    def test_not_feed_1():
        schema = fields.Schema(id=fields.ID(stored=True), feed=fields.NUMERIC)
        ix = RamStorage().create_index(schema)
        with ix.writer() as w:
            # Make 40 documents, with 26 of feed != 1
            for i in xrange(40):
                w.add_document(id=u(str(i)), feed=(0 if i < 26 else 1))
        with ix.searcher() as s:
            qp = qparser.QueryParser("id", schema)
            # Find documents where feed != 1
            q = qp.parse("NOT (feed:(1))")
            r = s.search(q, limit=30)
            assert_equal(len(r), 26)  # Total number of matched documents
            assert_equal(r.scored_length(), 26)  # Number of docs in the results

    The test you linked to makes me wonder... I don't know what your code or Haystack does with an empty query string.

    I'll see if I can set up your dev environment so I can run your tests myself. I haven't had much luck in the past building other peoples' Django projects...

  2. Matt Chaput repo owner

    Are the dependencies for mirocommunity listed somewhere? They're not in setup.py. I can't run the tests because I don't have the required packages. I figured out "mptt" and "compressor" but gave up after that.

  3. melinath reporter

    Yeah, we should do a better job of putting the requirements in setup.py. Here are the current installation instructions: http://readthedocs.org/docs/mirocommunity/en/latest/installation.html

    It looks like the test case moved a bit since I posted it. Here's where it should have pointed: https://github.com/pculture/mirocommunity/blob/a86cffde287cc0ac0d601ff6ee4731735cf342e9/localtv/tests/unit/search/query.py#L222

    Line 236 is what was failing (self.assertQueryResults('-feed:blender', expected)) Essentially what's going on there is: the Feed model instance with the name "blender" gets fetched, then a search is done which excludes everything with that instance's pk in the 'feed' field. Hence, NOT (feed:(1)).

    I'll see how the test case you posted behaves for me.

  4. Log in to comment