1. Matt Chaput
  2. whoosh
  3. Issues
Issue #386 resolved

KeyError: (<CompressedBytes.Reader>, 1L)

Anonymous created an issue

Hi,

I've been having issues with whoosh in combination with django-haystack. About once a week, the website running haystack and whoosh keeps returning errors and filling our Sentry feeds with the following error:

whoosh.util.cache in wrapper KeyError: (<CompressedBytes.Reader>, 1L)

(1L changes, full stacktrace here: http://pastebin.com/cMRKhxpR )

I was wondering if this is something that (could be) related to whoosh or our implementation of it. If you have any idea on how to fix this it would be very helpful.

Comments (19)

  1. Manel Clos

    Hi, because I was having the same problem "KeyError" on cache.py:95, I tested different versions from 2.4.1 to 2.6.0. If you reload the search page very fast, apache starts to spawn processes and searching on different tabs at the same time will (easily) catch the error. Here is the testing summary, in testing order:

    • ...
    • 2.5.2 FAIL (2.5.1 index)
    • 2.5.1 fail: harder to reproduce
    • 2.4.1 rebuild, OK
    • 2.5 ok
    • 2.5.1 ok, rebuild, fail
    • 2.5 fail, rebuild, fail
    • 2.4.1 rebuild, ok
    • 2.5 ok

    I was no able to reproduce the problem with 2.4.1, it looks the regression was introduced in 2.5. Also, I was not able to reproduce it in 2.5 with the 2.4.1 index, so it may be something related to the index construction.

    Can you try version 2.4.1 to see if you can reproduce the problem?

  2. Mattias Fliesberg

    I tried that patch but it doesn't work because I get the error in the lru_cache decorator but that one fixes random_cache. Everyone seems to be talking about getting it in lru_cache though so maybe that's just a mistake on the authors part?

  3. brainstorm

    I just applied the same changes to the function/lines affected and it seems to hold for now. Needs more testing and perhaps the searches become slower (lock strategy does not seem very optimal), but it is working for now :)

  4. Martin Cech

    Can confirm on 2.5.7. I am unable to reproduce when not overriding the final method of own Weighting class though. Might have something to do with it? No Django here, only nginx proxy. Downgrade to 2.4.1 seems to help (but re-introduces some fixed bugs so I had to build my own egg :( )

  5. aleray

    The fix for me was to configure Apache to be single threaded using the threads option, like this:

        WSGIDaemonProcess example.org python-path=/srv/data02/www/example.org/app:/srv/data02/www/example.org/venv/lib/python2.7/site-packages maximum-requests=1000 threads=1
    

    Full example:

    <VirtualHost *:8080>
        ServerName example.org
    
        <Directory /srv/data02/www/example.org/app/example/>
            <Files wsgi.py>    
                Require local
            </Files>
        </Directory>
    
        WSGIDaemonProcess example.org python-path=/srv/data02/www/example.org/app:/srv/data02/www/example.org/venv/lib/python2.7/site-packages maximum-requests=1000 threads=1
        WSGIProcessGroup example.org
        WSGIScriptAlias / /srv/data02/www/example.org/app/example/wsgi.py
    </VirtualHost>
    
  6. Matt Good

    This is a bad problem. Once this occurs, search is unusable until I restart my application. Yet over a year and no verified fix. Is Whoosh no longer an active project?

    File "/base/data/home/apps/s~style--dev/1.389338407435741255/libs.zip/whoosh/searching.py", line 1510, in getitem if fieldname in self.fields(): File "/base/data/home/apps/s~style--dev/1.389338407435741255/libs.zip/whoosh/searching.py", line 1398, in fields self._fields = self.searcher.stored_fields(self.docnum) File "/base/data/home/apps/s~style--dev/1.389338407435741255/libs.zip/whoosh/reading.py", line 1170, in stored_fields return self.readers[segmentnum].stored_fields(segmentdoc) File "/base/data/home/apps/s~style--dev/1.389338407435741255/libs.zip/whoosh/reading.py", line 685, in stored_fields sfs = self._perdoc.stored_fields(docnum) File "/base/data/home/apps/s~style--dev/1.389338407435741255/libs.zip/whoosh/codec/whoosh3.py", line 485, in stored_fields v = reader[docnum] File "/base/data/home/apps/s~style--dev/1.389338407435741255/libs.zip/whoosh/columns.py", line 1212, in getitem v = self._child[docnum] File "/base/data/home/apps/s~style--dev/1.389338407435741255/libs.zip/whoosh/columns.py", line 807, in getitem v = VarBytesColumn.Reader.getitem(self, docnum) File "/base/data/home/apps/s~style--dev/1.389338407435741255/libs.zip/whoosh/util/cache.py", line 95, in wrapper del data[k] KeyError: (<CompressedBytes.Reader>, 166L)

  7. Matt Chaput repo owner

    Remove caching decorator from VarBytesColumn reader.

    My best guess is that issue #386 involves users sharing readers between threads, which is not supported. That's the only way I can see the cache trying to delete the same thing twice? I think the error in the cache is masking the real problem, that users are sharing readers. I'm going to remove the cache because it probably has a very small performance impact and the error is misleading.

    Fixes issue #386.

    → <<cset ae13d89b8227>>

  8. Log in to comment