Pull requests

#50 Open
Repository
sradack/whoosh-sr whoosh-sr
Branch
default
Repository
mchaput/whoosh whoosh
Branch
default

fix LRU thread safety

Bitbucket cannot automatically merge this request due to conflicts.

Review the conflicts on the Overview tab. You can then either decline the request or merge it manually on your local system using the following commands:

hg update default
hg pull -r default https://bitbucket.org/sradack/whoosh-sr
hg merge db80f0945273
hg commit -m 'Merged in sradack/whoosh-sr (pull request #50)'
Author
  1. Steven Radack
Reviewers
Description

I was getting race conditions until I made this patch. Let me know what you think.

Comments (8)

  1. Matt Chaput repo owner

    Are you using this decorator in your own code, or are you getting race conditions using Whoosh? You're not supposed to share readers/searchers between threads, so there shouldn't ever be race conditions in Whoosh... if there are it's a really weird bug :)

    1. Steven Radack author

      I'm not explicitly sharing any readers. I'm working on trying to replicate the issue in a small script. For now I've put a global lock around accessing the index which has resolved my problem for the time being.

      FYI here are the exceptions I'm getting:

      2014-02-04 17:44:05: (mod_fastcgi.c.2676) FastCGI-stderr: Traceback (most recent call last):

      File "/usr/local/lib/python2.7/dist-packages/web/application.py", line 239, in process

      return self.handle()
      

      File "/usr/local/lib/python2.7/dist-packages/web/application.py", line 230, in handle

      return self._delegate(fn, self.fvars, args)
      

      File "/usr/local/lib/python2.7/dist-packages/web/application.py", line 420, in _delegate

      return handle_class(cls)
      

      File "/usr/local/lib/python2.7/dist-packages/web/application.py", line 396, in handle_class

      return tocall(*args)
      

      File "/home/sradack/jura/web_interface.py", line 48, in GET

      return json.dumps(model.findProperties(**input))
      

      File "/home/sradack/jura/model.py", line 27, in findProperties

      return [r.fields() for r in results]
      

      File "/usr/local/lib/python2.7/dist-packages/whoosh/searching.py", line 1389, in fields

      self._fields = self.searcher.stored_fields(self.docnum)
      

      File "/usr/local/lib/python2.7/dist-packages/whoosh/reading.py", line 1225, in stored_fields

      return self.readers[segmentnum].stored_fields(segmentdoc)
      

      File "/usr/local/lib/python2.7/dist-packages/whoosh/reading.py", line 712, in stored_fields

      sfs = self._perdoc.stored_fields(docnum)
      

      File "/usr/local/lib/python2.7/dist-packages/whoosh/codec/whoosh3.py", line 489, in stored_fields

      v = reader[docnum]
      

      File "/usr/local/lib/python2.7/dist-packages/whoosh/columns.py", line 1207, in getitem

      v = self._child[docnum]
      

      File "/usr/local/lib/python2.7/dist-packages/whoosh/columns.py", line 807, in getitem

      v = VarBytesColumn.Reader.__getitem__(self, docnum)
      

      File "/usr/local/lib/python2.7/dist-packages/whoosh/util/cache.py", line 95, in wrapper

      del data[k]
      

      KeyError: (<CompressedBytes.Reader>, 4028L)

  2. pleong

    I also have the same error while running Whoosh 2.6.0 with Apache and fcgi.

    I reindexed and it seemed to temporarily fix the issue but the problem seems to come back.

    [16/May/2014 16:42:33] ERROR Internal Server Error: 
    Traceback (most recent call last):
      File "/vol/digipal2/webroot/liv/django/envs/digipal-liv/lib/python2.6/site-packages/django/core/handlers/base.py", line 111, in get_response
        response = callback(request, *callback_args, **callback_kwargs)
      File "/vol/digipal2/webroot/liv/django/digipal-django/digipal/views/search.py", line 165, in search_record_view
        set_search_results_to_context(request, context=context, show_advanced_search_form=True)
      File "/vol/digipal2/webroot/liv/django/digipal-django/digipal/views/search.py", line 292, in set_search_results_to_context
        context['results'] = type.build_queryset(request, term, not has_result)
      File "/vol/digipal2/webroot/liv/django/digipal-django/digipal/views/content_type/search_content_type.py", line 636, in build_queryset
        ret = self._build_queryset(request, term)
      File "/vol/digipal2/webroot/liv/django/digipal-django/digipal/views/content_type/search_content_type.py", line 738, in _build_queryset
        whoosh_dict[int(hit['id'])] = hit
      File "/vol/digipal2/webroot/liv/django/envs/digipal-liv/lib/python2.6/site-packages/whoosh/searching.py", line 1501, in __getitem__
        if fieldname in self.fields():
      File "/vol/digipal2/webroot/liv/django/envs/digipal-liv/lib/python2.6/site-packages/whoosh/searching.py", line 1389, in fields
        self._fields = self.searcher.stored_fields(self.docnum)
      File "/vol/digipal2/webroot/liv/django/envs/digipal-liv/lib/python2.6/site-packages/whoosh/reading.py", line 712, in stored_fields
        sfs = self._perdoc.stored_fields(docnum)
      File "/vol/digipal2/webroot/liv/django/envs/digipal-liv/lib/python2.6/site-packages/whoosh/codec/whoosh3.py", line 489, in stored_fields
        v = reader[docnum]
      File "/vol/digipal2/webroot/liv/django/envs/digipal-liv/lib/python2.6/site-packages/whoosh/columns.py", line 1207, in __getitem__
        v = self._child[docnum]
      File "/vol/digipal2/webroot/liv/django/envs/digipal-liv/lib/python2.6/site-packages/whoosh/columns.py", line 807, in __getitem__
        v = VarBytesColumn.Reader.__getitem__(self, docnum)
      File "/vol/digipal2/webroot/liv/django/envs/digipal-liv/lib/python2.6/site-packages/whoosh/util/cache.py", line 95, in wrapper
        del data[k]
    KeyError: (<CompressedBytes.Reader>, 1601L)
    
  3. Jérôme Thiard

    I have the exact same problem on a production site, using

    • django==1.5.8
    • django-haystack==2.2.0
    • Whoosh==2.6.0

    The patch does not solve the issue because it fixes the wrong decorator (random_cache instead of lru_cache). I applied the same fix to the lru_cache decorator (through monkeypatching) , and it works fine (but i dont know if the locking strategy will be ok on heavy load).

    Matt Chaput said

    You're not supposed to share readers/searchers between threads, so there shouldn't ever be race conditions in Whoosh... if there are it's a really weird bug :)

    I don't know if it's a weird bug, but even if <VarBytes.Reader> instances are not shared between threads, the lru_cache.cache variable is shared between many <VarBytes.Reader> instances... and so between many threads.