Issue #368 resolved

race condition removing temp directory in filestore

Daniel Black
created an issue

Using writing.AsyncWriter(ix, writerargs={ 'procs': 2}) called in 23 different threads at the same time to delete items resulted in the following backtrace on one of them.

  File "/home/dan/software_projects/infinite/app/ckeditor_cms/search.py", line 172, in _delete
    wr.delete_by_term('path', name)
  File "/home/dan/software_projects/infinite/env/lib/python2.7/site-packages/whoosh/writing.py", line 197, in __exit__
    self.commit()
  File "/home/dan/software_projects/infinite/env/lib/python2.7/site-packages/whoosh/writing.py", line 1037, in commit
    self.writer.commit(*args, **kwargs)
  File "/home/dan/software_projects/infinite/env/lib/python2.7/site-packages/whoosh/writing.py", line 935, in commit
    self._finish()
  File "/home/dan/software_projects/infinite/env/lib/python2.7/site-packages/whoosh/writing.py", line 886, in _finish
    self._tempstorage.destroy()
  File "/home/dan/software_projects/infinite/env/lib/python2.7/site-packages/whoosh/filedb/filestore.py", line 460, in destroy
    os.rmdir(self.folder)
OSError: [Errno 2] No such file or directory: 'ckeditor_cms/index/MAIN.tmp'

My suspicions are: SegmentWriter._finish the self._tempstorage.destroy() should be before the writelock

and/or FileStore.destroy should catch an OSError on rmdir and pass like:

    os.rmdir(self.folder)
except OSError, e:
    if e.errno == errno.EEXIST:
        # not found
        pass
    else:
        raise e

and/or mktemp should be used to create a non predictable temp file name MAIN.tmp

Comments (5)

  1. Daniel Black reporter

    As a indication that the race condition can hit other parts of the storage code

      File "/home/dan/software_projects/infinite/app/ckeditor_cms/search.py", line 170, in _delete
        ix = index.open_dir(self.indexpath)
      File "/home/dan/software_projects/infinite/env/lib/python2.7/site-packages/whoosh/index.py", line 123, in open_dir
        return FileIndex(storage, schema=schema, indexname=indexname)
      File "/home/dan/software_projects/infinite/env/lib/python2.7/site-packages/whoosh/index.py", line 421, in __init__
        TOC.read(self.storage, self.indexname, schema=self._schema)
      File "/home/dan/software_projects/infinite/env/lib/python2.7/site-packages/whoosh/index.py", line 623, in read
        stream = storage.open_file(tocfilename)
      File "/home/dan/software_projects/infinite/env/lib/python2.7/site-packages/whoosh/filedb/filestore.py", line 497, in open_file
        f = StructFile(open(self._fpath(name), "rb"), name=name, **kwargs)
    IOError: [Errno 2] No such file or directory: '/home/dan/software_projects/infinite/app/ckeditor_cms/index/_MAIN_39.toc'
    
  2. Daniel Black reporter

    Reordering the self._tempstorage.destroy() in SegmentWriter._finish seems to have fixed it for me.

    class SegmentWriter
    ...
    ...
        def _finish(self):
            self._tempstorage.destroy()
            if self.writelock:
                self.writelock.release()
            self.is_closed = True
    
  3. pombredanne

    I think this is not resolved. I am still getting the error with 2.5.7 where the fix is applied.

    [Errno 2] No such file or directory: '/tmp/MAIN.tmp'

    The thing is that I have many independent processes creating temporary indices. /tmp/MAIN.tmp' is being deleted at times while other processes are using it or are trying to delete it. Using tempfile and a real temp directory would be much safer

  4. Log in to comment