whoosh.writers.IndexWriter procs > 1 does not work in Windows

Issue #414 resolved
Blake VandeMerwe
created an issue

Python on Windows always seems to give a few "gotchya" 's. Just want to document that you cannot run Whoosh with the multiprocessing implementation on Windows.

There is an immediate WriteLock Exception on indexing.

The most simple case is to obtain an index and grab it's writer with procs=2

idx = obtain_index(index_dest, schema, 'WARC-Feb-26-2015', force_new_index=True)
idx_writer = idx.writer(procs=2)

idx_writer.add_document(
    url = 'http://google.com'
)

idx_writer.commit()

Error:

(0, 0.00800013542175293)
Merging
Traceback (most recent call last):
  File "app_main.py", line 75, in run_toplevel
  File "app_main.py", line 581, in run_it
  File "<string>", line 1, in <module>
  File "C:\pypy-2.5.0-win32\lib-python\2.7\multiprocessing\forking.py", line 377, in main
    prepare(preparation_data)
  File "C:\pypy-2.5.0-win32\lib-python\2.7\multiprocessing\forking.py", line 492, in prepare
    '__parents_main__', file, path_name, etc
  File "D:\web_projects\sowing_seasons\blog\4-31-15\tests.py", line 102, in <module>
    index_writer = get_writer(idx)
  File "D:\web_projects\sowing_seasons\blog\4-31-15\tests.py", line 98, in get_writer
    return idx.writer(procs=2)
  File "C:\pypy-2.5.0-win32\site-packages\whoosh\index.py", line 461, in writer
    return MpWriter(self, procs=procs, **kwargs)
  File "C:\pypy-2.5.0-win32\site-packages\whoosh\multiproc.py", line 161, in __init__
    SegmentWriter.__init__(self, ix, **kwargs)
  File "C:\pypy-2.5.0-win32\site-packages\whoosh\writing.py", line 515, in __init__
    raise LockError
LockError
Traceback (most recent call last):
  File "app_main.py", line 75, in run_toplevel
  File "D:\web_projects\sowing_seasons\blog\4-31-15\tests.py", line 123, in <module>
    index_writer.commit(optimize=True)
  File "C:\pypy-2.5.0-win32\site-packages\whoosh\multiproc.py", line 252, in commit
    self._commit(mergetype, optimize, merge)
  File "C:\pypy-2.5.0-win32\site-packages\whoosh\multiproc.py", line 277, in _commit
    results.append(self.resultqueue.get(timeout=5))
  File "C:\pypy-2.5.0-win32\lib-python\2.7\multiprocessing\queues.py", line 132, in get
    raise Empty
Empty

Comments (6)

  1. Matt Chaput repo owner

    That's odd, I didn't think anything had changed in that code since the last time I tried the tests on a Windows machine. I'll have to look at it. Thanks.

  2. Clint P. George

    I reinstalled whoosh from the latest source available in the bitbucket repository by running

    python setup.py install 
    

    and

    python test_mpwriter.py
    

    works on my machine.

    But when I run

     writer = ix.writer(procs=4, limitmb=1024, multisegment=True)
     writer.add_document(...)
     writer.add_document(...)
     writer.add_document(...)
     writer.commit()
    

    I get the same Lock exception:

      writer = ix.writer(procs=4, limitmb=1024, multisegment=True)
      File "build\bdist.win32\egg\whoosh\index.py", line 461, in writer
      File "build\bdist.win32\egg\whoosh\multiproc.py", line 162, in __init__
      File "build\bdist.win32\egg\whoosh\writing.py", line 515, in __init__
    
    whoosh.index.LockError
    

    I have 32-bit python on my Windows machine.

    Do I have to do anything more to make it work? or am I missing something? Any help is greatly appreciated. Thanks.

  3. mx2048

    I have the same error.

    Whoosh (2.7.4) Python 3.6.2 Windows 10, two cores, 8 GB RAM

    import os
    import sqlite3
    
    from whoosh.index import create_in
    from whoosh.fields import Schema, TEXT
    
    conn = sqlite3.connect('my_database.db')
    c = conn.cursor()
    
    if not os.path.exists("indexdir"):
        os.mkdir("indexdir")
    
    schema = Schema(title=TEXT(stored=True), content=TEXT(phrase=True))
    ix = create_in("indexdir", schema)
    writer = ix.writer(procs=2, limitmb=128)
    for row in c.execute("SELECT my_title, my_content FROM my_table"):
        writer.add_document(title=row[0], content=row[1])
    writer.commit()
    
    Traceback (most recent call last):
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "<string>", line 1, in <module>
      File "C:\Python36-32\lib\multiprocessing\spawn.py", line 105, in spawn_main
      File "C:\Python36-32\lib\multiprocessing\spawn.py", line 105, in spawn_main
        exitcode = _main(fd)
      File "C:\Python36-32\lib\multiprocessing\spawn.py", line 114, in _main
        exitcode = _main(fd)
      File "C:\Python36-32\lib\multiprocessing\spawn.py", line 114, in _main
        prepare(preparation_data)
      File "C:\Python36-32\lib\multiprocessing\spawn.py", line 225, in prepare
        prepare(preparation_data)
      File "C:\Python36-32\lib\multiprocessing\spawn.py", line 225, in prepare
        _fixup_main_from_path(data['init_main_from_path'])
      File "C:\Python36-32\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path
        _fixup_main_from_path(data['init_main_from_path'])
      File "C:\Python36-32\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path
        run_name="__mp_main__")
      File "C:\Python36-32\lib\runpy.py", line 263, in run_path
        run_name="__mp_main__")
      File "C:\Python36-32\lib\runpy.py", line 263, in run_path
        pkg_name=pkg_name, script_name=fname)
        pkg_name=pkg_name, script_name=fname)
      File "C:\Python36-32\lib\runpy.py", line 96, in _run_module_code
      File "C:\Python36-32\lib\runpy.py", line 96, in _run_module_code
        mod_name, mod_spec, pkg_name, script_name)
      File "C:\Python36-32\lib\runpy.py", line 85, in _run_code
        mod_name, mod_spec, pkg_name, script_name)
      File "C:\Python36-32\lib\runpy.py", line 85, in _run_code
        exec(code, run_globals)
      File "C:\databases\whoosh_index.py", line 19, in <module>
        exec(code, run_globals)
      File "C:\databases\whoosh_index.py", line 19, in <module>
        writer = ix.writer(procs=2, limitmb=128)
      File "C:\Python36-32\lib\site-packages\whoosh\index.py", line 461, in writer
        writer = ix.writer(procs=2, limitmb=128)
      File "C:\Python36-32\lib\site-packages\whoosh\index.py", line 461, in writer
        return MpWriter(self, procs=procs, **kwargs)
      File "C:\Python36-32\lib\site-packages\whoosh\multiproc.py", line 162, in __init__
        return MpWriter(self, procs=procs, **kwargs)
      File "C:\Python36-32\lib\site-packages\whoosh\multiproc.py", line 162, in __init__
        SegmentWriter.__init__(self, ix, **kwargs)
      File "C:\Python36-32\lib\site-packages\whoosh\writing.py", line 515, in __init__
        SegmentWriter.__init__(self, ix, **kwargs)
      File "C:\Python36-32\lib\site-packages\whoosh\writing.py", line 515, in __init__
        raise LockError
    whoosh.index.LockError
        raise LockError
    whoosh.index.LockError
    
  4. Log in to comment