1. Thomas Waldmann
  2. moin-2.0
  3. Issues
Issue #264 new

whoosh pickling error on item save

Roger Haase
created an issue

Sometimes after an apache restart, all attempts to save any items will result in a traceback (see attached PicklingError.txt).

Restarting apache eliminates the problem.

Comments (7)

  1. Thomas Waldmann repo owner

    if you can reproduce, maybe it would also help to get a repr() of the schema, so we see what function item it tries to pickle.

    hmm, maybe file this to whoosh issue tracker (so, if we succeed to find out what's causing this, the whoosh modification to help debugging can be added there)?

  2. Roger Haase reporter

    This may be an Apache configuration error. Apache was running 4 applications with 2 virtual environments, moin2 having its own virtual env. The problem was reproduced by restarting Apache and then accessing any of the 3 other applications first, followed by a modify and attempt to save a moin2 item. The problem could be reproduced on both Centos 5 and Windows 7.

    An easy workaround is to add a wget for a moin2 item on the scripts restarting apache.

    A fix for Centos was to add a unique WSGIDaemonProcess for each application, similar to:

    WSGIDaemonProcess moin2_wsgi processes=1 threads=3 stack-size=524288
    WSGIScriptAlias /moin2/ /home/rockart/webapps/web/moin-2.0/moin2.wsgi/
    <Location /moin2>
        WSGIProcessGroup moin2_wsgi
    </Location>
    

    But Apache under windows does not support the WSGIDaemonProcess statement. Still looking for a windows configuration solution.

  3. Roger Haase reporter

    Still seeing a 100% failure rate when Apache is restarted on Windows, my other application is accessed first, and then an attempt is made to modify a moin2 item.

    Zero failures are seen if Apache is restarted and moin2 is accessed first.

    Zero failures have occurred on CentOS since setting the WSGIProcessGroup as described above.

    My virtual environ are set per http://code.google.com/p/modwsgi/wiki/VirtualEnvironments.

    A workaround is to change the imports within whoosh/filedb/fileindex.py:

        #~ from whoosh.compat import pickle, string_type, xrange
        from whoosh.compat import string_type, xrange
        import pickle
    

    which replaces the cPickle module imported into whoosh.compat with the pickle module.

    Adding a print statement into fileindex.py before the failing statement shows os.environ and sys.path are identical in both the successful and failing Windows tests.

    Adding a "print 'schema.items()=%s' % schema.items()" before the failing statement yields the following (line endings added):

        [Sun Feb 17 10:39:54 2013] [error] schema.items()=[
        ('action', ID(format=Existence(boost=1.0), vector=None, scorable=None, stored=True, unique=False)), 
        ('address', ID(format=Existence(boost=1.0), vector=None, scorable=None, stored=True, unique=False)), 
        ('backendname', ID(format=Existence(boost=1.0), vector=None, scorable=None, stored=True, unique=False)), 
        ('comment', TEXT(format=Positions(boost=1.0), vector=None, scorable=True, stored=True, unique=None)), 
        ('content', TEXT(format=Positions(boost=1.0), vector=None, scorable=True, stored=True, unique=None)), 
        ('contenttype', TEXT(format=Positions(boost=1.0), vector=None, scorable=True, stored=True, unique=None)), 
        ('hostname', ID(format=Existence(boost=1.0), vector=None, scorable=None, stored=True, unique=False)), 
        ('itemid', ID(format=Existence(boost=1.0), vector=None, scorable=None, stored=True, unique=False)), 
        ('itemtype', ID(format=Existence(boost=1.0), vector=None, scorable=None, stored=True, unique=False)), 
        ('language', ID(format=Existence(boost=1.0), vector=None, scorable=None, stored=True, unique=False)), 
        ('mtime', DATETIME(format=Existence(boost=1.0), vector=None, scorable=None, stored=True, unique=False)), 
        ('name', TEXT(format=Positions(boost=2.0), vector=None, scorable=True, stored=True, unique=None)), 
        ('name_exact', ID(format=Existence(boost=3.0), vector=None, scorable=None, stored=False, unique=False)), 
        ('parentid', ID(format=Existence(boost=1.0), vector=None, scorable=None, stored=True, unique=False)), 
        ('ptime', DATETIME(format=Existence(boost=1.0), vector=None, scorable=None, stored=True, unique=False)), 
        ('revid', ID(format=Existence(boost=1.0), vector=None, scorable=None, stored=True, unique=True)), 
        ('size', NUMERIC(format=Existence(boost=1.0), vector=None, scorable=None, stored=True, unique=False)), 
        ('summary', TEXT(format=Positions(boost=1.0), vector=None, scorable=True, stored=True, unique=None)), 
        ('tags', ID(format=Existence(boost=1.0), vector=None, scorable=None, stored=True, unique=False)), 
        ('userid', ID(format=Existence(boost=1.0), vector=None, scorable=None, stored=True, unique=False)), 
        ('wikiname', ID(format=Existence(boost=1.0), vector=None, scorable=None, stored=True, unique=False))]
    

    The above is identical to the output of the first call of a successful test:

        [Sun Feb 17 10:34:52 2013] [error] schema.items()=[
        ('action', ID(format=Existence(boost=1.0), vector=None, scorable=None, stored=True, unique=False)), 
        ('address', ID(format=Existence(boost=1.0), vector=None, scorable=None, stored=True, unique=False)), 
        ('backendname', ID(format=Existence(boost=1.0), vector=None, scorable=None, stored=True, unique=False)), 
        ('comment', TEXT(format=Positions(boost=1.0), vector=None, scorable=True, stored=True, unique=None)), 
        ('content', TEXT(format=Positions(boost=1.0), vector=None, scorable=True, stored=True, unique=None)), 
        ('contenttype', TEXT(format=Positions(boost=1.0), vector=None, scorable=True, stored=True, unique=None)), 
        ('hostname', ID(format=Existence(boost=1.0), vector=None, scorable=None, stored=True, unique=False)), 
        ('itemid', ID(format=Existence(boost=1.0), vector=None, scorable=None, stored=True, unique=False)), 
        ('itemtype', ID(format=Existence(boost=1.0), vector=None, scorable=None, stored=True, unique=False)), 
        ('language', ID(format=Existence(boost=1.0), vector=None, scorable=None, stored=True, unique=False)), 
        ('mtime', DATETIME(format=Existence(boost=1.0), vector=None, scorable=None, stored=True, unique=False)), 
        ('name', TEXT(format=Positions(boost=2.0), vector=None, scorable=True, stored=True, unique=None)), 
        ('name_exact', ID(format=Existence(boost=3.0), vector=None, scorable=None, stored=False, unique=False)), 
        ('parentid', ID(format=Existence(boost=1.0), vector=None, scorable=None, stored=True, unique=False)), 
        ('ptime', DATETIME(format=Existence(boost=1.0), vector=None, scorable=None, stored=True, unique=False)), 
        ('revid', ID(format=Existence(boost=1.0), vector=None, scorable=None, stored=True, unique=True)), 
        ('size', NUMERIC(format=Existence(boost=1.0), vector=None, scorable=None, stored=True, unique=False)), 
        ('summary', TEXT(format=Positions(boost=1.0), vector=None, scorable=True, stored=True, unique=None)), 
        ('tags', ID(format=Existence(boost=1.0), vector=None, scorable=None, stored=True, unique=False)), 
        ('userid', ID(format=Existence(boost=1.0), vector=None, scorable=None, stored=True, unique=False)), 
        ('wikiname', ID(format=Existence(boost=1.0), vector=None, scorable=None, stored=True, unique=False))]
    
        [Sun Feb 17 10:34:52 2013] [error] schema.items()=[
        ('acl', TEXT(format=Positions(boost=1.0), vector=None, scorable=True, stored=True, unique=None)), 
        ('action', ID(format=Existence(boost=1.0), vector=None, scorable=None, stored=True, unique=False)), 
        ('address', ID(format=Existence(boost=1.0), vector=None, scorable=None, stored=True, unique=False)), 
        ('assigned_to', ID(format=Existence(boost=1.0), vector=None, scorable=None, stored=True, unique=False)), 
        ('backendname', ID(format=Existence(boost=1.0), vector=None, scorable=None, stored=True, unique=False)), 
        ('comment', TEXT(format=Positions(boost=1.0), vector=None, scorable=True, stored=True, unique=None)), 
        ('content', TEXT(format=Positions(boost=1.0), vector=None, scorable=True, stored=True, unique=None)), 
        ('contenttype', TEXT(format=Positions(boost=1.0), vector=None, scorable=True, stored=True, unique=None)), 
        ('depends_on', ID(format=Existence(boost=1.0), vector=None, scorable=None, stored=True, unique=False)), 
        ('difficulty', NUMERIC(format=Existence(boost=1.0), vector=None, scorable=None, stored=True, unique=False)), 
        ('effort', NUMERIC(format=Existence(boost=1.0), vector=None, scorable=None, stored=True, unique=False)), 
        ('email', ID(format=Existence(boost=1.0), vector=None, scorable=None, stored=True, unique=False)), 
        ('hostname', ID(format=Existence(boost=1.0), vector=None, scorable=None, stored=True, unique=False)), 
        ('itemid', ID(format=Existence(boost=1.0), vector=None, scorable=None, stored=True, unique=True)), 
        ('itemlinks', ID(format=Existence(boost=1.0), vector=None, scorable=None, stored=True, unique=False)), 
        ('itemtransclusions', ID(format=Existence(boost=1.0), vector=None, scorable=None, stored=True, unique=False)), 
        ('itemtype', ID(format=Existence(boost=1.0), vector=None, scorable=None, stored=True, unique=False)), 
        ('language', ID(format=Existence(boost=1.0), vector=None, scorable=None, stored=True, unique=False)), 
        ('mtime', DATETIME(format=Existence(boost=1.0), vector=None, scorable=None, stored=True, unique=False)), 
        ('name', TEXT(format=Positions(boost=2.0), vector=None, scorable=True, stored=True, unique=None)), 
        ('name_exact', ID(format=Existence(boost=3.0), vector=None, scorable=None, stored=False, unique=False)), 
        ('openid', ID(format=Existence(boost=1.0), vector=None, scorable=None, stored=True, unique=False)), 
        ('parentid', ID(format=Existence(boost=1.0), vector=None, scorable=None, stored=True, unique=False)), 
        ('priority', NUMERIC(format=Existence(boost=1.0), vector=None, scorable=None, stored=True, unique=False)), 
        ('ptime', DATETIME(format=Existence(boost=1.0), vector=None, scorable=None, stored=True, unique=False)), 
        ('revid', ID(format=Existence(boost=1.0), vector=None, scorable=None, stored=True, unique=True)), 
        ('severity', NUMERIC(format=Existence(boost=1.0), vector=None, scorable=None, stored=True, unique=False)), 
        ('size', NUMERIC(format=Existence(boost=1.0), vector=None, scorable=None, stored=True, unique=False)), 
        ('status', ID(format=Existence(boost=1.0), vector=None, scorable=None, stored=True, unique=False)), 
        ('summary', TEXT(format=Positions(boost=1.0), vector=None, scorable=True, stored=True, unique=None)), 
        ('superseded_by', ID(format=Existence(boost=1.0), vector=None, scorable=None, stored=True, unique=False)), 
        ('tags', ID(format=Existence(boost=1.0), vector=None, scorable=None, stored=True, unique=False)), 
        ('userid', ID(format=Existence(boost=1.0), vector=None, scorable=None, stored=True, unique=False)), 
        ('wikiname', ID(format=Existence(boost=1.0), vector=None, scorable=None, stored=True, unique=False))]
    

    Googled: http://code.google.com/p/modwsgi/wiki/IssuesWithPickleModule http://stackoverflow.com/questions/11287455/how-do-i-avoid-this-pickling-error-and-what-is-the-best-way-to-parallelize-this

  4. Log in to comment