AttributeError on self._hash when unpickling query (serializer extension). Well, sometimes...

Issue #3918 resolved
Peter Sevens created an issue

My apologies for the long report but this issue seems quite dependent on exact circumstances and I wanted to provide as much relevant info as possible.

When unpickling a query (using the serializer extension) with a filter using .any() or .has() on a relationship to a model with its __table__ defined as an outerjoin between 2 Tables (see attached script) an AttributeError may get raised on the self._hash attribute, depending on some circumstances...

It seems to be dependent on some specific conditions for the same query-building code to trigger the bug or not, at least in our code (Pyramid app using PostgreSQL). E.g. on my local machine (single-process single-threaded webserver) I am only able to trigger the exception the first time the problematic query executes after the server process was (re)started, no matter if other queries where executed since the server started or not. After that first time the same query/code/request goes fine, until the server is (re)started once again. On our CI environment however, during our integration tests (UWSGI multi-process multi-threading) it triggered over 200 times during the full suite. On that same environment I can also fairly consistently reproduce it manually (about 1-2 in 10 attempts raise no exception) by repeating the same action (the same that only fails the first time after server-start on my local env). And in the attached test script it only happens after a str() call is done on the query before pickling... The query in our code is the same as in attached script and there's nothing obvious in our code that could get into a different state at all between subsequent runs of the code generating the query explaining this behavior.

After some digging around in the pickle module it seems to be that the Python unpickle code wants to put an 'empty' object (AnnotatedJoin in this case) inside a set. The object has an empty __dict__ at that point (it will be filled later on in the unpickling process I guess) and, as Annotated has a custom __hash__() function which uses an attribute, this leads to an error. I have some more context (including tracebacks) from a few of these debugging sessions if more info on this part is wanted.

As far as I could figure out it looks like this is a bug in Python when pickling classes with a custom __hash__() inside a set with a circular reference but I wanted to report it here too, especially since that Python bug is closed as WONTFIX so it's at least documented and searchable for others running into this, as I had a hard time finding any info at all on this issue. The Python bug mentions a workaround, though I don't know if it would be suitable for this case.

Using pickle protocol 2 but the same issue arises when using pickle protocol 1. Protocol 0 raises a different error.

The exact exception message differs between situations and if pickle or cPickle is used, though the traceback and error lines are the same (added all, for reference and search engines):

  • On Python 2.7, with pickle: AttributeError: 'AnnotatedJoin' object has no attribute '_hash'
  • On Python 2.7, with cPickle: AttributeError: <exception str() failed>
    • Same but running in a Celery worker: AttributeError: <unprintable AttributeError object>
  • On Python 3.6, with cPickle: AttributeError: 'AnnotatedJoin' object has no attribute '_hash'

Tested with:

  • SQLAlchemy 1.1.5, Python 2.7.13 (only tested with the attached script)
  • SQLAlchemy 1.1.5, Python 3.6.0 (only tested with the attached script)
  • SQLAlchemy 1.0.12, Python 2.7.13 (with both the original bug in our code and with the attached script)
  • All on Linux (x86_64), both Debian and OpenSUSE (13.2 and 42.2).
  • All (except the test script) using PostgreSQL with Psycopg2 (2.6.2)

Traceback:

  Traceback (most recent call last):
    File "/home/user/.virtualenvs/venv/lib/python2.7/site-packages/celery/app/trace.py", line 240, in trace_task
      R = retval = fun(*args, **kwargs)
    File "/home/user/.virtualenvs/venv/lib/python2.7/site-packages/celery/app/trace.py", line 438, in __protected_call__
      return self.run(*args, **kwargs)
    File "/home/user/dev/project/backend/realtime/registry.py", line 570, in store_query_watch
      query = deserialize_query(serialized_query)
    File "/home/user/dev/project/backend/realtime/util.py", line 37, in deserialize_query
      return serializer.loads(value, db_meta, db_session)
    File "/home/user/.virtualenvs/venv/lib/python2.7/site-packages/sqlalchemy/ext/serializer.py", line 159, in loads
      return unpickler.load()
    File "/home/user/.virtualenvs/venv/lib/python2.7/site-packages/sqlalchemy/sql/annotation.py", line 99, in __hash__
      return self._hash
  AttributeError: <unprintable AttributeError object>

N.b.: on our testing and staging environments (same UWSGI multi-process multi-threading setup) with a slightly older version of the code the same error, but on a different query, is also sometimes reproducible but doing the same action/request won't trigger it more than once and replaying the same actions and steps in the application under a different user won't reproduce it at all.

Comments (7)

  1. Mike Bayer repo owner

    so, the "serializer" extension is overall this pattern that in practice has turned out to be largely unworkable as there are many pickle issues like this that prevent it from working completely. I don't have any fix for this issue at the moment and as you've noted they can be very difficult to debug and repair, and sometimes impossible. This would have to be a low priority issue unless you want to continue working on it.

  2. Peter Sevens reporter

    I'm afraid I don't really have more time to spend on this issue. It took me some time to figure this much out already (though I learned a lot) and your comment here together with another one I found while researching this bug, combined with another bug in our code when unpickling queries that I haven't even found the code for that triggers the problem yet made me think it might not be such a good idea to keep using the serializer extension at all for our solution :)

    Especially considering the very specific circumstances under which problems may arise, making it possible for issues with it to slip through undetected before the code gets into production just because the exact right circumstances to trigger them didn't happen during testing.

  3. Mike Bayer repo owner

    I think warnings really need to be added to the serializer docs. It works great for Core expressions and simple ORM expressions, but when it gets involved with the deep structures of some of the ORM queries we hit these issues.

  4. Peter Sevens reporter

    Yeah, I was thinking about that too, looks like a good idea.

    It seems like a potentially big gotcha, or at least something to be aware of, with little information on it. At least as far as I could find. Especially considering texts like "Any SQLAlchemy query structure, either based on sqlalchemy.sql. or sqlalchemy.orm. can be used." (serializer docs), which seems slightly at odd with the actual situation (at least under some circumstances :)), and the subtleties of which complex structures can and can't be pickled without additional helping code (e.g. __setstate__/__getstate__).

  5. Mike Bayer repo owner

    Improve serializer behavior

    Fix an issue where the Annotated system needs to have a reduce method, also see why we can't default to HIGHEST_PROTOCOL. This latter part might not be a good idea until 1.2 for compatibility reasons.

    Change-Id: I0239e38259fc768c9e3b6c448c29161e271a969c Fixes: #3918

    → <<cset 86cf3eb71c3c>>

  6. Log in to comment