Very weird `KeyError` exception
Traceback (most recent call last):
File "site-packages/cherrypy/_cprequest.py", line 670, in respond
File "site-packages/cherrypy/lib/encoding.py", line 221, in __call__
File "site-packages/cherrypy/_cpdispatch.py", line 60, in __call__
File "sfutils/perf_stats.py", line 206, in wrapper
File "sfutils/base_service.py", line 461, in wrapper
File "sfutils/base_service.py", line 395, in call_f_and_return_json
File "sfvolumes/service.py", line 260, in request_save_volume
File "sfvolumes/service.py", line 229, in _save_volume
File "sfutils/perf_stats.py", line 206, in wrapper
File "sfvolumes/dao.py", line 206, in get_volume_by_name
File "sfvolumes/dao.py", line 190, in _get_volume_by_name
File "site-packages/sqlalchemy/orm/query.py", line 2814, in one
File "site-packages/sqlalchemy/orm/query.py", line 2784, in one_or_none
File "site-packages/sqlalchemy/orm/loading.py", line 90, in instances
File "site-packages/sqlalchemy/util/compat.py", line 203, in raise_from_cause
File "site-packages/sqlalchemy/orm/loading.py", line 75, in instances
File "site-packages/sqlalchemy/orm/loading.py", line 385, in _instance
File "site-packages/sqlalchemy/orm/identity.py", line 151, in get
KeyError: (<class 'sfutils.pg_api.alchemy_schema.Volume'>, (2L,))
It's very weird because code within orm/identity.py
looks like:
[148] def get(self, key, default=None):
[149] if key not in self._dict:
[150] return default
[151] state = self._dict[key]
First time I've though that session
is used in two threads but session is stored only within dao
and dao is a local parameter so it's not possible that more than one thread is using the same session.
The other strange thing is backtrace - can't explain why raise_from_cause
(from util.py) invoked instances
(from loading.py).
I've saw that weakref
is used so is it possible that default _identity_cls = identity.WeakInstanceDict
from session caused that fired garbage collector freed some objects and removed some key from self._dict
within identity_map? (just while thread was between identity.py:149
and identity.py:151
)?
We use sqlalchemy very intensively and that problem happened three times since last four months (so it seems to be very rare case).
Versions: SQLAlchemy (1.1.10) psycopg2 (2.7.1) postgresql 9.6+184.pgdg14.04+1 python 2.7.12
Unfortunately can't create any Minimal, Complete and Verifiable Example as it seems to be very rare case.
Comments (12)
-
repo owner -
repo owner -
repo owner -
reporter indeed, raising exception is expensive, my favorite is:
result = self._dict.get(key, default_value)
and later usage ofis
operator to verifyif result is default_value:
...) -
repo owner - changed status to resolved
Guard all indexed access in WeakInstanceDict
Added
KeyError
checks to all methods within :class:.WeakInstanceDict
where a check forkey in dict
is followed by indexed access to that key, to guard against a race against garbage collection that under load can remove the key from the dict after the code assumes its present, leading to very infrequentKeyError
raises.Change-Id: I881cc2899f7961d29a0549f44149a2615ae7a4ea Fixes:
#4030→ <<cset 3d8049e32d63>>
-
repo owner Guard all indexed access in WeakInstanceDict
Added
KeyError
checks to all methods within :class:.WeakInstanceDict
where a check forkey in dict
is followed by indexed access to that key, to guard against a race against garbage collection that under load can remove the key from the dict after the code assumes its present, leading to very infrequentKeyError
raises.Change-Id: I881cc2899f7961d29a0549f44149a2615ae7a4ea Fixes:
#4030(cherry picked from commit 1a75f676d75b5daf67455d6e85b7b4df6f2316dc)→ <<cset 676c49b51de2>>
-
We believe we are (still) having this sporadic problem with SQLAlchemy==1.1.13. We
merge
a (fairly complex) object into a session from a cache with every request to our server. We get a handful ofKeyError
s each day, out of dozens of requests per second, so it is very sporadic. The relevant parts of the stack trace areuser = session.merge(cachedUser, load=False) File "/venv/lib/python2.7/site-packages/sqlalchemy/orm/scoping.py", line 157, in do return getattr(self.registry(), name)(*args, **kwargs) File "/venv/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 1849, in merge _resolve_conflict_map=_resolve_conflict_map) File "/venv/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 1875, in _merge merged = self.identity_map[key] File "/venv/lib/python2.7/site-packages/sqlalchemy/orm/identity.py", line 89, in __getitem__ state = self._dict[key] KeyError: <snip>
Does this look related?
-
repo owner yup, I missed one, no idea how
-
repo owner hmmm, OK didn't miss it, because that method is supposed to raise a key error anyway.
-
... except for the fact that it's not the explicit
raise KeyError(key)
that is triggering, it's the dictionary lookup before it. -
repo owner this is
#4069specific to session.merge() -
OK, thanks for picking this up!
- Log in to comment
it's not so much threads as it is that gc just runs in the middle of virtually anything. Looking at WeakInstanceDict I'm a little surprised this pattern is present in quite a few methods, and even more that a couple of methods do guard against it.
The reason there's a lot of "if key in dict" is because if the item is expected to often not be inside the dict, this is a cheaper check than catching KeyError. However, the catch for KeyError should still be there to accommodate for this race.
Methods that have a defensive KeyError are:
methods that are checking "key in dict", then accessing dict[key] without anticipating KeyError (and also do not expect to throw KeyError) are:
above those all need to catch KeyError.