session.merge() can miss object gc'ed after checking in identity map

Issue #4069 resolved
Michael Bayer
repo owner created an issue

sample trace:

   user = session.merge(cachedUser, load=False)
 File "/venv/lib/python2.7/site-packages/sqlalchemy/orm/scoping.py", line 157, in do
   return getattr(self.registry(), name)(*args, **kwargs)
 File "/venv/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 1849, in merge
   _resolve_conflict_map=_resolve_conflict_map)
 File "/venv/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 1875, in _merge
   merged = self.identity_map[key]
 File "/venv/lib/python2.7/site-packages/sqlalchemy/orm/identity.py", line 89, in __getitem__
   state = self._dict[key]
KeyError: <snip>

moved from #4030

Comments (8)

  1. David Feltell

    Great, thanks for the fast fix! I have put it on our staging server. There is much less traffic there than on production, but we do still see these KeyErrors once or twice per day.

  2. David Feltell

    Just a quick update: after leaving it overnight, so far so good! Your changes didn't break anything that I can tell, and there have been no KeyErrors. However, not enough time has passed to be conclusive.

  3. David Feltell

    After checking our logs, the longest period we have not seen this KeyError on our staging server is 6 days (on average we see them on 2 out of every 3 days). So it will be a while before we can conclude for certain that this fix solves the problem, by this measure at least.

  4. David Feltell

    I'm afraid my team need to update our staging server, so the patched sqlalchemy will be overridden.

    However, we've looked at the change in session.py and it looks fine, though we're curious why you did not patch the lower-level __getitem__?

    Anyway, we look forward to the next release with this fix included.

    Thanks for your rapid action on this.

  5. Michael Bayer reporter

    the behavioral contract of a Python __getitem__ is that if the object is not found, it raises KeyError. It may be odd looking but the first self._dict[key] is allowed to propagate its own KeyError, as our own __getitem__ is behaving like a proxy method. Only if the item is found, and an additional condition fails, e.g. that the weakref to the mapped object has been garbage collected, do we also consider that to be "key does not exist" so you see the explicit raise for KeyError there.

  6. Michael Bayer reporter

    Guard against KeyError in session.merge after check for identity

    Fixed bug in :meth:.Session.merge following along similar lines as that of 🎫4030, where an internal check for a target object in the identity map could lead to an error if it were to be garbage collected immediately before the merge routine actually retrieves the object.

    Change-Id: Ifecfb8b9d50c52d0ebd5a03e1bd69fe3abf1dc40 Fixes: #4069 (cherry picked from commit bfad032017fde2e519ad5eacc0011a71bf7a22a9)

    → <<cset 68953902dda9>>

  7. Michael Bayer reporter

    Guard against KeyError in session.merge after check for identity

    Fixed bug in :meth:.Session.merge following along similar lines as that of 🎫4030, where an internal check for a target object in the identity map could lead to an error if it were to be garbage collected immediately before the merge routine actually retrieves the object.

    Change-Id: Ifecfb8b9d50c52d0ebd5a03e1bd69fe3abf1dc40 Fixes: #4069

    → <<cset 1e21f9c4f898>>

  8. Log in to comment