Cannot rollback session when deadlock occurs within nested transaction

Issue #3680 resolved
Former user created an issue

I am getting an assertion error when rolling back a transaction following an OperationalError raised by a deadlock within a nested transaction:

File "lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 273, in _restore_snapshot assert self._is_transaction_boundary

I assume that this is caused by another error that was raised during the deadlock rollback attempt:

sqlalchemy.exc.OperationalError: (_mysql_exceptions.OperationalError) (1305, 'SAVEPOINT sa_savepoint_1 does not exist') [SQL: u'ROLLBACK TO SAVEPOINT sa_savepoint_1']

This was executed automatically after the deadlock error.

When the session reaches this point, I cannot roll it back and can only re-use it by using session.close()

To reproduce:

class MyTable(Base):
    __tablename__ = 'mytable'

    id = Column(Integer, primary_key=True, autoincrement=True, nullable=False)
    name = Column(String(10), unique=True, nullable=False)

In session 1:

s = Session()
s.begin_nested()
s.add(MyTable(name='mitsos'))
s.flush()

In session 2:

s = Session()
s.begin_nested()
s.add(MyTable(name='mitsos'))
s.flush()

Back to session 1:

s.add(MyTable(name='mits'))
s.flush()

At this point session 2 will get the deadlock stacktrace and a failed rollback attempt because of the savepoint error.

After this, the session cannot be rolled back at all.

Comments (17)

  1. Mike Bayer repo owner

    this is a dupe of #2696. Please make sure you're on version 1.0.12 and the actual "cause" will be in your logs emitted as a warning. the ROLLBACK is not possible.

  2. Mike Bayer repo owner

    OK, so here's that turned into a real test:

    from sqlalchemy import *
    from sqlalchemy.orm import *
    from sqlalchemy.ext.declarative import declarative_base
    import threading
    import time
    
    Base = declarative_base()
    
    
    class MyTable(Base):
        __tablename__ = 'mytable'
    
        id = Column(Integer, primary_key=True, autoincrement=True, nullable=False)
        name = Column(String(10), unique=True, nullable=False)
    
    e = create_engine("mysql://scott:tiger@localhost/test", echo=True)
    Base.metadata.drop_all(e)
    Base.metadata.create_all(e)
    
    s = Session(e)
    s.begin_nested()
    s.add(MyTable(name='mitsos'))
    s.flush()
    
    
    def go():
        s2 = Session(e)
        s2.begin_nested()
        s2.add(MyTable(name='mitsos'))
        s2.flush()
        time.sleep(1)
    
    t = threading.Thread(target=go)
    t.start()
    time.sleep(1)
    
    s.add(MyTable(name='mits'))
    s.flush()
    

    we can see the warning:

    #!
    
    
    /home/classic/dev/sqlalchemy/lib/sqlalchemy/util/langhelpers.py:68: SAWarning: An exception has occurred during handling of a previous exception.  The previous exception is:
     <class 'sqlalchemy.exc.OperationalError'> (_mysql_exceptions.OperationalError) (1213, 'Deadlock found when trying to get lock; try restarting transaction') [SQL: u'INSERT INTO mytable (name) VALUES (%s)'] [parameters: ('mitsos',)]
    

    and that is what #2696 was all about. But now, that you can't rollback() on that session, that's another aspect here. #2696 is all over openstack but I guess they start with a new session.

  3. Mike Bayer repo owner

    and looking, yeah I don't know that s.rollback() of the topmost transaction is feasible. the bookkeeping we need to do after the savepoint is rolled back does not occur, which means the outermost bookkeeping would fail as well unless we went through the process of piecing together from the broken rollback. session.close() OTOH does do a database level rollback, but doesn't have to do the bookkeeping, so close() here is appropriate. perhaps a s.rollback(hard=True), which is basically the same as s.close(), would be clearer.

  4. Mike Bayer repo owner

    After this, the session cannot be rolled back at all.

    hello -

    can you please clarify exactly the behavior on that? Per my tests it is:

    #!
    
    /home/classic/dev/sqlalchemy/lib/sqlalchemy/orm/session.py:430: SAWarning: Session's state has been changed on a non-active transaction - this state will be discarded.
      "Session's state has been changed on "
    Exception in thread Thread-1:
    Traceback (most recent call last):
      File "/usr/lib64/python2.7/threading.py", line 813, in __bootstrap_inner
        self.run()
      File "/usr/lib64/python2.7/threading.py", line 766, in run
        self.__target(*self.__args, **self.__kwargs)
      File "test.py", line 33, in go
        s2.rollback()
      File "/home/classic/dev/sqlalchemy/lib/sqlalchemy/orm/session.py", line 756, in rollback
        self.transaction.rollback()
      File "/home/classic/dev/sqlalchemy/lib/sqlalchemy/orm/session.py", line 433, in rollback
        boundary._restore_snapshot(dirty_only=boundary.nested)
      File "/home/classic/dev/sqlalchemy/lib/sqlalchemy/orm/session.py", line 272, in _restore_snapshot
        assert self._is_transaction_boundary
    AssertionError
    

    I can repair that aspect here.

  5. Mike Bayer repo owner

    test which rolls back failed transaction:

    from sqlalchemy import *
    from sqlalchemy.orm import *
    from sqlalchemy.ext.declarative import declarative_base
    import threading
    import time
    
    Base = declarative_base()
    
    
    class MyTable(Base):
        __tablename__ = 'mytable'
    
        id = Column(Integer, primary_key=True, autoincrement=True, nullable=False)
        name = Column(String(10), unique=True, nullable=False)
    
    e = create_engine("mysql://scott:tiger@localhost/test", echo=True)
    Base.metadata.drop_all(e)
    Base.metadata.create_all(e)
    
    s = Session(e)
    s.begin_nested()
    s.add(MyTable(name='mitsos'))
    s.flush()
    
    
    def go():
        s2 = Session(e)
        s2.begin_nested()
        s2.add(MyTable(name='mitsos'))
        try:
            s2.flush()
        except Exception as err:
            print "exception:", err
            s2.rollback()
    
    t = threading.Thread(target=go)
    t.start()
    time.sleep(1)
    
    s.add(MyTable(name='mits'))
    s.flush()
    s.commit()
    

    patch:

    diff --git a/lib/sqlalchemy/orm/session.py b/lib/sqlalchemy/orm/session.py
    index 48ff09b..b43e944 100644
    --- a/lib/sqlalchemy/orm/session.py
    +++ b/lib/sqlalchemy/orm/session.py
    @@ -422,7 +422,7 @@ class SessionTransaction(object):
             sess = self.session
    
             if sess._enable_transaction_accounting and \
    -                not sess._is_clean():
    +                not sess._is_clean() and boundary._is_transaction_boundary:
    
                 # if items were added, deleted, or mutated
                 # here, we need to re-restore the snapshot
    

    can you confirm this patch resolves? thanks

  6. Former user Account Deleted reporter

    I suspected that there would need to be a lot of book-keeping to bring the session back to a workable state. A session.close and a retry is feasible in my situation so I'm happy to stick to that since it's the appropriate thing to do.

    Thanks

  7. Former user Account Deleted reporter

    Sorry there were more responses coming in while I was reading the previous ones, yes this is the exact issue I am getting.

  8. Mike Bayer repo owner

    yeah still leaning towards, "call close()" here. if a rollback fails, we have no idea why that happened and it means we've lost control of the connection's state. a close() as documented cancels any remaining transaction and discards all state, which is the only safe thing to do here. The Session is capable of handling multiple connections at once and I don't see how to reconcile restoring the correct state if some of those connections fail to rollback a savepoint and others do.

  9. Mike Bayer repo owner

    re: your comment, if i fix the assertion error, we're still stuck on the bad transaction, so it doesn't really fix the problem, however perhaps I should add that patch in any case since you at least continue to get the "cant rollback to savepoint" error consistently instead of the assertion failure.

  10. Mike Bayer repo owner

    OK I'm going to spend more time with this since the more I play with it the more I see the flow can be improved, but yes, solution for now is, "call close()" :).

  11. Mike Bayer repo owner
    • Further continuing on the common MySQL exception case of a savepoint being cancelled first covered in 🎫2696, the failure mode in which the :class:.Session is placed when a SAVEPOINT vanishes before rollback has been improved to allow the :class:.Session to still function outside of that savepoint. It is assumed that the savepoint operation failed and was cancelled. fixes #3680

    → <<cset 0a29071b16e1>>

  12. Log in to comment