DELETEs executed twice when using history_meta and subclassing models

Issue #3015 invalid
matt chisholm created an issue

When using history_meta and also subclassing from model classes (which we do in order to add additional query parameters in our API), SQLAlchemy 0.9.4 raises an error about the number of rows that were deleted not matching the number of rows intended to be deleted.

This bug is new in 0.9.4, it does not occur in 0.9.3 or 0.9.2.

Attached is a minimal test case.

There is also a hack in here which fixes another bug, the SubclassEquivalenceMixIn. If you remove that from the base classes for APIBar, you can see why we are using that mix-in.

Here is the traceback:

2014-04-04 18:55:38,671 INFO sqlalchemy.engine.base.Engine DELETE FROM foo_bar WHERE foo_bar.foo_id = %(foo_id)s AND foo_bar.bar_id = %(bar_id)s
2014-04-04 18:55:38,671 INFO sqlalchemy.engine.base.Engine {'foo_id': 1, 'bar_id': 1}
2014-04-04 18:55:38,672 INFO sqlalchemy.engine.base.Engine DELETE FROM foo_bar WHERE foo_bar.foo_id = %(foo_id)s AND foo_bar.bar_id = %(bar_id)s
2014-04-04 18:55:38,672 INFO sqlalchemy.engine.base.Engine {'foo_id': 1, 'bar_id': 1}
2014-04-04 18:55:38,672 INFO sqlalchemy.engine.base.Engine ROLLBACK
Traceback (most recent call last):
  File "sqlalchemy-delete-bug.py", line 100, in <module>
    db.session.commit()
  File "/Users/matt/.virtualenvs/sqlalchemy-delete-bug/lib/python2.7/site-packages/sqlalchemy/orm/scoping.py", line 149, in do
    return getattr(self.registry(), name)(*args, **kwargs)
  File "/Users/matt/.virtualenvs/sqlalchemy-delete-bug/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 765, in commit
    self.transaction.commit()
  File "/Users/matt/.virtualenvs/sqlalchemy-delete-bug/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 370, in commit
    self._prepare_impl()
  File "/Users/matt/.virtualenvs/sqlalchemy-delete-bug/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 350, in _prepare_impl
    self.session.flush()
  File "/Users/matt/.virtualenvs/sqlalchemy-delete-bug/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 1903, in flush
    self._flush(objects)
  File "/Users/matt/.virtualenvs/sqlalchemy-delete-bug/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 2021, in _flush
    transaction.rollback(_capture_exception=True)
  File "/Users/matt/.virtualenvs/sqlalchemy-delete-bug/lib/python2.7/site-packages/sqlalchemy/util/langhelpers.py", line 57, in __exit__
    compat.reraise(exc_type, exc_value, exc_tb)
  File "/Users/matt/.virtualenvs/sqlalchemy-delete-bug/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 1985, in _flush
    flush_context.execute()
  File "/Users/matt/.virtualenvs/sqlalchemy-delete-bug/lib/python2.7/site-packages/sqlalchemy/orm/unitofwork.py", line 370, in execute
    rec.execute(self)
  File "/Users/matt/.virtualenvs/sqlalchemy-delete-bug/lib/python2.7/site-packages/sqlalchemy/orm/unitofwork.py", line 479, in execute
    self.dependency_processor.process_saves(uow, states)
  File "/Users/matt/.virtualenvs/sqlalchemy-delete-bug/lib/python2.7/site-packages/sqlalchemy/orm/dependency.py", line 1083, in process_saves
    secondary_update, secondary_delete)
  File "/Users/matt/.virtualenvs/sqlalchemy-delete-bug/lib/python2.7/site-packages/sqlalchemy/orm/dependency.py", line 1104, in _run_crud
    result.rowcount)
sqlalchemy.orm.exc.StaleDataError: DELETE statement on table 'foo_bar' expected to delete 1 row(s); Only 0 were matched.

I would not be the least bit surprised if either this code or flask-sqlalchemy were just doing something horribly stupid.

Comments (8)

  1. matt chisholm reporter

    Oh also, I've tried putting __mapper_args__ = {'confirm_deleted_rows': False} on my model classes and that doesn't help... but I think it is unrelated.

  2. Mike Bayer repo owner

    I don't have flask installed. If I remove the flask junk, it works fine, even that last line where you pretty dangerously modify the APIBar without making a strong reference to it, and also with the "enable_typechecks=False" lines, though they aren't needed here:

    from sqlalchemy import *
    from sqlalchemy.orm import *
    from sqlalchemy.ext.declarative import declarative_base
    
    Base = declarative_base()
    
    from examples.versioned_history.history_meta import Versioned
    from examples.versioned_history.history_meta import versioned_session
    
    versioned_session(Session)
    
    class SubclassEquivalenceMixIn(object):
        def __eq__(self, other):
            return isinstance(self, type(other)) and self.id == other.id
    
    foo_bar = Table('foo_bar', Base.metadata,
        Column(
            'foo_id', Integer,
            ForeignKey('foo.id', ondelete='CASCADE'),
            nullable=False, primary_key=True),
        Column(
            'bar_id', Integer,
            ForeignKey('bar.id', ondelete='CASCADE'),
            nullable=False, primary_key=True),
    )
    
    
    class Foo(Versioned, Base):
        __tablename__ = 'foo'
        id = Column(Integer, primary_key=True)
    
    
    class Bar(Versioned, Base):
        __tablename__ = 'bar'
        id = Column(Integer, primary_key=True)
    
        foos = relationship(
            'Foo',
            secondary=foo_bar,
            backref=backref('bars', enable_typechecks=False),
            enable_typechecks=False,
        )
    
    class APIBar(Bar, SubclassEquivalenceMixIn):
        """A subclass used for the API"""
        pass
    
    e = create_engine("mysql://scott:tiger@localhost/test", echo=True)
    Base.metadata.drop_all(e)
    Base.metadata.create_all(e)
    
    session = Session(e)
    
    session.add(Bar(foos=[Foo()]))
    session.commit()
    session.close()
    
    session.query(APIBar).one().foos = []
    session.commit()
    

    make it fail without the flask junk and then we can look at what's happening.

  3. Mike Bayer repo owner

    oh and I'm using history meta from the examples/ folder, not whatever you have on bitbucket :).

  4. matt chisholm reporter

    Ah, I'd forgotten that we were pulling from our own fork of history_meta. Ok, I will try to construct a minimal example without Flask-SQLAlchemy.

  5. Mike Bayer repo owner

    nevermind, here is your issue:

    1. you're using postgresql (I thought it was mysql), and the change that is triggering your exception only in 0.9.4 is c01558ae. psycopg2 now supports the correct rowcount on executemany(), so this improvement is revealing where your script is doing the wrong thing (e.g., your app has been broken for many versions, back to 0.7 at least, emitting two DELETE statements). Here's 0.8.2:

    2014-04-06 14:06:58,956 INFO sqlalchemy.engine.base.Engine DELETE FROM foo_bar WHERE foo_bar.foo_id = %(foo_id)s AND foo_bar.bar_id = %(bar_id)s
    2014-04-06 14:06:58,956 INFO sqlalchemy.engine.base.Engine {'foo_id': 1, 'bar_id': 1}
    2014-04-06 14:06:58,956 INFO sqlalchemy.engine.base.Engine DELETE FROM foo_bar WHERE foo_bar.foo_id = %(foo_id)s AND foo_bar.bar_id = %(bar_id)s
    2014-04-06 14:06:58,957 INFO sqlalchemy.engine.base.Engine {'foo_id': 1, 'bar_id': 1}
    

    2. your history_meta.py is slightly different from the one in the current examples, around line 162 you have this:

    attributes.get_history(obj, prop.key)
    

    where we currently have it as this:

     attributes.get_history(obj, prop.key, passive=attributes.PASSIVE_NO_INITIALIZE)
    

    not a big deal, but using your version, there's an unconditional load of Foo.bars and Bar.foos. that is, the foo_bar row is represented in both directions in memory.

    3. The many-to-many logic has to guard against the case where a change in a bidirectional relationship triggers both Bar.foos and Foo.bars to record that as a DELETE. It does this by putting the states for what's been processed so far into a set, and checking if the thing we're about to operate upon is in that set. In this case it's not. Which is because your script is going through some extraordinary effort to break the identity map.

    4. I wondered why your script used that terrible flag, "enable_typechecks=False". Why on earth would anyone need this flag that I probably added because polymorphic loading sucked back in 0.4 and someone's 0.3 program probably wasn't working? The script doesn't seem to need it, but I can see the general thing that caused it, which is because you are representing the same row in two different ways. This is wrong:

    from sqlalchemy import *
    from sqlalchemy.orm import *
    from sqlalchemy.ext.declarative import declarative_base
    
    Base = declarative_base()
    
    class Bar(Base):
        __tablename__ = 'bar'
        id = Column(Integer, primary_key=True)
    
    class APIBar(Bar):
        pass
    
    e = create_engine("postgresql://scott:tiger@localhost/test", echo=True)
    Base.metadata.drop_all(e)
    Base.metadata.create_all(e)
    
    session = Session(e)
    session.add(Bar())
    session.commit()
    
    a1 = session.query(APIBar).one()
    a2 = session.query(Bar).one()
    assert a1 is a2
    

    Depending on how we load a Bar, we get either an APIBar or a Bar, both of which represent the same row, but as far as the Session is concerned, have nothing to do with each other. The script when the condition occurs loads a Bar as a result of Foo.bars in that history check. So now there's a Bar and an APIBar in play, both with a "foos" collection that's been emptied out.

    Within the many-to-many, it knows it has to remove a Foo from Bar.foos (checking first that there is no Bar that's been removed from the corresponding Foo.bars), and an APIBar from Foo.bars (checking first that there is no Foo that's been removed from the corresponding Bar.foos). The first check for the "no Bar that's been removed from the corresponding Foo.bars" fails to report that well, actually there's an APIBar representing the same row that has been taken care of. So you get two deletes, the second delete matches no rows, you get an error. no bug here.

  6. matt chisholm reporter

    Thanks a lot for the detailed explanation. We are already fixing our version of history_meta. However, I am a bit confused about point 4 above. This seems to indicate that it's just plain impossible to inherit from a model class (as we are doing with APIBar) and expect things to work properly. This inheritance is why we are using enable_typechecks=False... without that flag, we cannot add or remove APIBar instances from a relation that's expecting instances of Bar instead.

    In other words, is this subclassing pattern just never going to work properly? Or is there something we can do to keep using it?

  7. Mike Bayer repo owner

    This seems to indicate that it's just plain impossible to inherit from a model class (as we are doing with APIBar) and expect things to work properly.

    that's not true. You just want to ensure that either A. everytime you refer to that particular row in the database, it is always an APIBar object (use single table inheritance for this), or B. if you are going to represent the same row using different classes, you don't mix the usage of these entirely different object into the same context, e.g. expecting the APIBar you loaded on one hand with a relationship that is looking for Bar objects.

  8. Log in to comment