post_update does not honor version_id_col

Issue #3496 resolved
Jon Rosebaugh created an issue

I'm looking at using the ORM's version counting system, but I've found a corner case that makes it difficult.

In the below example, I make a change to node1 to set its related node to node2 (previously None). Conceptually, this is a change to node1, and node1 shows up in session.dirty (node2 does not); however, due to the way the foreign key is implemented, SQLAlchemy needs to UPDATE the foreign key on node2's row.

This means that the updated_at column gets updated for node2, not node1. However, strangely, the revision counter doesn't increment for either object; the generated SQL is "UPDATE nodes SET related_node_id=1, updated_at=CURRENT_TIMESTAMP WHERE nodes.id = 2"

I think this may be a bug. The unwanted behavior does not occur if I also mutate the bonus_data column on node1. (However, updated_at still gets updated on node2, which is technically not accurate but is acceptable for my use case.)

import logging
import sqlalchemy as sa
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker, relationship

logging.basicConfig()
logging.getLogger('sqlalchemy.engine').setLevel(logging.INFO)

engine = sa.create_engine('sqlite:///:memory:')
Base = declarative_base()


class Node(Base):
    __tablename__ = 'nodes'

    id = sa.Column(sa.Integer, primary_key=True)
    related_node_id = sa.Column(sa.Integer, sa.ForeignKey('nodes.id'))
    updated_at = sa.Column(sa.DateTime, default=sa.func.now(), onupdate=sa.func.now())
    revision = sa.Column(sa.Integer)
    bonus_data = sa.Column(sa.Unicode(20))

    related_node = relationship('Node', uselist=False, post_update=True)

    __mapper_args__ = {
        "version_id_col": revision
    }


Base.metadata.create_all(engine)

session_factory = sessionmaker(bind=engine)

session = session_factory()
node1 = Node()
node2 = Node()
session.add(node1)
session.add(node2)
session.flush()
node1_id = node1.id
node2_id = node2.id
session.commit()

node1 = session.query(Node).get(node1_id)
node2 = session.query(Node).get(node2_id)
assert node1.revision == 1
assert node2.revision == 1
node1.bonus_data = u'hello'
node2.bonus_data = u'world'
session.commit()

node1 = session.query(Node).get(node1_id)
node2 = session.query(Node).get(node2_id)
assert node1.revision == 2
assert node2.revision == 2
assert node1.updated_at == node2.updated_at

node1.related_node = node2
assert node1 in session.dirty
assert session.is_modified(node1)
assert node2 not in session.dirty
assert not session.is_modified(node2)
# node1.bonus_data = u"goodbye"
session.flush()
assert node1.revision == 3
assert node2.revision == 2
session.commit()

Comments (20)

  1. Jon Rosebaugh reporter

    I tried applying that patch and it seemed to have no effect; still no UPDATE for node1's row.

  2. Mike Bayer repo owner

    a plain many-to-one update is likely not going to impact version_id for the parent because no changes to that row are required. If you want that effect, I'd use an attribute event handler that bumps the parent manually.

  3. Jon Rosebaugh reporter

    Hm, my conception was that an object appearing in session.dirty would have revision incremented.

    Anyway, it does look like I can get the revision guard on the update when I manually increment revision. This just seems potentially error-prone.

  4. Mike Bayer repo owner

    OK. the purpose of version_id_col is to track changes to a row that may be concurrent elsewhere. There is a bug here, but given the following test:

    session_factory = sessionmaker(bind=engine)
    
    session = session_factory()
    node1 = Node()
    node2 = Node()
    session.add(node1)
    session.add(node2)
    session.commit()
    
    print "---------------"
    node1.related_node = node2
    session.commit()
    

    Above, the UPDATE is emitted against node2's row. that is the only row that must receive an UPDATE and that is where version_id_col should be emitted. The bug is basically that post_update does not take version_id_col into account. If we remove the post_update, then the behavior is correct:

    UPDATE nodes SET related_node_id=?, updated_at=CURRENT_TIMESTAMP, revision=? WHERE nodes.id = ? AND nodes.revision = ?
    INFO:sqlalchemy.engine.base.Engine:(1, 2, 2, 1)
    

    it seems you're asking for the opposite here, and that's not what version_id_col is used for.

  5. Jon Rosebaugh reporter

    It's certainly odd that neither row got a revision bump in my test, but since the revision system operates at the ORM level, and node1 is the one that was dirty (even though flushing it actually resulted in an update to node2's row), I would have expected node1 to get the revision bump.

  6. Mike Bayer repo owner

    the versioning system is intended to detect concurrent modifications to a target row by another transaction. It is part of the ORM but is doing something to ensure correctness at the SQL level, not the "business case versioning" level, which is something you should maintain on your own.

  7. Jon Rosebaugh reporter

    We're now doing that, by setting version_id_generator = False and manually incrementing revision whenever we want. We still use version_id_col for the addition to the WHERE clause on updates (unless there's another known way to get that behavior).

  8. Mike Bayer repo owner

    I think that's fine. as far as the use case of version_id-and-nothing-else bump on session.dirty, I think a simple before_flush() handler can get you that no ? it just means that even if the object has no net change whatsoever, you still get the version bump.

  9. Mike Bayer repo owner
    • changed milestone to 1.2

    starting to wrap up scope for 1.1, remaining behavioral changes being pushed out

  10. Mike Bayer repo owner

    coming back, OK, that's a lot of code and a lot of repetition. Why do we need to emit post_update when this is only an UPDATE? This should be a dupe of #1063 which would solve the whole problem.

  11. Mike Bayer repo owner

    Add all versioning logic to _post_update()

    An UPDATE emitted as a result of the :paramref:.relationship.post_update feature will now integrate with the versioning feature to both bump the version id of the row as well as assert that the existing version number was matched.

    Fixes: #3496 Change-Id: I865405dd6069f1c1e3b0d27a4980e9374e059f97

    → <<cset 64b0760faa45>>

  12. Log in to comment