Use column expressions with session.bulk_update_mappings, session.bulk_insert_mappings

Issue #3568 new
Fabio Sussetto created an issue
mappings = [
   {'id': 1, 'balance': User.balance + 10},
   {'id': 2, 'balance': User.balance + 20}
]
db.session.bulk_update_mappings(User, mappings)

On 1.0.9 this fails with the following error:

can't adapt type 'BinaryExpression' [SQL: 'UPDATE "user" SET balance=%(balance)s WHERE "user".id = %(user_id)s'] [parameters: ({'user_id': 1, 'balance': <sqlalchemy.sql.elements.BinaryExpression object at 0x10d153be0>}

Is this supported?

Thanks!

Comments (6)

  1. Mike Bayer repo owner

    the purpose of bulk_update_mappings is to get as much raw Python speed as possible while still having some degree of ORM features. This particular feature would require scanning through every value in every dictionary given and calling isinstance(value, ClauseElement) on it, which would really place a burden on that, unless we made some kind of rule that we only scan the first dictionary and assume the rest have that set up. If a subsequent dictionary doesn't have an expression there, it looks like it might send it out normally anyway just not in bulk.

    But you'd still get an error if you had a SQL expression in the middle of the list.

    Here's a patch

    diff --git a/lib/sqlalchemy/orm/persistence.py b/lib/sqlalchemy/orm/persistence.py
    index 1f9d0a2..0f9c7c9 100644
    --- a/lib/sqlalchemy/orm/persistence.py
    +++ b/lib/sqlalchemy/orm/persistence.py
    @@ -429,6 +429,9 @@ def _collect_update_commands(
    
         """
    
    +    if bulk:
    +        bulk_value_keys = None
    +
         for state, state_dict, mapper, connection, \
                 update_version_id in states_to_update:
    
    @@ -442,12 +445,27 @@ def _collect_update_commands(
             propkey_to_col = mapper._propkey_to_col[table]
    
             if bulk:
    +            if bulk_value_keys is None:
    +                bulk_value_keys = set([
    +                    key for key in state_dict
    +                    if isinstance(state_dict[key], sql.ClauseElement)
    +                ])
    +
                 params = dict(
                     (propkey_to_col[propkey].key, state_dict[propkey])
                     for propkey in
                     set(propkey_to_col).intersection(state_dict).difference(
    -                    mapper._pk_keys_by_table[table])
    +                    mapper._pk_keys_by_table[table]).
    +                difference(bulk_value_keys)
                 )
    +            if bulk_value_keys:
    +                value_params.update(
    +                    (propkey_to_col[propkey].key, state_dict[propkey])
    +                    for propkey in
    +                    set(propkey_to_col).intersection(state_dict).difference(
    +                        mapper._pk_keys_by_table[table]).
    +                    intersection(bulk_value_keys)
    +                )
             else:
                 params = {}
                 for propkey in set(propkey_to_col).intersection(
    

    with a test script as follows:

    from sqlalchemy import *
    from sqlalchemy.orm import *
    from sqlalchemy.ext.declarative import declarative_base
    
    
    def eq_(x, y):
        assert x == y
    
    Base = declarative_base()
    
    
    class A(Base):
        __tablename__ = 'a'
        id = Column(Integer, primary_key=True)
        x = Column(Integer)
        y = Column(Integer)
    
    e = create_engine("sqlite://", echo=True)
    Base.metadata.create_all(e)
    
    s = Session(e)
    s.add_all([A(id=1, x=1, y=1), A(id=2, x=1, y=1), A(id=3, x=1, y=1)])
    s.commit()
    
    s.bulk_update_mappings(
        A,
        [
            {"id": 1, "x": 2, "y": A.y + 1},
            {"id": 2, "x": 2, "y": A.y + 1},
            {"id": 3, "x": 2, "y": A.y + 1},
        ]
    )
    
    
    eq_(
        s.query(A.id, A.x, A.y).order_by(A.id).all(),
        [(1, 2, 2), (2, 2, 2), (3, 2, 2)]
    )
    
    
    s.bulk_update_mappings(
        A,
        [
            {"id": 1, "x": 3, "y": A.y + 1},
            {"id": 2, "x": 3, "y": 10},
            {"id": 3, "x": 3, "y": 11},
        ]
    )
    
    eq_(
        s.query(A.id, A.x, A.y).order_by(A.id).all(),
        [(1, 3, 3), (2, 3, 10), (3, 3, 11)]
    )
    
    
    s.bulk_update_mappings(
        A,
        [
            {"id": 1, "x": 3, "y": 12},
            {"id": 2, "x": 3, "y": 13},
            {"id": 3, "x": 3, "y": 14},
        ]
    )
    

    we see three sets of updates. The first shows the feature working:

    015-10-27 13:25:50,442 INFO sqlalchemy.engine.base.Engine UPDATE a SET x=?, y=(a.y + ?) WHERE a.id = ?
    2015-10-27 13:25:50,443 INFO sqlalchemy.engine.base.Engine (2, 1, 1)
    2015-10-27 13:25:50,443 INFO sqlalchemy.engine.base.Engine UPDATE a SET x=?, y=(a.y + ?) WHERE a.id = ?
    2015-10-27 13:25:50,443 INFO sqlalchemy.engine.base.Engine (2, 1, 2)
    2015-10-27 13:25:50,443 INFO sqlalchemy.engine.base.Engine UPDATE a SET x=?, y=(a.y + ?) WHERE a.id = ?
    2015-10-27 13:25:50,443 INFO sqlalchemy.engine.base.Engine (2, 1, 3)
    

    the second shows that if subsequent params aren't expressions, it still "works":

    2015-10-27 13:25:50,445 INFO sqlalchemy.engine.base.Engine UPDATE a SET x=?, y=(a.y + ?) WHERE a.id = ?
    2015-10-27 13:25:50,445 INFO sqlalchemy.engine.base.Engine (3, 1, 1)
    2015-10-27 13:25:50,446 INFO sqlalchemy.engine.base.Engine UPDATE a SET x=?, y=? WHERE a.id = ?
    2015-10-27 13:25:50,446 INFO sqlalchemy.engine.base.Engine (3, 10, 2)
    2015-10-27 13:25:50,446 INFO sqlalchemy.engine.base.Engine UPDATE a SET x=?, y=? WHERE a.id = ?
    2015-10-27 13:25:50,446 INFO sqlalchemy.engine.base.Engine (3, 11, 3)
    

    for contrast, a pure mass update looks like:

    2015-10-27 13:25:50,447 INFO sqlalchemy.engine.base.Engine UPDATE a SET x=?, y=? WHERE a.id = ?
    2015-10-27 13:25:50,447 INFO sqlalchemy.engine.base.Engine ((3, 12, 1), (3, 13, 2), (3, 14, 3))
    

    I guess though that bulk_update_mappings is still saving you the overhead of building/loading whole objects and all that so allowing SQL expressions may be worth it.

    But then we also have to support them for inserts.

    there's no plan to get into this anytime soon unless folks want to work on pull requests (with tests!!!)

  2. Mike Bayer repo owner

    this was never intended as a feature, will consider pull requests though would have to evaluate the performance impact (it is probably fine).

  3. Log in to comment