add an "opt in" flag to query.update() regarding when updating on polymorphic FROM?
Consider the following setup,
class User(Base):
__tablename__ = 'users'
user_id = Column(Integer, primary_key=True)
name = Column(String)
role = Column('role', Enum('user', 'super'))
__mapper_args__ = {'polymorphic_on': role, 'polymorphic_identity': 'user'}
class SuperUser(User):
__tablename__ = 'superusers'
__mapper_args__ = {'polymorphic_identity': "super"}
super_id = Column(Integer, primary_key=True)
users_user_id = Column(Integer, ForeignKey('users.user_id'))
email = Column('email', String)
If you try to perform an update on a query using the base class's primary key like so,
session.query(SuperUser).filter_by(user_id=some_id).update(
{'email': 'foo@example.com'}
)
The system will generate the following UPDATE
command.
UPDATE superusers SET email=? FROM users WHERE users.user_id = ?
This is trying to join superusers
with users
without any join criteria. That's obviously wrong.
To make it work, you have to either use the foreign key column name or fetch the original object and update on that.
session.query(SuperUser).filter_by(users_user_id=some_id).update(
{'email': 'foo@example.com'}
)
session.commit()
# or
su = session.query(SuperUser).get(some_id)
su.email = 'foo@example.com'
session.add(su)
session.commit()
This generates the correct command:
UPDATE superusers SET email=? WHERE superusers.super_id = ?
I have tested it with SQLAlchemy 0.8.2 against Postgres and SQLite, and SQLAlchemy 0.9.8 against SQLite.
In case of SQLite, the command immediately fails because it doesn't support JOIN
s in UPDATE
s. Here's the full trace:
Traceback (most recent call last):
File "bug.py", line 53, in <module>
{'email': 'foo@example.com'}
File "$HOME/dev/py/sqlalchemy-bug-repro/env/lib/python2.7/site-packages/sqlalchemy/orm/query.py", line 2796, in update
update_op.exec_()
File "$HOME/dev/py/sqlalchemy-bug-repro/env/lib/python2.7/site-packages/sqlalchemy/orm/persistence.py", line 897, in exec_
self._do_exec()
File "$HOME/dev/py/sqlalchemy-bug-repro/env/lib/python2.7/site-packages/sqlalchemy/orm/persistence.py", line 995, in _do_exec
update_stmt, params=self.query._params)
File "$HOME/dev/py/sqlalchemy-bug-repro/env/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 991, in execute
bind, close_with_result=True).execute(clause, params or {})
File "$HOME/dev/py/sqlalchemy-bug-repro/env/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 729, in execute
return meth(self, multiparams, params)
File "$HOME/dev/py/sqlalchemy-bug-repro/env/lib/python2.7/site-packages/sqlalchemy/sql/elements.py", line 322, in _execute_on_connection
return connection._execute_clauseelement(self, multiparams, params)
File "$HOME/dev/py/sqlalchemy-bug-repro/env/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 826, in _execute_clauseelement
compiled_sql, distilled_params
File "$HOME/dev/py/sqlalchemy-bug-repro/env/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 958, in _execute_context
context)
File "$HOME/dev/py/sqlalchemy-bug-repro/env/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1159, in _handle_dbapi_exception
exc_info
File "$HOME/dev/py/sqlalchemy-bug-repro/env/lib/python2.7/site-packages/sqlalchemy/util/compat.py", line 199, in raise_from_cause
reraise(type(exception), exception, tb=exc_tb)
File "$HOME/dev/py/sqlalchemy-bug-repro/env/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 951, in _execute_context
context)
File "$HOME/dev/py/sqlalchemy-bug-repro/env/lib/python2.7/site-packages/sqlalchemy/engine/default.py", line 436, in do_execute
cursor.execute(statement, parameters)
sqlalchemy.exc.OperationalError: (OperationalError) near "FROM": syntax error u'UPDATE superusers SET email=? FROM users WHERE users.user_id = ?' ('foo@example.com', 2)
In case of Postgres, it ends up joining the tables and possibly updating everything (I haven't had a chance to confirm the last part).
Sample code to reproduce this attached to the report.
Comments (14)
-
reporter -
repo owner this behavior is expected and is documented, it is one of the key caveats listed in the documentation for query.update(), at http://docs.sqlalchemy.org/en/rel_0_9/orm/query.html?highlight=query.update#sqlalchemy.orm.query.Query.update.
The Query.update() method provides an entryway to more direct SQL, e.g. the UPDATE statement, and in that regard it is an optimization compared to the regular unit of work use (that is, session.add() + commit()).. However, in standard SQL, an UPDATE statement only involves one table; the various multi-table forms everyone takes for granted now are vendor specific extensions and not part of SQL; therefore at a basic level, applying an UPDATE to a row that spans multiple tables doesn't make much sense. So you have to specify this how you want it to be.
-
repo owner - changed status to wontfix
this is a documented limitation of the query.update() method.
-
reporter On backends that do support joins in updates, the generated statement can do some serious damage because no join criteria is specified. Would it be reasonable to throw an exception instead of executing the statement?
I just confirmed against Postgres that the generated statement
UPDATE superusers SET email=? FROM users WHERE users.user_id = ?
will indeed change the email address for all rows in thesuperusers
table.$ SELECT * FROM users; user_id | name | role ---------+-------------+------- 1 | Bruce Wayne | user 2 | Clark Kent | super (2 rows) $ SELECT * FROM superusers; super_id | users_user_id | email ----------+---------------+-------------------- 1 | 2 | superman@yahoo.com 2 | 1 | bruce@wayne.com (2 rows) $ SELECT name, role, email FROM users JOIN superusers ON superusers.users_user_id = users.user_id; name | role | email -------------+-------+-------------------- Clark Kent | super | superman@yahoo.com Bruce Wayne | user | bruce@wayne.com (2 rows) $ UPDATE superusers SET email = 'totally.not.superman@gmail.com' FROM users WHERE users.user_id = 2; $ SELECT name, role, email FROM users JOIN superusers ON superusers.users_user_id = users.user_id; name | role | email -------------+-------+-------------------------------- Clark Kent | super | totally.not.superman@gmail.com Bruce Wayne | user | totally.not.superman@gmail.com (2 rows)
-
reporter - changed status to open
-
repo owner it's not really feasible. You could just as easily be emitting an UPDATE that has a WHERE clause that joins the tables correctly (in fact this is more common). It would be complicated and error prone to search within the whole statement and try to guess if semantically it does a particular thing SQL-wise, all just so that a warning may be emitted. Consider that this really means SQLAlchemy has to anticipate the behavior of a SQL statement on the Python side without sending it to the database. That means we're beginning to build pieces of the database itself on our end.
the alternative is that we simply throw an exception when any polymorphic class is passed to update(), but that is a heavy handed approach to an operation that merely has caveats. This is how it was before, and then of course some folks actually wanted to be able to use query.update() in the way that it is documented now.
The question is, how exactly did you come across knowing about query.update() without seeing the caveats listed in the documentation? This particular issue is even written in boldface. SQLAlchemy of course likes to warn for detectable issues where the user might be going wrong but there are limits to that. SQL queries should always be tested.
-
reporter Fair enough. I wasn't sure how technically feasible this would be. I agree that disallowing polymorphic classes completely is probably a bad idea.
I came across the method in a response to someone's StackOverflow question. Its usage seemed straightforward enough that I did not think to check for caveats.
I'll switch to using the workaround. Thanks.
-
reporter - changed status to wontfix
-
repo owner nick coghlan agrees with one of my proposals: https://twitter.com/ncoghlan_dev/status/534451843724615680?cn=cmVwbHk%3D that is an "opt in" flag which when not set, emits a warning when a polymorphic class is the target. im loathe to add more small things to 1.0 but there you go...
-
repo owner - changed title to add an "opt in" flag to query.update() regarding when updating on polymorphic FROM?
- marked as proposal
- changed milestone to 1.0
-
repo owner - changed status to open
look into just adding a flag
-
reporter That sounds like a nice solution. Thanks!
-
repo owner -
assigned issue to
-
assigned issue to
-
repo owner - changed status to resolved
I'd rather not box update() in to this limitation by making it part of the API. I've improved the documentation and added
#3271for the potential feature of trying to scan the UPDATE tables to determine if the criteria can be added. - Log in to comment