query.join() is silently ignored within query.update(), query.delete()
I noticed that _no_select_modifiers
and _no_criterion_assertion
in orm/query.py
differ only in checking self._statement
and self._from_obj
. It seems that _no_criterion_assertion
should be equivalent to self._statement is None and _no_select_modifiers
; it is surprising that _from_obj
is omitted, as this can result in a very dangerous situation when doing bulk updates, since _from_obj
is ignored entirely.
For example, someone may mistakenly use a join instead of a filter:
(session.query(File)
.join(File.folder)
.filter(Folder.owner_id == 1)
.update({'content': 'USER 1'}, synchronize_session=False))
results in UPDATE file SET content='USER 1' FROM folder WHERE folder.owner_id = 1
which will touch every file instead of just ones belong to user 1.
The suggested fix is to add: ('_from_obj', 'join()', ())
in _no_select_modifiers
, which will error, as demonstrated in attached file.
Comments (13)
-
repo owner -
repo owner - changed title to query.join() is silently ignored within query.update(), query.delete()
- changed milestone to 1.0
I've altered the title here, because that is the bug. For the moment, BulkUD will need to inspect the from_obj of the Query and such to make sure nothing is there, because at the moment it isn't being taken into account when it formulates the update object. it also probably needs to be only a warning for now.
-
reporter Hm, the only call to
_no_select_modifiers
is in the__init__
ofBulkUpdate
andBulkDelete
, whose_do_exec
method ignores_from_obj
. Why is this not the appropriate place to do the check?Also, according to https://dev.mysql.com/doc/refman/5.0/en/update.html LIMIT and ORDER BY are also valid for UPDATE's in MySQL, so I don't see why a JOIN should be handled differently.
What severity does something need to be an error? It seems too dangerous to be a warning. I ran into issues in production code where we accidentally updated every single row in a table because of this.
-
repo owner Hm, the only call to _no_select_modifiers is in the init of BulkUpdate and BulkDelete, whose _do_exec method ignores _from_obj. Why is this not the appropriate place to do the check?
because the FROM clause isn't what's considered there as a "select modifier". Truth be told, it seems that method isn't used anywhere else and I'd rather just whack it, and localize the checks to the BulkUD classes.
Also, according to https://dev.mysql.com/doc/refman/5.0/en/update.html LIMIT and ORDER BY are also valid for UPDATE's in MySQL, so I don't see why a JOIN should be handled differently.
If this were release 0.1 of the query.update() and query.delete() methods, then yes. But it's not, they've been out for a few years now. So the chance that applications are silently relying on this without issues at the moment is high. That's why we use warnings when we find things like this that have been out in the wild for a long time. We will often push those warnings to exceptions in subsequent major releases but in this case 1.0 is already at beta5 so I'd still be concerned it's a surprise for some.
What severity does something need to be an error?
this should have been an error, but it's too late at the moment to just spring it out.
It seems too dangerous to be a warning
query.update() and delete() are super-widely used and nobody has reported this issue all throughout the 0.8 or 0.9 series, and there are many many ways to write code that emits SQL that is not what's expected, even though it succeeds without raising. So it's a bug for sure but when I spring new errors on people, it really very often produces lots more headaches for everyone. For backwards compatibility on behaviors that may be silently succeeding for people, we use a warning.
I ran into issues in production code where we accidentally updated every single row in a table because of this.
I'm sorry for that, but the truth is there's lots of ways you can write an UPDATE statement that produces an unexpected syntax, not to mention encounter unexpected behaviors with the unit of work, the database, and everything else, and you should be unit and functional testing your code to ensure it works as expected. The echo=True option is extremely prominent in SQLAlchemy as well in order to encourage the developer to be very aware of how the application is acting at the level of SQL interaction. At least when putting code into production, it is reasonable to expect that the developers have observed the warnings that their code produces beforehand and I think if this warning were there, you would have seen it.
-
reporter because the FROM clause isn't what's considered there as a "select modifier". Truth be told, it seems that method isn't used anywhere else and I'd rather just whack it, and localize the checks to the BulkUD classes.
I see. Would an adequate fix then be to rename and move
_no_select_modifiers
with something similar to my correction (with a warning) intoorm/persistence.py
?this should have been an error, but it's too late at the moment to just spring it out.
Ah, that makes sense.
you should be unit and functional testing your code to ensure it works as expected
I completely agree... sadly we had forgotten to test that the update touched nothing more than the expected models. Fortunately, the data was entirely recoverable. As a sanity check, is there a nice way to configure SQLAlchemy to log a warning -- maybe even raise an error -- if an update/delete touches more than say, 10000 rows?
-
repo owner yeah you can for example make an after_cursor_execute event, and then test cursor.rowcount. emit the warning from within the event.
-
repo owner - changed status to resolved
- :class:
.Query
doesn't support joins, subselects, or special FROM clauses when using the :meth:.Query.update
or :meth:.Query.delete
methods; instead of silently ignoring these fields if methods like :meth:.Query.join
or :meth:.Query.select_from
has been called, a warning is emitted. In 1.0.0b5 this will raise an error. Partial cherry pick from 5302bcebb8e18fdad7448ffb60d2a2017eab26c8. fixes#3349 - don't realy need _no_select_modifiers anymore
→ <<cset 01a1c3256aee>>
-
repo owner - :class:
.Query
doesn't support joins, subselects, or special FROM clauses when using the :meth:.Query.update
or :meth:.Query.delete
methods; instead of silently ignoring these fields if methods like :meth:.Query.join
or :meth:.Query.select_from
has been called, an error is raised. In 0.9.10 this only emits a warning. fixes#3349 - don't needlessly call _compile_context() and build up a whole statement that we never need. Construct QueryContext as it's part of the event contract, but don't actually call upon mapper attributes; use more direct systems of determining the update or delete table.
- don't realy need _no_select_modifiers anymore
→ <<cset fe1922764151>>
- :class:
-
Hello there.
I spent some time fighting a query before Phazorx pointed me to this issue on IRC. He suggested I poke you about it because the error message might be informative enough.
Here is a brief recap of the problem and solution https://gist.github.com/EvaSDK/1f1f097300b691d80269.
-
repo owner hi Gilles -
it looks like 1.0's error "resolved" this for you, right?
-
Well it did expose hidden failure so I guess this is better :-). It is our fault for not following SQLAlchemy releases close enough hence we haven't seen the warning before. I just wanted to hint that maybe the message could more explicitely say that this is not supported in SQL and/or not yet available in drivers that could support it like postgres with WITH.
-
repo owner oh. well if you want to suggest text / send a PR. SQLAlchemy is pretty heavy on assuming SQL knowledge of the user.
-
Sure.
- Log in to comment
i doubt anything would be done at the level of Query. JOIN is valid for some forms of UPDATE such as on MySQL. The fact that update() misinterprets the join without raising a compiler error is more the issue. this would be a core issue.