validation for primary target of UPDATE, INSERT, DELETE - must be a Table or compatible, add update_from()

Issue #3645 new
immerrr again created an issue

It is probably unexpected that it actually works, but MySQL can do that (and StackOverflow suggests that MS SQL can, too). Here's a script to reproduce:

import sqlalchemy as sa

metadata = sa.MetaData()


t1 = sa.Table(
    't1', metadata,
    sa.Column('key1', sa.Integer),
    sa.Column('ref1', sa.Integer),
    sa.Column('val1', sa.Integer),
    sa.ForeignKeyConstraint(['ref1'], ['t2.key2']),
)


t2 = sa.Table(
    't2', metadata,
    sa.Column('key2', sa.Integer),
    sa.Column('val2', sa.Integer),
)

engine = sa.create_engine('mysql+mysqlconnector://')


def to_str(stmt):
    return str(stmt.compile(bind=engine,
                            compile_kwargs={'literal_binds': True}))

In the end I see

>>> to_str(
...     sa.update(t1.join(t2),
...               values={t1.c.val1: 'foo'},
...               whereclause=(t2.c.val2 == 'foobar')))
...
'UPDATE t1 INNER JOIN t2 ON t2.key2 = t1.ref1, t2 SET t1.val1=%(val1)s WHERE t2.val2 = %(val2_1)s'

Note, that there's a second t2 table in the table_references part of the query.

This patch seems to have fixed my case, but I can't seem to run tests out of the box, so I don't know if it breaks* anything:

index 7b506f9..fe9f20b 100644
--- a/lib/sqlalchemy/sql/dml.py
+++ b/lib/sqlalchemy/sql/dml.py
@@ -766,7 +766,7 @@ class Update(ValuesBase):
         # TODO: this could be made memoized
         # if the memoization is reset on each generative call.
         froms = []
-        seen = set([self.table])
+        seen = set(_from_objects(self.table))

         if self._whereclause is not None:
             for item in _from_objects(self._whereclause):

Comments (13)

  1. immerrr again reporter

    Another question is if it is ok that the compiled statement does not include the literals? I see that sometimes on other queries, too, e.g. OFFSET/LIMIT parameters:

    >>> to_str(t1.select().where(t1.c.key1 == 'foobar').offset(10).limit(5))
    "SELECT t1.key1, t1.ref1, t1.val1 \nFROM t1 \nWHERE t1.key1 = 'foobar' \n LIMIT %(param_1)s, %(param_2)s"
    
  2. immerrr again reporter

    Actually, found a way in which my patch breaks things: table prefix for the updated column is lost, i.e. SET t1.val1 = becomes SET val1, because mysqlconnector compiler checks stmt._extra_froms to see if the update is multi-table and after the patch that property returns an empty list.

  3. Mike Bayer repo owner

    this just came up on the list regarding aliased(), I think if you send anything other than a Table to update(), delete() or insert() as the primary target it should probably raise an error.

  4. immerrr again reporter

    Well, that's a bit disappointing, but explainable. Given that quite a few DBs support joins in one form or the other, what would be the easiest way to support those "custom" update expressions (Update class seems quite heavy, could you roughly estimate how much has to be "overridden" or copy-pasted)?

  5. immerrr again reporter

    I know what multi-table update is. I'm trying to say that some DBs actually allow using foo JOIN bar ON foo.x = bar.x as a table to update, like MySQL or MS SQL. In the majority of situations multi-table request is equivalent, including my particular scenario, but right now from the top of my head I cannot come up with a way to emulate LEFT JOIN with a multi-table query that does not involve sub-queries.

  6. Mike Bayer repo owner

    to my knowledge, SQL Server does not allow the UPDATE to actually update more than one table, see http://stackoverflow.com/a/15116036/34549. So while lots of DBs support an UPDATE...FROM style of syntax like SQL Server / Postgresql, only MySQL has the wacky twist of actually updating two tables at once rather than referring to multiple tables in a FROM / WHERE clause..

    In this case, the "self.table" variable of the UPDATE statement is assumed all over the place to be a Table and I don't think it's appropriate to change that. Instead, we're looking for a way for the FROM to be a JOIN, which applies to PG and SQL Server also. If for example we wanted to support this: http://stackoverflow.com/a/23556885/34549 - "a" is still the "target" table, there's just other things happening in the FROM clause that cause a JOIN to occur.

    So perhaps the feature add of .update_from() to Update would be the way to go here, providing way of explicitly controlling UPDATE..FROM, and on MySQL would just not use the FROM keyword.

  7. immerrr again reporter

    Ah, yes, sorry, I skimmed through stackoverflow when investigating the issue and admittedly didn't pay much attention to the description, being satisfied with joins I saw in the example queries. update_from seems a nice idea!

  8. Log in to comment