- attached sqlalchemy-correlate-asfrom.diff
allow correlation of FROM elements to any SELECT which encloses via WHERE, columns, ORDER BY (but not FROM list)
This is related to the mailing list post: https://groups.google.com/forum/?fromgroups#!topic/sqlalchemy/Ld0iuCUphHU
It seems like #2668 has universally disabled correlation for subquery inside a FROM clause. The situations where such expression is needed could be quite rare, but it is sometimes necessary to do so, especially for lesser databases such as SQL Server. An example would be this:
Another example would be a subquery that returns a couple of computed values, with an outer query consists entirely of a CASE statement in its select list that makes use of the values.
(Note that these examples are used in the context of hybrid / column property)
I think it would be good to add a flag to select()
to enforce the correlation when necessary.
Comments (13)
-
reporter -
repo owner - marked as major
- marked as enhancement
- changed milestone to 0.9.0
well the funny thing here is that we've had a long-standing comment in the compiler, regarding the fact that the list of "Existing froms" we pass in is only from the immediate enclosing SELECT, not the SELECT that's enclosing that one. I think at least we know we can take that out, since that only applies to auto-correlation and I think we can agree auto-correlation out of the FROM clause is not necessary. This is the case where you say, "omit these elements from the FROM clause, unconditionally".
I'd rather make this specific to each FROM for the "correlate" case. See the attached patch.
t1 = table('t1', column('a')) t2 = table('t2', column('a'), column('b')) s = select( [t2.c.b](t2.c.b), t1.c.a == t2.c.a, ).correlate_except(t2, froms=True).alias('s') s2 = select([func.foo(s.c.b)](func.foo(s.c.b))).as_scalar() s3 = select([t1](t1), order_by=s2) print s3
-
repo owner dont know what the flag should be named though. maybe "within_from_ok".
-
reporter Thanks, patch works great. "within_from_ok" sounds about right to me (disclaimer: I suck at naming things).
-
repo owner See also
#2668which we'll close as a dupe here, we need to consider propagating the correlation list recursively:--- a/lib/sqlalchemy/sql/compiler.py +++ b/lib/sqlalchemy/sql/compiler.py @@ -1241,6 +1241,8 @@ class SQLCompiler(engine.Compiled): froms = select._get_display_froms(existingfroms, asfrom=asfrom) correlate_froms = set(sql._from_objects(*froms)) + if existingfroms: + correlate_froms = correlate_froms.union(existingfroms) self.stack.append({'from': correlate_froms,
here's the test case for that:
from sqlalchemy.sql import table, column, select, exists p = table('parent', column('id')) c = table('child', column('id'), column('parent_id'), column('pos')) s = c.select().where(c.c.parent_id == p.c.id).order_by(c.c.pos).limit(1) s = s.correlate(p, froms=True) s = exists().select_from(s).where(s.c.id == 1) s = select([p](p)).where(s) print s
we want:
SELECT parent.id FROM parent WHERE EXISTS (SELECT * FROM (SELECT child.id AS id, child.parent_id AS parent_id, child.pos AS pos FROM child WHERE child.parent_id = parent.id ORDER BY child.pos LIMIT :param_1) WHERE id = :id_1)
and not:
SELECT parent.id FROM parent WHERE EXISTS (SELECT * FROM (SELECT child.id AS id, child.parent_id AS parent_id, child.pos AS pos FROM child, parent WHERE child.parent_id = parent.id ORDER BY child.pos LIMIT :param_1) WHERE id = :id_1)
I'm a little concerned about propagating the FROMs all the way in like that but no tests fail, and I'm getting the impression that when you have "SELECT ( SELECT ( SELECT ) ) ", those selects are typically going to be nested in the FROM clause and this new "froms" flag is a nice way of ensuring it doesn't happen by default.
updated patch is attached and i think we have to make this 0.8 also since people from 0.7 are getting hit by it.
-
repo owner from that change we also can fix correlate_except, which also doesn't do the right thing here:
from sqlalchemy.sql import table, column, select, exists t1 = table('t1', column('x')) t2 = table('t2', column('y')) t3 = table('t3', column('z')) s = select([t1](t1)).where(t1.c.x == t2.c.y).where(t2.c.y == t3.c.z).correlate_except(t1) print s
we want:
SELECT t1.x FROM t1, t2, t3 WHERE t1.x = t2.y AND t2.y = t3.z
and not
SELECT t1.x FROM t1 WHERE t1.x = t2.y AND t2.y = t3.z
a similar pattern to the latter is called "backwards correlated" in test_compiler...but its wrong, and I'd like to be able to add other FROM clauses to an
any()
orhas()
.This is getting pretty changy. I don't think there's a real-world use for "backwards correlated" and correlate_except() is fairly new so hopefully this is safe for 0.8 still.
-
repo owner this is in the ticket_2746 branch for now and also in the attached patch.
-
repo owner - marked as bug
-
repo owner - changed title to allow correlation of FROM elements to any SELECT which encloses via WHERE, columns, ORDER BY (but not FROM list)
good news. We don't need a flag. the latest patch has the compiler figure out what the legal "correlations" should be, so all these scenarios just work.
this is a pretty big change but 0.8's current behavior is pretty broken.
-
reporter Just tested 1b6b2fb8d2ea and everything just works indeed. Nice.
-
repo owner I'm going to try taking it one step further, and make it so that if you dont actually use correlate()/correlate_except(), the default auto-correlation will work the old way, i.e. just for the immediate enclosing SELECT via WHERE/etc. This is to minimize the chance of someone's query behaving differently since I want this in 0.8 - it will act very close to the 0.7 behavior.
-
repo owner - changed status to resolved
f76cae4bc92da640c1 master b4697e9e187571b06 0.8
-
repo owner - removed milestone
Removing milestone: 0.8.xx (automated comment)
- Log in to comment
A diff to add flag
correlate_asfrom
(default to False) toselect()