detect illegal usage of exported columns being passed back into generative methods especially with_only_columns; document expected usage

Issue #2319 resolved

Former user created an issue 2011-11-08

I've got a moderately complex 3-way join against a sort of EAV table.

It was working fine with sqlalchemy 0.7.2, with both sqllite and mysql.

When I deployed it to a production setting, I started getting the dreaded 'Every derived table must have its own alias' error from mysql.

It turns out that 0.7.3 produces very different SQL. I hope this is not by design, but I can't really be sure.

I isolated it in sareport.py ( https://gist.github.com/1349445 and attached); run it a la: python -m doctest sareport.py. It passes with 0.7.2 and fails with 0.7.3.

I took a quick look at the 0.7.3 release notes, but nothing jumped out at me that would explain what's going on.

sareport.py

Comments (11)

Former user Account Deleted
- attached sareport.py
SQLAlchemy 3-way join query that behaves differently in 0.7.2 and 0.7.3
- 2011-11-08T17:29:25+00:00
Mike Bayer repo owner
- changed milestone to 0.7.4
- assigned issue to
- changed component to orm
- marked as blocker
This is a slightly insane amount of code for me to break down and understand. As I'd like to save 90 minutes of intense focusing to understand the issue, any chance you can just illustrate, very simply, what a "three way join" is exactly, remove all the superfluous things here, like group bys, and_s, func.counts, with_only_columns if that isn't needed etc. ? Basically, remove elements of complexity, re-run, verify that erroneous behavior remains, repeat. This seems like a structural issue so a couple of columns and some simple selects should be sufficient, and this process will also identify where things go wrong.

A regression between 0.7.2 and 0.7.3 is also a serious issue if confirmed.
- 2011-11-08T17:39:34+00:00

Mike Bayer repo owner

for example. Here is a script which produces different results on 0.7.2 and 0.7.3:

from sqlalchemy import *

rd = Table('rd', MetaData(),
    Column('project_id', Integer),
    Column('record', Integer)
)

a = rd.alias('a')
b = rd.alias('b')

j = a.join(b, a.c.record == b.c.record).alias('c').select()

print j.with_only_columns((j.c.a_record,))

so there's some change in behavior of with_only_columns. So far it looks like 0.7.3 is doing the right thing as you are selecting from an alias.

2011-11-08T18:03:03+00:00

Mike Bayer repo owner

changed title to detect illegal usage of exported columns being passed back into generative methods
marked as critical

So there's not really a bug here, though there's a section where it's hard to detect that you're doing the wrong thing. Basically with_only_columns is against the columns you pass to the select(), not the columns exported by the select. Unfortunately there's no quick way for me to detect that, will try. You get a better query, identical in both versions, if you do it correctly:

    j = decision.join(candidate,
                      and_(candidate.c.record == decision.c.record,
                           candidate.c.project_id == decision.c.project_id)
                      ).outerjoin(dt_exp, and_(
            dt_exp.c.record == decision.c.record,
            dt_exp.c.project_id == decision.c.project_id)
                                  ).alias('cdwho')

    j2 = j.select()
    cdwho = j2.with_only_columns((j.c.cd_record.label('record'),
                                 j.c.cd_decision.label('decision'),
                                 j.c.who_userid.label('candidate'),
                                 j.c.expire_dt_exp.label('dt_exp')))

2011-11-08T18:27:20+00:00

Mike Bayer repo owner
- marked as major
- 2011-11-08T18:27:29+00:00

Mike Bayer repo owner

another interaction illustrating 0.7.3 is consistent with itself:

>>> from sqlalchemy.sql import table, column, select
>>> a = table('a', column('x'))
>>> b = table('b', column('y'))

>>> # a join
>>> j = a.join(b, a.c.x == b.c.y)
>>> print j
a JOIN b ON a.x = b.y

>>> # a select of the join
>>> j2 = j.select()
>>> print j2
SELECT a.x, b.y 
FROM a JOIN b ON a.x = b.y

>>> # a select of the select of the join
>>> print select([j2.c.x](j2.c.x))
SELECT x 
FROM (SELECT a.x AS x, b.y AS y 
FROM a JOIN b ON a.x = b.y)

>>> # a select from j2 as well as j at the same time
>>> print select([j2.c.x](j2.c.x)).select_from(j)
SELECT x 
FROM (SELECT a.x AS x, b.y AS y 
FROM a JOIN b ON a.x = b.y), a JOIN b ON a.x = b.y

>>> # equivalent operation using with_only_columns
>>> print j2.with_only_columns([j2.c.x](j2.c.x))
SELECT x 
FROM (SELECT a.x AS x, b.y AS y 
FROM a JOIN b ON a.x = b.y), a JOIN b ON a.x = b.y

>>> # with_only_columns can be used to replace the "inside",
>>> # need to send it those
>>> print j2.with_only_columns([j.c.a_x](j.c.a_x))
SELECT a.x 
FROM a JOIN b ON a.x = b.y

2011-11-08T19:10:21+00:00

Mike Bayer repo owner
- changed milestone to 0.7.xx
this one is very tough as when with_only_columns is called, it's exceedingly difficult to figure out that those are the same columns coming in.
- 2011-11-16T00:19:22+00:00
Mike Bayer repo owner
- changed title to detect illegal usage of exported columns being passed back into generative methods especially with_only_columns
- 2011-11-17T16:14:49+00:00
Mike Bayer repo owner
- changed title to detect illegal usage of exported columns being passed back into generative methods especially with_only_columns; document expected usage
rolling in ~~#2330~~ here. document how the method behaves.
- 2011-11-19T21:35:11+00:00
Mike Bayer repo owner
- changed component to sql
- changed status to resolved
- changed milestone to 0.7.4
with c0c42af4e0ef8acd651cc66e84ec636c14ab53a5 I'm satisfied with_only_columns() works as intended and a test is added. We're not going to try to detect this particular case, since it does behave consistently with the advertised usage contract now. Lots of documentation added.
- 2011-11-19T23:29:35+00:00
Mike Bayer repo owner
- removed milestone
Removing milestone: 0.7.4 (automated comment)
- 2014-02-23T18:14:54+00:00
Log in to comment

Assignee: Mike Bayer

Type: bug

Priority: major

Status: resolved

Component: sql

Milestone: –

Votes: 0

Watchers: 0