column targeting failure with any() + joinedload + join to joined inh + limit

Issue #2419 resolved
Mike Bayer repo owner created an issue

another one where joinedload + limit subquerying screws up targeting

from sqlalchemy import *
from sqlalchemy.orm import *
from sqlalchemy.ext.declarative import declarative_base

Base= declarative_base()

class A(Base):
    __tablename__ = "a"

    id = Column(Integer, primary_key=True)

# test both here
#class B(A):
class B(Base):
    __tablename__ = "b"

#    id = Column(Integer, ForeignKey('a.id'), primary_key=True)
    id = Column(Integer, primary_key=True)
    ds = relationship("D")
    es = relationship("E")

class C(A):
    __tablename__ = "c"

    id = Column(Integer, ForeignKey('a.id'), primary_key=True)
    b_id = Column(Integer, ForeignKey('b.id'))
    b = relationship("B", primaryjoin=b_id==B.id)

class D(Base):
    __tablename__ = "d"

    id = Column(Integer, primary_key=True)
    b_id = Column(Integer, ForeignKey('b.id'))

class E(Base):
    __tablename__ = 'e'
    id = Column(Integer, primary_key=True)
    b_id = Column(Integer, ForeignKey('b.id'))


e = create_engine("sqlite://", echo=True)

Base.metadata.create_all(e)

s = Session(e)

s.add_all([   C(
        b=B(ds=[D()](
))
    )

])

s.commit()

q = s.query(B, B.ds.any(D.id==1)).options(joinedload_all("es"))
q = q.join(C, C.b_id==B.id)
q = q.limit(5)
print q.all()

Comments (9)

  1. Mike Bayer reporter

    it appears I can generate the identical query and identical result without the error using outerjoin() + contains_eager:

    ea = aliased(E)
    q = s.query(B, B.ds.any(D.id==1)).options(contains_eager("es", alias=ea))
    q = q.join(C, C.b_id==B.id)
    q = q.limit(5).from_self().outerjoin(ea, B.es)
    print q.all()
    
    
    
    SELECT e_1.id AS e_1_id, e_1.b_id AS e_1_b_id, anon_1.b_id AS anon_1_b_id, anon_1.anon_2 AS anon_1_anon_2 
    FROM (SELECT b.id AS b_id, EXISTS (SELECT 1 
    FROM d 
    WHERE b.id = d.b_id AND d.id = ?) AS anon_2 
    FROM b JOIN (SELECT a.id AS a_id, c.id AS c_id, c.b_id AS c_b_id 
    FROM a JOIN c ON a.id = c.id) AS anon_3 ON anon_3.c_b_id = b.id
     LIMIT ? OFFSET ?) AS anon_1 LEFT OUTER JOIN e AS e_1 ON anon_1.b_id = e_1.b_id
    

    vs.

    SELECT anon_1.b_id AS anon_1_b_id, anon_1.anon_2 AS anon_1_anon_2, e_1.id AS e_1_id, e_1.b_id AS e_1_b_id 
    FROM (SELECT b.id AS b_id, EXISTS (SELECT 1 
    FROM d 
    WHERE b.id = d.b_id AND d.id = ?) AS anon_2 
    FROM b JOIN (SELECT a.id AS a_id, c.id AS c_id, c.b_id AS c_b_id 
    FROM a JOIN c ON a.id = c.id) AS anon_3 ON anon_3.c_b_id = b.id
     LIMIT ? OFFSET ?) AS anon_1 LEFT OUTER JOIN e AS e_1 ON anon_1.b_id = e_1.b_id
    
  2. Mike Bayer reporter

    OK next trick, "anon_2.anon_1" isn't the name it's looking for. It's got anonymous names in there but when the exception is raised it calls str() on it, so where the anon_X goes is based on that.

  3. Mike Bayer reporter

    so if we take the problem to be, that Query should be aliasing correctly and not relying on key_fallback, we can disable key_fallback (maybe that should be an execution option ?) and then make the failure very simple:

    # fails
    q = s.query(B).from_self(B.ds.any(D.id==1))
    
    # works
    q = s.query(B).from_self(B.id)
    

    pdbing into _ColumnEntity.row_processor() leads to very complex situations with labels coming up that aren't in result_map, not yet clear exactly what that's about.

  4. Mike Bayer reporter

    note that changing "C" to not be joined inheritance resolves. It seems like a corresponding column issue due to the fact that the join to "C" subquery runs the whole select through an adaption. if you turn off require_embedded or use adapt_on_names in the adapter used in _ColumnEntity.row_processor, then it matches. both are not options as they each cause failures in some of those very obscure name collision tests.

  5. Mike Bayer reporter

    patch:

    diff -r fd9dd2280003707c97cf9e928ac474d2d8f9c20d lib/sqlalchemy/sql/expression.py
    --- a/lib/sqlalchemy/sql/expression.py  Mon Feb 27 11:14:46 2012 -0500
    +++ b/lib/sqlalchemy/sql/expression.py  Tue Feb 28 14:22:09 2012 -0500
    @@ -2466,6 +2466,12 @@
    
             """
    
    +        def embedded(proxy_set, target_set):
    +            for t in target_set:
    +                if not set(_expand_cloned([t](t))).intersection(_expand_cloned(proxy_set)):
    +                    return False
    +            return True
    +
             # dont dig around if the column is locally present
             if self.c.contains_column(column):
                 return column
    @@ -2476,7 +2482,7 @@
                 i = target_set.intersection(itertools.chain(*[                    for p in c.proxy_set](p._cloned_set
    )))
                 if i and (not require_embedded
    -                      or c.proxy_set.issuperset(target_set)):
    +                      or embedded(c.proxy_set, target_set)):
                     if col is None:
    
                         # no corresponding column yet, pick this one.
    

    this liberalizes the "embedded" test by considering any clone of each item in target_set to count as "contained" within c.proxy_set. basically, column elements are cloned out on three axes (!), cloned, proxy, and annotated. annotated is taken care of via __hash__(). corresponding_column needs specificity regarding proxies. But clones should also be considered equivalent (right?). need to test this out in sql.test_generative. everything passes though.

  6. Mike Bayer reporter

    here's a refinement:

    diff -r fd9dd2280003707c97cf9e928ac474d2d8f9c20d lib/sqlalchemy/sql/expression.py
    --- a/lib/sqlalchemy/sql/expression.py  Mon Feb 27 11:14:46 2012 -0500
    +++ b/lib/sqlalchemy/sql/expression.py  Tue Feb 28 14:35:30 2012 -0500
    @@ -2466,6 +2466,13 @@
    
             """
    
    +        def embedded(expanded_proxy_set, target_set):
    +            for t in target_set.difference(expanded_proxy_set):
    +                if not set(_expand_cloned([t](t))
    +                            ).intersection(expanded_proxy_set):
    +                    return False
    +            return True
    +
             # dont dig around if the column is locally present
             if self.c.contains_column(column):
                 return column
    @@ -2473,10 +2480,10 @@
             target_set = column.proxy_set
             cols = self.c
             for c in cols:
    -            i = target_set.intersection(itertools.chain(*[                   for p in c.proxy_set](p._cloned_set
    -)))
    +            expanded_proxy_set = set(_expand_cloned(c.proxy_set))
    +            i = target_set.intersection(expanded_proxy_set)
                 if i and (not require_embedded
    -                      or c.proxy_set.issuperset(target_set)):
    +                      or embedded(expanded_proxy_set, target_set)):
                     if col is None:
    
                         # no corresponding column yet, pick this one.
    
  7. Log in to comment