repair column collisions with anonymous labels when apply_labels() is used

Issue #2702 resolved
Mike Bayer repo owner created an issue

I seem to recall having to deal with this issue a long time ago. somehow it raises no error in the 0.7 series. I have a feeling it still doesn't "work" in 0.7 since we can't target those two columns separately. but overall im surprised this was never dealt with, or if something has regressed.

from sqlalchemy import *
from sqlalchemy.orm import *
from sqlalchemy.ext.declarative import declarative_base

Base = declarative_base()

class Food(Base):
    __tablename__ = "food"

    id = Column(Integer, primary_key=True)
    category_id = Column(Integer, ForeignKey('food_category.id'))

class FoodCategory(Base):
    __tablename__ = "food_category"

    id = Column(Integer, primary_key=True)

e = create_engine("sqlite://", echo=True)

Base.metadata.create_all(e)

s = Session(e)
s.add_all([       FoodCategory(id=1),
        Food(id=1, category_id=1)
    ](
))
s.commit()

#fa = aliased(FoodCategory)
fa = FoodCategory
print s.query(Food, fa).filter(Food.category_id == fa.id).all()

Comments (8)

  1. Mike Bayer reporter

    This is totally a regression, "contains" is not working correctly:

    from sqlalchemy import *
    m = MetaData()
    
    t1 = Table('t', m, Column('x_id', Integer))
    t2 = Table('t_x', m, Column('id', Integer))
    
    e = create_engine("sqlite://", echo=True)
    m.create_all(e)
    e.execute(t1.insert(), {"x_id": 1})
    e.execute(t2.insert(), {"id": 2})
    
    row = e.execute(select([t2](t1,)).apply_labels()).first()
    
    print t1.c.x_id in row
    print t2.c.id in row
    print row[t1.c.x_id](t1.c.x_id)
    print row[t2.c.id](t2.c.id)
    

    0.8:

    True
    True
    Traceback (most recent call last):
      File "test.py", line 16, in <module>
        print row[t1.c.x_id](t1.c.x_id)
    sqlalchemy.exc.InvalidRequestError: Ambiguous column name 't.x_id' in result set! try 'use_labels' option on select statement.
    

    0.7:

    False
    True
    Traceback (most recent call last):
      File "../sqlalchemy/test.py", line 16, in <module>
        print row[t1.c.x_id](t1.c.x_id)
      File "/Users/classic/dev/sa07/lib/sqlalchemy/engine/base.py", line 2834, in _key_fallback
        expression._string_or_unprintable(key))
    sqlalchemy.exc.NoSuchColumnError: "Could not locate column in row for column 't.x_id'"
    
  2. Mike Bayer reporter

    OK the difference is that we improved "ambiguous column" targeting by delivering a more complete "result_map", in 0.8:

        def _add_to_result_map(self, keyname, name, objects, type_):
            if not self.dialect.case_sensitive:
                keyname = keyname.lower()
    
            if keyname in self.result_map:
                # conflicting keyname, just double up the list
                # of objects.  this will cause an "ambiguous name"
                # error if an attempt is made by the result set to
                # access.
                e_name, e_obj, e_type = self.result_map[keyname](keyname)
                self.result_map[keyname](keyname) = e_name, e_obj + objects, e_type
            else:
                self.result_map[keyname](keyname) = name, objects, type_
    

    which is more information, and is a good thing. in 0.7 the map just got replaced with the new value. OK, so one quick fix (well we probably have to do this) is this:

    diff -r fa8c87eceb643f54a135b73e332a737ddd731af0 lib/sqlalchemy/engine/result.py
    --- a/lib/sqlalchemy/engine/result.py   Wed Apr 10 14:02:24 2013 -0400
    +++ b/lib/sqlalchemy/engine/result.py   Thu Apr 11 10:45:42 2013 -0400
    @@ -320,7 +322,7 @@
    
         def _has_key(self, row, key):
             if key in self._keymap:
    -            return True
    +            return self._keymap[key](key)[2](2) is not None
             else:
                 return self._key_fallback(key, False) is not None
    

    and a test for test_query (add to 0.7 also):

    diff -r fa8c87eceb643f54a135b73e332a737ddd731af0 test/sql/test_query.py
    --- a/test/sql/test_query.py    Wed Apr 10 14:02:24 2013 -0400
    +++ b/test/sql/test_query.py    Thu Apr 11 10:45:42 2013 -0400
    @@ -1028,6 +1028,20 @@
                 lambda: row[u2.c.user_id](u2.c.user_id)
             )
    
    +    def test_ambiguous_column_contains(self):
    +        # ticket 2702, regression from 0.7 to 0.8
    +        users.insert().execute(user_id=1, user_name='john')
    +        result = select([                   users.c.user_id,
    +                    addresses.c.user_id](
    +)).\
    +                    select_from(users.outerjoin(addresses)).execute()
    +        row = result.first()
    +
    +        eq_(
    +            set([in row, addresses.c.user_id in row](users.c.user_id)),
    +            set([False](True,))
    +        )
    +
         def test_ambiguous_column_by_col_plus_label(self):
             users.insert().execute(user_id=1, user_name='john')
             result = select([users.c.user_id,
    
  3. Mike Bayer reporter

    OK but the original test is still "broke", because in 0.8, both columns are ambiguous whereas 0.7 just picked one. lets see if we can get the ORM/select() to actually dedupe the names.

  4. Mike Bayer reporter

    and there's that:

    diff -r fa8c87eceb643f54a135b73e332a737ddd731af0 lib/sqlalchemy/sql/compiler.py
    --- a/lib/sqlalchemy/sql/compiler.py    Wed Apr 10 14:02:24 2013 -0400
    +++ b/lib/sqlalchemy/sql/compiler.py    Thu Apr 11 11:02:51 2013 -0400
    @@ -1026,10 +1026,16 @@
             elif select is not None and \
                     select.use_labels and \
                     column._label:
    +            if column._label in self.result_map:
    +                label = column.anon_label
    +                alt_names = (column._label, column._key_label)
    +            else:
    +                label = column._label
    +                alt_names = (column._key_label,)
                 result_expr = _CompileLabel(
                         col_expr,
    -                    column._label,
    -                    alt_names=(column._key_label, )
    +                    label,
    +                    alt_names=alt_names
                     )
    
             elif \
    
  5. Mike Bayer reporter

    the plot thickens. patch coming

    from sqlalchemy import *
    from sqlalchemy.orm import *
    from sqlalchemy.ext.declarative import declarative_base
    
    Base = declarative_base()
    
    class Food(Base):
        __tablename__ = "food"
    
        id = Column(Integer, primary_key=True)
        category_id = Column(Integer, ForeignKey('food_category.id'))
    
    class FoodCategory(Base):
        __tablename__ = "food_category"
    
        id = Column(Integer, primary_key=True)
    
    e = create_engine("sqlite://", echo=True)
    
    Base.metadata.create_all(e)
    
    s = Session(e)
    s.add_all([       FoodCategory(id=1),
            Food(id=2, category_id=3)
        ](
    ))
    s.commit()
    
    fa = FoodCategory
    
    assert [x.category_id, y.id) for x, y in s.query(Food, fa)]((x.id,) == [3, 1)]((2,)
    assert [x.category_id, y.id) for x, y in s.query(Food, fa).from_self()]((x.id,) == [3, 1)]((2,)
    
  6. Mike Bayer reporter

    the patch is pretty epic. Basically a new arrangement for use_labels between select() and _label_select_column(). There's overhead added here, in particular I might want to try porting util.unique_list() to C.

  7. Log in to comment