Non-simple relationship between two joined table subclasses is queried incorrectly

Issue #3364 resolved
Scott Torborg created an issue

The test case below explains it better, but basically, creating a relationship between two joined table subclasses which includes a parent class column in the join conditions causes the relationship to be queried incorrectly.

Admittedly, this case does raise a warning, so it may be unintended usage:

/.../sqlalchemy/lib/sqlalchemy/orm/relationships.py:2368: SAWarning: Non-simple
column elements in primary join condition for property Cheese.published_pizzas
- consider using remote() annotations to mark the remote side.

However, it used to work, and I'm not sure this particular example can actually be fixed with annotations--so I think this may indicate a regression which could cause other problems as well.

Specifically, it works up until 0611baa889198421afa932f2af1524bd8826ed7d, at which point the mapper initialization starts breaking (with TypeError: 'dict_values' object does not support indexing). Then, 610e0594e249cd0bb28cb2bd4a7624f63f4510bb "fixes" that exception, after which the script below runs, but the assertion at the end fails.

Here's the test:

from sqlalchemy import MetaData, Column, ForeignKey, types
from sqlalchemy.orm import create_session, relationship
from sqlalchemy.ext.declarative import declarative_base


metadata = MetaData('sqlite://')
metadata.bind.echo = True
Base = declarative_base(metadata=metadata)


class Food(Base):
    __tablename__ = 'foods'
    id = Column(types.Integer, primary_key=True)
    name = Column(types.Unicode(255), nullable=False)
    published = Column(types.Boolean, nullable=False, default=False)


class Pizza(Food):
    __tablename__ = 'pizzas'
    food_id = Column(None, ForeignKey('foods.id'), primary_key=True)
    cheese_id = Column(None, ForeignKey('cheeses.food_id'), nullable=False)

    cheese = relationship('Cheese',
                          primaryjoin='Pizza.cheese_id == Cheese.food_id',
                          backref='pizzas')


class Cheese(Food):
    __tablename__ = 'cheeses'
    food_id = Column(None, ForeignKey('foods.id'), primary_key=True)

    published_pizzas = relationship(
        'Pizza',
        primaryjoin=('and_(Pizza.cheese_id == Cheese.food_id,'
                     'Pizza.published == True)'))


metadata.create_all()
sess = create_session()

cheese = Cheese(name='Mozzarella', published=True)

pizza1 = Pizza(name='Plain', published=True, cheese=cheese)
pizza2 = Pizza(name='Pepperoni', published=False, cheese=cheese)
pizza3 = Pizza(name='Vegetarian', published=True, cheese=cheese)

sess.add_all([cheese, pizza1, pizza2, pizza3])
sess.flush()
sess.expunge_all()

mozzarella = sess.query(Cheese).first()

names = set(pizza.name for pizza in mozzarella.published_pizzas)
# Pepperoni isn't published, it shouldn't be returned
assert 'Pepperoni' not in names

Comments (8)

  1. Mike Bayer repo owner

    worksforme:

    works fine if remote() is used fully as is asked for, works (and is needed) for lazyload, joinedload, etc.:

    class Cheese(Food):
        __tablename__ = 'cheeses'
        food_id = Column(None, ForeignKey('foods.id'), primary_key=True)
    
        published_pizzas = relationship(
            'Pizza',
            primaryjoin=('and_(remote(Pizza.cheese_id) == Cheese.food_id,'
                         'remote(Pizza.published) == True)'))
    

    can you confirm please? thanks!

  2. Mike Bayer repo owner
    • changed status to open

    I can actually make this work without the remote() so let's see if we can do that for 1.0

  3. Mike Bayer repo owner

    like this

    diff --git a/lib/sqlalchemy/orm/relationships.py b/lib/sqlalchemy/orm/relationships.py
    index e36a644..ace1051 100644
    --- a/lib/sqlalchemy/orm/relationships.py
    +++ b/lib/sqlalchemy/orm/relationships.py
    @@ -2329,12 +2329,20 @@ class JoinCondition(object):
                 binary.right, binary.left = proc_left_right(binary.right,
                                                             binary.left)
    
    +        fully_self_referential = self.prop.mapper is self.prop.parent
    +
             def proc_left_right(left, right):
                 if isinstance(left, expression.ColumnClause) and \
                         isinstance(right, expression.ColumnClause):
                     if self.child_selectable.c.contains_column(right) and \
                             self.parent_selectable.c.contains_column(left):
                         right = right._annotate({"remote": True})
    +            elif not fully_self_referential and \
    +                    right._annotations.get('parentmapper') is self.prop.mapper:
    +                right = right._annotate({"remote": True})
    +            elif not fully_self_referential and \
    +                    left._annotations.get('parentmapper') is self.prop.mapper:
    +                left = left._annotate({"remote": True})
                 else:
                     self._warn_non_column_elements()
    
  4. Mike Bayer repo owner
    • Made a small improvement to the heuristics of relationship when determining remote side with semi-self-referential (e.g. two joined inh subclasses referring to each other), non-simple join conditions such that the parententity is taken into account and can reduce the need for using the remote() annotation; this can restore some cases that might have worked without the annotation prior to 0.9.4 via 🎫2948. fixes #3364

    → <<cset 35a488ae5f41>>

  5. Mike Bayer repo owner
    • changed milestone to 1.0

    this commit just allows it to figure out "remote" automatically in this case but using remote() in 0.9 is fine as well.

  6. Log in to comment