OR-ed relation.contains() filters produce incorrect query

Issue #2177 resolved
Former user created an issue

(From StackOverflow question)

session.query(Foo).filter(Foo.bars1.contains(bar)|Foo.bars2.contains(bar))

leads to the following SQL:

SELECT "Foo".id AS "Foo_id" 
FROM "Foo", foos_to_bars1 AS foos_to_bars1_1, foos_to_bars2 AS foos_to_bars2_1 
WHERE "Foo".id = foos_to_bars1_1.foo AND ? = foos_to_bars1_1.bar OR 
"Foo".id = foos_to_bars2_1.foo AND ? = foos_to_bars2_1.bar

It works as expected when both secondary table have at least on row since join conditions are OR-ed. But it always produce empty result when one of secondary tables is empty.

The full test case (fails with 0.6.7 at least):

from sqlalchemy import *
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker, relationship, aliased

metadata  = MetaData()

Base = declarative_base(metadata=metadata, name='Base')

class Bar(Base):
    __tablename__ = 'Bar'
    id = Column(Integer, primary_key=True)

foos_to_bars1 = Table(
    'foos_to_bars1', metadata,
    Column('foo', ForeignKey('Foo.id'), primary_key=True),
    Column('bar', ForeignKey('Bar.id'), primary_key=True),
)

foos_to_bars2 = Table(
    'foos_to_bars2', metadata,
    Column('foo', ForeignKey('Foo.id'), primary_key=True),
    Column('bar', ForeignKey('Bar.id'), primary_key=True),
)

class Foo(Base):
    __tablename__ = 'Foo'
    id = Column(Integer, primary_key=True)
    bars1 = relationship(Bar, secondary=foos_to_bars1, collection_class=set)
    bars2 = relationship(Bar, secondary=foos_to_bars2, collection_class=list)


engine = create_engine('sqlite://', echo=True)
metadata.create_all(engine)

session = sessionmaker(bind=engine)()

o = Foo(bars1=set([Bar()](Bar())))
session.add(o)
session.commit()

session.expunge_all()

bar = session.query(Bar).first()

# Produces incorrect result when one of secondary tables is empty
assert session.query(Foo).filter(Foo.bars1.contains(bar)|Foo.bars2.contains(bar)).count()==1

Comments (3)

  1. Mike Bayer repo owner
    • changed component to orm
    • changed milestone to 0.8.0
    • changed status to wontfix

    That's the correct SQL. contains() would otherwise have to generate subqueries - if you want that, you're better off using any(). any() uses EXISTS which is not the most efficient approach, so you really need to use outerjoins to do this correctly.

    if someone would like to propose a radical change for contains(), clearly an 0.8 thing, feel free to reopen with a full behavioral proposal.

  2. Log in to comment