One-to-one mapping issues extra query when accessing parent_obj.child.parent_obj

Issue #3277 resolved
Priit Laes created an issue

We've been investigating an issue where we have a one-to-one relationship between A and B, where A->B is lazy=joined relationship.

When querying A, both A and its child B are fetched, but accessing A.B.A (I even tried setting backref lazy='joined' to backref) issues another query.

Eventual workaround/fix was to use options(contains_eager('b.a')) which also results in cleaner query.

Now there's also with another issue with NULL check when passing nested path to contains_eager, but I'm not sure whether this is user or orm responsibility ;)

Comments (7)

  1. Mike Bayer repo owner

    have you tried setting join_depth? the ORM can't keep eager joining from A->B->A->B->A without end so it checks for this recursive situation. you might have to use explicit joinedload() directives.

  2. Priit Laes reporter

    Setting join_depth=2 doesn't seem to be making any difference:

        b = relationship("B", backref=backref("a", uselist=False, lazy='joined'), lazy='joined', join_depth=2)
    

    Setup

    class A:
      id = int
      b_id = int, foreign key to B.id
      b = relationship(B, backref=(backref(a), uselist=False), lazy=joined)
      foo = column()
    
    class B:
      id = int
    
      @property
      def xxx(self):
        return self.a.foo
    

    Default Case

    First, query, where we look up A (and also A.b, because it's relationship arguments are lazy='joined')

    session.query(A).first()
    

    which creates query:

    SELECT table_a.id AS table_a_id, table_a.b_id AS table_a_b_id, table_b_1.id AS table_b_1_id 
    FROM table_a LEFT OUTER JOIN table_b AS table_b_1 ON table_b_1.id = table_a.b_id 
     LIMIT %(param_1)s
    

    Now, trying to access a via backref (via A.b.xxx) causes extra query:

    SELECT table_a.id AS table_a_id, table_a.b_id AS table_a_b_id 
    FROM table_a 
    WHERE %(param_1)s = table_a.b_id
    

    Workaround case (with contains_eager)

    Now, when I query A like this:

    a = session.query(A).options(contains_eager("b.a")).first()
    

    which generates SQL like this:

    SELECT table_b.id AS table_b_id, table_a.id AS table_a_id, table_a.b_id AS table_a_b_id 
    FROM table_b, table_a 
     LIMIT %(param_1)s
    

    Now, when I try to access A.b.xxx, the already existing a is found in the session and no query is needed.

    So, was just wondering whether the session could automatically "optimize" the one-to-one relationship where both sides have load='joined', into the workaround case (as the data is already in the session). Or the relationship docs could be improved to mention contains_eager.

  3. Mike Bayer repo owner

    So, was just wondering whether the session could automatically "optimize" the one-to-one relationship where both sides have load='joined', into the workaround case (as the data is already in the session). Or the relationship docs could be improved to mention contains_eager.

    right, my original suggestions were off the mark because really, you're hoping that the ORM can figure out that this is one-to-one so hey, we already have A.

    Unfortunately it cannot. Attribute loading of a relationship does not consider the existing state of any other attribute. Put another way, the "backref" concept is not consulted in any way during loading. When the ORM is trying to do some_b.a, it knows nothing about that some_b is already being pointed at by an A, and that because this is uselist=False, we can make the actually quite large assumption that this backref works perfectly in both directions.

    If you want to understand this better, erase "uselist=False" and "one-to-one" from your mind; this is a many to one, referring to a one to many. SQLAlchemy makes no assumptions about b.a because that can be an unlimited number of "A" rows. uselist=False goes as far as, "only load the first row and complain if there's a second", and that's pretty much it.

    In practice, one-to-ones aren't a great idea and I try to avoid them in relational design.

  4. Mike Bayer repo owner

    here is a better way to make that load happen. but see, putting this in the docs is risky, b.c. when super beginners are given a little bit of stuff they go nuts and get all tangled. maybe a little green box somewhere:

    from sqlalchemy import Integer, ForeignKey, create_engine, Column
    from sqlalchemy.orm import Session, relationship, backref
    from sqlalchemy.ext.declarative import declarative_base
    from sqlalchemy import event
    from sqlalchemy.orm import attributes
    
    Base = declarative_base()
    
    
    class A(Base):
        __tablename__ = 'a'
        id = Column(Integer, primary_key=True)
        b_id = Column(ForeignKey('b.id'))
        b = relationship("B", backref=backref("a", uselist=False), lazy='joined')
    
    
    class B(Base):
        __tablename__ = 'b'
        id = Column(Integer, primary_key=True)
    
    
    @event.listens_for(A, "load")
    def load_b(target, context):
        if 'b' in target.__dict__:
            attributes.set_committed_value(target.b, 'a', target)
    
    e = create_engine("sqlite://", echo=True)
    Base.metadata.create_all(e)
    
    s = Session(e)
    s.add(A(b=B()))
    s.commit()
    
    a1 = s.query(A).first()
    
    print "----------------------"
    
    assert a1.b.a is a1
    
  5. Log in to comment