don't use joined-inh optimized get for mapping to a custom join()

Issue #2697 resolved
YKdvd created an issue

I'm having a weird glitch where I retrieve some objects with a query, commit (even without any changes), and when subsequently accessing the retrieved records (which should get expired and reloaded automatically on access), see incorrect information. This only seems to be the case when the query is activated with "all()", and my model is a little weird. It is the product of a "join()", and then subclassed.

I've attached a demonstration which can be run standalone (with the doStandalone and doSQLite variables set True), which loops through 4 classes. AbstractUser doesn't have any sort of join, and works, as does the minimal subclass AbstractUser. AbstractMerge is created with a "join" relationship, but still works. However, a simple subclass "User" fails, seemingly only because of the subclassing.

What should happen is that all three User objects should be loaded, and we print out some information from each. We then do commit() on the session, and do the display loop again. The objects would have been expired by the commit, but should reload as they are accessed again. But in the failure case, where the query loads all(), the second printout has incorrect information - all records seem to be the same (wrong) record, likely the first one in the database. If all() is avoided (I use a limit(999) instead), there is no failure.

Looking at the SQL, what seems to be happening is that the SELECT for these records is normally a single query which includes:

FROM users LEFT OUTER JOIN users_links ON users.id = users_links.user_id

as well as appropriate WHERE and LIMIT clauses. However, when the "all()" is used, the initial retrieval is the same, but on refresh there seems to be a query for each record, and they have a plain "FROM users, users_links" clause, with no WHERE/LIMIT clauses at all. This seems to pull either the first record or all records from the table, and each object gets loaded up with wrong data.

I can certainly get around this by avoiding the subclassing, but even though this is probably a weird corner of the SQLA code, this bug may shine some light on something interesting to fix.

Comments (7)

  1. Mike Bayer repo owner

    here's a refined test:

    from sqlalchemy import *
    from sqlalchemy.ext.declarative import declarative_base
    from sqlalchemy.orm import *
    
    Base = declarative_base()
    
    users = Table('users', Base.metadata,
        Column('id', Integer, primary_key=True),
        Column('username', String),
        )
    
    userlinks = Table('users_links', Base.metadata,
            Column("user_id", Integer, ForeignKey('users.id'), primary_key=True)
        )
    
    class AbstractUser(Base):
        __table__ = users
    
        ulink = relationship("AbstractLink", uselist=False)
    
    class AbstractLink(Base):
        __table__ = userlinks
    
    class AbstractMerge(Base):
        __table__ = join(users, userlinks)
    
        id = column_property(users.c.id, userlinks.c.user_id)
    
    #User = AbstractMerge   # <- works
    class User(AbstractMerge):   # <- fails
        pass
    
    engine = create_engine('sqlite://', echo=True)
    
    Base.metadata.create_all(bind=engine)
    
    sess = Session(engine)
    
    users = [   AbstractUser(username='A', ulink=AbstractLink()),
        AbstractUser(username='B', ulink=AbstractLink()),
        AbstractUser(username='C', ulink=AbstractLink()),
    ](
    )
    sess.add_all(users)
    
    sess.commit()
    
    users = sess.query(User).all()
    
    sess.expire(users[1](1))  # <- fails
    
    assert [for u in users](u.username) == ['B', 'C']('A',), [for u in users](u.username)
    

    the issue is in the optimized get:

    diff -r 6bdd3bb93fd18a4aec54ee2a836875a922dcaab3 lib/sqlalchemy/orm/loading.py
    --- a/lib/sqlalchemy/orm/loading.py Mon Apr 01 15:41:57 2013 -0400
    +++ b/lib/sqlalchemy/orm/loading.py Sun Apr 07 20:06:13 2013 -0400
    @@ -558,7 +558,7 @@
    
         result = False
    
    -    if mapper.inherits and not mapper.concrete:
    +    if False: #mapper.inherits and not mapper.concrete:
             statement = mapper._optimized_get_statement(state, attribute_names)
             if statement is not None:
                 result = load_on_ident(
    
  2. Mike Bayer repo owner

    OK the optimized get should only apply to mappers that are against straight tables as the local table, so just check for that.

    diff -r 6bdd3bb93fd18a4aec54ee2a836875a922dcaab3 lib/sqlalchemy/orm/mapper.py
    --- a/lib/sqlalchemy/orm/mapper.py  Mon Apr 01 15:41:57 2013 -0400
    +++ b/lib/sqlalchemy/orm/mapper.py  Sun Apr 07 20:18:29 2013 -0400
    @@ -1886,6 +1886,7 @@
             only those tables to minimize joins.
    
             """
    +
             props = self._props
    
             tables = set(chain(
    @@ -1933,6 +1934,8 @@
                 for mapper in reversed(list(self.iterate_to_root())):
                     if mapper.local_table in tables:
                         start = True
    +                elif not isinstance(mapper.local_table, expression.TableClause):
    +                    return None
                     if start and not mapper.single:
                         allconds.append(visitors.cloned_traverse(
                                                     mapper.inherit_condition,
    
  3. Log in to comment