eager loader options / parameters for unexpire and refresh

Issue #1763 open
Mike Bayer repo owner created an issue

this patch begins to provide this behavior. However a lazyloader from an expired object emits a second unnecessary load (or an unnecessary join, depending on how you look at it), and polymorphic loading breaks entirely, probably due to an incompatibility with the partial joined inheritance loader.

diff -r cfdc41c75ce2e2228d71733212bbaca0fe17ce35 lib/sqlalchemy/orm/state.py
--- a/lib/sqlalchemy/orm/state.py   Mon Apr 05 13:16:29 2010 -0400
+++ b/lib/sqlalchemy/orm/state.py   Mon Apr 05 15:43:40 2010 -0400
@@ -262,9 +262,10 @@
         if kw.get('passive') is attributes.PASSIVE_NO_FETCH:
             return attributes.PASSIVE_NO_RESULT

-        toload = self.expired_attributes.\
-                        intersection(self.unmodified)
-        
+        toload = self.expired_attributes.intersection(self.unmodified).union(
+            self.unloaded.intersection(self.expire_on_missing)
+        )
+
         self.manager.deferred_scalar_loader(self, toload)

         # if the loader failed, or this 
@@ -277,6 +278,10 @@
         return ATTR_WAS_SET

     @property
+    def expire_on_missing(self):
+        return set(key for key in self.manager if self.manager[key](key).impl.expire_missing)
+        
+    @property
     def unmodified(self):
         """Return the set of keys which have no uncommitted changes"""

diff -r cfdc41c75ce2e2228d71733212bbaca0fe17ce35 test/orm/test_expire.py
--- a/test/orm/test_expire.py   Mon Apr 05 13:16:29 2010 -0400
+++ b/test/orm/test_expire.py   Mon Apr 05 15:43:40 2010 -0400
@@ -425,17 +425,7 @@
         assert len(u.addresses) == 2

     @testing.resolve_artifact_names
-    def test_joinedload_props_dontload(self):
-        # relationships currently have to load separately from scalar instances.
-        # the use case is: expire "addresses".  then access it.  lazy load
-        # fires off to load "addresses", but needs foreign key or primary key
-        # attributes in order to lazy load; hits those attributes, such as
-        # below it hits "u.id".  "u.id" triggers full unexpire operation,
-        # joinedloads addresses since lazy='joined'.  this is all wihtin lazy load
-        # which fires unconditionally; so an unnecessary joinedload (or
-        # lazyload) was issued.  would prefer not to complicate lazyloading to
-        # "figure out" that the operation should be aborted right now.
-
+    def test_joinedload_props_load(self):
         mapper(User, users, properties={
             'addresses':relationship(Address, backref='user', lazy='joined'),
             })
@@ -444,11 +434,30 @@
         u = sess.query(User).get(8)
         sess.expire(u)
         u.id
-        assert 'addresses' not in u.__dict__
+        assert 'addresses' in u.__dict__
         u.addresses
         assert 'addresses' in u.__dict__

     @testing.resolve_artifact_names
+    def test_joinedload_props_load_two(self):
+        mapper(User, users, properties={
+            'addresses':relationship(Address, backref='user', lazy='joined'),
+            })
+        mapper(Address, addresses)
+        sess = create_session()
+        u = sess.query(User).get(8)
+        sess.expire(u)
+
+        # current bug: u.addresses unexpires attributes,
+        # eagerlaods u.addresses.   lazyloader for u.addresses
+        # then runs anyway.
+        def go():
+            u.addresses
+            assert 'addresses' in u.__dict__
+            assert 'id' in u.__dict__
+        self.assert_sql_count(testing.db, go, 1)
+        
+    @testing.resolve_artifact_names
     def test_expire_synonym(self):
         mapper(User, users, properties={
             'uname': sa.orm.synonym('name')

Comments (13)

  1. Mike Bayer reporter

    excess complexity for the "unexpire" use case, not clear that this case benefits from supplied eager loaders firing off and this would add to the maintenance burden.

  2. Mike Bayer reporter
    • changed status to open

    two dupes now, so putting this out for a long milestone in case we want to see if there's an expedient solution for it w/ our current architecture.

  3. Mike Bayer reporter
    • changed milestone to 1.2
    • edited description

    Let's look at what might be involved for 1.2, including that an in-place unexpire as well as get() would work here. For refresh(), that might be diffferent if I recall correctly. Re-target out to 1.3 if it's not a quick fix at this point.

  4. Mike Bayer reporter
    • changed milestone to 1.3
    • edited description

    non-critical, expect this to continue moving out on milestones until we have more space for nice-to-haves

  5. Mike Bayer reporter

    OK, we need to figure out just what's being asked here. Because a basic test of session.refresh() shows the eager loaders do get applied now:

    from sqlalchemy import *
    from sqlalchemy.orm import *
    from sqlalchemy.ext.declarative import declarative_base
    
    Base = declarative_base()
    
    
    class A(Base):
        __tablename__ = 'a'
        id = Column(Integer, primary_key=True)
        bs = relationship("B", lazy="joined")
    
    
    class B(Base):
        __tablename__ = 'b'
        id = Column(Integer, primary_key=True)
        a_id = Column(ForeignKey('a.id'))
    
    e = create_engine("sqlite://", echo=True)
    Base.metadata.create_all(e)
    
    
    s = Session(e)
    s.add(A(bs=[B(), B()]))
    s.commit()
    
    a1 = s.query(A).first()
    assert 'bs' in a1.__dict__
    
    s.expire(a1)
    assert 'bs' not in a1.__dict__
    
    s.refresh(a1)
    assert 'bs' in a1.__dict__
    

    now what we can't do yet is allow query options in there, which we should, e.g. session.query(A).options(joinedload(A.bs)).refresh(some_a) - that is make refresh mainly part of Query.

    the mechanism by which the eager loaders run on a refresh go back to at least 0.9, so not sure what's new here.

  6. Mike Bayer reporter

    oh. OK, this is explicitly, "unexpire", as in, attribute access after a full expiration:

    a1.id  # instead of "refresh"
    assert 'bs' in a1.__dict__
    

    that we don't do. this also kind of refers to a larger issue is that if you do in fact do session.commit() and have the expiration happen, you lose all the loading you've done and it can be much more expensive to restore. but it's not clear how aggressive the "unexpire" use case should be vs. trying to not produce more expensive queries.

  7. Mike Bayer reporter

    here's what this needs. subquery and selectin loading need to convert to "immediateload" when they are in the "refresh" case. there's no point using a loader that is meant to aggregate across many items. also refresh loading should be more configurable - a second-select loader is going to be surprising if it fires in all cases for a refresher.

    we need to consider here:

    1. session.refresh(obj)

    2. session -> loader options -> refresh() ? session.refresh(obj, options=[joinedload(xyz)]) ?

    3. implicit refresh on attribute access

    4. relationship - suppose 'lazy='joined'' - separate config 'refresh_strategy='immediate' (emit a SELECT) / 'joined' (use joined eager loading) / 'none' (dont unexpire) / 'inherit' (use whatever lazy=, or the query options that were set up for this object) ?

    5. state.load_options when we do refresh() as well as attribute access - 'immediate' as before?

  8. Mike Bayer reporter

    so here's another one:

    http://docs.sqlalchemy.org/en/latest/orm/session_state_management.html#what-actually-loads

    " Column-oriented attributes, even if expired, will not load as part of this operation, and instead will load when any column-oriented attribute is accessed. relationship()- mapped attributes will not load in response to expired column-based attributes being accessed."

    so actually I think that's wrong, if the object is fully expired, a lazy load needs those column attributes to know what it's loading, so it does unexpire the object and emit two SELECT statements. but it would be nice to formalize this. in attributes.get(), we had to alter the check for "if key is expired" to not include relationships for the above gerrit. but if we could somehow consider an expired relationship (which is kind of a new concept as of the above gerrit) to be just another attribute to unexpire, and we have state._load_expired() handle it somehwat more intelligently along with the above strategy options.

  9. Log in to comment