default_strategy(), overrides lazy=X settings

Issue #2351 resolved
Mike Bayer repo owner created an issue

patch is attached.

from sqlalchemy import *
from sqlalchemy.orm import *
from sqlalchemy.ext.declarative import declarative_base

Base= declarative_base()

class Parent(Base):
    __tablename__ = "parent"

    id = Column(Integer, primary_key=True)
    children = relationship("Child", lazy='joined')

class Child(Base):
    __tablename__ = "child"

    id = Column(Integer, primary_key=True)
    parent_id = Column(Integer, ForeignKey('parent.id'))

s = Session()

print s.query(Parent)

print s.query(Parent).options(default_strategy('select'))

output:

SELECT parent.id AS parent_id, child_1.id AS child_1_id, child_1.parent_id AS child_1_parent_id 
FROM parent LEFT OUTER JOIN child AS child_1 ON parent.id = child_1.parent_id

SELECT parent.id AS parent_id 
FROM parent

Consider carefully how this may interact with #1418, if this option should also be used there, or if we should be naming it differently.

Comments (32)

  1. Former user Account Deleted

    I'm curious about why you indexed context.attributes by a tuple ('loaderstrategy', (self.strategy_wildcard_key,)) instead of just a symbol (or string)? Is this just a "name space" solution within attributes, or does it serve another purpose?

  2. Former user Account Deleted

    Thank you very much for your time on Saturday even! Is the head revision stable to your knowledge? This might get me to update sqlalchemy finally....

  3. Mike Bayer reporter

    Well usually the form is ('loaderstrategy', MyClass.some_attribute.property), so not a string. Also even if the second part was always a string I tend to use tuples instead of concatenated strings. I likely picked that habit up early on in Python when I saw GVR's "five minute multimethods" recipe http://www.artima.com/weblogs/viewpost.jsp?thread=101605, where coming from Perl/Java it was like, holy crap, you can put things that aren't huge string concatenations, and aren't big thick MySpecialKey.hashCode() types of objects as keys in dicts ! Tuples ! why bother concatenating/parsing strings when you can just leave it at a tuple.

    But yeah there's normally two parts to that key.

  4. Former user Account Deleted

    Is the head revision stable to your knowledge?

    Also, if I were to write some test cases: * Which test module (where would I add them)? * Do you typically assert against rendered SQL (even though that means you can't easily change the format in the future of the rendered sql)? (Would like a newline before UNION, but that's off topic...) * What coverage do you want?

  5. Mike Bayer reporter

    the tip is stable, sure, nothing much is going on.

    some generic tests for how "options" work are in test/orm/test_mapper.py:OptionsTest. These are mostly really old tests. These particular tests use the technique of asserting a loaded object graph is equivalent to another (which ensures various relationships are loaded), then counting how many SQL calls it took. We have other tests for options elsewhere which assert the SQL but if we put tests over here we'd probably maintain the "count the SQL" approach.

    usually with tests I focus on ensuring that the intended effect is achieved, before I focus on "coverage". So in this case checking that lazy="joined", maybe lazy="subquery" are downgraded to "select", maybe for fun trying it the other way, if lazy="select"/True can be upgraded to default_strategy(lazy="joined"), and then for a case nobody will ever use to see if lazy=None participates in this too (lazy=None can become default_strategy(lazy="select")).

  6. Mike Bayer reporter

    oh also, if you try it out ahead of time, then find some other issue that we fix, we'd want to make sure that specific issue is also tested.

  7. Former user Account Deleted

    is there an easy way to turn engine echo on during nosetests so I can see what's going on?

  8. Former user Account Deleted

    Also, I'm slowly working it out through the code, but is there a brief explanation of assert_sql_count() and sql_count_()?

  9. Mike Bayer reporter

    do --log-debug=sqlalchemy.engine for echoing.

    assert_sql_count() accepts a callable function, runs it, and asserts that the number of execute calls matches the given number. sql_count_() seems to be an underused helper for assert_sql_count().

  10. Former user Account Deleted

    Here is the problem, can you explain? After using

    .options(sa.orm.default_strategy('select'))
    

    Subsequent loading should default back to the mapper defaults, but with subsequent loads (no longer part of the query), ('loaderstrategy', (self.strategy_wildcard_key,)) is still in context.attributes, so the original mapper defaults are effectively permanently overwritten.

    See last test case with #PROBLEM comment, compared to the previous test case with .enable_eagerloads(False):

    diff -U10 -r sqlalchemy-default.untarred/test/orm/test_mapper.py sqlalchemy-default/test/orm/test_mapper.py
    --- sqlalchemy-default.untarred/test/orm/test_mapper.py 2011-12-15 11:42:50.000000000 -0500
    +++ sqlalchemy-default/test/orm/test_mapper.py  2011-12-29 12:01:49.000000000 -0500
    @@ -1797,20 +1797,100 @@
    
             sess = create_session()
             l = (sess.query(User).
                  order_by(User.id).
                  options(sa.orm.lazyload('addresses'))).all()
    
             def go():
                 eq_(l, self.static.user_address_result)
             self.sql_count_(4, go)
    
    +    def test_default_joined_option(self):
    +        """An eager,joined relationship can be defaulted to a lazy relationship."""
    +
    +        users, Keyword, items, order_items, orders, Item, User, Address, keywords, item_keywords, Order, addresses = (self.tables.users,
    +                                self.classes.Keyword,
    +                                self.tables.items,
    +                                self.tables.order_items,
    +                                self.tables.orders,
    +                                self.classes.Item,
    +                                self.classes.User,
    +                                self.classes.Address,
    +                                self.tables.keywords,
    +                                self.tables.item_keywords,
    +                                self.classes.Order,
    +                                self.tables.addresses)
    +
    +        mapper(Address, addresses)
    +
    +        mapper(Keyword, keywords)
    +
    +        mapper(Item, items, properties=dict(
    +            keywords=relationship(Keyword, secondary=item_keywords,
    +                              lazy='subquery',
    +                              order_by=item_keywords.c.keyword_id)))
    +
    +        mapper(Order, orders, properties=dict(
    +            items=relationship(Item, secondary=order_items, lazy='subquery',
    +                           order_by=order_items.c.item_id)))
    +
    +        mapper(User, users, properties=dict(
    +            addresses=relationship(Address, lazy='joined',
    +                               order_by=addresses.c.id),
    +            orders=relationship(Order, lazy='joined',
    +                            order_by=orders.c.id)))
    +
    +        sess = create_session()
    +
    +        # first test mapper defaults, 3 queries (2 subquery loads).
    +        # there are no keywords in this static data
    +        l = [       def go():
    +            l[:](]
    +) = sess.query(User).order_by(User.id).all()
    +            eq_(l, self.static.user_all_result)
    +        self.assert_sql_count(testing.db, go, 3)
    +
    +        # verify all the item keywords were subqeury loaded, with no more sql
    +        def go():
    +            f = util.flatten_iterator
    +            for i in f([for o in f([u.orders for u in l](o.items))]):
    +                i.keywords
    +        self.assert_sql_count(testing.db, go, 0)
    +
    +        sess.expunge_all()
    +
    +        # demonstrate that enable_eagerloads loads with only 1 sql
    +        def go():
    +            l[:](:) = sess.query(User).enable_eagerloads(False).order_by(User.id).all()
    +        self.assert_sql_count(testing.db, go, 1)
    +
    +        # User[0](0) has orders, which need to 'select' load, and 2 subquery: 3 total
    +        def go():
    +            l[0](0).orders
    +        self.assert_sql_count(testing.db, go, 3)
    +
    +        sess.expunge_all()
    +
    +        # now test 'default_strategy' option shuts off 'orders' load, only 1 sql
    +        def go():
    +            l[:](:) = sess.query(User)\
    +                .options(sa.orm.default_strategy('select'))\
    +                .options(sa.orm.joinedload(User.addresses))\
    +                .order_by(User.id).all()
    +        self.assert_sql_count(testing.db, go, 1)
    +
    +        def go():
    +#            eq_(l, self.static.user_all_result)
    +            # PROBLEM: 'items' relationship is no longer subquery load.. why?
    +            l[0](0).orders
    +        self.assert_sql_count(testing.db, go, 3)
    +
         def test_option_propagate(self):
             users, items, order_items, Order, Item, User, orders = (self.tables.users,
                                     self.tables.items,
                                     self.tables.order_items,
                                     self.classes.Order,
                                     self.classes.Item,
                                     self.classes.User,
                                     self.tables.orders)
    
             mapper(User, users, properties=dict(
    
  11. Mike Bayer reporter

    There's a flag on EagerLazyOption called propagate_to_loaders, which if True tells QueryContext to stick the option into its propagate_options collection (line 3212 of query.py). This collection then gets carried along with each InstanceState (mapper.py line 2554). Then it goes back into the subsequent lazyload queries via _conditional_options (strategies.py line 538).

    The answer is turn that flag off for this option ! Inside of default_strategy() pass propagate_to_loaders=False to the new EagerLazyOption.

  12. Former user Account Deleted

    Wait, do we need that to make sure subquerys that are specified as part of the original query with .options(default_strategy('select')), when loading, pass on the .options(default_strategy('select')) option to the subquery?

  13. Mike Bayer reporter

    I'm going to totally guess and say no. the subquery loader should take the options along from the immediate Query and use that in the new one it's creating. let's check. yup, line 833 of strategies.py:

            # propagate loader options etc. to the new query.
            # these will fire relative to subq_path.
            q = q._with_current_path(subq_path)
            q = q._conditional_options(*orig_query._with_options)
    
  14. Former user Account Deleted

    Is the orm clever enough to skip a subquery load of a collection's attribute if all the objects in the collection were already loaded and already had that attribute loaded? Seems like a reasonable explanation for what I'm seeing, but wasn't expecting.

  15. Mike Bayer reporter

    Attributes aren't populated in the usual case if the parent object was already in the identity map and the attribute is either unexpired or is relationship-based. The firing of the subqueryload query is a product of running the "row processor" function for at least one row. So if none of the objects in the result get the attribute populated, then it won't fire. But if just one row does represent a new value, then the "row processor" runs and runs the whole subquery. The rows that were already represented are still not affected, however.

  16. Mike Bayer reporter

    calls initialize(), so yes whatever the AttributeImpl does there which is a None/empty collection.

  17. Former user Account Deleted

    Let me know if this is acceptable and gets controlled.

    Is there a better way to get rid of the missing *.pyc files diff output for a patch file than grep -v? :

    diff -U5 -r sqlalchemy-default.untarred sqlalchemy-default | grep -v "^Only in sqlalchemy-default\(\/\|:\)" > ~/2351.kb.patch
    
  18. Mike Bayer reporter

    Usually you'd just commit to your local hg repository and do "hg export -r<start>:<end>". or "hg diff".

  19. Mike Bayer reporter

    there's a structural change I'd like to make to those tests. while you've done a great job of following the conventions exactly in test_mapper.py, newer tests I've been doing a few things differently - getting each test to test exactly one thing, and moving setup code at least into a separate fixture method that's shared by multiple tests. So here the changes would be:

    1. put the tests here into a new class, called something like DefaultOptionsTest.

    2. each of the blocks that set up mapper(cls, table) goes into separate methods, like def _lazy_subquery_fixture(), def _joined_fixture(), something like that. The access to "self.tables" would be only here.

    3. break up each test_XXX method as much as possible. Instead of having four or five go() calls in a row with the occasional session.expunge_all() in the middle, use separate test methods. You can still have multiple go() inside of one test if they're related of course, but we just want to make each test as small as possible as this makes it much easier to figure out what's going on when tests fail.

    4. the tests can still call out the classes from the "self.classes" collection, that's not a big deal.

    the document that's inspiration for this style are the "Just One Thing" section, and to a lesser degree the "Share setup via helper methods" section (basically, use helper methods, but not necessarily the "don't use self" part), of: http://docs.pylonsproject.org/en/latest/community/testing.html

  20. Former user Account Deleted

    Consider carefully how this may interact with #1418, if this option should also be used there, or if we should be naming it differently.

    #1418 probably can't/shouldn't also be named "default_strategy()" unless we burden it with extra arguments to distinguish relationships from columns. The use cases are different. Generally, I'd want to fully load the objects (with no deferred columns), but not necessarily all relationships.

    Since "strategy" could refer to column loading ''or'' relationship loading, one might argue the present ticket shouldn't use "strategy" alone (i.e. "relationship_strategy()" vs. "column_strategy()" instead of default_strategy), but its association with relationships seems stronger to me, so probably fine?

    For #1418, there are two approaches. * Its current defer_except() approach:

    sess.query(Order).options(
           joinedload("items"),
           defer_except("id", "customerid"),
           defer_except("id", "price",  path=(Order.items),
       ).all()
    
    • Or the present ticket's approach, using the existing undefer() option:

      #!python sess.query(Order).options( joinedload("items"), undefer("id"), undefer("customerid"), undefer("items.id"), undefer("items.price"), column_strategy('defer'), ).all()

      #!python # this would allow for another (albeit less practical use case): # get all the columns marked as defer in the mapper, except one sess.query(Order).options( joinedload("items"), defer("some_calculated_column"), column_strategy('undefer'), ).all()

    I suspect primary key columns don't need to be specified in either case and must be loaded.

  21. Former user Account Deleted

    What about this ticket? Before I patch this in, are you considering a function name change (comment:24) or will it be default_strategy()?

  22. Former user Account Deleted

    Replying to zzzeek:

    one more idea:

    {{{ #!python options(joinedload("")) options(lazyload("")) options(defer("*")) }}}

    I gotta admit, I like that the best of all. (if someone says both joinedload("") and lazyload("") for same query we leave behavior undefined?)

    Is it clear what's going on? (Just as clear as relationship_strategy("select"), I think, while default_strategy("select") is slightly clearer.) (my $0.02 worth is "lazy" should be a synonym for "select" btw, just to plant that idea in your subconscious. )

  23. Log in to comment