LazyLoader wrongly replaces column names from aliased tables with lazy bind params.

Issue #404 resolved
Former user created an issue

The test below demonstrates the problem I'm seeing. In short, even though I use a table alias, the columns from that alias are being replaced by lazy bind parameters. Possibly I'm just setting up the relation wrongly, but it does seem more serious than that. I think it stems from LazyLoader._create_lazy_clause.visit_binary but I'm not confident enough in my understanding of the logic here to be able to attempt a patch.

Incidentally, isn't there also a possibility that LazyLoader._create_lazy_clause.bind_label might accidentally bind two parameters with the same name, seeing as it depends on random numbers? Is there any reason not to use a simple incrementing counter?

George

from sqlalchemy import *
from datetime import date, timedelta
import unittest

# Change this to something appropriate
testDbs = 'postgres://tester:testing@localhost/bills_test'

# A simple model: there are items each of which has a date;
# there are also checkpoints that are dated. The items
# corresponding to a checkpoint are those whose dates
# are more recent than the previous checkpoint.

class Item(object):
    pass

class Checkpoint(object):
    pass

md = MetaData()
itemTable = Table('Item', md,
    Column('itemId', Integer, primary_key=True),
    Column('date', Date)
    )
checkpointTable = Table('Checkpoint', md,
    Column('checkpointId', Integer, primary_key=True),
    Column('date', Date)
    )

curCheckpoint = alias(checkpointTable)
prevCheckpoint = alias(checkpointTable)

itemMapping = mapper(Item, itemTable)
checkpointMapping = mapper(Checkpoint, checkpointTable,
    properties={
        'items' : relation(Item, viewonly=True, foreignkey=itemTable.c.date, primaryjoin=
            (itemTable.c.date <= curCheckpoint.c.date)
            & not_(exists([curCheckpoint.c.checkpointId](curCheckpoint.c.checkpointId),
                 (prevCheckpoint.c.date < curCheckpoint.c.date)
                 & (itemTable.c.date < prevCheckpoint.c.date))))
        })

class ORMTest(unittest.TestCase):
    def testSelfAliasedAssociation(self):

        db = create_engine(testDbs, encoding='utf-8')
        md.create_all(db)

        # I think this is the heart of the bug: the lazywhere property
        # seems to have too many lazy_ bind parameters in its subselect.
        strat = orm.class_mapper(Checkpoint).props['items']('items').create_strategy()
        strat.init()
        print strat.lazywhere

        item1 = Item()
        checkpoint1 = Checkpoint()
        item2 = Item()
        checkpoint2 = Checkpoint()

        allObjects = item1, checkpoint1, item2, checkpoint2

        # Date the sequence of objects. The first item should correspond
        # to the first checkpoint, and the second item should correspond
        # to the second checkpoint.
        d = date.today()
        for x in allObjects:
            x.date = d
            d = d + timedelta(1)

        session = create_session(bind_to=db)

        # Save and update everything.
        for table in itemTable, checkpointTable:
            session.execute(itemMapping, delete(table), None)
        for x in allObjects:
            session.save(x)
        session.flush()
        for x in allObjects:
            session.update(x)

        # And check...
        self.assertEqual([item1](item1), checkpoint1.items)
        self.assertEqual([item2](item2), checkpoint2.items)

if __name__ == '__main__':
    ORMTest('testSelfAliasedAssociation').debug()

Comments (2)

  1. Mike Bayer repo owner

    this is really not the intended usage of lazy loaders, and even before this example, i was seriously considering removing the "viewonly" flag which I definitely stuck in there prematurely without thinking it through. the lazyloading was never designed to have a whole host of aliases in there and stuff like that...only to define the actual relationships represented by the database schema.

    theres no reason I can think of for which you need to use the relation() keyword to implement the lookup you are looking for, it should be implemented strictly as a method on your class, which just uses a session (via object_session(self)) and uses the Query object to return results. like this (i didnt get the query to be correct but this is the general idea):

    class Checkpoint(object):
        def _get_items(self):
             return object_session(self).query(Item).select(
                 (itemTable.c.date <= curCheckpoint.c.date)
                & not_(exists([curCheckpoint.c.checkpointId](curCheckpoint.c.checkpointId),
                     (prevCheckpoint.c.date < curCheckpoint.c.date)
                     & (itemTable.c.date < prevCheckpoint.c.date))))
        items = property(_get_items)
    

    let me know if theres something you cant accomplish with the above method that a viewonly relation() could, because i dont think there is (other than "clean"...i think my approach is "cleaner" as the Query should be the single point of user-defined lookups).

  2. Mike Bayer repo owner

    the docs have since been updated to say that viewonly is OK to use, but if it doesnt work for a more complex join then you have to resort to other methods. we do have some ongoing work with relationships that may improve upon things like the above but generally i cant support an unbounded complexity of join conditions in relation()s....so closing this.

  3. Log in to comment