support self-referential auto-joins when the foreign key joins to itself

Issue #1401 resolved
Former user created an issue

See attached test case. Basically, we have a class called Item with a relation on itself called group and we try to do: session.query(Item).filter(~Item.group.any(id = 2))

The generated sql is:

SELECT item.id AS item_id, item.group_id AS item_group_id FROM item WHERE NOT (EXISTS (SELECT 1 FROM item AS item_1 WHERE item.group_id = item.group_id AND item_1.id = :id_1))

Note that it has the tautological WHERE item.group_id = item.group_id

This should be WHERE item_1.group_id = item.group_id

Comments (20)

  1. Mike Bayer repo owner

    I didn't notice that the trick here is you're equating "group_id" to itself, so this is not an adjacency list model. This is way out of scope for what our current self-referential relations do, although its interesting that the "lazy load" feature does function. Unfortunately all of our self referential joining/aliasing logic relies upon the fact that the two columns are different columns of the table, so for this use case you'd have to build your own @comparator that does what you need.

  2. Mike Bayer repo owner

    lets look at what we might do:

    diff -r b918c4876103315227cd1f5f54b309f1cec6ceca lib/sqlalchemy/orm/properties.py
    --- a/lib/sqlalchemy/orm/properties.py  Wed May 04 19:06:01 2011 -0400
    +++ b/lib/sqlalchemy/orm/properties.py  Fri May 06 11:49:58 2011 -0400
    @@ -1206,17 +1206,17 @@
                 else:
                     if dest_selectable is not None:
                         primary_aliasizer = ClauseAdapter(dest_selectable,
    -                            exclude=self.local_side,
    +                            exclude_fn=lambda c: "local_side" in c._annotations,
                                 equivalents=self.mapper._equivalent_columns)
                         if source_selectable is not None:
                             primary_aliasizer.chain(
                                 ClauseAdapter(source_selectable,
    -                                exclude=self.remote_side,
    +                                exclude_fn=lambda c: "remote_side" in c._annotations,
                                     equivalents=self.parent._equivalent_columns))
                     elif source_selectable is not None:
                         primary_aliasizer = \
                             ClauseAdapter(source_selectable,
    -                            exclude=self.remote_side,
    +                            exclude_fn=lambda c: "remote_side" in c._annotations,
                                 equivalents=self.parent._equivalent_columns)
                     secondary_aliasizer = None
                 primaryjoin = primary_aliasizer.traverse(primaryjoin)
    
    1. get "local_side" and "remote_side" into the annotations of all the primaryjoin conditions
    2. everything still has to work despite the presence of annotations, "c is c" equivalence is gone, not sure what would happen there. perhaps need to use the "annotated" pj just for joins.
    3. find everywhere else we're doing local/remote exclusion
    4. get tests that are against relationship._create_joins directly
  3. Mike Bayer repo owner

    OK here's another approach that is much simpler, though it wouldn't work for situations where the join condition has a more complex relationship than "col == col", perhaps it can be further improved:

    diff -r 132f5c7e0437fb62237ab33bb9dea3befd5ab233 lib/sqlalchemy/orm/properties.py
    --- a/lib/sqlalchemy/orm/properties.py  Wed Feb 01 10:14:28 2012 -0500
    +++ b/lib/sqlalchemy/orm/properties.py  Wed Feb 01 11:04:50 2012 -0500
    @@ -1477,6 +1477,7 @@
         def _create_joins(self, source_polymorphic=False, 
                                 source_selectable=None, dest_polymorphic=False, 
                                 dest_selectable=None, of_type=None):
    +
             if source_selectable is None:
                 if source_polymorphic and self.parent.with_polymorphic:
                     source_selectable = self.parent._with_polymorphic_selectable
    @@ -1539,7 +1540,8 @@
                         secondary_aliasizer.traverse(secondaryjoin)
                 else:
                     primary_aliasizer = ClauseAdapter(dest_selectable,
    -                        exclude=self.local_side,
    +                        exclude=self.local_side.difference(self.remote_side),
    +                        half_include = self.remote_side.intersection(self.local_side),
                             equivalents=self.mapper._equivalent_columns)
                     if source_selectable is not None:
                         primary_aliasizer.chain(
    diff -r 132f5c7e0437fb62237ab33bb9dea3befd5ab233 lib/sqlalchemy/sql/expression.py
    --- a/lib/sqlalchemy/sql/expression.py  Wed Feb 01 10:14:28 2012 -0500
    +++ b/lib/sqlalchemy/sql/expression.py  Wed Feb 01 11:04:50 2012 -0500
    @@ -3342,9 +3342,13 @@
         def _from_objects(self):
             return self.left._from_objects + self.right._from_objects
    
    -    def _copy_internals(self, clone=_clone, **kw):
    +    def _copy_internals(self, clone=_clone, half_include=None, **kw):
    +        do_half = half_include is not None and \
    +                    self.left in half_include and \
    +                    self.right is self.left
             self.left = clone(self.left, **kw)
    -        self.right = clone(self.right, **kw)
    +        if not do_half:
    +            self.right = clone(self.right, **kw)
    
         def get_children(self, **kwargs):
             return self.left, self.right
    diff -r 132f5c7e0437fb62237ab33bb9dea3befd5ab233 lib/sqlalchemy/sql/util.py
    --- a/lib/sqlalchemy/sql/util.py    Wed Feb 01 10:14:28 2012 -0500
    +++ b/lib/sqlalchemy/sql/util.py    Wed Feb 01 11:04:50 2012 -0500
    @@ -669,8 +669,8 @@
           s.c.col1 == table2.c.col1
    
         """
    -    def __init__(self, selectable, equivalents=None, include=None, exclude=None, adapt_on_names=False):
    -        self.__traverse_options__ = {'stop_on':[selectable](selectable)}
    +    def __init__(self, selectable, equivalents=None, include=None, exclude=None, half_include=None, adapt_on_names=False):
    +        self.__traverse_options__ = {'stop_on':[selectable](selectable), 'half_include':half_include}
             self.selectable = selectable
             self.include = include
             self.exclude = exclude
    
  4. Mike Bayer repo owner

    just to experiment with annotations in the primaryjoin:

    diff -r 132f5c7e0437fb62237ab33bb9dea3befd5ab233 lib/sqlalchemy/orm/properties.py
    --- a/lib/sqlalchemy/orm/properties.py  Wed Feb 01 10:14:28 2012 -0500
    +++ b/lib/sqlalchemy/orm/properties.py  Wed Feb 01 11:19:26 2012 -0500
    @@ -1052,6 +1052,9 @@
                         "'secondaryjoin' is needed as well."
                         % self)
    
    +        from sqlalchemy.sql import util as sql_util
    +        self.primaryjoin = sql_util._deep_annotate(self.primaryjoin, {'_nothing':True}, set())
    +
         def _columns_are_mapped(self, *cols):
             """Return True if all columns in the given collection are 
             mapped by the tables referenced by this :class:`.Relationship`.
    

    what fails here, at least one thing, is _create_lazy_clause() calculates improperly when it checks the "binds" collection. We might want to make some small tests of annotated columns to see if they behave correctly in hash lookups (we seem to have this in test_selectable, though still seeing odd behavior here).

  5. Mike Bayer repo owner
    • changed milestone to 0.8.0

    this would be a great one to get in. if using helpers like local(mycol) == remote(mycol) means we can remove _local_remote_columns, we finally have relationship() figured out totally. if we go to a totally annotated approach it can even be somecolumn == foreign(someothercolumn) - annotations could be used to express the entire thing, unambiguously. consider reworking the whole of RelationshipProperty to work from annotated primaryjoin/secondaryjoin all the way through.

  6. Mike Bayer repo owner

    ongoing effort with the refactoring is at https://bitbucket.org/zzzeek/sqlalchemy_1401. Most of the guts of RelationshipProperty will move to a new module called relationships where we encapsulate all the primaryjoin/secondaryjoin stuff. The system represents everything as a fully annotated join condition, then everything else is derived from that. The idea is that a pre-annotated join condition can be passed to relationship(), removing the need for "foreign_keys", "remote_side", "local_remote_pairs", as well as supporting a wider range of joins such as that of #610.

  7. Log in to comment