from_self() with contains_eager, confirm usage pattern and add documentation notes
The test case below generates the following query:
SELECT
label_alias.id,
label_alias.name,
anon_1.venue_id,
anon_1.venue_name
FROM
label AS label_alias, -- this should come from the inner query
(SELECT
venue.id AS venue_id,
venue.name AS venue_name
FROM
venue
JOIN
venue_to_label AS venue_to_label_1
ON venue.id = venue_to_label_1.venue_id
JOIN
label AS label_alias
ON label_alias.id = venue_to_label_1.label_id
WHERE
label_alias.name IN (
%(
name_1
)s
)) AS anon_1
However, based on the alchemy query, I'd expect the label_alias to come from the inner sql query instead of from the outer. Is there an easy way to resolve this?
To give some background: We're using this strategy generically with joinedload instead of contains_eager to serialize all our queries to json. It works really well, however now the requirement came up to only return "certain joins from the joinedload".
I've tested the issue with SQLAlchemy==1.0.8 Flask==0.10.1 Flask-SQLAlchemy==2.0
Test case:
import unittest
from sqlalchemy import (
Table, Column, Integer, ForeignKey, String, func)
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import (relationship, aliased, contains_eager)
from flask import Flask
from flask.ext.sqlalchemy import SQLAlchemy
# -- create all the database models
app = Flask(__name__)
app.config['SQLALCHEMY_DATABASE_URI'] = ("sqlite://")
db = SQLAlchemy(app)
Base = declarative_base()
venue_to_label = Table(
'venue_to_label', db.metadata,
Column('venue_id', Integer, ForeignKey('venue.id'), primary_key=True),
Column('label_id', Integer, ForeignKey('label.id'), primary_key=True)
)
class Label(db.Model):
__tablename__ = 'label'
id = Column(Integer, primary_key=True, nullable=False)
name = Column(String(254))
class Venue(db.Model):
__tablename__ = 'venue'
id = Column(Integer, primary_key=True, nullable=False)
name = Column(String(254))
labels = relationship(Label, secondary=venue_to_label)
db.drop_all()
db.create_all()
class TestContainsEager(unittest.TestCase):
def test_contains_eager(self):
query = Venue.query
# filtered join
label_alias = aliased(Label, name="label_alias")
query = query.join(label_alias, Venue.labels)
query = query.filter(label_alias.name.in_(["label1"]))
# together with a windowing function this allows to correctly apply
# limit and order to the query, needed since labels is a *-to-many join
query = query.from_self()
# define to load from the inner query
query = query.options(
contains_eager(
'labels', alias=label_alias
).load_only('id', 'name')
)
# contains_eager does not pick up the alias from the inner query
import sqlalchemy.dialects.postgresql as postgresql
print query.statement.compile(dialect=postgresql.dialect())
Comments (16)
-
repo owner -
repo owner - changed component to documentation
- marked as task
- changed milestone to 1.1
- changed title to from_self() with contains_eager, confirm usage pattern and add documentation notes
-
reporter Ah, I tried something like that before but used joinedload instead of add_columns. This now works sort of (we're making progress!).
Unfortunately now when I run "query.all()" I get a separate venue per label. So a venue with two labels would be returned as two venues. Is there a way to force the merge?
Edit: I can provide a more elaborate test case if that helps. Please let me know.
Edit: Just experimenting with manual de-duplication, will report here once I know more.
-
reporter So something like:
result = {r[0].id: r[0] for r in query.all()}.values()
seems to do the job! Thank you for the help! Note that his doesn't preserve the order.
-
repo owner we can do better. Call from_self() like this:
query = query.from_self(Venue)
then you get a query that's just against Venue again, not those columns.
-
reporter Hah, wish I had known that one before :)
-
repo owner again, it's obscure. the whole from_self() thing in general is pretty unknown :)
-
reporter I'm now trying to in-cooperate this into the monolith class we're using. Might be back if I find more issues. Thank you again for your quick help!
-
reporter So I gave up for now. Was running into trouble with a deferred column_property that has a bindparam and correlated_except in it. Might come back to this later.
Here is my question: Other than using contains_eager is there a way to filter and order by "joinedload" table columns? Basically a "WHERE" and "ORDER BY" after the joinedload options have been applied. I'm willing to monkey patch.
-
repo owner filtering by "joinedload" is a little weird, at least if you're dealing with a related collection, because the collection would only contain what you filtered on, not everything that's in that collection. For a many-to-one relationship it makes more sense.
anyway, the equation only works one way - either you want to eager load out of the SQL you've already written, or you want eager loading to be transparently bolted on to the query.
the only resort after that is to load the two types of objects separately and construct the relationships manually afterward; you'd want to use set_committed_value() for this, see the example at https://bitbucket.org/zzzeek/sqlalchemy/wiki/UsageRecipes/DisjointEagerLoading
-
reporter Let me try to clarify my question with an example. The following sql statement is generated by the test case at the bottom of this post:
SELECT anon_1.venue_id, anon_1.venue_name, label_1.id, label_1.name FROM (SELECT venue.id AS venue_id, venue.name AS venue_name FROM venue JOIN venue_to_label AS venue_to_label_1 ON venue.id = venue_to_label_1.venue_id JOIN label ON label.id = venue_to_label_1.label_id WHERE label.name IN ( %( name_1 )s )) AS anon_1 LEFT OUTER JOIN ( venue_to_label AS venue_to_label_2 JOIN label AS label_1 ON label_1.id = venue_to_label_2.label_id ) ON anon_1.venue_id = venue_to_label_2.venue_id
My question now is, is there a way to append "WHERE" and "ORDER BY" to that query? Basically if I could append
WHERE label_1.name IN (...)
to the query, that would solve my issue.
Here is the test case:
import unittest from sqlalchemy import ( Table, Column, Integer, ForeignKey, String) from sqlalchemy.ext.declarative import declarative_base from sqlalchemy.orm import ( relationship, aliased, contains_eager, joinedload) from flask import Flask from flask.ext.sqlalchemy import SQLAlchemy # -- create all the database models app = Flask(__name__) app.config['SQLALCHEMY_DATABASE_URI'] = ("sqlite://") db = SQLAlchemy(app) Base = declarative_base() venue_to_label = Table( 'venue_to_label', db.metadata, Column('venue_id', Integer, ForeignKey('venue.id'), primary_key=True), Column('label_id', Integer, ForeignKey('label.id'), primary_key=True) ) class Label(db.Model): __tablename__ = 'label' id = Column(Integer, primary_key=True, nullable=False) name = Column(String(254)) class Venue(db.Model): __tablename__ = 'venue' id = Column(Integer, primary_key=True, nullable=False) name = Column(String(254)) labels = relationship(Label, secondary="venue_to_label") db.drop_all() db.create_all() class TestContainsEager(unittest.TestCase): def test_joined_load(self): query = Venue.query query = query.join(Label, Venue.labels) query = query.filter(Label.name.in_(["label1"])) query = query.from_self() # define to load from the inner query query = query.options( joinedload('labels').load_only('id', 'name') ) import sqlalchemy.dialects.postgresql as postgresql print query.statement.compile(dialect=postgresql.dialect())
-
repo owner if you truly want the two JOINs, one on the inside and another on the outside, after you do your from_self() you'd emit outerjoin() again on the Query, and then to make the new columns become part of the collection you'd use contains_eager.
-
reporter Solving this will require more time than I can currently spend. I'll come back once I have implemented this or run into more trouble :) Thanks you very much for the help!
-
repo owner - changed status to resolved
- write a real inline documentation section for from_self(),
building on the example of query.join(). Ideally all the methods
in Query would have this kind of thing. fixes
#3544
→ <<cset 8fa62d652b8d>>
-
repo owner - write a real inline documentation section for from_self(),
building on the example of query.join(). Ideally all the methods
in Query would have this kind of thing. fixes
#3544
(cherry picked from commit 8fa62d652b8d1834e23ba36e56c7bd711cc37e37)
→ <<cset fb16fe1b1942>>
- write a real inline documentation section for from_self(),
building on the example of query.join(). Ideally all the methods
in Query would have this kind of thing. fixes
-
repo owner - write a real inline documentation section for from_self(),
building on the example of query.join(). Ideally all the methods
in Query would have this kind of thing. fixes
#3544
(cherry picked from commit 8fa62d652b8d1834e23ba36e56c7bd711cc37e37)
→ <<cset 2506d7202f49>>
- write a real inline documentation section for from_self(),
building on the example of query.join(). Ideally all the methods
in Query would have this kind of thing. fixes
- Log in to comment
so what's happening here is that once you do a "from_self", the subquery that's going to be wrapped is now "baked"; its structure won't change. But in this case, contains_eager() has the desire to add additional columns to select from, to the top of the SELECT, in order to populate the collection. When it does so, the inner subquery can't provide what it needs, because it doesn't supply the columns from Label needed on the outside.
Now, this is very easy to fix, however note that I had no idea what to do with this for about ten minutes, so this is not at all intuitive until you see it, then it seems easy, but as far as users knowing to do this, I'd not expect anyone to come up with this without some serious thinking. So there is definitely a place here that we might want to add a documentation note to contains_eager(). Because if I don't even think it's possible at first, well, I wrote the thing so there you go :).
Anyway - we just add those columns to the inner query so that they are available:
then we get the right statement:
if you can confirm that this actually works with data and returns the right results and all that, that would be great.