optimize subqueries to not render columns which aren't referred to on the outside
The following program outputs:
SELECT b.b_id AS b_b_id,
b.a_id AS b_a_id,
c.id AS c_id,
c.c_name AS c_c_name,
c.b_id AS c_b_id
FROM c JOIN (
SELECT a.id AS a_id,
a.a_name AS a_a_name,
b.id AS b_id,
b.b_name AS b_b_name
FROM a JOIN b ON a.id = b.id
) AS b ON b.b_id = c.b_id
Note how b.b_name
is in the subselect, but not used outside. It's a minor issue because it will most probably be optimized out by the database, but it still makes the query longer than it needs to be.
from sqlalchemy import *
from sqlalchemy.orm import *
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()
class A(Base):
__tablename__ = 'a'
id = Column(Integer, primary_key=True)
a_name = Column(String)
class B(A):
__tablename__ = 'b'
id = Column(Integer, ForeignKey("a.id"), primary_key=True)
b_name = Column(String)
class C(Base):
__tablename__ = 'c'
id = Column(Integer, primary_key=True)
c_name = Column(String)
b_id = Column("b_id", Integer, ForeignKey("b.id"))
b = relationship(B)
e = create_engine("sqlite://", echo=True)
Base.metadata.create_all(e)
s = Session(e)
b = B(id=1, a_name="foo", b_name="bar")
s.add(b)
c = C(id=1, b_id=b.id, c_name="boo")
s.add(c)
s.flush()
alias = aliased(B, name="b")
print s.query(C).join(alias).options(
contains_eager(C.b, alias=alias).load_only(alias.id)
)
Comments (3)
-
repo owner -
repo owner - changed milestone to 1.x.xx
- changed component to sql
- changed title to optimize subqueries to not render columns which aren't referred to on the outside
- marked as enhancement
this would only be feasible if the Python overhead of figuring this out were not prohibitive. it also means that a text-specified column expression would no longer work though I dont think that's a common use. The system here would be drawing upon similar mechanisms as that of http://docs.sqlalchemy.org/en/latest/changelog/migration_10.html#order-by-and-group-by-are-special-cases.
-
repo owner - changed status to wontfix
this is a very complicated feature without too much of a use case. A subquery can be built now to include the specific columns needed and the "flat" argument allows the "auto subquery" thing to not take place; the "flat" alias needs to be publicized some more.
- Log in to comment
the subquery in the first place makes the query longer and vastly less efficient than any effect the column within has. The inner column is there because we are "selecting from" the "selectable" formed by this mapping, which is rendered as is. An optimization that caused subselects to only render columns that are called upon on the outside would be intricate, at this point might be doable, but IMHO not really worth it, because usually you want to just get rid of the subselect.
which leads to how you should be doing this query, which is doing the alias as "flat":
see http://docs.sqlalchemy.org/en/rel_0_9/changelog/migration_09.html#many-join-and-left-outer-join-expressions-will-no-longer-be-wrapped-in-select-from-as-anon-1 for caveats on SQLite.