load_only omits primary keys when not specified
example attached below; this happens in 0.9.4
if trying to load_only a column of the child relationship and not passing in the primary_key, sqlalchemy isn't happy
the sql generated is fine and runs -- it pulls the specified column and uses query string as an anonymous column ( sidenote: is that really necessary ?)
but the ORM is unhappy and raises an error.
adding the primary key works.
using a joinedload instead of subqueryload works as well.
and hey, i made a proper test case!
CATCH_KNOWN_EXCEPTION = False
from sqlalchemy import *
from sqlalchemy.orm import *
from sqlalchemy.ext.declarative import declarative_base
import sqlalchemy.exc
Base = declarative_base()
class TableA(Base):
__tablename__ = 'table_a'
id = Column(Integer, primary_key=True)
name = Column(String(50))
bs = relationship("TableB", primaryjoin="TableB.table_a_id==TableA.id")
cs = relationship("TableC", primaryjoin="TableC.table_a_id==TableA.id")
class TableB(Base):
__tablename__ = 'table_b'
id = Column(Integer, primary_key=True)
name = Column(String(30))
table_a_id = Column(Integer, ForeignKey('table_a.id'), nullable=False)
class TableC(Base):
__tablename__ = 'table_c'
id = Column(Integer, primary_key=True)
id_alt = Column(Integer, primary_key=True)
name = Column(String(30))
table_a_id = Column(Integer, ForeignKey('table_a.id'), nullable=False)
e = create_engine("sqlite://", echo=True)
Base.metadata.create_all(e)
s = Session(e)
s.add_all([
TableA(id=1, name='a',),
TableA(id=2, name='aa',),
TableA(id=3, name='aaa',),
TableA(id=4, name='aaaa',),
TableB(id=1, name='b', table_a_id=1),
TableB(id=2, name='bb', table_a_id=2),
TableB(id=3, name='bbb', table_a_id=3),
TableB(id=4, name='bbbbb', table_a_id=4),
TableC(id=1, id_alt=10, name='c', table_a_id=1),
TableC(id=2, id_alt=11, name='cc', table_a_id=2),
TableC(id=3, id_alt=12, name='ccc', table_a_id=3),
TableC(id=4, id_alt=13, name='cccc', table_a_id=4),
])
s.commit()
q = s.query( TableA ).filter( TableA.id == 2 )
## this passes
try:
q_good = q.options( subqueryload('bs').load_only('id', 'name') )
result = q_good.first()
print "expected - can load `bs` with primary_key + desired column"
except:
raise
try:
q_bad = q.options( subqueryload('bs').load_only('name') )
result = q_bad.first()
except sqlalchemy.exc.NoSuchColumnError, e:
print "expected - CAN NOT LOAD `bs` with only desired column"
if not CATCH_KNOWN_EXCEPTION:
raise
try:
joinedload_works = q.options( joinedload('bs').load_only('name') )
result = joinedload_works.first()
print "expected - can load `bs` with joinedload only using the target column"
except:
raise
try:
q_bad = q.options( subqueryload('cs').load_only('name') )
result = q_bad.first()
except sqlalchemy.exc.NoSuchColumnError, e:
print "expected - CAN NOT LOAD `cs` with only desired column"
if not CATCH_KNOWN_EXCEPTION:
raise
## this also fails, only one key
try:
q_bad = q.options( subqueryload('cs').load_only('id', 'name') )
result = q_bad.first()
except sqlalchemy.exc.NoSuchColumnError, e:
print "expected - CAN NOT LOAD `cs` with only 1/2 primary keys + desired column"
if not CATCH_KNOWN_EXCEPTION:
raise
try:
q_good = q.options( subqueryload('cs').load_only('id', 'id_alt', 'name'))
result = q_good.first()
print "expected - can load `cs` with subqueryload using the target column + BOTH primary keys"
except:
raise
s.close()
Comments (9)
-
repo owner -
reporter Yes. Works on joinedload as expected. That's even in the test case.
-
reporter oh wait, that's not in the testcase I uploaded. but it's in mine
joinedload_works = q.options( joinedload('bs').load_only('name') ) print "trying joinedload_works" result = joinedload_works.first()
-
reporter - edited description
edited the testcase : * show ALL primary keys are required for subqueryload * added
CATCH_KNOWN_EXCEPTION
boolean to disable known exceptions (default is False) -
repo owner - changed title to load_only omits primary keys when not specified
- changed component to orm
this is much more basic than this, and I should have seen this. I think nobody is using load_only() yet because it is almost unusable:
q = s.query(TableA).options(load_only('name')) print q.all() # boom
-
repo owner - changed status to resolved
- Modified the behavior of :func:
.orm.load_only
such that primary key columns are always added to the list of columns to be "undeferred"; otherwise, the ORM can't load the row's identity. Apparently, one can defer the mapped primary keys and the ORM will fail, that hasn't been changed. But as load_only is essentially saying "defer all but X", it's more critical that PK cols not be part of this deferral. fixes#3080
→ <<cset 33cb84945f52>>
-
repo owner - Modified the behavior of :func:
.orm.load_only
such that primary key columns are always added to the list of columns to be "undeferred"; otherwise, the ORM can't load the row's identity. Apparently, one can defer the mapped primary keys and the ORM will fail, that hasn't been changed. But as load_only is essentially saying "defer all but X", it's more critical that PK cols not be part of this deferral. fixes#3080
→ <<cset 023d818f21ad>>
- Modified the behavior of :func:
-
Hello - I am trying to use the load_only functionality, but my primary keys have the same name across tables.
from sqlalchemy import create_engine from sqlalchemy import Column, Integer, String, ForeignKey from sqlalchemy.ext.declarative import declarative_base from sqlalchemy.orm import Load, relationship, load_only Base = declarative_base() class Table1(Base): __tablename__ = 'table1' global_id = Column(Integer, primary_key=True) table1_val = Column(String) r1 = relationship('Table2', backref = 'r2') class Table2(Base): __tablename__ = 'table2' global_id = Column(Integer, ForeignKey('table1.global_id'), primary_key=True) table2_val = Column(String) from sqlalchemy.orm import sessionmaker some_engine = create_engine('postgresql://scott:tiger@localhost/') Session = sessionmaker(bind=some_engine) session = Session() query = session.query(Table1, Table2).join(Table2).options(Load(Table1).load_only('table1_val'), Load(Table2).load_only('table2_val')) import pandas as pd df = pd.read_sql(query.statement, query.session.bind)
Results in this error: ValueError: Encountered duplicate field name: 'global_id'
More info can be found at (http://stackoverflow.com/questions/35966988/load-subset-of-joined-columns-in-sqlalchemy)
-
repo owner hello -
please post your issue either on the mailing list or if you can produce a complete test case as a new issue. Please provide a complete test that does not require pandas using the guidelines at http://stackoverflow.com/help/mcve.
Below, your code with a simple persist and a query produces no error:
from sqlalchemy import create_engine from sqlalchemy import Column, Integer, String, ForeignKey from sqlalchemy.ext.declarative import declarative_base from sqlalchemy.orm import Load, relationship, load_only Base = declarative_base() class Table1(Base): __tablename__ = 'table1' global_id = Column(Integer, primary_key=True) table1_val = Column(String) r1 = relationship('Table2', backref = 'r2') class Table2(Base): __tablename__ = 'table2' global_id = Column(Integer, ForeignKey('table1.global_id'), primary_key=True) table2_val = Column(String) from sqlalchemy.orm import sessionmaker some_engine = create_engine('sqlite://') Base.metadata.create_all(some_engine) Session = sessionmaker(bind=some_engine) session = Session() session.add(Table1(r1=[Table2()])) query = session.query(Table1, Table2).join(Table2).options( Load(Table1).load_only('table1_val'), Load(Table2).load_only('table2_val')) print query.all()
output:
[(<__main__.Table1 object at 0x7f98c2271ad0>, <__main__.Table2 object at 0x7f98c2271f10>)]
- Log in to comment
is that only subquery loading?