topological.organize_as_tree may create an overly deep tree
this script illustrates topological.organize_as_tree creating a very deep tree as a result of many dependencies that are not related to each other; this within unitofwork _sort_circular_dependencies
. The script can cause a recursion overflow during the flush process, but its very very heisenbuggy; running it on different versions/platforms will have the bug appear, then not -in particular it seems to require both dont_generate_error
and generate_error
to run fully in order to trip it.
But the main idea is that organize_as_tree
needs to be more intelligent about this kind of graph so that the tree is not very deep.
import sqlalchemy as sa
from sqlalchemy import create_engine, MetaData, orm
from sqlalchemy import Column, ForeignKey
from sqlalchemy import Integer, String
from sqlalchemy.orm import mapper
class Object(object):
pass
class Q(Object):
pass
class A(Object):
pass
class C(Object):
pass
class WC(C):
pass
engine = create_engine('sqlite:///:memory:', echo=True)
sm = orm.sessionmaker(autoflush=True, transactional=True, bind=engine)
SA_Session = orm.scoped_session(sm)
SA_Metadata = MetaData()
object_table = sa.Table('Object',
SA_Metadata,
Column('ObjectID', Integer,primary_key=True),
Column('Type', String(1), nullable=False))
q_table = sa.Table('Q',
SA_Metadata,
Column('QID', Integer, ForeignKey('Object.ObjectID'),primary_key=True))
c_table = sa.Table('C',
SA_Metadata,
Column('CID', Integer, ForeignKey('Object.ObjectID'),primary_key=True))
wc_table = sa.Table('WC',
SA_Metadata,
Column('WCID', Integer, ForeignKey('C.CID'), primary_key=True))
a_table = sa.Table('A',
SA_Metadata,
Column('AID', Integer, ForeignKey('Object.ObjectID'),primary_key=True),
Column('QID', Integer, ForeignKey('Q.QID')),
Column('CID', Integer, ForeignKey('C.CID')))
mapper(Object, object_table, polymorphic_on=object_table.c.Type, polymorphic_identity='O')
mapper(Q, q_table, inherits=Object, polymorphic_identity='Q')
mapper(C, c_table, inherits=Object, polymorphic_identity='C')
mapper(WC, wc_table, inherits=C, polymorphic_identity='W')
mapper(A, a_table, inherits=Object, polymorphic_identity='A',
properties = {
'Q' : orm.relation(Q,primaryjoin=a_table.c.QID==q_table.c.QID,
backref='As'
),
'C' : orm.relation(C,primaryjoin=a_table.c.CID==c_table.c.CID,
backref='A',
uselist=False)
}
)
SA_Metadata.create_all(engine)
def generate_error():
q = Q()
for j in range(400): #at 306 the error does not pop out (depending on recursion depth)
a = A()
a.Q = q
a.C = WC()
SA_Session.save(q)
SA_Session.commit() #here the error pops out
def dont_generate_error():
q = Q()
SA_Session.save(q)
for j in range(600):
a = A()
a.Q = q
a.C = WC()
SA_Session.commit()
dont_generate_error()
generate_error()
If you comment out dont_generate_error()
, and place change the number of iterations in generate_error()
from 400 to 10, and then place a pdb.set_trace()
on line 734 of unitofwork.py, you get results like these for tuples
and head
:
(Pdb) [y.class_) for x, y in tuples]((x.class_,)
['__main__.Q'>, <class '__main__.A'>), (<class '__main__.Q'>, <class '__main__.A'>), (<class '__main__.Q'>, <class '__main__.A'>), (<class '__main__.Q'>, <class '__main__.A'>), (<class '__main__.Q'>, <class '__main__.A'>), (<class '__main__.WC'>, <class '__main__.A'>), (<class '__main__.WC'>, <class '__main__.A'>), (<class '__main__.WC'>, <class '__main__.A'>), (<class '__main__.WC'>, <class '__main__.A'>), (<class '__main__.WC'>, <class '__main__.A'>), (<class '__main__.Q'>, <class '__main__.A'>), (<class '__main__.WC'>, <class '__main__.A'>), (<class '__main__.Q'>, <class '__main__.A'>), (<class '__main__.WC'>, <class '__main__.A'>), (<class '__main__.Q'>, <class '__main__.A'>), (<class '__main__.WC'>, <class '__main__.A'>), (<class '__main__.Q'>, <class '__main__.A'>), (<class '__main__.WC'>, <class '__main__.A'>), (<class '__main__.Q'>, <class '__main__.A'>), (<class '__main__.WC'>, <class '__main__.A'>)]((<class)
above you can see that each A
is dependent on the Q
as well as each WC
. But that's it; a tree of these dependencies should place all the A
s as a straight down list after the WC
s and single Q
. But looking at the head
which is the tree:
(Pdb) head
(<sqlalchemy.orm.attributes.InstanceState object at 0x2ac75ca9b550>, [[(<sqlalchemy.orm.attributes.InstanceState object at 0x2ac75ca9b690>, [](],), [object at 0x2ac75ca95150>, []((<sqlalchemy.orm.attributes.InstanceState), [object at 0x2ac75ca9b390>, []((<sqlalchemy.orm.attributes.InstanceState), [object at 0x2ac75ca95350>, []((<sqlalchemy.orm.attributes.InstanceState), [])](]))), (<sqlalchemy.orm.attributes.InstanceState object at 0x2ac75ca9b7d0>, [[(<sqlalchemy.orm.attributes.InstanceState object at 0x2ac75ca9b810>, [](],), [])](]))), (<sqlalchemy.orm.attributes.InstanceState object at 0x2ac75ca9b410>, [[(<sqlalchemy.orm.attributes.InstanceState object at 0x2ac75ca9b450>, [](],), [])](]))), (<sqlalchemy.orm.attributes.InstanceState object at 0x2ac75ca9b590>, [[](],)), (<sqlalchemy.orm.attributes.InstanceState object at 0x2ac75ca9b6d0>, [[](],))])])])
(Pdb) head[2](2)[0](0)[2](2)[0](0)[2](2)[0](0)[2](2)
[object at 0x2ac75ca95350>, []((<sqlalchemy.orm.attributes.InstanceState), [])](]))
(Pdb)
the structure is deeply nested. When the number of objects is upped to the hundreds we get recursion overflows when this tree is acted upon.
Comments (12)
-
reporter -
reporter or convert
UOWExecutor
to not be recursive. Some more recursiveness has been removed in 5b99a7ccc8d306ce198d1d684763b861ef2a96c7. -
reporter at this point the error is in UOWExecutor. A little trickier to make that non recursive.
-
reporter - changed milestone to 0.7.0
the entire tree notion could be taken out based on the idea at
#1742. If we're able to proceed with that then this ticket is implicitly fixed. -
reporter the above test works just fine using http://bitbucket.org/zzzeek/uow_rethink/. on 0.6 I can get it to overflow if I bump the 400 up to 1800. I think at the very least this will be a good test to compare callcounts with across the two approaches.
-
Trying the branch with my codebase I get a ton of this:
File "uow_rethink/lib/sqlalchemy/orm/query.py", line 1343, in all return list(self) File "uow_rethink/lib/sqlalchemy/orm/query.py", line 1450, in __iter__ self.session._autoflush() File "uow_rethink/lib/sqlalchemy/orm/session.py", line 879, in _autoflush self.flush() File "uow_rethink/lib/sqlalchemy/orm/session.py", line 1347, in flush self._flush(objects) File "uow_rethink/lib/sqlalchemy/orm/session.py", line 1425, in _flush flush_context.execute() File "uow_rethink/lib/sqlalchemy/orm/unitofwork.py", line 198, in execute for rec in cycles File "uow_rethink/lib/sqlalchemy/orm/unitofwork.py", line 198, in <genexpr> for rec in cycles File "uow_rethink/lib/sqlalchemy/orm/unitofwork.py", line 371, in per_state_flush_actions False): File "uow_rethink/lib/sqlalchemy/orm/mapper.py", line 1312, in _per_state_flush_actions for rec in prop.per_state_flush_actions(uow, states_for_prop, isdelete): File "uow_rethink/lib/sqlalchemy/orm/properties.py", line 1222, in per_state_flush_actions for rec in self._dependency_processor.per_state_flush_actions(uow, state, isdelete): TypeError: 'NoneType' object is not iterable
-
reporter apparently you're using mappings that aren't covered in any tests. The issue you're having is most likely solved in the latest tip of that branch. The new uow is more deliberate about various contingencies and requires more test coverage than the old version.
-
reporter -
reporter - changed milestone to 0.6.1
-
reporter - changed milestone to 0.6.0
-
reporter - changed status to resolved
#1742is now fixed. -
reporter - removed milestone
Removing milestone: 0.6.0 (automated comment)
- Log in to comment
an alternative here is to do away with the whole "tree" thing and modify the way
UOWExecutor
works so that it is not using nested calls to handle "cyclical" dependencies. Or just inliningUOWExecutor
. But it would be nicer to do away with the "tree" concept completely.