- attached validator_bug.py
reentrant attribute access wtihin validators/AttributeExtension affects history accounting
If we have a self-referential, many-to-many relation (e.g. network of nodes), when we define a validator on one side of the symmetrical relationship (namely children), append works or doesn't work correctly depending on whether the data is commited or not. Works fine if there's no validator or changes are commited at the very end of operation.
An obvious workaround is to change the append to append to the other side of the relationship (ie. parent.children.append instead of children.parents.append), so I've set the ticket priority to low. However, something is definitely wrong with the code and this can be a manifestation of some nastier bug, ergo the ticket.
Self-contained test case is attached. My setup: SQLAlchemy 0.5.5, Python 2.5 on win32
Comments (10)
-
Account Deleted -
repo owner - marked as major
- changed title to add "enable_backrefs=False" to @validator(), document single-direction nature of them
- changed milestone to 0.5.6
backrefs don't fire off for unloaded collections, and it would be seriously inefficient if they did for many-to-many relations. every append() of an item would potentially result in a full load of a new collection of arbitrary length !
I think this is more of a documentation issue - validators only apply to in-python operations, not "de facto" collection modifications that occur due to database operations resulting from a related collection in the other direction.
the solution would be to validate on each side where you expect an append() operation in python to occur.
the only thing bugging me here is that yes, if both sides are loaded, then both event handlers fire off. so this may be a behavioral change I'd want to set up for 0.6.0 - that @validator doesn't fire off for a backref operation (or, better yet, lets just add a flag to @validator("foo", enable_backrefs=False))
-
Account Deleted Now I'm confused. You say that backrefs don't fire off for unloaded collections. However, a validator defined for 'children' will fire (and work correctly) when doing a node.parents.append(). The test case demonstrates this (eg. trying to add a duplicate node will raise an exception).
What doesn't work is that append() silently fails, and only when intermidiate commits() are made. In pseudocode:
This works:
create node1 create node2 create node3 node2.parents.append(node1) node3.parents.append(node2) commit #node1.children would now be [node3](node2,)
But this doesn't:
create node1 commit create node2 commit create node3 commit node2.parents.append(node1) node3.parents.append(node2) commit #node1.children would now be [node3](node3) #node2 silently failed to be appended!
But this works again:
create node1 commit create node2 commit create node3 commit #Accessing node1.children magically makes code work again assert node1.children == [] node2.parents.append(node1) node3.parents.append(node2) commit
-
repo owner - changed title to reentrant attribute access wtihin validators/AttributeExtension affects history accounting
sorry, I read "doesn't work", saw that you were expecting it to work primarily from the reverse end, and assumed it meant your validator wasn't working as expected. But in fact you're hitting "self.children" which is loading the collection unconditionally. That is actually the source of the issue here, changing the state of the collection within the enclosing operation that is trying to do the same thing is resetting the "dirty" flag on the item and the collection isn't being flushed.
While the use case pictured in this test case is a lot easier to achieve by just using a
set()
, it is a bug though I won't know if it's resolvable until I more closely pinpoint the mechanics of this condition. -
Account Deleted The actual use case is a bit more complicated. The validator is used to prevent circulairty in the graph and therefore must traverse parents/children in a non-trivial way.
However, as there is a workaround (in this test case, using {{{node1.children.append(node2)}}} instead of {{{node2.parents.append(node1)}}}), the issue is not too critical. I just wanted to point out the bug, to make sure its not a manifestation of some deeper problem.
-
repo owner Here's a patch that addresses at least trapping where the bookkeeping goes wrong:
Index: lib/sqlalchemy/orm/attributes.py =================================================================== --- lib/sqlalchemy/orm/attributes.py (revision 6320) +++ lib/sqlalchemy/orm/attributes.py (working copy) @@ -700,6 +700,7 @@ collection = self.get_collection(state, dict_, passive=passive) if collection is PASSIVE_NO_RESULT: value = self.fire_append_event(state, dict_, value, initiator) + assert self.key not in dict_, "Collection was loaded during event handling." state.get_pending(self.key).append(value) else: collection.append_with_event(value, initiator) @@ -711,6 +712,7 @@ collection = self.get_collection(state, state.dict, passive=passive) if collection is PASSIVE_NO_RESULT: self.fire_remove_event(state, dict_, value, initiator) + assert self.key not in dict_, "Collection was loaded during event handling." state.get_pending(self.key).remove(value) else: collection.remove_with_event(value, initiator)
I think this + the aforementioned enable_backrefs=False flag to avoid having the error be raised is at least the way to go for now. there are ways the bookkeeping can try to "adjust" inside the block there but the sequence of events gets a little more intricate.
-
Account Deleted - attached backref_validator_test.py
A test that fails when backref validators are fired and passes when only the correct validator is fired
-
Account Deleted I confirm that the changes throw an assert where append() used to silently fail.
I think that it's better for validators not to fire for the backref side of the relation, since it's redundant. Then the programmer would explicitly need to use {{{enable_backrefs=True}}} if he really knows what he's doing (and the funcionality has been fully implemented, which at the moment it isn't for self-referential relations).
Since this is a lower-priority issue, maybe the following course of action would be the best:
- Disble firing validators on the backref side of a relation.
- Explicitly note this in the section on validators in the documentation.
- One day, when a clear use case is found and the funcionality has been implemented correctly, add enable_backrefs flag, False by default. Setting it to True would cause the backref validator to be fired also.
Attached: a test case that fails when backref validators are fired and passes when only the correct validator is fired.
-
repo owner - changed status to resolved
assertion is in f83c9a3959e25e5817bae6f0ca0015f9054baf8d 83eafb241349c06c02f6db799186869e3ee540cb. the validates flag is
#1535. -
repo owner - removed milestone
Removing milestone: 0.5.6 (automated comment)
- Log in to comment
Test case showing the bug