Query.one() fails to raise MultipleResultsFound

Issue #1688 resolved
Former user created an issue

G'day,

I've got a situation where a query results in multiple rows, but calling one() does not result in a MultipleResultsFound exception being raised.

I am using polymorphic mapping and then joining the 'base' table with one of the 'identity' tables. (Please see the attached unit test which I believe illustrates the problem clearly).

I can confirm that two rows are returned by the database, but these rows are put into an IdentitySet and happen to be equal due to the query only selecting columns from the base class (this is why the problem only manifests when using polymorphic mapping). This causes the call to one() to return a single row even though each row results in a different object once mapped polymorphically. I am aware that the query is not well constrained (it should explicitly join on the two tables where Node.node_id==BranchOffice.node_id) but I guess that the assertion aspect of using one() should catch the problem in my query.

I've also attached a patch containing the fix that I have deployed here, but I am by no means certain that it is the best solution.

Please let me know if I can provide any extra information, try another solution or whatever.

Regards, David Jagoe

Comments (8)

  1. Mike Bayer repo owner

    Using query.count() is not an option since that would add 100% overhead to every one() operation, and count() in any case is not reliable in a generic sense (only works for simple queries).

    query.one() counts the number of distinct object identities. Here, your query is returning only one unique object identity. So even though the query is returning the unintended results, one() is functioning as designed.

  2. Former user Account Deleted

    Ok, agreed that query.count() is no good.

    I also understand that query.one() counts distinct object identities, but when using polymorphic mapping I think we can get into trouble. Let me try to explain a little better what I have seen:

    This is what the node table looks like in my fixture:

    sqlite> select * from node;
    1||company
    2|1|branch_office
    3||company
    4|3|branch_office
    

    This is the query that results from the query in my unit test (except notice that I have left off the 'LIMIT 2 OFFSET 0' that results from the line that does list(self[0:2](0:2))

    sqlite> SELECT node.node_id AS node_node_id, node.parent_id AS node_parent_id, node.type AS node_type 
       ...> FROM node, branchoffice
       ...> WHERE node.type = 'branch_office' AND branchoffice.name = 'Bristol';
    2|1|branch_office
    2|1|branch_office
    4|3|branch_office
    4|3|branch_office
    

    If I understand correctly, there are actually 2 identity objects here. So my original patch actually just removes the slice so that line 1321 in query.py (SQLAlchemy 0.5.8) just reads

    ret = list(self)
    

    I presume that the slice is an optimisation? Is it necessary given that the caller to one() assumes that there is only one result? Maybe there are cases where lots of rows result in the same object identity?

    But I became concerned of a deeper problem... what if there were only two rows returned at the database level, like this:

    2|1|branch_office
    2|1|branch_office
    

    then at first glance it seems that will result in two objects of the same identity. Indeed if we were not using polymorphic mapping then that is true, but with polymorphic mapping I don't know which mapped object I'm going to get back. Is it the branch office that belongs to Road Runner or the one that belongs to Coyote? Since there is more than one answer to my query when using polymorphic mapping, I assume that one() should raise an exception. Or perhaps I could just contribute some documentation in the area?

  3. Mike Bayer repo owner

    the list(self) is something I will think about. the limit is an optimization. I am considering the case when a result is huge, and the library will appear to "hang" instead of quickly raising, or when the result is huge but genuinely contains only one identity, though im not sure what that use case really is.

    In your second case, the two rows are identical. It doesn't matter if there is polymorphic mapping, those rows represent the same type of object. The process of loading objects is only about what the rows actually state, not the intention of the query.

    The documentation here could be changed to say that one() counts unique identities returned in the result set, not actual number of rows.

  4. Mike Bayer repo owner

    I'm hoping you'll be satisfied with the changes in 4e9a2307a719667ffe27157146fa6c3dabcdb65b, which includes documentation details as well as opening up one() to fully fetch all rows. A very basic join() can elicit the same problem so it was worth changing. your attached test case passes.

    this does not guard against a result that unintentionally returns rows that are all of the same identity, however. detecting that sort of condition is out of scope.

  5. Former user Account Deleted

    Great, the change looks good. I also understand fully understand your points about the polymorphic mapping now.

    Cheers, David

  6. Log in to comment