topological sort relies on hefty recursion at the end

Issue #423 resolved
Mike Bayer repo owner created an issue

we have to rewrite the _create_batched_tree method to not use recursion the way its doing - very easy for it to run out and its inefficient.

        tuples = [       for i in range(0,1500, 2):
            tuples.append((i, i+1))
        head = DependencySorter(tuples, [](]
)).sort()

== kaboom

Comments (19)

  1. Former user Account Deleted

    As this is apparently a source of heisenbugs I tested this very thoroughly. Ensured by hand that the code is isomorphic to original, ran some manually engineered testcases and lots of randomly constructed tests. Other than some valid variations of tree structure when dictionary ordering changed in topological sort, the output is identical.

  2. Former user Account Deleted

    Added an early out for nodes without dependencies, which should be a pretty common case. For lists of independent nodes the asymptotic complexity is now O(n) instead of O(n^2^). Worst case, which is the case in the bug description, is still O(n^2^), although about 7 times faster than before this patch. I don't see a good way to make that any faster either.

  3. Mike Bayer reporter

    im sure its great, the issue was mostly the recursion overflow....im a little short on time this weekend so ill try to review this when i get a chance.

    can you post your test cases ? have you been able to run through all SA unit test ? (in particular test/orm/alltests.py , with a good strict database like postgres ?)

  4. Former user Account Deleted

    I have two unit tests failures, but they are exactly the same as before the patch. Namely:

    ======================================================================
    FAIL: test_recycle (engine.pool.PoolTest)
    ----------------------------------------------------------------------
    Traceback (most recent call last):
      File "/home/ants/work/sqlalchemy/trunk/test/engine/pool.py", line 167, in test_recycle
        assert id(c3.connection) != c_id
    AssertionError
    
    ======================================================================
    FAIL: testmanytoonepolymorphic (orm.inheritance5.RelationTest4)
    ----------------------------------------------------------------------
    Traceback (most recent call last):
      File "/home/ants/work/sqlalchemy/trunk/test/orm/inheritance5.py", line 296, in testmanytoonepolymorphic
        assert str(usingProperty) == "Engineer E4, status X"
    AssertionError
    
    ----------------------------------------------------------------------
    

    My own tests were hand crafted nonautomatic tests that I used to verify correctness. The random test generated random graphs and compared output of old algorithm and new algorithm and alerted when they were not exactly the same. I verified for all exceptions that the differences were trivial reorderings due to different topological sort order. The difficult part isn't generating the testcases, but verifying the results. I suppose I could make a testcase processor that can verify that the produced tree is valid by validating that all dependency constraints hold. If you deem it necessary I will create it, but it's going to take a bit as I have to do a nasty database migration in the next week.

  5. Former user Account Deleted

    Forgot to mention, those results are on Python 2.5, Postgres 8.1.5, x86_64, SVN revision 2255. Just tried with 2267, same failures + 3 errors from orm.inheritance5.

  6. Mike Bayer reporter

    what OS are you on, linux ? (its frequent that tests which pass on OSX fail on linux, usually due to differences in dictionary ordering). im curious if the latest trunk (2271) fixes those particular test failures....even though I dont believe they are related to the topological sorting (although its remotely possible).

  7. Former user Account Deleted

    It doesn't fix those particular issues but there are lots more now. I'll attach the test log. Anyway, this has got nothing to do with this patch as test log is identical with or without the patch.

  8. Mike Bayer reporter

    i know, i apologize for the tangent. as far as the topological patch, im pretty sure its OK, i just want to make sure I understand it.

    that log output looks very much like the tests are being run off the wrong version of the code...particularly the quoting test, i cant see how that one would screw up unless youre running against the older version of the code. ill try running all tests on a linux box later today.

  9. Former user Account Deleted

    Ok, this is embarrasing. I for some reason figured that the tests would augment the path with ../lib automatically. So the previous runs were against my SQLAlchemy install, not against trunk. Reran the unittests with correct path and found a bug in my code. it was throwing an exception on empty sequences. If I fixed that everything but the test_recycle in engine.pool.PoolTest ran OK. Sorry for the confusion.

  10. Mike Bayer reporter

    the unit tests do prepend './lib/' to the sys.path, its at the top of test/testbase.py, which is generally needed since easy_install favors installed .eggs over paths in PYTHONPATH (except on my system, where i added an init file to override that).

  11. Mike Bayer reporter

    i think you are good for now, this is a patch where I need to find the time to really pore through the algorithm and make sure i understand it (since I will usually be the one supporting it if new errors arise). i have some time constraints including traveling currently but this is on my high priority list of items and hopefully will be in the next release.

  12. Former user Account Deleted

    Maybe I can making understanding the algorithm simpler. It's actually exactly the same algorithm as before, only transformed into iterative form, so there exists a 1:1 mapping between the old version and the new version. Here are some of the mappings: The ''sort(index + 1, l2)'' recursion is transformed into ''for node in reversed(nodes):'' loop. The ''l'' and ''l2'' lists are mapped to ''independets'' list. ''for n in l2:'' loop is transformed into the ''for index in xrange(len(independents)-1,-1,-1):'' loop. Because we don't rebuild the independents list on each recursion/iteration, the ''if previous_dependent_on_node: add_previous_to_nodes_children, else: add_previous_to_new_independents_list'' part changes to ''if previous_dependent_on_node: add_previous_to_nodes_children, remove_previous_from_independents''.

    The search_dep function underwent a slightly larger transformation. Previously each recursive dependency check meant checking recursively for each node in the tree, 1) if that node is in the potential parents dependency list or 2) in any of the potential parents cycles dependency lists or if 3) any of that nodes cycles is in the potential parents dependency list. This is replaced by for each independent subtree collecting a set of all nodes in that subtree. Then checking 1) and 2) maps to checking the intersection of that set and the set of all the dependencies of the parent node and its cycles (the new all_deps function). For checking 3) we also collect a set of all the cycles in a subtree and check whether it interesects with the parent nodes dependencies. Note that there is a slight unorthogonality here as the childs cycles aren't checked against parents cycles dependencies. I'm not sure whether it should be so or not, but it was so in the old algorithm. If it's a bug, it would actually simplify the algorithm a bit as the subtree and cycle sets could be merged.

    I hope that this makes understanding the algorithm simpler. My suggestion is to merge the patch as soon as you make the decision to put it in. Leaves more time for testing to find potential bugs.

  13. Mike Bayer reporter

    youre being pushy, but youre expending tremendous effort to do so so thats OK :). so fine, its in, changeset:2312, it works extremely well as I ran the unit tests with session logging on into a text file and did a diff of the same run against both versions, and the log outputs are more or less identical. and of course the test in this ticket passes.

    also i didnt really understand the code that I had already written for that section anyway, like i understood the general idea but i found it just as hard to follow as the new version (im not a deep algorithm guy), so i have an equal chance of understanding either one at this point. thanks for all the comments.

  14. Log in to comment