Distributed testing can lock up and never finish

Issue #72 resolved
memedough created an issue

Hi Holger,

For distributed testing py.test can lock up and never finish.

On the host running py.test at the end of the testing it can lock up. I repeated this issue a number of times. Hitting control C shows the stack trace (I posted 3 but I ran many more tests):

http://pastebin.com/m266c2c39

Further I tracked down (to some extent) under what conditions this is occurring.

What is happening is that there is a module level setup fixture. During this setup the subprocess module is used to start a server needed for testing.

This server is still up when the testing finishes (since the teardown fn does not get called for distributed testing). What appears to happen is that the python processes on the remote host finish up and exit (leaving the server process running - which is fine as we can work around it), however py.test on the original host running the command appears to wait forever (even though there is a timeout parameter in the stacktrace I have never seen it time out - maybe it's just very long).

I have verified that if the server process is stopped in the setup fn then all the processes on the remote host exit and the py.test command on the original host exits correctly showing which tests failed.

So it would appear that when the remote host python processes decide to exit that they should make sure that they communicate back to the original host indicating that they are done and are exiting - even if there is a subprocess still running.

:)

Comments (3)

  1. Holger Krekel repo owner
    • changed status to open

    The timeout parameter is None, coming from group.terminate(timeout=None) - which means it waits indefinitly. Indeed a very long timeout :)

    I am not sure how the subprocess-Process you start relates to the problem yet. If you happen to be able to isolate the problem in a simple test-file that'd be cool as i could use it to test against.

  2. Holger Krekel repo owner

    Can you try again with py-1.2 and pytest-xdist and see if the problem persists? the underlying execnet and distribution code received a cleanup particularly related to termination handling but i am not sure it is properly covering your case. thanks, holger

  3. Holger Krekel repo owner

    i believe this issue should be fixed alredy, closing this issue. if not, please try with recent versions and re-open.

  4. Log in to comment