Bugfix: Socket timeouts in wsgi server not caught (eventlet issue #143)

#33 Declined
  1. pauloppenheim-liveloop

This patch for issue #143 catches socket.timeout from the traceback in there the highest point in the eventlet.wsgi.server code it can, at process_request. I could have gone one further and caught in the bare _spawn_n_impl, but that feels like over-generalizing and also I'd rather avoid changing exception handling in that method right now. I don't want to poke that bear — for instance, why is traceback.print_exc being called directly instead of server.log.write(traceback.format_exc())? etc. Seems simpler to keep the bugfix in the wsgi server.

It's been a long time, and I'm not sure who the current maintainer is (hello!), but I'm probably not in the Linden Contributor Agreement list, but I am an ex-Linden employee and so should have enough paperwork already in. If that's not how things work anymore, this patch is also MIT licensed or whatever you need to merge it in.

Comments (18)

  1. Floris Bruynooghe

    Could you add a test for this as well please?

    AFAIK there is no one from Linden active anymore. As for legalities I think that as long as people are happy to contribute using the same MIT license all is fine, no idea what Linden wanted for paperwork.

  2. pauloppenheim-liveloop author

    Tests didn't work on windows, I've patched my fork. I'm guessing from your mention of lack of Linden involvement that the build farm isn't running tests on this one anymore. I'll sniff around more for info on that.

    Once that's patched:

    Ran 576 tests in 17.426s
    FAILED (SKIP=220, errors=224, failures=20)

    It's mostly related to signal.SIGALRM being nonexistant on windows, but part of block detection, ala

    ERROR: test_wrap_tuple (tests.tpool_test.TestTpool)
    Traceback (most recent call last):
      File "c:\Users\poppy\Workspace\velour\venv\src\eventlet\tests\tpool_test.py", line 43, in tearDown
      File "c:\Users\poppy\Workspace\velour\venv\src\eventlet\eventlet\tpool.py", line 292, in killall
      File "c:\Users\poppy\Workspace\velour\venv\src\eventlet\eventlet\greenthread.py", line 272, in kill
      File "c:\Users\poppy\Workspace\velour\venv\src\eventlet\eventlet\hubs\hub.py", line 225, in run
      File "c:\Users\poppy\Workspace\velour\venv\src\eventlet\eventlet\hubs\hub.py", line 93, in block_detect_pre
        tmp = signal.signal(signal.SIGALRM, alarm_handler)
    AttributeError: 'module' object has no attribute 'SIGALRM'

    Aside from windows currently being soaked in failures, I've got the beginnings of a test, but I believe I have found a deeper problem with said test. Hopefully it's worth committing soon.

  3. pauloppenheim-liveloop author

    Eventlet works so well on windows so far, I'm surprised to hear your surprise. Admittedly, this is not in production yet. You should be more surprised I'm using windows! (safely, in a VM on a linux system)

  4. pauloppenheim-liveloop author

    Yeah... I wanted to write something that both appeared the same as the bug to the end user, and also didn't rely on timers / time.time() to help reduce test runtime. Monkeypatch wrappers seemed the quickest way to make the insides blow up visibly to the outsides. But, yeah, not the most straightforward code I've ever written! Generally when your comments have to include ascii art...

  5. Sergey Shepelev

    @pauloppenheim-liveloop Hello, Paul. I am the current maintainer.

    No contributor paper is required since 2008, as far as I know. We also have an official Github mirror if you prefer it: https://github.com/eventlet/eventlet

    Thanks for your patch. I have a few minor questions:

    • Could you describe circumstances under which the resource module is not available?
    • While I appreciate creativeness too, have you considered using a separate process approach, like in env_test.py, fork_test.py? I'm all for simplest possible code.
    • class ExplodingSocketFile(socket_fileobject): Any specific reason to not inherit from socket._fileobject without temporary variable?
  6. pauloppenheim-liveloop author

    Hello @Sergey Shepelev,

    Sorry for the delay in response, I lost track of this. Yes, I will use the github mirror in the future; their notifications are much more verbose. As far as your questions,

    • resource is not available in Windows - it is Unix only. http://docs.python.org/2/library/resource.html

    • the socket_fileobject = socket._fileobject is leftover from experiments. Deleted. Thanks for noticing!

    • re: multiple processes, like env_test and fork_test - The test is already inscrutable, and I feel like having the test written as a string that gets dumped to a file and ran adds an large and unnecessary level of inscrutability. I do agree that the isolation of another process gives some comfort, but we're talking about tests - they should already be running isolated from production code. Despite monkeypatching, I don't appear to have done anything that will change system objects beyond the lifespan of the test, so other tests should be fine. If desired, I could add a step to dir(socket) before and after to verify they are equal.

    Also, after looking into fork_test and env_test, I noticed that ProcessBase.setUp doesn't call super(self.__class__, self).setUp() - doesn't it need to for actual timeouts?

  7. Sergey Shepelev
    • On inscrutability. You don't have to put them into string. Separate files in repository work even better.
    • we're talking about tests - they should already be running isolated from production code That is not the problem. The problem is isolation of one test from another. In absence of such isolation, basically we can't trust any test result. Right? don't appear and should be fine is not the level of insurance I would like to get from running tests.
    • Simulate server connection socket timeout without actually waiting. maybe actually waiting would simplify it all. There is quite a lot of places in tests where we actually wait some reasonable small time, like 10-50-100ms. It's even more acceptable if tests run in parallel.
    • On ProcessBase.setUp - thank you, I will add the call. BTW, super(self.__class__, self) won't work in all cases, don't write that, there is a reason one must pass exact class as first argument.
  8. Sergey Shepelev

    @pauloppenheim please have a look here https://github.com/eventlet/eventlet/commits/wsgi-fix-socket-timeout this is a branch with your work ready to be merged

    I merged resource and setUp/tearDown fixes separately. They're already in master.

    In that branch I extracted common code from ProcessBase so you don't have to copy file. And removed dependency on _TestBase in wsgi_test_conntimeout.py because I believe these subprocess tests must be as small and simple as possible.

    If you say OK, I will merge it all.

  9. pauloppenheim-liveloop author

    I really like what you've done. I initially wrote wsgi_test_conntimeout.py without using unittest, but felt weird copying things over. Your take on it is better than mine was. Also, the run_python helper is a much better way of going about running than a subclass of ProcessBase! I thought about doing that, but thought it would make my already-large pull request too big.

    Using assert instead of the unittest assert helpers is a little weird (that's a lot of code to replace assertRaises) but that's just a problem in the way asserts are structured in the unittest helpers. There's a bug on python testing that has a suggestion that asserts should be separate, but that's not the world we live in yet.

    I'm happy with your take on this, go ahead and merge. Thank you for your effort, @Sergey Shepelev!

    As far as your questions about bug #109, I will post there.