Don't lose data when doing non-blocking I/O

#12 Declined
  1. Stefano Rivera

Continuing on from 3f96afe7cdc2, I looked for all other possible data loss, when EAGAIN's could be thrown.

The common scenario is that many layers of the stream stack temporarily cache read data, while read()ing in a loop, and then return the data, concatenated. This breaks when you receive an EAGAIN.

I started by auditing the code to find possible problems, then writing tests that could expose them, then fixing the bugs. There isn't a strict relationship between tests and patches, rather, vaguley-comprehensive tests.

The tests are a bit ugly, I can't see any way to avoid massive code duplication or multiple asserts per test. Suggestions welcome :)

Comments (11)

    1. Stefano Rivera author

      Yes, but where can I define that function. One can't call neighbor methods from the test methods. (I've never met pytest before, and it seems rather magicky. Esp with the multiple objectspaces in pypy)

  1. Amaury Forgeot d'Arc

    - Could you also add tests that run on a translated pypy-c? It's important, because python 3 uses a very different code (the _io module) and your changes are likely to be lost.

    - How does CPython (2.7 and 3.2 behave in this area?)

    1. Stefano Rivera author

      I did my best to copy cpython (2.7) and general C behavior. Basically, any read() can throw EAGAIN if there's nothing to be read. And a readline() can return a partial line if it runs out of data. Of course, this isn't quite as easy to test in cpython, because we can't play with internals, but I get the same results when running bursty jobs in subprocess.

      In fact, the reason I did any of this at all was so that we could get pygame's test suite to run under Pypy. Without this patch, it randomly looses data that subprocesses output. Here's a slightly modified example of that test suite's problem (the sleep may need tweaking between cpython and pypy to get nice bursty reads):

    2. Stefano Rivera author

      Could you also add tests that run on a translated pypy-c?

      Don't think so. Could create one or two tests, showing the type of problem, but all the stream classes need to be tested, and one can't create them all from app-space (or simulate this problem with all the types of stream that one *can* create, from app space).