Issue #1314 resolved

Unconsumed iterated responses are not closed properly

Allan Crooks
created an issue

I've attached two files to show this - "gensomebytes.py" which is the CherryPy server and "readsomebytes.py" which will read a kilobyte of data and just exit.

The server module will use an iterator to return a reponse which generates log statements to indicate when the iterator is instantiated, finalised, explicitly closed, and when the generator begins to be consumed and stopped.

If you start up the server, and run the script "readsomebytes.py", it'll connect to the server, read some data, and then close completely. If I modify the script to generate a small amount of data (which will read all the data), you'll get these sort of log messages:

[20/Apr/2014:02:30:36]  Instantiated <__main__.StreamingResource object at 0x7f1358031d90>.
127.0.0.1 - - [20/Apr/2014:02:30:36] "GET /resource/20 HTTP/1.1" 200 - "" ""
[20/Apr/2014:02:30:36]  Beginning iteration of <__main__.StreamingResource object at 0x7f1358031d90>.
[20/Apr/2014:02:30:36]  Finishing iteration of <__main__.StreamingResource object at 0x7f1358031d90>.
[20/Apr/2014:02:30:36]  Finalizing <__main__.StreamingResource object at 0x7f1358031d90>.

and if you increase the size of the response data so that the script doesn't read everything:

[20/Apr/2014:02:30:44]  Instantiated <__main__.StreamingResource object at 0x7f1358031e50>.
127.0.0.1 - - [20/Apr/2014:02:30:44] "GET /resource/200 HTTP/1.1" 200 - "" ""
[20/Apr/2014:02:30:44]  Beginning iteration of <__main__.StreamingResource object at 0x7f1358031e50>.

Even though the script terminates and the socket closes, there's no explicit closing or garbage collection that takes place. If I repeat the request enough, then I eventually hit a threshold where these unconsumed iterators are cleared up in one swoop. Is there something we can do where we can clean up these iterators when the socket disconnects?

Comments (3)

  1. Allan Crooks reporter

    OK, this isn't CherryPy's fault, but it is something we can't ignore.

    I think the fact that the generator isn't explicitly closed means we end up in a position where the garbage collector has trouble determining whether it is able to finalize it. Other posts that I've read online seem to indicate that this is the expected behaviour - though I'll have to find that later.

    If I modify the attached server module to not execute the CherryPy server upon import, I can demonstrate the problem from a standard Python prompt:

    >>> import gensomebytes
    >>> a = gensomebytes.StreamingResource(20)
    [20/Apr/2014:18:33:43]  Instantiated <gensomebytes.StreamingResource object at 0x19cca10>.
    >>> del a
    >>> import gc
    >>> gc.collect()
    4
    >>>
    >>> a = gensomebytes.StreamingResource(20)
    [20/Apr/2014:18:34:42]  Instantiated <gensomebytes.StreamingResource object at 0x7f454122c450>.
    >>> a.iterator.close()
    >>> 
    >>> del a
    [20/Apr/2014:18:34:49]  Finalizing <gensomebytes.StreamingResource object at 0x7f454122c450>.
    

    As far as I can tell, an instance object refers to a generator, and that generator has a reference to the instance object as it is aware of "self".

    So I think CherryPy needs to do three things: 1) Call "close" when dealing with generators when errors occur. 2) Have iterator wrappers delegate closing to their internal iterator. 3) Provide some mechanism to tell non generator-based iterators to close.

  2. Log in to comment