Long polling cripples cherrypy

shapeshifter avatarshapeshifter created an issue

I'm having trouble making long polling work correctly with cherrypy:

Here's a test case. There's a boolean "self.done" which can be switched using "/set_done" and "/set_pending". While it's False, "/long_poll" will be kept open and when calling "/set_done", it will return a response saying "Done!".

The problem is: When calling long_poll a couple of times (about 5-10), the server becomes very unresponsive. The problem is even more distinct when *killing* the long_poll connections from the client. In that case, the server becomes completely unresponsive. This happens even though I've set the thread pool limit to -1.

cherrypy_config:

[global]
tree.test = test.app
server.socket_host = '0.0.0.0'
server.socket_port = 8090
server.thread_pool_max = -1
tools.encode.encoding = 'utf-8'
tools.encode.on = True 
log.error_file = "/tmp/cp_errors.log"

[/]
tools.sessions.on = False

test.py:

# -*- coding: utf-8 -*-

import time
import cherrypy

class Test(object):
    def __init__(self):
        self.done = False
    
    @cherrypy.expose
    def long_poll(self):
        cherrypy.response.timeout = 7200
        while True:
            if self.done == False:
                time.sleep(0.5)
            else:
                return "Done!\n"

    @cherrypy.expose
    def set_done(self):
        self.done = True
        return "set done\n"

    @cherrypy.expose
    def set_pending(self):
        self.done = False
        return "set pending\n"

    @cherrypy.expose
    def test(self):
        return "still working OK\n"

app = cherrypy.Application(Test())

Here's a sample run using just wget.

Start the server:

[straydog@saskatoon testcase]$ cherryd -c cherrypy_config
[07/Jan/2012:15:16:25] ENGINE Mounted: cherrypy._cptree.Application(<test.Test object at 0x96b6d8c>, '') on /
[07/Jan/2012:15:16:25] ENGINE Listening for SIGHUP.
[07/Jan/2012:15:16:25] ENGINE Listening for SIGTERM.
[07/Jan/2012:15:16:25] ENGINE Listening for SIGUSR1.
[07/Jan/2012:15:16:25] ENGINE Bus STARTING
[07/Jan/2012:15:16:25] ENGINE Started monitor thread '_TimeoutMonitor'.
[07/Jan/2012:15:16:25] ENGINE Started monitor thread 'Autoreloader'.
[07/Jan/2012:15:16:25] ENGINE Serving on 0.0.0.0:8090
[07/Jan/2012:15:16:25] ENGINE Bus STARTED

Initial test:

[shapeshifter@bluerabbit cherrypylongpolltest]$ wget -q http://saskatoon.icu.uzh.ch:8090/test -O -
still working OK

Start one long_poll in the background to demo the functionality and then call set_done. Everything works as expected:

[shapeshifter@bluerabbit cherrypylongpolltest]$ wget -q http://saskatoon.icu.uzh.ch:8090/long_poll -O - &
[1] 1794
[shapeshifter@bluerabbit cherrypylongpolltest]$ wget -q http://saskatoon.icu.uzh.ch:8090/set_done -O -
set done
[shapeshifter@bluerabbit cherrypylongpolltest]$ Done!

[1]+  Done                    wget -q http://saskatoon.icu.uzh.ch:8090/long_poll -O -

Set pending again, and then open several long_poll connections in background:

[shapeshifter@bluerabbit cherrypylongpolltest]$ wget -q http://saskatoon.icu.uzh.ch:8090/set_pending -O -
set pending
[shapeshifter@bluerabbit cherrypylongpolltest]$ wget -q http://saskatoon.icu.uzh.ch:8090/test -O -
still working OK
[shapeshifter@bluerabbit cherrypylongpolltest]$ wget -q http://saskatoon.icu.uzh.ch:8090/long_poll -O - &
[1] 1798
[shapeshifter@bluerabbit cherrypylongpolltest]$ wget -q http://saskatoon.icu.uzh.ch:8090/long_poll -O - &
[2] 1799
[shapeshifter@bluerabbit cherrypylongpolltest]$ wget -q http://saskatoon.icu.uzh.ch:8090/long_poll -O - &
[3] 1800
[shapeshifter@bluerabbit cherrypylongpolltest]$ wget -q http://saskatoon.icu.uzh.ch:8090/long_poll -O - &
[4] 1801
[shapeshifter@bluerabbit cherrypylongpolltest]$ wget -q http://saskatoon.icu.uzh.ch:8090/long_poll -O - &
[5] 1802
[shapeshifter@bluerabbit cherrypylongpolltest]$ wget -q http://saskatoon.icu.uzh.ch:8090/long_poll -O - &
[6] 1803
[shapeshifter@bluerabbit cherrypylongpolltest]$ wget -q http://saskatoon.icu.uzh.ch:8090/long_poll -O - &
[7] 1804
[shapeshifter@bluerabbit cherrypylongpolltest]$ wget -q http://saskatoon.icu.uzh.ch:8090/long_poll -O - &
[8] 1805
[shapeshifter@bluerabbit cherrypylongpolltest]$ wget -q http://saskatoon.icu.uzh.ch:8090/long_poll -O - &
[9] 1806
[shapeshifter@bluerabbit cherrypylongpolltest]$ wget -q http://saskatoon.icu.uzh.ch:8090/long_poll -O - &
[10] 1807
[shapeshifter@bluerabbit cherrypylongpolltest]$ wget -q http://saskatoon.icu.uzh.ch:8090/long_poll -O - &
[11] 1808

Now calling even just "test" which should just return a message hangs forever:

[shapeshifter@bluerabbit cherrypylongpolltest]$ wget -q http://saskatoon.icu.uzh.ch:8090/test -O -
#not working anymore!

At this point, even "killall wget" doesn't help and if anything, makes the problem worse.

I have no idea how to remedy this or if this behavior is to be expected but with a -1 thread pool, I'd guess that cherrypy should always be able to answer requests, even with several open connections.

Comments (1)

  1. shapeshifter

    Actually, I just discovered what might be the actual problem:

    changing the init and long_poll functions as such:

    class Test(object):
        def __init__(self):
            self.done = False
            self.count = 0
        
        @cherrypy.expose
        def long_poll(self):
            self.count = self.count + 1
            this_id = self.count
            cherrypy.response.timeout = 30
            print cherrypy.server.thread_pool_max
            while self.done == False:
                print this_id
                time.sleep(0.5)
            return "Done!\n"
    

    adds a counter for each new long_poll connection and also prints out the thread_pool_max setting. Now it's easy to see that there are A) only 10 connections made at max and that B) the threads keep running even if the clients disconnect. Opening >10 connections there will be output such as:

    10
    2
    5
    1
    4
    9
    3
    8
    7
    6
    10
    2
    1
    5
    4
    9
    3
    8
    7
    6
    10
    2
    1
    5
    4
    9
    3
    8
    7
    6
    10
    2
    5
    1
    4
    9
    3
    8
    7
    10
    6
    2
    5
     1
    4
    9
    3
    8
    7
    10
    6
    2
    

    even when all wget instances have been killed, or new ones made. There never appears an 11 or anything and past 10 the server is unresponsive. These even though each new connection first prints

    -1
    

    which indicates that the thread limit is unlimited. No idea what I'm doing wrong.

  2. Log in to comment
Tip: Filter by directory path e.g. /media app.js to search for public/media/app.js.
Tip: Use camelCasing e.g. ProjME to search for ProjectModifiedEvent.java.
Tip: Filter by extension type e.g. /repo .js to search for all .js files in the /repo directory.
Tip: Separate your search with spaces e.g. /ssh pom.xml to search for src/ssh/pom.xml.
Tip: Use ↑ and ↓ arrow keys to navigate and return to view the file.
Tip: You can also navigate files with Ctrl+j (next) and Ctrl+k (previous) and view the file with Ctrl+o.
Tip: You can also navigate files with Alt+j (next) and Alt+k (previous) and view the file with Alt+o.