- attached test_mysql_disconnect.py
restarting mysql can cause connections to not be returned to the pool
If you restart the MySQL server while a query is being executed against it, this can result in the connection not being properly returned to the pool.
Attached is an example script which just loops, executing a query against MySQL and then asserting that no connections remain checkout out of the pool. AFAICT this is an appropriate invariant of the engine/pool - when no queries are being run, there should be nothing checked out.
If you run this script, then restart MySQL while it's running, a connection will often remain checked out after e.execute() returns, triggering the assertion.
As near as I can tell, what happens is the following:
- MySQL gets the signal to restart while a query is being executed
- It terminates the running query, producing error "1317 Query Execution was interrupted"
- The Connection._handle_dbapi_exception() method is invoked to deal with the error
- This error does not signal a disconnect, so we attempt to execute self._autorollback() before returning the connection to the pool
- MySQL has stopped, so this raises another error, now "2006 MySQL server has gone away"
- Connection._handle_dbapi_exception() is invoked again, but immediately re-raises the error since self._reentrant_error is now set to True
- The error propagates up to the calling code, and the connection is not returned to the pool
The connection is dead at this point, so it should be invalidated. Instead it stays around as a "zombie" connection, not referenced from anywhere but still marked as checked out by the pool.
Comments (7)
-
Account Deleted -
repo owner the attached patch is against 0.8 tip. It basically moves up the disconnect check to the top and sets a connection-wide one-way trigger. the dispose is then done in the finally. This might be a way to do this with minimal changes to the existing flow, care to try it out? (also "SHOW DATABASES" runs in like 50 ms here, I'm assuming that query is supposed to be long running?)
-
Account Deleted Great, the patch seems to have done the trick! Thanks for a speedy response.
I wonder if the should_close_with_result logic should also be moved into the finally block, in case of re-entrant errors that are not connection related?
(also "SHOW DATABASES" runs in like 50 ms here, I'm assuming that query is supposed to be long running?)
Yeah, a long-running query would have been better, but even with just "SHOW DATABASES" the bug could be quite reliably reproduced on my machine.
-
repo owner right now this code probably assumes a reentrant error is connection related, since it's doing an unconditional DBAPIError wrap in that case. At the moment I think if we're experiencing some case where each of cursor.close(), rollback(), connection.close() all may throw some new kind of exception each time, we're in a pretty catastrophic situation anyway. the next attachment here puts the autorollback in another try/finally.
If this patch is OK are you OK with 0.8 or are you stuck on 0.7?
-
repo owner - marked as blocker
-
repo owner - changed status to resolved
-
repo owner - removed milestone
Removing milestone: 0.8.xx (automated comment)
- Log in to comment
test script to demonstrate the error