Memory Leak 0.7.5

Issue #2427 resolved

Iuri Diniz created an issue 2012-03-09

Hello folks,

I have a same program running forever using SQLAlchemy thought many of my servers.

On a brand new server I've detected that my program increase usage memory quickly (12 hours) until 2.0GB and the only solution for me is kill it. The same program is running on old servers without any problem. For example, it is using 254MB since 2012-02-14.

Each server uses a separated virtual-env. So I have detect that old servers running SQLAlchemy<=0.7.4 has no problem and the new server using 0.7.5 has no problem.

problematic server is using: (see full list problematic-pip-freeze.txt) MySQL-python==1.2.3 SQLAlchemy==0.7.5 Python==2.6.6

no problematic server is using (see full list unproblematic-pip-freeze.txt) MySQL-python==1.2.3 SQLAlchemy==0.7.4 Python==2.5.2

Comments (17)

Iuri Diniz reporter
- attached no-problematic-pip-freeze.txt
NO Problematic server packages
- 2012-03-09T14:50:10+00:00
Iuri Diniz reporter
- attached problematic-pip-freeze.txt
Problematic server packages
- 2012-03-09T15:04:29+00:00
Iuri Diniz reporter
Also, a downgrade to 0.7.4 solves my problem
- 2012-03-09T15:06:49+00:00
Mike Bayer repo owner
- assigned issue to
- changed component to orm
Any chance you can step through versions of SQLAlchemy to narrow it down ? that is, "hg up -r XYZ" ?

Here are all the revisions generated by hg log -rrel_0_7_4:rel_0_7_5 -v -b default:

http://paste.pocoo.org/show/563523/
- 2012-03-09T16:54:26+00:00
Mike Bayer repo owner
and of course a test case would be best, thanks !
- 2012-03-09T16:54:42+00:00
Iuri Diniz reporter
Ok,

I will try to make a test and make a binary bug search across the versions.
- 2012-03-09T22:29:16+00:00
Iuri Diniz reporter
- attached test_memleak.py
do something and print gc size
- 2012-03-10T07:49:10+00:00

Iuri Diniz reporter

First version of test_memleak.py

0.7.4 output:

sqlalchemy.__version__ = 0.7.4
len(gc.get_objects())=16206
len(gc.get_objects())=16489
len(gc.get_objects())=16489
len(gc.get_objects())=16489
len(gc.get_objects())=16489
len(gc.get_objects())=16489
len(gc.get_objects())=16489
len(gc.get_objects())=16489
len(gc.get_objects())=16489
len(gc.get_objects())=16489
len(gc.get_objects())=16489

0.7.5 output:

sqlalchemy.__version__ = 0.7.5
len(gc.get_objects())=16239
len(gc.get_objects())=16557
len(gc.get_objects())=16592
len(gc.get_objects())=16627
len(gc.get_objects())=16662
len(gc.get_objects())=16697
len(gc.get_objects())=16732
len(gc.get_objects())=16767
len(gc.get_objects())=16802
len(gc.get_objects())=16837
len(gc.get_objects())=16872
len(gc.get_objects())=16907

Is it sufficient?

2012-03-10T07:52:11+00:00

Iuri Diniz reporter
Better version: https://gist.github.com/2011352

0.7.4 output:
```
sqlalchemy.__version__ = 0.7.4, leak=False
```
0.7.5 output:
```
sqlalchemy.__version__ = 0.7.5, leak=True
```
- 2012-03-10T08:14:07+00:00

Iuri Diniz reporter

binary search done:

(sqlalquemy-hg)iuri@moeda:~/Sources/SQLALCHEMY/src$ hg bisect -g
The first bad revision is:
changeset:   8022:09a503e49724b61a8119f0b7855a990a29fc1521
user:        Mike Bayer <mike_mp@zzzcomputing.com>
date:        Sun Jan 22 14:04:20 2012 -0500
summary:     - [bug](bug) Fixed bug whereby a table-bound Column

(sqlalquemy-hg)iuri@moeda:~/Sources/SQLALCHEMY/src$

done by using bisect extension:

hg clone http://hg.sqlalchemy.org/sqlalchemy src
cd src

export PYTHONPATH=$PYTHONPATH:$PWD/build/lib.linux-x86_64-2.7/

# rel_0_7_5                       8072:667249f7b19325e1512f1a9ece29c5894e679508 (BAD)
# rel_0_7_4                       7976:5421b51a479a4c04da5175b58c8d8557ba2bc51c (GOOD)

# reset
hg bisect -r

# going to a bad revision
hg up rel_0_7_5

# build and test
rm -rf build && python setup.py build && python ../tests/test_memleak.py
# this revision is bad
hg bisect -b

# revision 7976 is good
hg bisect -g 7976

# build and test again
# ...

2012-03-10T08:41:46+00:00

Mike Bayer repo owner
- marked as blocker
- assigned issue to
- changed component to cextensions
- changed milestone to 0.7.6
yes, the bug is in the C extensions. Delete the .so files for the moment within sqlalchemy/ to alleviate your leak.
- 2012-03-10T11:31:27+00:00

Mike Bayer repo owner

so this seems like a basic reference counting bug, related to the tuple created in resultproxy. If I apply this patch to r0822:

diff -r 09a503e49724b61a8119f0b7855a990a29fc1521 lib/sqlalchemy/engine/base.py
--- a/lib/sqlalchemy/engine/base.py Sun Jan 22 14:04:20 2012 -0500
+++ b/lib/sqlalchemy/engine/base.py Sat Mar 10 14:38:50 2012 -0800
@@ -2596,7 +2596,7 @@
             processor = type_._cached_result_processor(dialect, coltype)

             processors.append(processor)
-            rec = (processor, obj, i)
+            rec = (processor, None, i)

             # indexes as keys. This is only needed for the Python version of
             # RowProxy (the C version uses a faster path for integer indexes).
@@ -2608,7 +2608,7 @@
                 # columns colliding by name is not a problem as long as the
                 # user does not try to access them (ie use an index directly,
                 # or the more precise ColumnElement)
-                keymap[name.lower()](name.lower()) = (processor, obj, None)
+                keymap[name.lower()](name.lower()) = (processor, None, None)

             if dialect.requires_name_normalize:
                 colname = dialect.normalize_name(colname)

then the bug is gone. So I need some help understanding reference counters here - the C ext doesn't refer to this second element of the tuple at all. I tried adding Py_DECREF(record) in there but that just breaks things.

I am able to fix it by pulling out that second element from the "record" and calling Py_DECREF on it (see below). But this seems like the wrong place to do this - it's not clear to me what activity causes the reference increment on this object - is it the PyDict_GetItem that increments the reference on it (seems very unlikely)? Is it because we have a reference to "keymap"? Seems weird we'd need to access references that we otherwise haven't accessed at all.

diff -r 09a503e49724b61a8119f0b7855a990a29fc1521 lib/sqlalchemy/cextension/resultproxy.c
--- a/lib/sqlalchemy/cextension/resultproxy.c   Sun Jan 22 14:04:20 2012 -0500
+++ b/lib/sqlalchemy/cextension/resultproxy.c   Sat Mar 10 14:43:12 2012 -0800
@@ -242,7 +242,7 @@
 {
     PyObject *processors, *values;
     PyObject *processor, *value;
-    PyObject *row, *record, *result, *indexobject;
+    PyObject *row, *record, *result, *indexobject, *tokenobject;
     PyObject *exc_module, *exception;
     char *cstr_key;
     long index;
@@ -278,7 +278,11 @@
                 return NULL;
         }

+        tokenobject = PyTuple_GetItem(record, 1);
+        Py_DECREF(tokenobject);
+
         indexobject = PyTuple_GetItem(record, 2);
+
         if (indexobject == NULL)
             return NULL;

@@ -302,7 +306,6 @@
                     "try 'use_labels' option on select statement.", cstr_key);
             return NULL;
         }
-
         index = PyInt_AsLong(indexobject);
         if ((index == -1) && PyErr_Occurred())
             /* -1 can be either the actual value, or an error flag. */

2012-03-10T17:49:49+00:00

Mike Bayer repo owner
- changed status to resolved
pjenvey and I got it, fixed in 5448f6129cd0487c3d06324385cc2ef0701b5815. Thanks for the effort reporting this !
- 2012-03-10T19:19:41+00:00
Former user Account Deleted
- changed watchers to kb@retailarchitects.com
Replying to zzzeek:

pjenvey and I got it, fixed in 5448f6129cd0487c3d06324385cc2ef0701b5815. Thanks for the effort reporting this !

Just double checking: 5448f6129cd0487c3d06324385cc2ef0701b5815 contains the entire fix?
- 2012-03-12T13:07:14+00:00
Mike Bayer repo owner
Replying to kentbower:

Replying to zzzeek:

pjenvey and I got it, fixed in 5448f6129cd0487c3d06324385cc2ef0701b5815. Thanks for the effort reporting this !

Just double checking: 5448f6129cd0487c3d06324385cc2ef0701b5815 contains the entire fix?

that sounds ominous, what part might be missing ?
- 2012-03-12T13:17:22+00:00
Former user Account Deleted
Replying to zzzeek:

Replying to kentbower:

Replying to zzzeek:

pjenvey and I got it, fixed in 5448f6129cd0487c3d06324385cc2ef0701b5815. Thanks for the effort reporting this !

Just double checking: 5448f6129cd0487c3d06324385cc2ef0701b5815 contains the entire fix?

that sounds ominous, what part might be missing ?

No, no...I haven't looked; I was just going to apply that patch but wanted to make sure it was self contained, but your reply is as good as a "yes"
- 2012-03-12T13:27:59+00:00
Mike Bayer repo owner
- removed milestone
Removing milestone: 0.7.6 (automated comment)
- 2014-02-23T18:14:54+00:00
Log in to comment

Assignee: –

Type: bug

Priority: blocker

Status: resolved

Component: cextensions

Milestone: –

Votes: 0

Watchers: 0