- attached no-problematic-pip-freeze.txt
Memory Leak 0.7.5
Hello folks,
I have a same program running forever using SQLAlchemy thought many of my servers.
On a brand new server I've detected that my program increase usage memory quickly (12 hours) until 2.0GB and the only solution for me is kill it. The same program is running on old servers without any problem. For example, it is using 254MB since 2012-02-14.
Each server uses a separated virtual-env. So I have detect that old servers running SQLAlchemy<=0.7.4 has no problem and the new server using 0.7.5 has no problem.
problematic server is using: (see full list problematic-pip-freeze.txt) MySQL-python==1.2.3 SQLAlchemy==0.7.5 Python==2.6.6
no problematic server is using (see full list unproblematic-pip-freeze.txt) MySQL-python==1.2.3 SQLAlchemy==0.7.4 Python==2.5.2
Comments (17)
-
reporter -
reporter - attached problematic-pip-freeze.txt
Problematic server packages
-
reporter Also, a downgrade to 0.7.4 solves my problem
-
repo owner - assigned issue to
- changed component to orm
Any chance you can step through versions of SQLAlchemy to narrow it down ? that is, "hg up -r XYZ" ?
Here are all the revisions generated by
hg log -rrel_0_7_4:rel_0_7_5 -v -b default
: -
repo owner and of course a test case would be best, thanks !
-
reporter Ok,
I will try to make a test and make a binary bug search across the versions.
-
reporter - attached test_memleak.py
do something and print gc size
-
reporter First version of test_memleak.py
0.7.4 output:
sqlalchemy.__version__ = 0.7.4 len(gc.get_objects())=16206 len(gc.get_objects())=16489 len(gc.get_objects())=16489 len(gc.get_objects())=16489 len(gc.get_objects())=16489 len(gc.get_objects())=16489 len(gc.get_objects())=16489 len(gc.get_objects())=16489 len(gc.get_objects())=16489 len(gc.get_objects())=16489 len(gc.get_objects())=16489
0.7.5 output:
sqlalchemy.__version__ = 0.7.5 len(gc.get_objects())=16239 len(gc.get_objects())=16557 len(gc.get_objects())=16592 len(gc.get_objects())=16627 len(gc.get_objects())=16662 len(gc.get_objects())=16697 len(gc.get_objects())=16732 len(gc.get_objects())=16767 len(gc.get_objects())=16802 len(gc.get_objects())=16837 len(gc.get_objects())=16872 len(gc.get_objects())=16907
Is it sufficient?
-
reporter Better version: https://gist.github.com/2011352
0.7.4 output:
sqlalchemy.__version__ = 0.7.4, leak=False
0.7.5 output:
sqlalchemy.__version__ = 0.7.5, leak=True
-
reporter binary search done:
(sqlalquemy-hg)iuri@moeda:~/Sources/SQLALCHEMY/src$ hg bisect -g The first bad revision is: changeset: 8022:09a503e49724b61a8119f0b7855a990a29fc1521 user: Mike Bayer <mike_mp@zzzcomputing.com> date: Sun Jan 22 14:04:20 2012 -0500 summary: - [bug](bug) Fixed bug whereby a table-bound Column (sqlalquemy-hg)iuri@moeda:~/Sources/SQLALCHEMY/src$
done by using bisect extension:
hg clone http://hg.sqlalchemy.org/sqlalchemy src cd src export PYTHONPATH=$PYTHONPATH:$PWD/build/lib.linux-x86_64-2.7/ # rel_0_7_5 8072:667249f7b19325e1512f1a9ece29c5894e679508 (BAD) # rel_0_7_4 7976:5421b51a479a4c04da5175b58c8d8557ba2bc51c (GOOD) # reset hg bisect -r # going to a bad revision hg up rel_0_7_5 # build and test rm -rf build && python setup.py build && python ../tests/test_memleak.py # this revision is bad hg bisect -b # revision 7976 is good hg bisect -g 7976 # build and test again # ...
-
repo owner - marked as blocker
- assigned issue to
- changed component to cextensions
- changed milestone to 0.7.6
yes, the bug is in the C extensions. Delete the .so files for the moment within sqlalchemy/ to alleviate your leak.
-
repo owner so this seems like a basic reference counting bug, related to the tuple created in resultproxy. If I apply this patch to r0822:
diff -r 09a503e49724b61a8119f0b7855a990a29fc1521 lib/sqlalchemy/engine/base.py --- a/lib/sqlalchemy/engine/base.py Sun Jan 22 14:04:20 2012 -0500 +++ b/lib/sqlalchemy/engine/base.py Sat Mar 10 14:38:50 2012 -0800 @@ -2596,7 +2596,7 @@ processor = type_._cached_result_processor(dialect, coltype) processors.append(processor) - rec = (processor, obj, i) + rec = (processor, None, i) # indexes as keys. This is only needed for the Python version of # RowProxy (the C version uses a faster path for integer indexes). @@ -2608,7 +2608,7 @@ # columns colliding by name is not a problem as long as the # user does not try to access them (ie use an index directly, # or the more precise ColumnElement) - keymap[name.lower()](name.lower()) = (processor, obj, None) + keymap[name.lower()](name.lower()) = (processor, None, None) if dialect.requires_name_normalize: colname = dialect.normalize_name(colname)
then the bug is gone. So I need some help understanding reference counters here - the C ext doesn't refer to this second element of the tuple at all. I tried adding Py_DECREF(record) in there but that just breaks things.
I am able to fix it by pulling out that second element from the "record" and calling Py_DECREF on it (see below). But this seems like the wrong place to do this - it's not clear to me what activity causes the reference increment on this object - is it the PyDict_GetItem that increments the reference on it (seems very unlikely)? Is it because we have a reference to "keymap"? Seems weird we'd need to access references that we otherwise haven't accessed at all.
diff -r 09a503e49724b61a8119f0b7855a990a29fc1521 lib/sqlalchemy/cextension/resultproxy.c --- a/lib/sqlalchemy/cextension/resultproxy.c Sun Jan 22 14:04:20 2012 -0500 +++ b/lib/sqlalchemy/cextension/resultproxy.c Sat Mar 10 14:43:12 2012 -0800 @@ -242,7 +242,7 @@ { PyObject *processors, *values; PyObject *processor, *value; - PyObject *row, *record, *result, *indexobject; + PyObject *row, *record, *result, *indexobject, *tokenobject; PyObject *exc_module, *exception; char *cstr_key; long index; @@ -278,7 +278,11 @@ return NULL; } + tokenobject = PyTuple_GetItem(record, 1); + Py_DECREF(tokenobject); + indexobject = PyTuple_GetItem(record, 2); + if (indexobject == NULL) return NULL; @@ -302,7 +306,6 @@ "try 'use_labels' option on select statement.", cstr_key); return NULL; } - index = PyInt_AsLong(indexobject); if ((index == -1) && PyErr_Occurred()) /* -1 can be either the actual value, or an error flag. */
-
repo owner - changed status to resolved
pjenvey and I got it, fixed in 5448f6129cd0487c3d06324385cc2ef0701b5815. Thanks for the effort reporting this !
-
Account Deleted - changed watchers to kb@retailarchitects.com
Replying to zzzeek:
pjenvey and I got it, fixed in 5448f6129cd0487c3d06324385cc2ef0701b5815. Thanks for the effort reporting this !
Just double checking: 5448f6129cd0487c3d06324385cc2ef0701b5815 contains the entire fix?
-
repo owner Replying to kentbower:
Replying to zzzeek:
pjenvey and I got it, fixed in 5448f6129cd0487c3d06324385cc2ef0701b5815. Thanks for the effort reporting this !
Just double checking: 5448f6129cd0487c3d06324385cc2ef0701b5815 contains the entire fix?
that sounds ominous, what part might be missing ?
-
Account Deleted Replying to zzzeek:
Replying to kentbower:
Replying to zzzeek:
pjenvey and I got it, fixed in 5448f6129cd0487c3d06324385cc2ef0701b5815. Thanks for the effort reporting this !
Just double checking: 5448f6129cd0487c3d06324385cc2ef0701b5815 contains the entire fix?
that sounds ominous, what part might be missing ?
No, no...I haven't looked; I was just going to apply that patch but wanted to make sure it was self contained, but your reply is as good as a "yes"
-
repo owner - removed milestone
Removing milestone: 0.7.6 (automated comment)
- Log in to comment
NO Problematic server packages