Possible memory leak?

Issue #4083 closed
Levon Saldamli created an issue

I have an application running with sqlalchemy which periodically imports some archives and creates database model objects, where objects of type BindParameter keeps increasing each time by some numbers. But I can't find any references to these objects that also increase. I'm using pdb and objgraph to find references, and as far as I can see using those tools, only BinaryExpression objects refer to BindParameter objects, but there is also a cyclic reference from Comparator objects (also not increasing in number). Is this a memory leak?

See attached graph of the references. At the moment the graph was dumped, number of objects in memory are: Comparator=1342, BinaryExpression=1396, BindParameter=1450. The strange this is that for each iteration where models are created, Comparator and BinaryExpression remain constant but number of BindParameter objects increase by the same amount (186 in my application).

Comments (8)

  1. Mike Bayer repo owner

    Python has a garbage collector that collects unreferenced cycles, so the presence of a cycle does not indicate a memory leak. in your graph, I notice the BinaryExpression is inside a tuple of some kind, to discover the source of memory growth you'd need to move up that tuple and to see what is referring to it (I'd gather some SQL expression I'm not recalling uses a tuple) to see further up the chain to where it links to objects associated with global state.

  2. Levon Saldamli reporter

    Since the number of BinaryExpression objects was constant, I didn't bother to check where they are referenced. Is it expected that number of BindParameter objects is increasing while Comparator and BinaryExpression are constant? I see that the reference in BinaryExpression is named "left" which sounds like a single object, but the name of the reference from Comparator is not shown.

  3. Mike Bayer repo owner

    BindParameter objects are associated with virtually every SQL expression generated, including ones that are generated by the ORM when it persists objects.

    The area where it can seem like leaking is occurring (but technically is not) is: the ORM uses caching for some persistence queries (e.g. INSERT / UPDATE / DELETE) meaning there may be a small amount of growth in the presence of these objects, each of which will have new BindParameter objects associated with them. This cache will use the dialect associated with the engine as part of the cache key. One effect of this is that if you had a program or test suite that generated an unlimited amount of engine objects using create_engine(), this would directly defeat the purpose of the cache, and the cache would fill up to its limit of around 150 entries, however that is 150 entries per mapper. if you are using the 1.2 beta series, there are more caches of this nature associated with relationship and inheritance loaders, and they also would grow to about 150 entries per mapper if the application constantly makes new Engine objects.

    If your application is calling create_engine() many times, like per test, or per request, or something like that, this will cause ORM related caches to grow to their max size.

    If there's any theme here, it's that there is nothing special about BindParameter in particular that would cause these to be the only object you see growing. These objects as used by SQLAlchemy are always associated utimately with a Core expression element and/or a Compiled object.

  4. Mike Bayer repo owner

    some important questions:

    1. is this the pypy interpreter?

    2. python 2 or 3?

    3. SQLAlchemy version?

    4. database driver?

    5. interpreter is not run with special GC like gc.disable() ?

  5. Levon Saldamli reporter

    Thanks for the detailed explanation, and sorry for the false alarm. You are correct that the application I was debugging was calling create_engine() many times, separate from the engine and session I was monitoring. Also, the sessions that were created were not being closed. I've resolved those issues now, and now I don't see any objects left around after sessions are closed and garbage collection is done.

    For the record, I was using the cypthon interpreter, python 3.6.2, sqlalchemy 1.1.11., psycopg2 2.6.1.

  6. Log in to comment