Possible Memory Leak on relationships

Issue #3251 resolved
Francisco Fernández Castaño created an issue

Hi, I've the tables shown in the code mapped to some classes and I need to run a query using a custom selectable that gives me augmented information about Accounts through FollowAssociation depending on a given user_id and then join to Broadcast orm object this custom selectable defined on runtime, creating a new relationship with the same name. But with this solution, that probably is not sqlalchemy idiomatic I'm getting memory leaks. The RelationshipProperty objects are never freed so memory don't stop growing. I dind't be able to find an alternative way to achieve this goal and probably is my fault, but just in case here is a runnable snippet that shows how memory grows over time.

sqlalchemy==0.9.8

from sqlalchemy import create_engine, Column, Integer, ForeignKey, DateTime, case, and_
from sqlalchemy.orm import sessionmaker, relationship, aliased, mapper
from sqlalchemy.ext.declarative import declarative_base
import datetime
import random
from guppy import hpy


engine = create_engine('sqlite:///:memory:', echo=False)
Session = sessionmaker(bind=engine)
Base = declarative_base()
session = Session()


class FollowAssociation(Base):
    __tablename__ = 'followers_followees'

    follower_id = Column(Integer, ForeignKey('accounts.id'), primary_key=True)
    followee_id = Column(Integer, ForeignKey('accounts.id'), primary_key=True)
    created_at = Column(DateTime, default=datetime.datetime.now)


ali = aliased(FollowAssociation)


class Account(Base):
    __tablename__ = 'accounts'

    id = Column(Integer, primary_key=True)

    followers = relationship('Account', secondary='followers_followees',
                             primaryjoin=(FollowAssociation.followee_id == id),
                             secondaryjoin=(FollowAssociation.follower_id == id),
                             backref='followees')


class Broadcast(Base):
    __tablename__ = 'broadcasts'

    id = Column(Integer, primary_key=True)

    account_id = Column(Integer, ForeignKey('accounts.id'))


Base.metadata.create_all(engine)
a1 = Account()
a2 = Account()
a3 = Account()
b = Broadcast()
a3.followers.append(a2)
session.add_all([a1, a2, a3, b])
session.commit()
h = hpy()


def custom_query(user_id):
    query = session.query(
        Account,
        case([(FollowAssociation.follower_id == None, False)], else_=True).label('follower'),
        case([(ali.follower_id == None, False) ], else_=True).label('followee')
    ).outerjoin(
        FollowAssociation,
        and_(Account.id == FollowAssociation.followee_id, FollowAssociation.follower_id == user_id)
    ).outerjoin(ali, and_(Account.id == ali.follower_id, ali.followee_id == user_id))

    class CustomAccount(object):
        pass

    CustomAccount = type('CustomAccount', (object,), {})

    mapper(CustomAccount, query.statement.alias())

    return CustomAccount


def execute():
    for i in range(200000):
        custom_account = custom_query(random.randint(1, 200))
        Broadcast.account_with_follow_info = relationship(custom_account)
        session.query(Broadcast).first()
        if i % 1000 == 0:
            print i
            print h.heap()
            print h.heap().more

execute()

Comments (4)

  1. Mike Bayer repo owner

    well yes, I'm trying to see exactly which registry leaks here, but the pattern you're doing is really, really wrong. you absolutely should not be changing a mapping destructively ever (e.g. in this case the previous "account_with_follow_info" is expected to be discarded). You'll note it raises a warning as well, so in that the warning says "the old property will be discarded", if in fact I can't get this leak to resolve I'll have to change that warning to " and may leak memory".

    there is no reason at all you should be trying to use this pattern in the way that's being done here. the Broadcast class is an application level object, so even if all memory issues were resolved, changing it's definition constantly will not work correctly under concurrency and will also break other Broadcast objects already present.

  2. Mike Bayer repo owner
    • Fixed a leak which would occur in the unsupported and highly non-recommended use case of replacing a relationship on a fixed mapped class many times, referring to an arbitrarily growing number of target mappers. A warning is emitted when the old relationship is replaced, however if the mapping were already used for querying, the old relationship would still be referenced within some registries. fixes #3251

    → <<cset 026449c15ff3>>

  3. Mike Bayer repo owner
    • Fixed a leak which would occur in the unsupported and highly non-recommended use case of replacing a relationship on a fixed mapped class many times, referring to an arbitrarily growing number of target mappers. A warning is emitted when the old relationship is replaced, however if the mapping were already used for querying, the old relationship would still be referenced within some registries. fixes #3251

    → <<cset 533f9b6b8a30>>

  4. Log in to comment