relationship() defined in child class does not work as expected.

Issue #3452 resolved
Paul Conilione created an issue

This maybe a feature, rather than a bug. However, I was stuck with a strange problem with merge() that had me stumped for a long time as the issue was obfuscated by the use of the get_one_or_create() function from http://skien.cc/blog/2014/01/15/sqlalchemy-and-race-conditions-implementing/ which I had modified to use the created = session.modify(created) if a boolean was passed to it. Otherwise it would just add() the created object to the session.

I've pin pointed the core problem to creating a relationship in the child class, rather than the parent class. As detailed in the "Merge tip" section of http://docs.sqlalchemy.org/en/latest/orm/session_state_management.html#merge-tips by default, adding an object that is already in the session to a new object that you wish to merge() will result in the error:

New instance Y with identity key X conflicts with persistent instance Z

As stated in the docs, adding the property "cascade_backrefs=False" should resolve the problem. Which it does if the relationship() is defined in the parent and not the child. However, if you define the relationship in the child, then this directive is ignored and the error above is present.

Using the "Many to one" example given in "http://docs.sqlalchemy.org/en/rel_1_0/orm/basic_relationships.html#many-to-one" as a basis. Instead of defining the relationship between the User (parent) and the Address (child) using the relationship() function in the User class, I defined it in the Address class.

from sqlalchemy import *
from sqlalchemy.orm import *

from sqlalchemy.ext.declarative import declarative_base

Base = declarative_base();

class User(Base):
    __tablename__ = 'users'

    id = Column(Integer, Sequence('user_id_seq'), primary_key=True)
    name = Column(String(50))
    fullname = Column(String(50))
    password = Column(String(12))

    # This works as described in http://docs.sqlalchemy.org/en/rel_1_0/orm/basic_relationships.html#many-to-one
    #addresses = relationship("Address", backref=backref('user', order_by=id), cascade_backrefs=False)


    def __repr__(self):
        return "<User(id='{0}', name='{1}', fullname='{2}', password='{3}')>".format(
                        self.id, self.name, self.fullname, self.password)


class Address(Base):
    __tablename__ = 'addresses'

    id = Column(Integer, primary_key=True)
    email_address = Column(String, nullable=False)
    user_id = Column(Integer, ForeignKey('users.id'))
    comment = Column(String)

    # This does not works as expected.
    user = relationship("User", backref=backref('addresses', order_by=id), cascade_backrefs=False)

    def __repr__(self):
        return "<Address(id='{0}', email_address='{1}', user='{2}', comment='{3}')>".format(self.id, self.email_address, self.user, self.comment)


engine = create_engine('sqlite:///:memory:', echo=True)
Base.metadata.create_all(engine)

session = sessionmaker(bind=engine)()


print '## Create new objects ##'
user1 = User()
user1.name = 'Fred'
user1.fullname = 'Fred A'
user1.password = 'abc123'

user1 = session.merge(user1)

print 'commit()'
session.commit()

address1 = Address()
address1.user = user1
address1.email_address = 'user1@abc.com.au'
address1.comment = 'hello world'

address1 = session.merge(address1)

print 'commit()'
session.commit()


print '## New Objects in session. ##'
print user1
print address1


print '## Updating Address ##'

address1_update = Address()
address1_update.id = address1.id
address1_update.user = user1 # Error due to assignment of object already in session.
address1_update.email_address = 'user2@abc.com.au'
address1_update.comment = 'hello world'

address1 = session.merge(address1_update)

print 'commit()'
session.commit()

print '## Updated Objects in session ##'
print user1
print address1

The output is:

## Updating Address ##

<<SNIP>>

Traceback (most recent call last):
  File "example_declarative_relationship_1tomany_childside.py", line 106, in <module>
    address1 = session.merge(address1_update)
  File "/usr/lib64/python2.6/site-packages/sqlalchemy/orm/session.py", line 1689, in merge
    self._autoflush()
  File "/usr/lib64/python2.6/site-packages/sqlalchemy/orm/session.py", line 1282, in _autoflush
    self.flush()
  File "/usr/lib64/python2.6/site-packages/sqlalchemy/orm/session.py", line 2004, in flush
    self._flush(objects)
  File "/usr/lib64/python2.6/site-packages/sqlalchemy/orm/session.py", line 2122, in _flush
    transaction.rollback(_capture_exception=True)
  File "/usr/lib64/python2.6/site-packages/sqlalchemy/util/langhelpers.py", line 60, in __exit__
    compat.reraise(exc_type, exc_value, exc_tb)
  File "/usr/lib64/python2.6/site-packages/sqlalchemy/orm/session.py", line 2086, in _flush
    flush_context.execute()
  File "/usr/lib64/python2.6/site-packages/sqlalchemy/orm/unitofwork.py", line 373, in execute
    rec.execute(self)
  File "/usr/lib64/python2.6/site-packages/sqlalchemy/orm/unitofwork.py", line 532, in execute
    uow
  File "/usr/lib64/python2.6/site-packages/sqlalchemy/orm/persistence.py", line 149, in save_obj
    base_mapper, states, uowtransaction
  File "/usr/lib64/python2.6/site-packages/sqlalchemy/orm/persistence.py", line 301, in _organize_states_for_save
    state_str(existing)))
sqlalchemy.orm.exc.FlushError: New instance <Address at 0x2323090> with identity key (<class '__main__.Address'>, (1,)) conflicts with persistent instance <Address at 0x224f690>

From the documentation (http://docs.sqlalchemy.org/en/rel_1_0/orm/backref.html), it says that defining the relationship using backref is a shortcut for defining the bi-directional relationship between parent and child. Hence, I assumed that it should also work in reverse.

Other info:

Python and module versions

$ /usr/bin/python
Python 2.6.6 (r266:84292, Jan 22 2014, 09:42:36)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import sqlalchemy
>>> sqlalchemy.__version__
'1.0.5'

Comments (6)

  1. Mike Bayer repo owner

    thanks for posting a complete test case, as this is extremely helpful for me to understand immediately.

    In the "it works" example you have is:

        # This works as described in http://docs.sqlalchemy.org/en/rel_1_0/orm/basic_relationships.html#many-to-one
        #addresses = relationship("Address", backref=backref('user', order_by=id), cascade_backrefs=False)
    

    e.g., you want User.addresses not to cascade from a backref. great.

    The equivalent version of this on User Address is:

        user = relationship("User", backref=backref('addresses', order_by=id, cascade_backrefs=False))
    

    where again, we want User.addresses to have the directive on it.

    We will be trying to push the documentation away from the "backref" concept in any case and towards the more verbose, but probably easier to understand, "back_populates" model: http://docs.sqlalchemy.org/en/rel_1_0/orm/backref.html.

    Overall the heavy reliance on merge() here is not necessary and won't perform as well as doing straight add(), and it is more prone to issues like these.

  2. Paul Conilione reporter

    Thanks Mike, that solved the issue.

    I was using merge() frequently as I was updating some records in the database I'm working with.

    This whole problem would have been avoided if there was someway of detecting the inconsistency between the definition of the relationship in the Parent and Child classes.

    Where the relationship in the Parent works ok if defined as:

    addresses = relationship("Address", backref=backref('user', order_by=id), cascade_backrefs=False)
    

    Whilst defining the relationship in the Child class using the same syntax does not work:

    user = relationship("User", backref=backref('addresses', order_by=id), cascade_backrefs=False)
    
    # this also does not work.
    user = relationship("User", backref='addresses', cascade_backrefs=False )
    

    yet your solution does work:

    user = relationship("User", backref=backref('addresses', order_by=id, cascade_backrefs=False))
    

    where the cascade_backrefs=False is set in the backref() function instead of the relationship() function.

    I prefer to define the relationship in the child class as the child is storing a unit of work from a parent. Defining the relationships in the parent could be very messy if it has many children classes/tables.

    Thanks for your help.

  3. Mike Bayer repo owner

    there's no bug here as that's how the API works, and it's not an option to make the back_populates flag take effect in both directions simultaneously, as that would mean it's not possible to set it for just one direction at time and would be catastrophically backwards-incompatible in any case. I'd recommend using just back_populates instead of backref() to eliminate ambiguity.

  4. Paul Conilione reporter

    Thanks Mike,

    Maybe a note in the documentation about the subtle difference between the use of backref would be helpful. However, the use of the back_populates flag is the way to go.

    Cheers

  5. Log in to comment