IntegrityError when saving new objects that have cyclical relations

Issue #362 resolved
somedood created an issue

Where there are cyclical relations in objects, and a new object is created, sometimes the session will try to save the same object twice which results in an IntegrityError exception since the same id is used on both INSERTS. The test uses activemapper to make the table declarations smaller, though as far as I can tell it is unrelated to the extension.

from sqlalchemy import *
from sqlalchemy.ext.activemapper import *
import sys


_engine = create_engine("postgres://****:****@localhost/issuetracker")
metadata.connect(_engine)


class Answer(ActiveMapper):
    class mapping:
        __table__ = 'answer'
        id = column(Integer, primary_key=True)
        label = column(String(80), default=None)
        fromquestion_id = column(Integer, foreign_key='question.id', default=None)
        fromquestion = relationship('Question', primaryjoin=['answer.fromquestion_id','question.id']('answer.fromquestion_id','question.id'), uselist=False)
        toquestion_id = column(Integer, foreign_key='question.id', default=None)
        toquestion = relationship('Question', primaryjoin=['answer.toquestion_id','question.id']('answer.toquestion_id','question.id'), uselist=False)

class Issue(ActiveMapper):
    class mapping:
        __table__ = 'issue'
        id = column(Integer, primary_key=True)
        name = column(String, default='')
        description = column(String, default='')
        provider_service_id = column(Integer, foreign_key='provider_service.id')
        providerservice = relationship('ProviderService', uselist=False)
        firstquestion_id = column(Integer, foreign_key='question.id', default=None)
        firstquestion = relationship('Question', primaryjoin=['issue.firstquestion_id','question.id']('issue.firstquestion_id','question.id'), uselist=False)
        questions = one_to_many('Question', primaryjoin=['issue.id','question.issue_id']('issue.id','question.issue_id'))

class Provider(ActiveMapper):
    class mapping:
        __table__ = 'provider'
        id = column(Integer, primary_key=True)
        firstquestion_id = column(Integer, foreign_key='question.id', default=None)
        firstquestion = relationship('Question', primaryjoin=['provider.firstquestion_id','question.id']('provider.firstquestion_id','question.id'), uselist=False)
        questions = one_to_many('Question', primaryjoin=['provider.id','question.provider_id']('provider.id','question.provider_id'))
        providerservices = one_to_many('ProviderService')

class ProviderService(ActiveMapper):
    class mapping:
        __table__ = 'provider_service'
        id = column(Integer, primary_key=True)
        name = column(String(80), default='')
        provider_id = column(Integer, foreign_key='provider.id')
        provider = relationship('Provider', uselist=False)
        issues = one_to_many('Issue')

class Question(ActiveMapper):
    class mapping:
        __table__ = 'question'
        id = column(Integer, primary_key=True)
        issue_id = column(Integer, foreign_key='issue.id', default=None)
        issue = relationship('Issue', primaryjoin=['question.issue_id','issue.id']('question.issue_id','issue.id'), uselist=False)
        provider_id = column(Integer, foreign_key='provider.id', default=None)
        provider = relationship('Provider', primaryjoin=['question.provider_id','provider.id']('question.provider_id','provider.id'), uselist=False)
        provider_service_id = column(Integer, foreign_key='provider_service.id', default=None)
        providerservice = relationship('ProviderService', uselist=False)
        description = column(String(250))
        answers = one_to_many('Answer', primaryjoin=['question.id','answer.fromquestion_id']('question.id','answer.fromquestion_id'))
        wikilink = column(String(80))


import logging
logging.getLogger('sqlalchemy').setLevel(logging.DEBUG)
logging.getLogger('sqlalchemy').addHandler(logging.FileHandler("c:\\temp\\sqalog.txt","w"))

q = Question(description="testtest")
q.flush()

After a new question is created, and then flush() is called, an IntegrityError exception is raised. If I remove one of the classes, such as Issue (and it's relating columns \ relations) I don't get the error. And example of Issue being removed:

from sqlalchemy import *
from sqlalchemy.ext.activemapper import *
import sys


_engine = create_engine("postgres://****:****@localhost/issuetracker")
metadata.connect(_engine)


class Answer(ActiveMapper):
    class mapping:
        __table__ = 'answer'
        id = column(Integer, primary_key=True)
        label = column(String(80), default=None)
        fromquestion_id = column(Integer, foreign_key='question.id', default=None)
        fromquestion = relationship('Question', primaryjoin=['answer.fromquestion_id','question.id']('answer.fromquestion_id','question.id'), uselist=False)
        toquestion_id = column(Integer, foreign_key='question.id', default=None)
        toquestion = relationship('Question', primaryjoin=['answer.toquestion_id','question.id']('answer.toquestion_id','question.id'), uselist=False)

class Provider(ActiveMapper):
    class mapping:
        __table__ = 'provider'
        id = column(Integer, primary_key=True)
        firstquestion_id = column(Integer, foreign_key='question.id', default=None)
        firstquestion = relationship('Question', primaryjoin=['provider.firstquestion_id','question.id']('provider.firstquestion_id','question.id'), uselist=False)
        questions = one_to_many('Question', primaryjoin=['provider.id','question.provider_id']('provider.id','question.provider_id'))
        providerservices = one_to_many('ProviderService')

class ProviderService(ActiveMapper):
    class mapping:
        __table__ = 'provider_service'
        id = column(Integer, primary_key=True)
        name = column(String(80), default='')
        provider_id = column(Integer, foreign_key='provider.id')
        provider = relationship('Provider', uselist=False)
        issues = one_to_many('Issue')

class Question(ActiveMapper):
    class mapping:
        __table__ = 'question'
        id = column(Integer, primary_key=True)
        provider_id = column(Integer, foreign_key='provider.id', default=None)
        provider = relationship('Provider', primaryjoin=['question.provider_id','provider.id']('question.provider_id','provider.id'), uselist=False)
        provider_service_id = column(Integer, foreign_key='provider_service.id', default=None)
        providerservice = relationship('ProviderService', uselist=False)
        description = column(String(250))
        answers = one_to_many('Answer', primaryjoin=['question.id','answer.fromquestion_id']('question.id','answer.fromquestion_id'))
        wikilink = column(String(80))


import logging
logging.getLogger('sqlalchemy').setLevel(logging.DEBUG)
logging.getLogger('sqlalchemy').addHandler(logging.FileHandler("c:\\temp\\sqalog.txt","w"))

q = Question(description="testtest")
q.flush()

The test succeeds with that class removed. It also works if Provider, or ProviderService is removed (along with the relations that point to them). The log for when the IntegrityError is raised is as follows:

Dependency sort:
Mapper|ProviderService|provider_service (cycles: [Mapper|Question|question, Mapper|Provider|provider](Mapper|ProviderService|provider_service,))
  Mapper|Issue|issue (cycles: [Mapper|Question|question](Mapper|Issue|issue,))
    Mapper|Answer|answer

Task dump:

 UOWTask(0xc96230, ProviderService/provider_service/None) (contains cyclical sub-tasks) (save/update phase) 
   |- Save (placeholder)
   |
   |- UOWTask(0xc96230->0xc9de90, Question/question/None) (save/update phase) 
   |   |- Save Question(0xc8a7b0)
   |   |----
   |
   |   |- Process Question(0xc8a7b0).answers
   |
   |- UOWTask(0xc967d0, Issue/issue/None) (contains cyclical sub-tasks) (save/update phase) 
   |   |- Save (placeholder)
   |   |
   |   |- UOWTask(0xc967d0->0xca1310, Question/question/None) (save/update phase) 
   |   |   |- Save Question(0xc8a7b0)
   |   |   |----
   |   |
   |   |   |- Process Question(0xc8a7b0).answers
   |   |
   |   |- UOWTask(0xc9def0, Answer/answer/None) (save/update phase) 
   |   |   |- (empty task)
   |   |
   |   |----
   |
   |
   |- UOWTask(0xc967d0, Issue/issue/None) (contains cyclical sub-tasks) (delete phase) 
   |   |
   |   |- UOWTask(0xc9def0, Answer/answer/None) (delete phase) 
   |   |   |- (empty task)
   |   |
   |   |
   |   |- UOWTask(0xc967d0->0xca1310, Question/question/None) (delete phase) 
   |   |   |----
   |   |
   |   |----
   |
   |
   |- UOWTask(0xc96230->0xc9de90, Question/question/None) (delete phase) 
   |   |----
   |
   |----


(ProviderService|provider_service) save_obj() start, batched
Created new connection <connection object at 0x00BACF90; dsn: 'dbname=issuetracker host=localhost user=**** password=****', closed: 0>
Connection <connection object at 0x00BACF90; dsn: 'dbname=issuetracker host=localhost user=**** password=****', closed: 0> checked out from pool
BEGIN
(Question|question) save_obj() start, batched
(Question|question) save_obj() table 'question' instance Question@0xc8a7b0 identity (<class '__main__.Question'>, (None,), None)
select nextval('"question_id_seq"')
None
INSERT INTO question (description, id, issue_id, provider_id, provider_service_id, wikilink) VALUES (%(description)s, %(id)s, %(issue_id)s, %(provider_id)s, %(provider_service_id)s, %(wikilink)s)
{'provider_id': None, 'description': 'testtest', 'provider_service_id': None, 'issue_id': None, 'wikilink': None, 'id': 2628L}
(Issue|issue) save_obj() start, batched
(Question|question) save_obj() start, batched
(Question|question) save_obj() table 'question' instance Question@0xc8a7b0 identity (<class '__main__.Question'>, (2628L,), None)
INSERT INTO question (description, id, issue_id, provider_id, provider_service_id, wikilink) VALUES (%(description)s, %(id)s, %(issue_id)s, %(provider_id)s, %(provider_service_id)s, %(wikilink)s)
{'provider_id': None, 'description': 'testtest', 'provider_service_id': None, 'issue_id': None, 'wikilink': None, 'id': 2628L}
ROLLBACK
Connection <connection object at 0x00BACF90; dsn: 'dbname=issuetracker host=localhost user=**** password=****', closed: 0> being returned to pool

The log shows the the UOWTask is doing:

Save Question(0xc8a7b0)

Twice in this case, and both times saving the Question with the same id. I haven't checked out the UOW code at all, and have simplified the test as much as possible. If more information is needed, fell free to contact me at warren at serverplus dot com.

Comments (8)

  1. somedood reporter

    More simplified test withthout the Issue class, and without using my modifications to activemapper (to allow primaryjoins)

  2. Former user Account Deleted

    Update : The tests in my original post won't work with the standard SQLAlchemy 0.3.0 library. I made some modifications to activemapper so i could test using primaryjoins inside of its relationship() function, but it's pretty inflexible. Anyway, I updated and attached the test (and the log output) to just define those relationships after the classes and tables are created by activemapper the way it would normally be done. I also was able to remove the Issue class and relations to it to simplify the test a bit more while still duplicating the same problem. Sorry about not attaching them in the first place and making one huge ticket :)

  3. Mike Bayer repo owner
    • changed milestone to 0.3.1

    ive seen this error before, and its almost certainly because the relationships are set up improperly. ill try to look into this when i have time but it would be extremely helpful if you could provide a non-activemapper version of this. the problem might be in activemapper's setup of relationships (or that you are using OneToMany when ManyToOne is required, for example).

  4. Former user Account Deleted

    I'm attaching test2.py, which is the same object \ table structure but without activemapper. The errors are the same, however, I wonder if this has something to do with the order in which the relations are being created? I've noticed that /sometimes/ the error doesn't come up when using activemapper, but only in rare cases. The only thing i can think of is because activemapper uses a dictionary to store the relations to-be-processed until all of them can be created (when the tables they depend on exist). Since, from what I understand, python dicts don't always get iterated in the same order in which they were created that could explain something about it... This should be a 'bare-bones' test, and should be easier to diagnose.

    I'll see if some particular order of relation-creation makes the problem go away.

  5. Mike Bayer repo owner

    to create that table, you have to use 'use_alter' :

    question_table = Table('question', meta, 
            Column('id', Integer, primary_key=True), 
            Column('provider_id', Integer, ForeignKey('provider.id', name='pidfkey', use_alter=True), default=None), 
            Column('provider_service_id', Integer, ForeignKey('provider_service.id', name="psidfkey", use_alter=True), default=None), 
            Column('description', String(250))
    )
    

    the topological issue is fixed in changeset:2085

  6. Log in to comment