maintain duplicates in history lists

Issue #1232 resolved

Former user created an issue 2008-11-24

This problem occurs in sqlalchemy 4.7.1, using oracle 9.

Case scenario: I had a job table, and a task task, and a jabs2tasks table to create a many to many relation. I created a job and a task, and added the job twice to the task (using a relation that have a secondary). len(task.jobs) returned 2. Then I saved the task and commited. I was expecting that an exception would be thrown, but it didn't. When I looked in at jobs2tasks in my SQLNavigator, I saw that only one entry was enetered.

The problem is, that this behaviour is not expected. When I try to enter, for example, a task that already exists in the db (the same task_id), sqlalchemy throw a constraint exception. I think that this is exactly the same case, but with a complex primary key.

Reported by Kobi Perl.

Comments (6)

Mike Bayer repo owner
- changed milestone to 0.5.xx
- changed component to documentation
- changed title to document the behavior of list-based collections and how they really should be configured as sets
- marked as task
I don't understand your second case. In the first case, you should use collection_class=set.
- 2008-11-24T15:27:54+00:00
Former user Account Deleted
There is no "second case". I tried to say that if it throws me a constraint exception when I enter two tasks with the same id, it should do the same when I enter two "jobs2task" entries with the same complex primary key - THROW AN EXCEPTION that a constraint was broken.

I can use collection_class=set to ensure that I'm really working with a set. This is a fine solution (And BTW I think that set should be the default and not list).

But still, it's logical to throw an exception, when I do choose to use a list, and have a task that contains two jobs with the same primary key in its jobs list...

When does the current behaviour is useful? make sense? Why should one be able to do something like:
```
j = Job()
t = Task()
t.jobs.append(j)
t.jobs.append(j)
# Print 2
print len(t.jobs)
session.save(t)
t_id = t.task_id
session.commit()
t = session.query(Task).findone(t.id = t_id)
# Print 1
print len(t.jobs)
```
- 2008-11-24T16:13:19+00:00
Mike Bayer repo owner
sqlalchemy relies upon the database to enforce constraints - we don't duplicate that behavior - because a. the database does it much better, b. we try not to intrusively get in the way of the relational dialogue the database produces, and c. because it would impact performance negatively for all instrumented lists to have a backing set with constraint checking logic going on, especially when this logic would be completely redundant.

If your jobs2task table specifies the primary key of each related table as a composite primary key, or otherwise has a UNIQUE constraint for those two columns, a constraint exception will be thrown.

as far as set being the default, it should have been the default but the original behavior of relation() was that the collections were automatically ordered. "set" was also not as widely used in the 2.3 series of Python. so there is some historical reference here. a better solution, perhaps one for 0.6, would be for collection_class to be a required field for all collection-holding relations.
- 2008-11-24T16:40:28+00:00

Mike Bayer repo owner

marked as critical
changed milestone to 0.5.0
changed title to maintain duplicates in history lists

My apologies, the many-to-many table itself only encounters one insert() statement and the constraint violation is not raised.

(if you attached a test case that would have communicated the issue more clearly. this is why its in big red letters...).

Here's a patch which maintains duplicates in the history list:

Index: lib/sqlalchemy/orm/attributes.py
===================================================================
--- lib/sqlalchemy/orm/attributes.py    (revision 5330)
+++ lib/sqlalchemy/orm/attributes.py    (working copy)
@@ -1427,11 +1427,18 @@
             elif original is NEVER_SET:
                 return cls((), list(current), ())
             else:
-                collection = util.OrderedIdentitySet(current)
-                s = util.OrderedIdentitySet(original)
-                return cls(list(collection.difference(s)),
-                           list(collection.intersection(s)),
-                           list(s.difference(collection)))
+                collection = util.IdentitySet(current)
+                s = util.IdentitySet(original)
+
+                added = collection.difference(s)
+                unchanged = collection.intersection(s)
+                deleted = s.difference(collection)
+                # maintain duplicates
+                return cls(
+                    [for x in current if x in added](x),
+                    [for x in current if x in unchanged](x),
+                    [for x in original if x in deleted](x)
+                )
         else:
             if current is NO_VALUE:
                 if original not in [NEVER_SET, NO_VALUE](None,):

test case

from sqlalchemy import *
from sqlalchemy.orm import *

engine = create_engine('sqlite://', echo=True)
metadata = MetaData(engine)

t1 = Table('t1', metadata,
    Column('id', Integer, primary_key=True),
    Column('data', String)
)

t2 = Table('t2', metadata,
    Column('id', Integer, primary_key=True),
    Column('data', String)
)

t3 = Table('t3', metadata,
    Column('t1id', Integer, ForeignKey('t1.id'), primary_key=True),
    Column('t2id', Integer, ForeignKey('t2.id'), primary_key=True),
)

metadata.create_all()

class T1(object):
    pass

class T2(object):
    pass

mapper(T1, t1, properties={
    't2s':relation(T2, secondary=t3)
})
mapper(T2, t2)

a = T1()
b = T2()

sess = create_session()
a.t2s.append(b)
a.t2s.append(b)
sess.add(a)
sess.flush()

I'm not sure if I want to add this into 0.5 as we were done with release candidates and this could be a surprise for some. will have to test some more.

2008-11-24T22:54:11+00:00

Mike Bayer repo owner
- changed status to resolved
OK well the solution in 332f5ee2662835ed1ca008043d40c37d7cddc270 will improve the situation, but of course not correct the case of one-to-many since there's no INSERT issued in that case.
- 2008-11-24T23:43:53+00:00
Mike Bayer repo owner
- removed milestone
Removing milestone: 0.5.0 (automated comment)
- 2014-02-23T18:14:04+00:00
Log in to comment

Assignee: Mike Bayer

Type: task

Priority: critical

Status: resolved

Component: documentation

Milestone: –

Votes: 0

Watchers: 0