ORM primary key retreival can populate unrelated column in mappers used for inheritance

Issue #3041 duplicate
Elise created an issue

When in an inheritance relationship, SA can overwrite a child table's column when setting the parent's primary key (as grabbed from the database) with the parent's primary key if the the child's column's attribute name matches the parent's attribute name.

See attached bug2.py for example. Sample output follows:

$ python bug2.py
INFO:sqlalchemy.engine.base.Engine:
CREATE TABLE bug_parent (
        discriminator VARCHAR(50), 
        id INTEGER NOT NULL, 
        parent_attr INTEGER, 
        PRIMARY KEY (id)
)


INFO:sqlalchemy.engine.base.Engine:()
INFO:sqlalchemy.engine.base.Engine:COMMIT
INFO:sqlalchemy.engine.base.Engine:
CREATE TABLE bug_child_a (
        id INTEGER NOT NULL, 
        parent_id INTEGER NOT NULL, 
        child_attr INTEGER, 
        PRIMARY KEY (id),         UNIQUE (parent_id), 
        FOREIGN KEY(parent_id) REFERENCES bug_parent (id)
)

INFO:sqlalchemy.engine.base.Engine:()
INFO:sqlalchemy.engine.base.Engine:COMMIT
INFO:sqlalchemy.engine.base.Engine:
CREATE TABLE bug_child_b (
        id INTEGER NOT NULL,         parent_id INTEGER NOT NULL, 
        child_attr INTEGER, 
        PRIMARY KEY (id), 
        UNIQUE (parent_id), 
        FOREIGN KEY(parent_id) REFERENCES bug_parent (id)
)
INFO:sqlalchemy.engine.base.Engine:()
INFO:sqlalchemy.engine.base.Engine:COMMIT
INFO:sqlalchemy.engine.base.Engine:
CREATE TABLE bug_child_c (
        some_primary_key INTEGER NOT NULL, 
        id INTEGER, 
        parent_id INTEGER NOT NULL, 
        child_attr INTEGER, 
        PRIMARY KEY (some_primary_key), 
        UNIQUE (parent_id), 
        FOREIGN KEY(parent_id) REFERENCES bug_parent (id)
)


INFO:sqlalchemy.engine.base.Engine:()
INFO:sqlalchemy.engine.base.Engine:COMMIT


INFO:__main__:The following SQL should not be setting bug_child_a.id (incorrect)
INFO:sqlalchemy.engine.base.Engine:BEGIN (implicit)
INFO:sqlalchemy.engine.base.Engine:INSERT INTO bug_parent (discriminator, parent_attr) VALUES (?, ?)
INFO:sqlalchemy.engine.base.Engine:('child_a', 100)
DEBUG:__main__:Validating ChildA.id=1
INFO:sqlalchemy.engine.base.Engine:INSERT INTO bug_child_a (id, parent_id, child_attr) VALUES (?, ?, ?)
INFO:sqlalchemy.engine.base.Engine:(1, 1, 200)


INFO:__main__:The following SQL does not set bug_child_b.id (debatably correct)
INFO:sqlalchemy.engine.base.Engine:INSERT INTO bug_parent (discriminator, parent_attr) VALUES (?, ?)
INFO:sqlalchemy.engine.base.Engine:('child_b', 300)
INFO:sqlalchemy.engine.base.Engine:INSERT INTO bug_child_b (parent_id, child_attr) VALUES (?, ?)
INFO:sqlalchemy.engine.base.Engine:(2, 400)
DEBUG:__main__:Validating ChildB.child_extra_id=1


INFO:__main__:The following SQL sets bug_child_c.id to None (correct)
INFO:sqlalchemy.engine.base.Engine:INSERT INTO bug_parent (discriminator, parent_attr) VALUES (?, ?)
INFO:sqlalchemy.engine.base.Engine:('child_c', 500)
INFO:sqlalchemy.engine.base.Engine:INSERT INTO bug_child_c (id, parent_id, child_attr) VALUES (?, ?, ?)
INFO:sqlalchemy.engine.base.Engine:(None, 3, 600)

NFO:sqlalchemy.engine.base.Engine:COMMIT
Testing objects
INFO:sqlalchemy.engine.base.Engine:BEGIN (implicit)
INFO:sqlalchemy.engine.base.Engine:SELECT bug_parent.discriminator AS bug_parent_discriminator, bug_child_a.id AS bug_child_a_id, bug_parent.id AS bug_parent_id, bug_parent.parent_attr AS bug_parent_parent_attr, bug_child_a.parent_id AS bug_child_a_parent_id, bug_child_
a.child_attr AS bug_child_a_child_attr 
FROM bug_parent JOIN bug_child_a ON bug_parent.id = bug_child_a.parent_id 
WHERE bug_parent.parent_attr = ?
INFO:sqlalchemy.engine.base.Engine:(100,)
INFO:sqlalchemy.engine.base.Engine:SELECT bug_child_b.id AS bug_child_b_id, bug_parent.discriminator AS bug_parent_discriminator, bug_parent.id AS bug_parent_id, bug_parent.parent_attr AS bug_parent_parent_attr, bug_child_b.parent_id AS bug_child_b_parent_id, bug_child_
b.child_attr AS bug_child_b_child_attr 
FROM bug_parent JOIN bug_child_b ON bug_parent.id = bug_child_b.parent_id 
WHERE bug_parent.parent_attr = ?
INFO:sqlalchemy.engine.base.Engine:(300,)
INFO:sqlalchemy.engine.base.Engine:SELECT bug_child_c.id AS bug_child_c_id, bug_parent.discriminator AS bug_parent_discriminator, bug_parent.id AS bug_parent_id, bug_parent.parent_attr AS bug_parent_parent_attr, bug_child_c.some_primary_key AS bug_child_c_some_primary_k
ey, bug_child_c.parent_id AS bug_child_c_parent_id, bug_child_c.child_attr AS bug_child_c_child_attr 
FROM bug_parent JOIN bug_child_c ON bug_parent.id = bug_child_c.parent_id 
WHERE bug_parent.parent_attr = ?
INFO:sqlalchemy.engine.base.Engine:(500,)
{'ChildA.parent_attr': 100, 'ChildA.child_attr': 200, 'ChildA._discriminator': u'child_a', 'ChildA.id': 1, 'ChildA.parent_id': 1}
{'ChildB._discriminator': u'child_b', 'ChildB.parent_id': 2, 'ChildB.parent_attr': 300, 'ChildB.id': 2, 'ChildB.child_attr': 400, 'ChildB.child_extra_id': 1}
{'ChildC.id': 3, 'ChildC.child_attr': 600, 'ChildC._discriminator': u'child_c', 'ChildC.parent_attr': 500, 'ChildC.some_primary_key': 1, 'ChildC.child_extra_id': None, 'ChildC.parent_id': 3}
Traceback (most recent call last):
  File "bug2.py", line 124, in <module>
    assert cb.child_extra_id == None, "SQLA set bug_child_b.id to a non-null value"
AssertionError: SQLA set bug_child_b.id to a non-null value

Comments (4)

  1. Mike Bayer repo owner

    issue 1: the bug here is that an exception is not raised. #3042 is added which will ensure that the exception normally raised for "implicitly combining two same-named columns under the same attribute" will raise in the case of class inheritance as well as for a single class mapping. This exception was originally raised for the case where you map a class to a join(), where the join the produces multiple columns of the same name that are then not linked to their attributes explicitly (see http://docs.sqlalchemy.org/en/rel_0_9/orm/mapper_config.html#mapping-a-class-against-multiple-tables). When this exception was added, I omitted joined-inheritance setups, probably in the interests of making the change less backwards-incompatible, as a lot of folks expect that setting up an attribute of the same name should mean the two columns are mirrored. It's a long time since that change so it is probably time to lock this down some more. The patch illustrated at #3042 makes the exception case much more specific and only omits the error for the case where the two same-named columns are either definitely the same "actual" column in the DB or are already assigned to the "inherit condition" as equated. There is no mechanism in SQLAlchemy by which the attributes Parent.id and ChildA.id can maintain two independent values; the attributes must be differently named for this to be possible. In the case where Parent.id and ChildA.id are to be mirrored and are otherwise unrelated, it's probably time that people start specifying this explicitly via id = column_property(Column('id', Integer, primary_key=True), Parent.id).

    issue 2: no bug. The bug_child_b.id column is assigned no value, because the ORM will omit sending NULL to any non-specified column which has a server-generated default value or is part of the primary key; as a PK column (even one part of a composite) can't have NULL in any case, we assume that an unspecified PK column must have some kind of server-side value generation scheme, so we leave the column omitted.

    issue 3: no bug. bug_child_c.id is a regular integer column and receives NULL as the attribute is un-set. The sending of NULL here vs. non in case #2 is in general not that important; technically NULL doesn't need to be sent to any missing columns in SQL as a missing column in an INSERT is usually the same as sending NULL and I'm pretty sure it's just historical that we send NULL for all missing columns.

  2. Elise reporter

    Thanks for the clarification of each case listed here. I was originally bitten by issue 1 and then decided to experiment to figure out exactly what was going on. Your knowledge of the history of this behavior to give context is much appreciated.

    In your explanation "in the case where Parent.id and ChildA.id are to be mirrored...", do you mean that in this scenario (after your future patches are applied), I should explicitly be specifying the column_property mirror to make both id properties point to the same column, or that that would be desired when they are "otherwise unrelated"?

  3. Mike Bayer repo owner

    if you want Parent.id and ChildA.id to be "syncrhonized", in order to avoid the warning I'll be checking in, you'd map like this:

    class ChildA(Parent):
       # ...
    
       id = column_property(Column('id', Integer, primary_key=True), Parent.id)
    

    however for now the above is the same as just having "Parent.id" and "ChildA.id" mapped to different columns on the same attribute name.

    if you want Parent.id and ChildA.id to be tracked separately, then you need to name the attributes differently:

    class ChildA(Parent):
        #...
    
       child_id = Column("id", integer, primary_key=True)
    
  4. Log in to comment