join object is a little stupid about figuring correct primary key

Issue #185 resolved
Mike Bayer repo owner created an issue

this impacts mappers with joined table inheritance, since people like to relate the primary key of table B directly to the primary key of table A, and therefore theres effectively only one primary key column for the mapper.

a join like this:

table = Table('table1', db, 
    Column('col1', Integer, primary_key=True),
    Column('col2', String(20)),
    Column('col3', Integer),
)

table2 = Table('table3', db, 
    Column('col1', Integer, ForeignKey('table1.col1'), primary_key=True),
    Column('col2', Integer),
    Column('col3', String(20)),
    )

a = join(table, table3)

should have only [table.c.col1](table.c.col1) as its primary key, not [table2.c.col1](table.c.col1,). however, if those two tables are joined together with any other ON clause, then the two columns should be there. since the ON clause can be explicitly specified, will need to dig into the ON clause and determine if this pattern is present.

really, the pattern may be, column A is a primary key for the join iff it is not present in the on clause being equated to another primary key already marked as a "primary key" for this join. if one of the columns in the binary comparison is a foreign key as well, then the non-foreign key column takes precedence as the choice for the primary key of the join.

Comments (10)

  1. Mike Bayer reporter

    the specific problem this causes is this:

    class A(object):pass
    class B(A):pass
    
    mapper(A, sometable)
    mapper(B, someothertable, inherits=A)
    
    session.query(B).get(1)  # fails
    session.query(B).get([1,1](1,1)) # succeeds
    

    ive committed a hack in changeset:1508 that addresses this on the query.get() side, which just repeats the first identifier if not enough identifiers are located. its possible this could be a "feature", i.e. "for a repeated identifier, just put the first one....."...though thats a little weird (and more than one way to do it...)

  2. Mike Bayer reporter
    • changed milestone to 0.3.0

    heres the pattern:

    for col in join.cols:
       if col.primary_key:
           if not col.has_foreign_key or col.foreign_key.column not in the join condition:
              primary_key.append(col)
    

    if the column has a foreign key, and the foreign key's column is in the join condition, then it doesnt get added as part of the primary key.

  3. Former user Account Deleted

    Just wanted to add that the fix should also account for the primary key having a different column name than the foreign key it "inherits". For instance, if in the ticket code table2's primary key were named 'col4' but table's remained 'col1'. Not sure if the fix already accounts for this, but just making sure it's not overlooked.

  4. Mike Bayer reporter

    this is fixed, better late than never, in changeset:2449. and if you look at our column arithmetic throughout SA youll see that we always use the targeted ForeignKey objects on Column to match up column references, never based on name. we have almost no name-matching going on between columns (a little bit in mapper.py as a fallback in the case that columns cant be found between derived selectables, that should probably be taken out too).

  5. Mike Bayer reporter
    • changed milestone to 0.3.9
    • marked as blocker
    • changed status to open
    • removed status

    this is still broke. the primary key of the join should be the column that does not have any foreign keys on it. also, you still get two columns back for a join of three tables.

    from sqlalchemy import *
    
    meta = MetaData()
    
    a = Table('a', meta, Column('id', Integer, primary_key=True))
    b = Table('b', meta, Column('id', Integer, ForeignKey('a.id'), primary_key=True))
    c = Table('c', meta, Column('id', Integer, ForeignKey('b.id'), primary_key=True))
    
    print a.join(b).primary_key
    print a.join(b).join(c).primary_key
    
  6. Log in to comment