join, sorting and limit/offset issue

Issue #369 resolved
Former user created an issue

There is a very strange behavior with table joins and limit/offset: example:

from sqlalchemy import *

db = create_engine("mysql://tezzt:tezzt@localhost/tezzt")
m = BoundMetaData(db)
db.engine.echo = True

class T1(object):pass
class T2(object):pass


t1 = Table("a", m, Column("id", Integer, primary_key=True), Column("b_id", Integer, ForeignKey("b.id")))
t2 = Table("b", m, Column("id", Integer, primary_key=True), Column("name", String(10)))

T1.mapper = mapper(T1, t1, properties={"b":relation(T2, lazy=False, order_by=None)})
T2.mapper = mapper(T2, t2)

m.drop_all()
m.create_all()


session = create_session()

for a in xrange(1,10):
    aa = T1()
    bb = T2()
    bb.name = "B%d"%a
    aa.b = bb
    session.save(aa)
    session.save(bb)

session.flush()
session.clear()

for a in session.query(T1).select(order_by=[T2.c.name](T2.c.name), limit=10, offset=0):
    print a.id, a.b.name


for a in session.query(T1).select(order_by=[asc(T2.c.name)](asc(T2.c.name)), limit=10, offset=0):
    print a.id, a.b.name

results:

2006-11-15 17:20:26,450 INFO sqlalchemy.engine.base.Engine.0x..d4 SELECT b_1b29.id AS b_1b29_id, b_1b29.name AS b_1b29_name, a.id AS a_id, a.b_id AS a_b_id
FROM (SELECT a.id AS a_id, b.name AS b_name
FROM a, b ORDER BY b.name
 LIMIT 10 OFFSET 0) AS tbl_row_count, a LEFT OUTER JOIN b AS b_1b29 ON b_1b29.id = a.b_id
WHERE a.id = tbl_row_count.a_id ORDER BY b_1b29.name
2006-11-15 17:20:26,452 INFO sqlalchemy.engine.base.Engine.0x..d4 [B1
2 B2
3 B3
4 B4
5 B5
6 B6
7 B7
8 B8
9 B9
2006-11-15 17:20:26,512 INFO sqlalchemy.engine.base.Engine.0x..d4 SELECT b_1b29.id AS b_1b29_id, b_1b29.name AS b_1b29_name, a.id AS a_id, a.b_id AS a_b_id
FROM (SELECT a.id AS a_id, b.name AS b_name
FROM a, b ORDER BY b.name ASC
 LIMIT 10 OFFSET 0) AS tbl_row_count, a LEFT OUTER JOIN b AS b_1b29 ON b_1b29.id = a.b_id
WHERE a.id = tbl_row_count.a_id ORDER BY tbl_row_count.b_name ASC
2006-11-15 17:20:26,514 INFO sqlalchemy.engine.base.Engine.0x..d4 [](]
1)
3 B3
6 B6
9 B9
2 B2
5 B5
8 B8
1 B1
4 B4
7 B7

I have expected to see same results in the both cases.

Comments (5)

  1. Mike Bayer repo owner

    I ran on MySQL5, and did not reproduce the issue; both results ordered correctly. However, running on postgres did reproduce the error.

    i think the issue is that you are adding an ordering criterion that is on a related table, without giving the query some context as to what "t2"'s relationship is to "t1" (since it wont guess). youll notice its shoving the "t2" table in there without a join condition to t1, so in the inner query you get arbitrary rows from "t2":

    test=> select a.id as a_id, b.name as b_name from a, b order by b.name asc limit 10 offset 0;
     a_id | b_name 
    ------+--------
        6 | B1
        1 | B1
        9 | B1
        2 | B1
        8 | B1
        3 | B1
        7 | B1
        4 | B1
        5 | B1
        6 | B2
    (10 rows)
    

    what its doing is limiting the total cartesian product of A and B, and you hardly get through the T2 table's rows.

    So adding some specificity to the query seems to work on both databases (and sqlite too):

    for a in session.query(T1).select(order_by=[asc(T2.c.name)](asc(T2.c.name)), limit=10, offset=0, from_obj=[t1.outerjoin(t2)](t1.outerjoin(t2))):
        print a.id, a.b.name
    

    producing the query:

    SELECT b_6fdd.id AS b_6fdd_id, b_6fdd.name AS b_6fdd_name, a.id AS a_id, a.b_id AS a_b_id 
    FROM (SELECT a.id AS a_id, b.name AS b_name 
    FROM a LEFT OUTER JOIN b ON b.id = a.b_id ORDER BY b.name ASC 
     LIMIT 10 OFFSET 0) AS tbl_row_count, a LEFT OUTER JOIN b AS b_6fdd ON b_6fdd.id = a.b_id 
    WHERE a.id = tbl_row_count.a_id ORDER BY tbl_row_count.b_name ASC
    
    1 B1
    2 B2
    3 B3
    4 B4
    5 B5
    6 B6
    7 B7
    8 B8
    9 B9
    

    remember that the eager loading should have no impact on your result set !

  2. Log in to comment