- changed status to wontfix
Oracle constraint reflection optimization
The current oracle reflection uses an outer join to the all_cons_columns
view when reflecting primary key and foreign key referential constraints:
@reflection.cache
def _get_constraint_data(self, connection, table_name, schema=None,
dblink='', **kw):
rp = connection.execute(
sql.text("""SELECT
ac.constraint_name,
ac.constraint_type,
loc.column_name AS local_column,
rem.table_name AS remote_table,
rem.column_name AS remote_column,
rem.owner AS remote_owner,
loc.position as loc_pos,
rem.position as rem_pos
FROM all_constraints%(dblink)s ac,
all_cons_columns%(dblink)s loc,
all_cons_columns%(dblink)s rem
WHERE ac.table_name = :table_name
AND ac.constraint_type IN ('R','P')
AND ac.owner = :owner
AND ac.owner = loc.owner
AND ac.constraint_name = loc.constraint_name
AND ac.r_owner = rem.owner(+)
AND ac.r_constraint_name = rem.constraint_name(+)
AND (rem.position IS NULL or loc.position=rem.position)
ORDER BY ac.constraint_name, loc.position""" % {'dblink': dblink}),
table_name=table_name, owner=schema)
constraint_data = rp.fetchall()
return constraint_data
The issue is: ''Oracle (especially older Oracle) is not very good at outer joins of views.''
With a large database (with many fk references), the above query is fairly slow, and when reflecting an entire schema, it can take several minutes to complete because sqlalchemy issues this query for each table.
We get the exact same results with no outer join by using a UNION of primary key with referential constraints because in the referential constraint SELECT we want an inner join:
@reflection.cache
def _get_constraint_data(self, connection, table_name, schema=None,
dblink='', **kw):
rp = connection.execute(
sql.text("""SELECT
ac.constraint_name,
ac.constraint_type,
loc.column_name AS local_column,
CAST(NULL AS CHAR) AS remote_table,
CAST(NULL AS CHAR) AS remote_column,
CAST(NULL AS CHAR) AS remote_owner,
loc.position as loc_pos,
CAST(NULL AS NUMBER) as rem_pos
FROM all_constraints%(dblink)s ac,
all_cons_columns%(dblink)s loc
WHERE ac.table_name = :table_name
AND ac.constraint_type = 'P'
AND ac.owner = :owner
AND ac.owner = loc.owner
AND ac.constraint_name = loc.constraint_name
UNION ALL
SELECT
ac.constraint_name,
ac.constraint_type,
loc.column_name AS local_column,
rem.table_name AS remote_table,
rem.column_name AS remote_column,
rem.owner AS remote_owner,
loc.position as loc_pos,
rem.position as rem_pos
FROM all_constraints%(dblink)s ac,
all_cons_columns%(dblink)s loc,
all_cons_columns%(dblink)s rem
WHERE ac.table_name = :table_name
AND ac.constraint_type = 'R'
AND ac.owner = :owner
AND ac.owner = loc.owner
AND ac.constraint_name = loc.constraint_name
AND ac.r_owner = rem.owner
AND ac.r_constraint_name = rem.constraint_name
AND loc.position=rem.position
ORDER BY constraint_name, loc_pos""" % {'dblink': dblink}),
table_name=table_name, owner=schema)
constraint_data = rp.fetchall()
return constraint_data
Note that the CAST(NULL AS CHAR)
is required for Oracle 8. (The above query runs for all oracle 8 and up, tested on 8.1.7.4.0, 9.2.0.4.0, 10.2.0.1.0 and 11.2.0.1.0)
It is interesting that for smaller databases (not so many fk references) with new oracle versions (10.2.0), the outer join actually performs better. I don't know whether that is due to oracle optimizations or my dev database not being as large as the production ones.
The scalability seems very poor however, since some of our production databases (both oracle 8 and 9) take minutes to reflect several schemas.
I ran the exact same reflection test before and after this patch on a large 8i database: [BR] Before: 162.7 s [BR] After: 12.2 s
My test of oracle 9.2.0 running above versus below query: [BR] Before: 41 s [BR] After: 9 s
(An even better optimization would be to gather all table constraint information in one query instead of issuing once for each table... in that case the outer join would be fine, but that is a lot more effort.)
Comments (2)
-
Account Deleted -
repo owner right..clearly the target is more recent oracles (are they up to 12 yet ?). if you feel like making the patch change what SQL it uses per oracle version feel free.
- Log in to comment
Nevermind... after more research, this is also very related to the optimization method being used so I can't recommend changing the select for everyone without more research than I have time for, but I will say that it seems ''certainly'' the right thing for oracle 8 which is very slow with these outer joins of views.