Oracle constraint reflection optimization

Issue #2543 resolved
Former user created an issue

The current oracle reflection uses an outer join to the all_cons_columns view when reflecting primary key and foreign key referential constraints:

    @reflection.cache
    def _get_constraint_data(self, connection, table_name, schema=None,
                            dblink='', **kw):

        rp = connection.execute(
            sql.text("""SELECT
             ac.constraint_name,
             ac.constraint_type,
             loc.column_name AS local_column,
             rem.table_name AS remote_table,
             rem.column_name AS remote_column,
             rem.owner AS remote_owner,
             loc.position as loc_pos,
             rem.position as rem_pos
           FROM all_constraints%(dblink)s ac,
             all_cons_columns%(dblink)s loc,
             all_cons_columns%(dblink)s rem
           WHERE ac.table_name = :table_name
           AND ac.constraint_type IN ('R','P')
           AND ac.owner = :owner
           AND ac.owner = loc.owner
           AND ac.constraint_name = loc.constraint_name
           AND ac.r_owner = rem.owner(+)
           AND ac.r_constraint_name = rem.constraint_name(+)
           AND (rem.position IS NULL or loc.position=rem.position)
           ORDER BY ac.constraint_name, loc.position""" % {'dblink': dblink}),
            table_name=table_name, owner=schema)
        constraint_data = rp.fetchall()
        return constraint_data

The issue is: ''Oracle (especially older Oracle) is not very good at outer joins of views.''

With a large database (with many fk references), the above query is fairly slow, and when reflecting an entire schema, it can take several minutes to complete because sqlalchemy issues this query for each table.

We get the exact same results with no outer join by using a UNION of primary key with referential constraints because in the referential constraint SELECT we want an inner join:

    @reflection.cache
    def _get_constraint_data(self, connection, table_name, schema=None,
                            dblink='', **kw):

        rp = connection.execute(
            sql.text("""SELECT
             ac.constraint_name,
             ac.constraint_type,
             loc.column_name AS local_column,
             CAST(NULL AS CHAR) AS remote_table,
             CAST(NULL AS CHAR) AS remote_column,
             CAST(NULL AS CHAR) AS remote_owner,
             loc.position as loc_pos,
             CAST(NULL AS NUMBER) as rem_pos
           FROM all_constraints%(dblink)s ac,
             all_cons_columns%(dblink)s loc
           WHERE ac.table_name = :table_name
           AND ac.constraint_type = 'P'
           AND ac.owner = :owner
           AND ac.owner = loc.owner
           AND ac.constraint_name = loc.constraint_name
        UNION ALL
        SELECT
             ac.constraint_name,
             ac.constraint_type,
             loc.column_name AS local_column,
             rem.table_name AS remote_table,
             rem.column_name AS remote_column,
             rem.owner AS remote_owner,
             loc.position as loc_pos,
             rem.position as rem_pos
           FROM all_constraints%(dblink)s ac,
             all_cons_columns%(dblink)s loc,
             all_cons_columns%(dblink)s rem
           WHERE ac.table_name = :table_name
           AND ac.constraint_type = 'R'
           AND ac.owner = :owner
           AND ac.owner = loc.owner
           AND ac.constraint_name = loc.constraint_name
           AND ac.r_owner = rem.owner
           AND ac.r_constraint_name = rem.constraint_name
           AND loc.position=rem.position
        ORDER BY constraint_name, loc_pos""" % {'dblink': dblink}),
            table_name=table_name, owner=schema)
        constraint_data = rp.fetchall()
        return constraint_data

Note that the CAST(NULL AS CHAR) is required for Oracle 8. (The above query runs for all oracle 8 and up, tested on 8.1.7.4.0, 9.2.0.4.0, 10.2.0.1.0 and 11.2.0.1.0)

It is interesting that for smaller databases (not so many fk references) with new oracle versions (10.2.0), the outer join actually performs better. I don't know whether that is due to oracle optimizations or my dev database not being as large as the production ones.

The scalability seems very poor however, since some of our production databases (both oracle 8 and 9) take minutes to reflect several schemas.

I ran the exact same reflection test before and after this patch on a large 8i database: [BR] Before: 162.7 s [BR] After: 12.2 s

My test of oracle 9.2.0 running above versus below query: [BR] Before: 41 s [BR] After: 9 s

(An even better optimization would be to gather all table constraint information in one query instead of issuing once for each table... in that case the outer join would be fine, but that is a lot more effort.)

Comments (2)

  1. Former user Account Deleted

    Nevermind... after more research, this is also very related to the optimization method being used so I can't recommend changing the select for everyone without more research than I have time for, but I will say that it seems ''certainly'' the right thing for oracle 8 which is very slow with these outer joins of views.

  2. Mike Bayer repo owner

    right..clearly the target is more recent oracles (are they up to 12 yet ?). if you feel like making the patch change what SQL it uses per oracle version feel free.

  3. Log in to comment