Issue #1867 resolved

Former user created an issue 2010-08-06

Apparently, Oracle's columns are not exclusively stored in uppercase.
I have a column named 'group'. In ALL_IND_COLUMNS (and apparently all other oracle views/tables, I imagine) this is stored as lower-case while all other column_names are upper-case. I imagine this is because 'group' is a keyword.

SQL> select column_name from all_tab_columns where table_name = 'SITEGROUPS';

COLUMN_NAME
------------------------------
group
DESCRIPTION

Here is the patch to fix the particular reflection problem I was having which was returning a redundant index representing the primary_key constraint even though reflection reports the primary key separately, as a PrimaryKeyConstraint:

--- sqlalchemy/dialects/oracle/base.py  2010-08-03 06:37:52.000000000 -0400
+++ ../SQLAlchemy-0.6.1.4/sqlalchemy/dialects/oracle/base.py    2010-08-03 06:36:49.000000000 -0400
@@ -875,21 +875,21 @@
         last_index_name = None
         pkeys = self.get_primary_keys(connection, table_name, schema,
                                       resolve_synonyms=resolve_synonyms,
                                       dblink=dblink,
                                       info_cache=kw.get('info_cache'))
         uniqueness = dict(NONUNIQUE=False, UNIQUE=True)

         oracle_sys_col = re.compile(r'SYS_NC\d+\$', re.IGNORECASE)
         for rset in rp:
             # don't include the primary key columns
-            if rset.column_name in [for s in pkeys](s.upper()):
+            if rset.column_name.upper() in [for s in pkeys](s.upper()):
                 continue
             if rset.index_name != last_index_name:
                 index = dict(name=self.normalize_name(rset.index_name), column_names=[                indexes.append(index)
             index['unique'](])
) = uniqueness.get(rset.uniqueness, False)

             # filter out Oracle SYS_NC names.  could also do an outer join
             # to the all_tab_columns table and check for real col names there.
             if not oracle_sys_col.match(rset.column_name):
                 index['column_names']('column_names').append(self.normalize_name(rset.column_name))

The bigger issue

I imagine this is a can of worms since the assumption was probably universally made that oracle stores its column names in UPPER case... (so I will probably rename my columns to avoid problems).

Kent

reflectunique.py

Comments (17)

Former user Account Deleted
I see now that SQLAlchemy has requested that this be in lower-case by surrounding it in quotes:
```
CREATE TABLE sitegroups (
        "group" NVARCHAR2(3) NOT NULL,
        description NVARCHAR2(40),
        PRIMARY KEY ("group")
)
```
Again, because it is a keyword, no doubt.
- 2010-08-06T13:43:39+00:00

Former user Account Deleted

Found another issue with oracle reflection (postgres reflection processes this correctly):

If there are more than one indexes on a column (for example, the primary key is two columns, and there is another index on one of those primary key columns, then reflection ignores ''both'' indexes instead of just the primary key index.

Attached is a .py script demonstrating.

Script output before patch below:

type: PrimaryKeyConstraint
name: sys_c0049060
columns: [u'id_b'](u'id_a',)

(u'SYS_C0049060', u'ID_A', u'UNIQUE')
(u'SYS_C0049060', u'ID_B', u'UNIQUE')
(u'SYS_C0049061', u'ID_B', u'UNIQUE')

Correct script output after patch below:

type: PrimaryKeyConstraint
name: sys_c0049060
columns: [u'id_b'](u'id_a',)

type: Index
name: sys_c0049061
columns: [u'id_b'](u'id_b')

(u'SYS_C0049060', u'ID_A', u'UNIQUE')
(u'SYS_C0049060', u'ID_B', u'UNIQUE')
(u'SYS_C0049061', u'ID_B', u'UNIQUE')

This patch makes the patch in the description of this ticket ''(above)'' obsolete

--- lib/sqlalchemy/dialects/oracle/base.py      2010-07-02 01:25:26.000000000 -0400
+++ ../SQLAlchemy-0.6.1.6kb/lib/sqlalchemy/dialects/oracle/base.py      2010-08-03 14:42:52.000000000 -0400
@@ -873,34 +873,46 @@
                                 schema=self.denormalize_name(schema))
         indexes = [        last_index_name = None
         pkeys = self.get_primary_keys(connection, table_name, schema,
                                       resolve_synonyms=resolve_synonyms,
                                       dblink=dblink,
                                       info_cache=kw.get('info_cache'))
         uniqueness = dict(NONUNIQUE=False, UNIQUE=True)

         oracle_sys_col = re.compile(r'SYS_NC\d+\$', re.IGNORECASE)
+
+        def sort_and_upper(names):
+            return sorted([i.upper() for i in names](]
))
+
+        pk_names = sort_and_upper(pkeys)
+
+        def remove_if_primary_key(index):
+            # don't include the primary key index
+            if index is not None and \
+               sort_and_upper(index['column_names']('column_names')) == pk_names:
+                indexes.pop()
+
+        index = None
         for rset in rp:
-            # don't include the primary key columns
-            if rset.column_name in [for s in pkeys](s.upper()):
-                continue
             if rset.index_name != last_index_name:
+                remove_if_primary_key(index)
                 index = dict(name=self.normalize_name(rset.index_name), column_names=[                indexes.append(index)
             index['unique'](])
) = uniqueness.get(rset.uniqueness, False)

             # filter out Oracle SYS_NC names.  could also do an outer join
             # to the all_tab_columns table and check for real col names there.
             if not oracle_sys_col.match(rset.column_name):
                 index['column_names']('column_names').append(self.normalize_name(rset.column_name))
             last_index_name = rset.index_name
+        remove_if_primary_key(index)
         return indexes

     @reflection.cache
     def _get_constraint_data(self, connection, table_name, schema=None,
                             dblink='', **kw):

         rp = connection.execute(
             sql.text("""SELECT
              ac.constraint_name,
              ac.constraint_type,

(Could get fancier and stop checking for the primary key index once it is found...)

Thanks,

Kent

P.S. This ticket is closely related to ~~#1868~~ (same file is patched)

2010-08-06T23:12:59+00:00

Former user Account Deleted
- attached reflectunique.py
Oracle reflection of indexes should only skip the primary key index not all indexes with a primary key column
- 2010-08-06T23:15:14+00:00
Mike Bayer repo owner
- changed milestone to 0.6.4
do you think you might try working these tests for here and ~~#1868~~ into test/dialect/test_oracle.py ? there's a lot of details here that you seem to have under your belt.
- 2010-08-07T01:36:56+00:00

Former user Account Deleted

Sure, but I'd like your input:

I found out oracle won't allow this:

CREATE TABLE sometable (
        id_a NVARCHAR2(255) NOT NULL,
        id_b NVARCHAR2(255) NOT NULL,
        col_c NVARCHAR2(255),
        PRIMARY KEY (id_a, id_b),
        UNIQUE (id_a, id_b)
)
        UNIQUE (id_a, id_b)
        *
ERROR at line 6:
ORA-02261: such unique or primary key already exists in the table

However, it does allow it if the order is changed (I wonder if this is unintended by oracle):

CREATE TABLE sometable (
        id_a NVARCHAR2(255) NOT NULL,
        id_b NVARCHAR2(255) NOT NULL,
        col_c NVARCHAR2(255),
        PRIMARY KEY (id_a, id_b),
        UNIQUE (id_b, id_a)
)

Table created.

Upon inspection, oracle has created two constraints, a 'U' and a 'P', but both use the same index.

SQL> select CONSTRAINT_NAME, CONSTRAINT_TYPE,  STATUS, INDEX_NAME from user_constraints;

CONSTRAINT_NAME                C STATUS   INDEX_NAME
------------------------------ - -------- ------------------------------
SYS_C0049092                   C ENABLED
SYS_C0049093                   C ENABLED
SYS_C0049094                   P ENABLED  SYS_C0049094
SYS_C0049095                   U ENABLED  SYS_C0049094



SQL> select INDEX_NAME, TABLE_NAME, UNIQUENESS from user_indexes;

INDEX_NAME                     TABLE_NAME                     UNIQUENES
------------------------------ ------------------------------ ---------
SYS_C0049094                   SOMETABLE                      UNIQUE

The issue is we only reflect constraints where "constraint_type IN ('R','P')" and we use indexes to determine our UniqueConstraints for reflection...

After my patch, reflection finds the PrimaryKeyConstraint

type: PrimaryKeyConstraint
name: sys_c0049094
columns: [u'id_b'](u'id_a',)

but technically there should also be a UniqueConstraint listed with a different order of columns (but Oracle isn't really honoring the order of columns anyway, since it is using the same index).

Do we even want to attempt "correct" reflection support for this scenario?

-Kent

2010-08-07T07:01:10+00:00

Former user Account Deleted
Also, I'm only familiar with cvs, svn (familiar with hg conceptually) and my oracle base.py is from 0.6.1. If I submit patches like the ones here, is that fine, our should I download hg client, etc?
- 2010-08-07T07:04:49+00:00
Former user Account Deleted
Replying to guest:

but technically there should also be a UniqueConstraint listed with a different order of columns (but Oracle isn't really honoring the order of columns anyway, since it is using the same index).

Do we even want to attempt "correct" reflection support for this scenario?

Wait a second, we don't even reflect UniqueConstraints, do we? Only Indexes, so maybe this ambiguity goes away. (Of course it brings up whether we ''should'' reflect uniqueconstraints, but that would be a different topic.

Input?
- 2010-08-07T08:35:56+00:00
Mike Bayer repo owner
Replying to guest:

Also, I'm only familiar with cvs, svn (familiar with hg conceptually) and my oracle base.py is from 0.6.1. If I submit patches like the ones here, is that fine, our should I download hg client, etc?

you don't have to worry about mercurial, while I encourage you to learn a DVCS someday I only need to get a unified diff file. Its best if its against the latest tip, which you can download without a VC client: http://hg.sqlalchemy.org/sqlalchemy/archive/default.tar.gz (this is off the downloads page).
- 2010-08-07T10:23:43+00:00
Mike Bayer repo owner
Replying to guest:

Wait a second, we don't even reflect UniqueConstraints, do we? Only Indexes, so maybe this ambiguity goes away. (Of course it brings up whether we ''should'' reflect uniqueconstraints, but that would be a different topic.

at the moment, unique constraints aren't part of what the inspector interface pulls in. But certainly its something the inspector would have eventually - the point of inspector is that its a place for pretty much everything about a database to be introspected. We'd probably add a method called "get_constraints" to inspector that pulls them all in. But that's a new feature and wouldn't have to be part of this bug report.
- 2010-08-07T10:26:59+00:00
Former user Account Deleted
Replying to guest:

Do we even want to attempt "correct" reflection support for this scenario?

There is the notion that the inspector should only reflect the pkconstraint, not also its index. (I can agree with that.) However, the issue is that, as I've shown, one index can enforce several constraints. It would be nice for inspector to report the index for a pk if isn't strictly for the pk ( ''also'' for another constraint). Also, it is possible for 2 indexes to have the same columns and then we don't know which index to ignore (and oracle 8.0's all_constraints doesn't have index_name.)

Should we make it simple and ignore all indexes if they have the same set of columns as the pk? (even if it misses indexes it shouldn't? ) (and ignore implications this has on the future plans for the inspector?) Any other ideas?
- 2010-08-07T12:15:25+00:00

Mike Bayer repo owner

so to let it take in all the indexes:

diff -r 9d6cb72791502e7a704a244093d6daca9697dd3c lib/sqlalchemy/dialects/oracle/base.py
--- a/lib/sqlalchemy/dialects/oracle/base.py    Fri Aug 06 11:03:05 2010 -0400
+++ b/lib/sqlalchemy/dialects/oracle/base.py    Sat Aug 07 13:05:09 2010 -0400
@@ -888,8 +888,8 @@
         oracle_sys_col = re.compile(r'SYS_NC\d+\$', re.IGNORECASE)
         for rset in rp:
             # don't include the primary key columns
-            if rset.column_name in [for s in pkeys](s.upper()):
-                continue
+#            if rset.column_name in [for s in pkeys](s.upper()):
+#                continue
             if rset.index_name != last_index_name:
                 index = dict(name=self.normalize_name(rset.index_name), column_names=[])
                 indexes.append(index)

Pretty much just one test fails for real, reflecting the table and then trying to recreate it - "ORA-01408: such column list already indexed". Another fails because a warning is raised regarding the unexpected index but that is harmless and the test can be made to ignore the warning.

So reflecting the additional index here is not a big deal, just slightly less compatible with how all the other databases do it, and it means you can't issue create_all() on a set of reflected metadata unless we figure something out (which....we probably should).

2010-08-07T13:10:02+00:00

Mike Bayer repo owner

so let me just add then, that the original patch here which addresses this, I only care about the "round trip" capability. if your patch removes only indexes that have the identical columns then that is probably fine. We'd need a "round trip" test added to test/dialect/test_oracle.py.

class RoundTripTest(TestBase):
    __only_on__ = 'oracle'

    @testing.provide_metadata
    def test_basic(self):
        t1 = Table('t1', metadata,
            Column('id1', Integer, primary_key=True),
            Column('id2', Integer, primary_key=True),
            Column('c3', Integer),
        )

        t2 = Table('t1', metadata,
            Column('id1', Integer, primary_key=True),
            Column('id2', Integer, primary_key=True),
            Column('c3', Integer),
        )
        # index that covers a subset of PKs, or 
        # whatever it is we need to test here
        Index('some_index', t2.c.id2, t2.c.c3)

        metadata.create_all()

        m2 = MetaData()
        m2.reflect()

        t1_reflected = m2.tables['t1']('t1')
        t2_reflected = m2.tables['t2']('t2')

        # do some assertions here, that 
        # there's an index, etc.
        # assert xyz in t1_reflected.indexes

        m2.drop_all()
        m2.create_all()

2010-08-07T13:15:19+00:00

Former user Account Deleted

Patch

Disclaimer

The inspector knowingly may not reflect all indexes if an index shares the exact same columns as the primary key constraint index. This should be very rare... I can only imagine a use-case is with some optimization where having two indexes in a different order affect some binary tree search or something, but not a normal situation...

--- sqlalchemy-default/lib/sqlalchemy/dialects/oracle/base.py   2010-08-05 12:28:01.000000000 -0400
+++ SQLAlchemy-0.6.3.1kb/lib/sqlalchemy/dialects/oracle/base.py 2010-08-07 18:02:23.000000000 -0400
@@ -886,11 +886,22 @@
         uniqueness = dict(NONUNIQUE=False, UNIQUE=True)

         oracle_sys_col = re.compile(r'SYS_NC\d+\$', re.IGNORECASE)
+
+        def upper_name_set(names):
+            return set([for i in names](i.upper()))
+
+        pk_names = upper_name_set(pkeys)
+
+        def remove_if_primary_key(index):
+            # don't include the primary key index
+            if index is not None and \
+               upper_name_set(index['column_names']('column_names')) == pk_names:
+                indexes.pop()
+
+        index = None
         for rset in rp:
-            # don't include the primary key columns
-            if rset.column_name in [for s in pkeys](s.upper()):
-                continue
             if rset.index_name != last_index_name:
+                remove_if_primary_key(index)
                 index = dict(name=self.normalize_name(rset.index_name), column_names=[                indexes.append(index)
             index['unique'](])
) = uniqueness.get(rset.uniqueness, False)
@@ -900,6 +911,7 @@
             if not oracle_sys_col.match(rset.column_name):
                 index['column_names']('column_names').append(self.normalize_name(rset.column_name))
             last_index_name = rset.index_name
+        remove_if_primary_key(index)
         return indexes

     @reflection.cache
@@ -932,7 +944,6 @@
         constraint_data = rp.fetchall()
         return constraint_data

-    @reflection.cache
     def get_primary_keys(self, connection, table_name, schema=None, **kw):
         """

@@ -943,7 +954,10 @@
             dblink

         """
+        return self._get_primary_keys(connection, table_name, schema=None, **kw)[0](0)

+    @reflection.cache
+    def _get_primary_keys(self, connection, table_name, schema=None, **kw):
         resolve_synonyms = kw.get('oracle_resolve_synonyms', False)
         dblink = kw.get('dblink', '')
         info_cache = kw.get('info_cache')
@@ -953,6 +967,7 @@
                                           resolve_synonyms, dblink,
                                           info_cache=info_cache)
         pkeys = [       constraint_name = None
         constraint_data = self._get_constraint_data(connection, table_name,
                                         schema, dblink,
                                         info_cache=kw.get('info_cache'))
@@ -962,8 +977,18 @@
             (cons_name, cons_type, local_column, remote_table, remote_column, remote_owner) = \
                 row[0:2](]
+) + tuple([for x in row[2:6](self.normalize_name(x))])
             if cons_type == 'P':
+                if constraint_name is None:
+                    constraint_name = self.normalize_name(cons_name)
                 pkeys.append(local_column)
-        return pkeys
+        return pkeys, constraint_name
+
+    def get_pk_constraint(self, connection, table_name, schema=None, **kw):
+        cols, name = self._get_primary_keys(connection, table_name, schema=schema, **kw)
+
+        return {
+            'constrained_columns':cols,
+            'name':name
+        }

     @reflection.cache
     def get_foreign_keys(self, connection, table_name, schema=None, **kw):
diff -u -r sqlalchemy-default/test/dialect/test_oracle.py SQLAlchemy-0.6.3.1kb/test/dialect/test_oracle.py
--- sqlalchemy-default/test/dialect/test_oracle.py      2010-08-05 12:28:01.000000000 -0400
+++ SQLAlchemy-0.6.3.1kb/test/dialect/test_oracle.py    2010-08-07 18:58:47.000000000 -0400
@@ -1168,4 +1168,64 @@
             t.select(for_update=True).limit(2).offset(3).execute
         )

-
+
+class RoundTripIndexTest(TestBase):
+    __only_on__ = 'oracle'
+
+    def test_basic(self):
+        engine = testing.db
+        metadata = MetaData(engine)
+
+        table=Table("sometable", metadata,
+            Column("id_a", Unicode(255), primary_key=True),
+            Column("id_b", Unicode(255), primary_key=True, unique=True),
+            Column("group", Unicode(255), primary_key=True),
+            Column("col", Unicode(255)),
+            UniqueConstraint('col','group'),
+        )
+
+        normalind = Index('tableind', table.c.id_b, table.c.group) # "group" is a keyword, so lower case
+
+        # create
+        metadata.create_all()
+        try:
+            # round trip, create from reflection
+            mirror = MetaData(engine)
+            mirror.reflect()
+            metadata.drop_all()
+            mirror.create_all()
+
+            # inspect the reflected creation
+            inspect = MetaData(engine)
+            inspect.reflect()
+
+            def obj_definition(obj):
+                return (obj.__class__, tuple([for c in obj.columns](c.name)), getattr(obj, 'unique', None))
+
+            # find what the primary k constraint name should be
+            primaryconsname = engine.execute(
+               text("""SELECT constraint_name
+                   FROM all_constraints
+                   WHERE table_name = :table_name
+                   AND owner = :owner
+                   AND constraint_type = 'P' """),
+               table_name=table.name.upper(),
+               owner=engine.url.username.upper()).fetchall()[0](0)[0](0)
+
+            reflectedtable = inspect.tables[table.name](table.name)
+
+            # make a dictionary of the reflected objects:
+            reflected = dict([i) for i in reflectedtable.indexes | reflectedtable.constraints]((obj_definition(i),))
+
+            # assert we got primary key constraint and its name, Error if not in dict
+            assert reflected[('id_a','id_b','group',), None)]((PrimaryKeyConstraint,).name.upper() == primaryconsname.upper()
+            # Error if not in dict
+            assert reflected[('id_b','group',), False)]((Index,).name == normalind.name
+            assert (Index, ('id_b',), True) in reflected
+            assert (Index, ('col','group',), True) in reflected
+            assert len(reflectedtable.constraints) == 1
+            assert len(reflectedtable.indexes) == 3
+
+        finally:
+            metadata.drop_all()
+

You may need to change the test to get it to run correctly (for example, owner=engine.url.username.upper() might not work). In the future, I'd like to figure out how to kick off the tests instead of running the python manually.

Note that several of the asserts as well as the create_all() failed before the patch, which also fixes and tests ~~#1868~~.

2010-08-07T19:31:41+00:00

Former user Account Deleted

Important

I made a mistake when I pasted line 957:

return self._get_primary_keys(connection, table_name, schema=None, **kw)[0](0)

should be:

--- SQLAlchemy-0.6.3.1kb/lib/sqlalchemy/dialects/oracle/base.py 2010-08-07 18:02:23.000000000 -0400
+++ /home/rarch/site-packages/SQLAlchemy-0.6.3.1dev-py2.6-linux-x86_64.egg/sqlalchemy/dialects/oracle/base.py   2010-08-08 10:15:06.000000000 -0400
@@ -954,7 +954,7 @@
             dblink

         """
-        return self._get_primary_keys(connection, table_name, schema=None, **kw)[0](0)
+        return self._get_primary_keys(connection, table_name, schema, **kw)[0](0)

     @reflection.cache
     def _get_primary_keys(self, connection, table_name, schema=None, **kw):

Oops. Sorry!

2010-08-09T22:12:03+00:00

Mike Bayer repo owner
Ill try to have a look in the next couple of days, its a little busy right now. thanks for the effort you've put in here !
- 2010-08-09T22:43:55+00:00
Mike Bayer repo owner
- changed status to resolved
Fantastic job, patch went right in with pertinent tests continuing to pass. 36fa24603f20ff6afc537cc97310d90efc667959 . Thanks for your ongoing involvement with the project !
- 2010-08-29T17:07:10+00:00
Mike Bayer repo owner
- removed milestone
Removing milestone: 0.6.4 (automated comment)
- 2014-02-23T18:14:38+00:00
Log in to comment

Assignee: Mike Bayer

Type: bug

Priority: major

Status: resolved

Component: oracle

Milestone: –

Votes: 0

Watchers: 0