Oracle reflection for get_indexes returns an index representing the primary_key if the column name is lowercase in Oracle
Apparently, Oracle's columns are not exclusively stored in uppercase.
I have a column named 'group'
. In ALL_IND_COLUMNS
(and apparently all other oracle views/tables, I imagine) this is stored as lower-case while all other column_names are upper-case. I imagine this is because 'group'
is a keyword.
SQL> select column_name from all_tab_columns where table_name = 'SITEGROUPS';
COLUMN_NAME
------------------------------
group
DESCRIPTION
Here is the patch to fix the particular reflection problem I was having which was returning a redundant index representing the primary_key constraint even though reflection reports the primary key separately, as a PrimaryKeyConstraint
:
--- sqlalchemy/dialects/oracle/base.py 2010-08-03 06:37:52.000000000 -0400
+++ ../SQLAlchemy-0.6.1.4/sqlalchemy/dialects/oracle/base.py 2010-08-03 06:36:49.000000000 -0400
@@ -875,21 +875,21 @@
last_index_name = None
pkeys = self.get_primary_keys(connection, table_name, schema,
resolve_synonyms=resolve_synonyms,
dblink=dblink,
info_cache=kw.get('info_cache'))
uniqueness = dict(NONUNIQUE=False, UNIQUE=True)
oracle_sys_col = re.compile(r'SYS_NC\d+\$', re.IGNORECASE)
for rset in rp:
# don't include the primary key columns
- if rset.column_name in [for s in pkeys](s.upper()):
+ if rset.column_name.upper() in [for s in pkeys](s.upper()):
continue
if rset.index_name != last_index_name:
index = dict(name=self.normalize_name(rset.index_name), column_names=[ indexes.append(index)
index['unique'](])
) = uniqueness.get(rset.uniqueness, False)
# filter out Oracle SYS_NC names. could also do an outer join
# to the all_tab_columns table and check for real col names there.
if not oracle_sys_col.match(rset.column_name):
index['column_names']('column_names').append(self.normalize_name(rset.column_name))
The bigger issue
I imagine this is a can of worms since the assumption was probably universally made that oracle stores its column names in UPPER case... (so I will probably rename my columns to avoid problems).
- Kent
Comments (17)
-
Account Deleted -
Account Deleted Found another issue with oracle reflection (postgres reflection processes this correctly):
If there are more than one indexes on a column (for example, the primary key is two columns, and there is another index on one of those primary key columns, then reflection ignores ''both'' indexes instead of just the primary key index.
Attached is a .py script demonstrating.
Script output before patch below:
type: PrimaryKeyConstraint name: sys_c0049060 columns: [u'id_b'](u'id_a',) (u'SYS_C0049060', u'ID_A', u'UNIQUE') (u'SYS_C0049060', u'ID_B', u'UNIQUE') (u'SYS_C0049061', u'ID_B', u'UNIQUE')
Correct script output after patch below:
type: PrimaryKeyConstraint name: sys_c0049060 columns: [u'id_b'](u'id_a',) type: Index name: sys_c0049061 columns: [u'id_b'](u'id_b') (u'SYS_C0049060', u'ID_A', u'UNIQUE') (u'SYS_C0049060', u'ID_B', u'UNIQUE') (u'SYS_C0049061', u'ID_B', u'UNIQUE')
This patch makes the patch in the description of this ticket ''(above)'' obsolete
--- lib/sqlalchemy/dialects/oracle/base.py 2010-07-02 01:25:26.000000000 -0400 +++ ../SQLAlchemy-0.6.1.6kb/lib/sqlalchemy/dialects/oracle/base.py 2010-08-03 14:42:52.000000000 -0400 @@ -873,34 +873,46 @@ schema=self.denormalize_name(schema)) indexes = [ last_index_name = None pkeys = self.get_primary_keys(connection, table_name, schema, resolve_synonyms=resolve_synonyms, dblink=dblink, info_cache=kw.get('info_cache')) uniqueness = dict(NONUNIQUE=False, UNIQUE=True) oracle_sys_col = re.compile(r'SYS_NC\d+\$', re.IGNORECASE) + + def sort_and_upper(names): + return sorted([i.upper() for i in names](] )) + + pk_names = sort_and_upper(pkeys) + + def remove_if_primary_key(index): + # don't include the primary key index + if index is not None and \ + sort_and_upper(index['column_names']('column_names')) == pk_names: + indexes.pop() + + index = None for rset in rp: - # don't include the primary key columns - if rset.column_name in [for s in pkeys](s.upper()): - continue if rset.index_name != last_index_name: + remove_if_primary_key(index) index = dict(name=self.normalize_name(rset.index_name), column_names=[ indexes.append(index) index['unique'](]) ) = uniqueness.get(rset.uniqueness, False) # filter out Oracle SYS_NC names. could also do an outer join # to the all_tab_columns table and check for real col names there. if not oracle_sys_col.match(rset.column_name): index['column_names']('column_names').append(self.normalize_name(rset.column_name)) last_index_name = rset.index_name + remove_if_primary_key(index) return indexes @reflection.cache def _get_constraint_data(self, connection, table_name, schema=None, dblink='', **kw): rp = connection.execute( sql.text("""SELECT ac.constraint_name, ac.constraint_type,
(Could get fancier and stop checking for the primary key index once it is found...)
Thanks,
Kent
P.S. This ticket is closely related to
#1868(same file is patched) -
Account Deleted - attached reflectunique.py
Oracle reflection of indexes should only skip the primary key index not all indexes with a primary key column
-
repo owner - changed milestone to 0.6.4
do you think you might try working these tests for here and
#1868into test/dialect/test_oracle.py ? there's a lot of details here that you seem to have under your belt. -
Account Deleted Sure, but I'd like your input:
I found out oracle won't allow this:
CREATE TABLE sometable ( id_a NVARCHAR2(255) NOT NULL, id_b NVARCHAR2(255) NOT NULL, col_c NVARCHAR2(255), PRIMARY KEY (id_a, id_b), UNIQUE (id_a, id_b) ) UNIQUE (id_a, id_b) * ERROR at line 6: ORA-02261: such unique or primary key already exists in the table
However, it does allow it if the order is changed (I wonder if this is unintended by oracle):
CREATE TABLE sometable ( id_a NVARCHAR2(255) NOT NULL, id_b NVARCHAR2(255) NOT NULL, col_c NVARCHAR2(255), PRIMARY KEY (id_a, id_b), UNIQUE (id_b, id_a) ) Table created.
Upon inspection, oracle has created two constraints, a 'U' and a 'P', but both use the same index.
SQL> select CONSTRAINT_NAME, CONSTRAINT_TYPE, STATUS, INDEX_NAME from user_constraints; CONSTRAINT_NAME C STATUS INDEX_NAME ------------------------------ - -------- ------------------------------ SYS_C0049092 C ENABLED SYS_C0049093 C ENABLED SYS_C0049094 P ENABLED SYS_C0049094 SYS_C0049095 U ENABLED SYS_C0049094 SQL> select INDEX_NAME, TABLE_NAME, UNIQUENESS from user_indexes; INDEX_NAME TABLE_NAME UNIQUENES ------------------------------ ------------------------------ --------- SYS_C0049094 SOMETABLE UNIQUE
The issue is we only reflect constraints where "
constraint_type IN ('R','P')
" and we use indexes to determine ourUniqueConstraint
s for reflection...After my patch, reflection finds the
PrimaryKeyConstraint
type: PrimaryKeyConstraint name: sys_c0049094 columns: [u'id_b'](u'id_a',)
but technically there should also be a
UniqueConstraint
listed with a different order of columns (but Oracle isn't really honoring the order of columns anyway, since it is using the same index).Do we even want to attempt "correct" reflection support for this scenario?
-Kent
-
Account Deleted Also, I'm only familiar with cvs, svn (familiar with hg conceptually) and my oracle
base.py
is from 0.6.1. If I submit patches like the ones here, is that fine, our should I download hg client, etc? -
Account Deleted Replying to guest:
but technically there should also be a
UniqueConstraint
listed with a different order of columns (but Oracle isn't really honoring the order of columns anyway, since it is using the same index).Do we even want to attempt "correct" reflection support for this scenario?
Wait a second, we don't even reflect
UniqueConstraint
s, do we? Only Indexes, so maybe this ambiguity goes away. (Of course it brings up whether we ''should'' reflect uniqueconstraints, but that would be a different topic.Input?
-
repo owner Replying to guest:
Also, I'm only familiar with cvs, svn (familiar with hg conceptually) and my oracle
base.py
is from 0.6.1. If I submit patches like the ones here, is that fine, our should I download hg client, etc?you don't have to worry about mercurial, while I encourage you to learn a DVCS someday I only need to get a unified diff file. Its best if its against the latest tip, which you can download without a VC client: http://hg.sqlalchemy.org/sqlalchemy/archive/default.tar.gz (this is off the downloads page).
-
repo owner Replying to guest:
Wait a second, we don't even reflect
UniqueConstraint
s, do we? Only Indexes, so maybe this ambiguity goes away. (Of course it brings up whether we ''should'' reflect uniqueconstraints, but that would be a different topic.at the moment, unique constraints aren't part of what the inspector interface pulls in. But certainly its something the inspector would have eventually - the point of inspector is that its a place for pretty much everything about a database to be introspected. We'd probably add a method called "get_constraints" to inspector that pulls them all in. But that's a new feature and wouldn't have to be part of this bug report.
-
Account Deleted Replying to guest:
Do we even want to attempt "correct" reflection support for this scenario?
There is the notion that the inspector should only reflect the pkconstraint, not also its index. (I can agree with that.) However, the issue is that, as I've shown, one index can enforce several constraints. It would be nice for inspector to report the index for a pk if isn't strictly for the pk ( ''also'' for another constraint). Also, it is possible for 2 indexes to have the same columns and then we don't know which index to ignore (and oracle 8.0's all_constraints doesn't have index_name.)
Should we make it simple and ignore all indexes if they have the same set of columns as the pk? (even if it misses indexes it shouldn't? ) (and ignore implications this has on the future plans for the inspector?) Any other ideas?
-
repo owner so to let it take in all the indexes:
diff -r 9d6cb72791502e7a704a244093d6daca9697dd3c lib/sqlalchemy/dialects/oracle/base.py --- a/lib/sqlalchemy/dialects/oracle/base.py Fri Aug 06 11:03:05 2010 -0400 +++ b/lib/sqlalchemy/dialects/oracle/base.py Sat Aug 07 13:05:09 2010 -0400 @@ -888,8 +888,8 @@ oracle_sys_col = re.compile(r'SYS_NC\d+\$', re.IGNORECASE) for rset in rp: # don't include the primary key columns - if rset.column_name in [for s in pkeys](s.upper()): - continue +# if rset.column_name in [for s in pkeys](s.upper()): +# continue if rset.index_name != last_index_name: index = dict(name=self.normalize_name(rset.index_name), column_names=[]) indexes.append(index)
Pretty much just one test fails for real, reflecting the table and then trying to recreate it - "ORA-01408: such column list already indexed". Another fails because a warning is raised regarding the unexpected index but that is harmless and the test can be made to ignore the warning.
So reflecting the additional index here is not a big deal, just slightly less compatible with how all the other databases do it, and it means you can't issue create_all() on a set of reflected metadata unless we figure something out (which....we probably should).
-
repo owner so let me just add then, that the original patch here which addresses this, I only care about the "round trip" capability. if your patch removes only indexes that have the identical columns then that is probably fine. We'd need a "round trip" test added to test/dialect/test_oracle.py.
class RoundTripTest(TestBase): __only_on__ = 'oracle' @testing.provide_metadata def test_basic(self): t1 = Table('t1', metadata, Column('id1', Integer, primary_key=True), Column('id2', Integer, primary_key=True), Column('c3', Integer), ) t2 = Table('t1', metadata, Column('id1', Integer, primary_key=True), Column('id2', Integer, primary_key=True), Column('c3', Integer), ) # index that covers a subset of PKs, or # whatever it is we need to test here Index('some_index', t2.c.id2, t2.c.c3) metadata.create_all() m2 = MetaData() m2.reflect() t1_reflected = m2.tables['t1']('t1') t2_reflected = m2.tables['t2']('t2') # do some assertions here, that # there's an index, etc. # assert xyz in t1_reflected.indexes m2.drop_all() m2.create_all()
-
Account Deleted Patch
Disclaimer
The inspector knowingly may not reflect all indexes if an index shares the exact same columns as the primary key constraint index. This should be very rare... I can only imagine a use-case is with some optimization where having two indexes in a different order affect some binary tree search or something, but not a normal situation...
--- sqlalchemy-default/lib/sqlalchemy/dialects/oracle/base.py 2010-08-05 12:28:01.000000000 -0400 +++ SQLAlchemy-0.6.3.1kb/lib/sqlalchemy/dialects/oracle/base.py 2010-08-07 18:02:23.000000000 -0400 @@ -886,11 +886,22 @@ uniqueness = dict(NONUNIQUE=False, UNIQUE=True) oracle_sys_col = re.compile(r'SYS_NC\d+\$', re.IGNORECASE) + + def upper_name_set(names): + return set([for i in names](i.upper())) + + pk_names = upper_name_set(pkeys) + + def remove_if_primary_key(index): + # don't include the primary key index + if index is not None and \ + upper_name_set(index['column_names']('column_names')) == pk_names: + indexes.pop() + + index = None for rset in rp: - # don't include the primary key columns - if rset.column_name in [for s in pkeys](s.upper()): - continue if rset.index_name != last_index_name: + remove_if_primary_key(index) index = dict(name=self.normalize_name(rset.index_name), column_names=[ indexes.append(index) index['unique'](]) ) = uniqueness.get(rset.uniqueness, False) @@ -900,6 +911,7 @@ if not oracle_sys_col.match(rset.column_name): index['column_names']('column_names').append(self.normalize_name(rset.column_name)) last_index_name = rset.index_name + remove_if_primary_key(index) return indexes @reflection.cache @@ -932,7 +944,6 @@ constraint_data = rp.fetchall() return constraint_data - @reflection.cache def get_primary_keys(self, connection, table_name, schema=None, **kw): """ @@ -943,7 +954,10 @@ dblink """ + return self._get_primary_keys(connection, table_name, schema=None, **kw)[0](0) + @reflection.cache + def _get_primary_keys(self, connection, table_name, schema=None, **kw): resolve_synonyms = kw.get('oracle_resolve_synonyms', False) dblink = kw.get('dblink', '') info_cache = kw.get('info_cache') @@ -953,6 +967,7 @@ resolve_synonyms, dblink, info_cache=info_cache) pkeys = [ constraint_name = None constraint_data = self._get_constraint_data(connection, table_name, schema, dblink, info_cache=kw.get('info_cache')) @@ -962,8 +977,18 @@ (cons_name, cons_type, local_column, remote_table, remote_column, remote_owner) = \ row[0:2](] +) + tuple([for x in row[2:6](self.normalize_name(x))]) if cons_type == 'P': + if constraint_name is None: + constraint_name = self.normalize_name(cons_name) pkeys.append(local_column) - return pkeys + return pkeys, constraint_name + + def get_pk_constraint(self, connection, table_name, schema=None, **kw): + cols, name = self._get_primary_keys(connection, table_name, schema=schema, **kw) + + return { + 'constrained_columns':cols, + 'name':name + } @reflection.cache def get_foreign_keys(self, connection, table_name, schema=None, **kw): diff -u -r sqlalchemy-default/test/dialect/test_oracle.py SQLAlchemy-0.6.3.1kb/test/dialect/test_oracle.py --- sqlalchemy-default/test/dialect/test_oracle.py 2010-08-05 12:28:01.000000000 -0400 +++ SQLAlchemy-0.6.3.1kb/test/dialect/test_oracle.py 2010-08-07 18:58:47.000000000 -0400 @@ -1168,4 +1168,64 @@ t.select(for_update=True).limit(2).offset(3).execute ) - + +class RoundTripIndexTest(TestBase): + __only_on__ = 'oracle' + + def test_basic(self): + engine = testing.db + metadata = MetaData(engine) + + table=Table("sometable", metadata, + Column("id_a", Unicode(255), primary_key=True), + Column("id_b", Unicode(255), primary_key=True, unique=True), + Column("group", Unicode(255), primary_key=True), + Column("col", Unicode(255)), + UniqueConstraint('col','group'), + ) + + normalind = Index('tableind', table.c.id_b, table.c.group) # "group" is a keyword, so lower case + + # create + metadata.create_all() + try: + # round trip, create from reflection + mirror = MetaData(engine) + mirror.reflect() + metadata.drop_all() + mirror.create_all() + + # inspect the reflected creation + inspect = MetaData(engine) + inspect.reflect() + + def obj_definition(obj): + return (obj.__class__, tuple([for c in obj.columns](c.name)), getattr(obj, 'unique', None)) + + # find what the primary k constraint name should be + primaryconsname = engine.execute( + text("""SELECT constraint_name + FROM all_constraints + WHERE table_name = :table_name + AND owner = :owner + AND constraint_type = 'P' """), + table_name=table.name.upper(), + owner=engine.url.username.upper()).fetchall()[0](0)[0](0) + + reflectedtable = inspect.tables[table.name](table.name) + + # make a dictionary of the reflected objects: + reflected = dict([i) for i in reflectedtable.indexes | reflectedtable.constraints]((obj_definition(i),)) + + # assert we got primary key constraint and its name, Error if not in dict + assert reflected[('id_a','id_b','group',), None)]((PrimaryKeyConstraint,).name.upper() == primaryconsname.upper() + # Error if not in dict + assert reflected[('id_b','group',), False)]((Index,).name == normalind.name + assert (Index, ('id_b',), True) in reflected + assert (Index, ('col','group',), True) in reflected + assert len(reflectedtable.constraints) == 1 + assert len(reflectedtable.indexes) == 3 + + finally: + metadata.drop_all() +
You may need to change the test to get it to run correctly (for example,
owner=engine.url.username.upper()
might not work). In the future, I'd like to figure out how to kick off the tests instead of running the python manually.Note that several of the asserts as well as the create_all() failed before the patch, which also fixes and tests
#1868. -
Account Deleted Important
I made a mistake when I pasted line 957:
return self._get_primary_keys(connection, table_name, schema=None, **kw)[0](0)
should be:
--- SQLAlchemy-0.6.3.1kb/lib/sqlalchemy/dialects/oracle/base.py 2010-08-07 18:02:23.000000000 -0400 +++ /home/rarch/site-packages/SQLAlchemy-0.6.3.1dev-py2.6-linux-x86_64.egg/sqlalchemy/dialects/oracle/base.py 2010-08-08 10:15:06.000000000 -0400 @@ -954,7 +954,7 @@ dblink """ - return self._get_primary_keys(connection, table_name, schema=None, **kw)[0](0) + return self._get_primary_keys(connection, table_name, schema, **kw)[0](0) @reflection.cache def _get_primary_keys(self, connection, table_name, schema=None, **kw):
Oops. Sorry!
-
repo owner Ill try to have a look in the next couple of days, its a little busy right now. thanks for the effort you've put in here !
-
repo owner - changed status to resolved
Fantastic job, patch went right in with pertinent tests continuing to pass. 36fa24603f20ff6afc537cc97310d90efc667959 . Thanks for your ongoing involvement with the project !
-
repo owner - removed milestone
Removing milestone: 0.6.4 (automated comment)
- Log in to comment
I see now that SQLAlchemy has requested that this be in lower-case by surrounding it in quotes:
Again, because it is a keyword, no doubt.