change oracle dialect to use char_length instead of length for reflection
Currently when determining the size of a CHAR, VARCHAR2 or NVARCHAR2 column, we are using the "data_length" information from ALL_TAB_COLUMNS, however, for unicode databases, or those with non-ascii character sets, this number does not actually represent the number of characters that fit in the column, but rather the number of bytes. The number of characters is stored in the "char_length" column.
Here are the changes I think solve the problem(all in dialects.oracle.base:
@reflection.cache
def get_columns(self, connection, table_name, schema=None, **kw):
"""
kw arguments can be:
oracle_resolve_synonyms
dblink
"""
.
.
.
c = connection.execute(sql.text(
"SELECT column_name, data_type, char_length, data_precision, data_scale, "
"nullable, data_default FROM ALL_TAB_COLUMNS%(dblink)s "
"WHERE table_name = :table_name AND owner = :owner "
"ORDER BY column_id" % {'dblink': dblink}),
table_name=table_name, owner=schema)
.
.
.
def visit_VARCHAR(self, type_):
return "VARCHAR(%(length)s CHAR)" % {'length' : type_.length}
def visit_NVARCHAR(self, type_):
return "NVARCHAR2(%(length)s CHAR)" % {'length' : type_.length}
If we could get this changed into the 0.6 release that would be great.
The CHAR column definition syntax has been around since oracle 9i, but is not supported in 8 and earlier, so I'm unsure as to how to best handle that difference for pre 9i databases.
Comments (10)
-
repo owner -
Account Deleted The impact, from what I've seen in testing, is that when storing unicode data into VARCHAR2 columns, that were created using sqlalchemy, you will now be able to store the full number of characters if the oracle character set and the python encoding match, instead of up to half the number of characters if all the characters in the string being inserted are multi byte.
Additionally you will be able to query against reflected column information to see if the data you are inserting has too many characters. This could fix potential bugs with FormAlchemy and other packages that use the length information.
The down side to this is that we would essentially be de-supporting oracle 8 and earlier users unless we add checking the version number to visits
-
repo owner OK its fine then, its true the length interpreted as CHAR will always mean more room is created and not less. Checking for <= 8 is not a problem at all in 0.6.
-
repo owner If i use the CHAR syntax with NVARCHAR2, I get "DatabaseError: ORA-00907: missing right parenthesis".
-
repo owner and...if I omit CHAR from VARCHAR, CHAR_LENGTH still comes back with the same number.
-
repo owner but that's likely because my test DB is in ASCII.
-
repo owner - changed status to resolved
so it applies just to VARCHAR. 03573c0517dc27f90f1a07ef8ad67a0692977a24
-
repo owner -
Account Deleted Thanks for Making these changes!!
-
repo owner - removed milestone
Removing milestone: 0.6.0 (automated comment)
- Log in to comment
the dialect's version number is available and we'd have to check that in the visit.
this change would appear to be significantly backwards incompatible, and while 0.5->0.6 gives us the opportunity to do things like this its a little late for it to get in. what do you think the impact will be ?