change oracle dialect to use char_length instead of length for reflection

Issue #1744 resolved
Former user created an issue

Currently when determining the size of a CHAR, VARCHAR2 or NVARCHAR2 column, we are using the "data_length" information from ALL_TAB_COLUMNS, however, for unicode databases, or those with non-ascii character sets, this number does not actually represent the number of characters that fit in the column, but rather the number of bytes. The number of characters is stored in the "char_length" column.

Here are the changes I think solve the problem(all in dialects.oracle.base:

    @reflection.cache
    def get_columns(self, connection, table_name, schema=None, **kw):
        """

        kw arguments can be:

            oracle_resolve_synonyms

            dblink

        """
        .
        .
        .
        c = connection.execute(sql.text(
                "SELECT column_name, data_type, char_length, data_precision, data_scale, "
                "nullable, data_default FROM ALL_TAB_COLUMNS%(dblink)s "
                "WHERE table_name = :table_name AND owner = :owner " 
                "ORDER BY column_id" % {'dblink': dblink}),
                               table_name=table_name, owner=schema)
        .
        .
        .




    def visit_VARCHAR(self, type_):
        return "VARCHAR(%(length)s CHAR)" % {'length' : type_.length}

    def visit_NVARCHAR(self, type_):
        return "NVARCHAR2(%(length)s CHAR)" % {'length' : type_.length}

If we could get this changed into the 0.6 release that would be great.

The CHAR column definition syntax has been around since oracle 9i, but is not supported in 8 and earlier, so I'm unsure as to how to best handle that difference for pre 9i databases.

Comments (10)

  1. Mike Bayer repo owner
    • assigned issue to
    • changed component to oracle
    • changed milestone to 0.6.0

    the dialect's version number is available and we'd have to check that in the visit.

    this change would appear to be significantly backwards incompatible, and while 0.5->0.6 gives us the opportunity to do things like this its a little late for it to get in. what do you think the impact will be ?

  2. Former user Account Deleted

    The impact, from what I've seen in testing, is that when storing unicode data into VARCHAR2 columns, that were created using sqlalchemy, you will now be able to store the full number of characters if the oracle character set and the python encoding match, instead of up to half the number of characters if all the characters in the string being inserted are multi byte.

    Additionally you will be able to query against reflected column information to see if the data you are inserting has too many characters. This could fix potential bugs with FormAlchemy and other packages that use the length information.

    The down side to this is that we would essentially be de-supporting oracle 8 and earlier users unless we add checking the version number to visits

  3. Mike Bayer repo owner

    OK its fine then, its true the length interpreted as CHAR will always mean more room is created and not less. Checking for <= 8 is not a problem at all in 0.6.

  4. Mike Bayer repo owner

    If i use the CHAR syntax with NVARCHAR2, I get "DatabaseError: ORA-00907: missing right parenthesis".

  5. Mike Bayer repo owner

    and...if I omit CHAR from VARCHAR, CHAR_LENGTH still comes back with the same number.

  6. Log in to comment