oracle 8 unicode support

Issue #1808 resolved
Mike Bayer repo owner created an issue

waiting on kb for information if this patch works with oracle 8

Comments (25)

  1. Former user Account Deleted

    patch allows oracle connection to succeed.

    However, now I get problems during the metadata.create_all() statement. SQLA supplies unicode bind parameters to the select statement instead of converting them to plain ansi characters:

    Traceback:

    Traceback (most recent call last):
      File "/home/rarch/tg2env/bin/paster", line 8, in <module>
        load_entry_point('PasteScript==1.7.3', 'console_scripts', 'paster')()
      File "/home/rarch/tg2env/lib/python2.6/site-packages/PasteScript-1.7.3-py2.6.egg/paste/script/command.py", line 84, in run
        invoke(command, command_name, options, args[1:](1:))
      File "/home/rarch/tg2env/lib/python2.6/site-packages/PasteScript-1.7.3-py2.6.egg/paste/script/command.py", line 123, in invoke
        exit_code = runner.run(args)
      File "/home/rarch/tg2env/lib/python2.6/site-packages/PasteScript-1.7.3-py2.6.egg/paste/script/appinstall.py", line 68, in run
        return super(AbstractInstallCommand, self).run(new_args)
      File "/home/rarch/tg2env/lib/python2.6/site-packages/PasteScript-1.7.3-py2.6.egg/paste/script/command.py", line 218, in run
        result = self.command()
      File "/home/rarch/tg2env/lib/python2.6/site-packages/PasteScript-1.7.3-py2.6.egg/paste/script/appinstall.py", line 456, in command
        self, config_file, section, self.sysconfig_install_vars(installer))
      File "/home/rarch/tg2env/lib/python2.6/site-packages/PasteScript-1.7.3-py2.6.egg/paste/script/appinstall.py", line 598, in setup_config
        mod.setup_app, command, filename, section, vars)
      File "/home/rarch/tg2env/lib/python2.6/site-packages/PasteScript-1.7.3-py2.6.egg/paste/script/appinstall.py", line 612, in _call_setup_app
        func(command, conf, vars)
      File "/home/rarch/trunk/src/appserver/pylotengine/websetup.py", line 32, in setup_app
        model.metadata.create_all(bind=config['pylons.app_globals']('pylons.app_globals').sa_engine)
      File "/home/rarch/tg2env/lib/python2.6/site-packages/SQLAlchemy-0.6.0.2-py2.6-linux-x86_64.egg/sqlalchemy/schema.py", line 1958, in create_all
        bind.create(self, checkfirst=checkfirst, tables=tables)
      File "/home/rarch/tg2env/lib/python2.6/site-packages/SQLAlchemy-0.6.0.2-py2.6-linux-x86_64.egg/sqlalchemy/engine/base.py", line 1504, in create
        self._run_visitor(ddl.SchemaGenerator, entity, connection=connection, **kwargs)
      File "/home/rarch/tg2env/lib/python2.6/site-packages/SQLAlchemy-0.6.0.2-py2.6-linux-x86_64.egg/sqlalchemy/engine/base.py", line 1535, in _run_visitor
        visitorcallable(self.dialect, conn, **kwargs).traverse_single(element)
      File "/home/rarch/tg2env/lib/python2.6/site-packages/SQLAlchemy-0.6.0.2-py2.6-linux-x86_64.egg/sqlalchemy/sql/visitors.py", line 77, in traverse_single
        return meth(obj, **kw)
      File "/home/rarch/tg2env/lib/python2.6/site-packages/SQLAlchemy-0.6.0.2-py2.6-linux-x86_64.egg/sqlalchemy/engine/ddl.py", line 36, in visit_metadata
        collection = [for t in sql_util.sort_tables(tables) if self._can_create(t)](t)
      File "/home/rarch/tg2env/lib/python2.6/site-packages/SQLAlchemy-0.6.0.2-py2.6-linux-x86_64.egg/sqlalchemy/engine/ddl.py", line 29, in _can_create
        return not self.checkfirst or not self.dialect.has_table(self.connection, table.name, schema=table.schema)
      File "/home/rarch/tg2env/lib/python2.6/site-packages/SQLAlchemy-0.6.0.2-py2.6-linux-x86_64.egg/sqlalchemy/dialects/oracle/base.py", line 632, in has_table
        name=self.denormalize_name(table_name), schema_name=self.denormalize_name(schema))
      File "/home/rarch/tg2env/lib/python2.6/site-packages/SQLAlchemy-0.6.0.2-py2.6-linux-x86_64.egg/sqlalchemy/engine/base.py", line 1109, in execute
        return Connection.executors[c](c)(self, object, multiparams, params)
      File "/home/rarch/tg2env/lib/python2.6/site-packages/SQLAlchemy-0.6.0.2-py2.6-linux-x86_64.egg/sqlalchemy/engine/base.py", line 1186, in _execute_clauseelement
        return self.__execute_context(context)
      File "/home/rarch/tg2env/lib/python2.6/site-packages/SQLAlchemy-0.6.0.2-py2.6-linux-x86_64.egg/sqlalchemy/engine/base.py", line 1215, in __execute_context
        context.parameters[0](0), context=context)
      File "/home/rarch/tg2env/lib/python2.6/site-packages/SQLAlchemy-0.6.0.2-py2.6-linux-x86_64.egg/sqlalchemy/engine/base.py", line 1284, in _cursor_execute
        self._handle_dbapi_exception(e, statement, parameters, cursor, context)
      File "/home/rarch/tg2env/lib/python2.6/site-packages/SQLAlchemy-0.6.0.2-py2.6-linux-x86_64.egg/sqlalchemy/engine/base.py", line 1282, in _cursor_execute
        self.dialect.do_execute(cursor, statement, parameters, context=context)
      File "/home/rarch/tg2env/lib/python2.6/site-packages/SQLAlchemy-0.6.0.2-py2.6-linux-x86_64.egg/sqlalchemy/engine/default.py", line 281, in do_execute
        cursor.execute(statement, parameters)
    sqlalchemy.exc.DatabaseError: (DatabaseError) ORA-12704: character set mismatch
     'SELECT table_name FROM all_tables WHERE table_name = :name AND owner = :schema_name' {'name': u'ORDERS_T', 'schema_name': u'ARC'}
    
  2. Former user Account Deleted

    Now metadata.create_all() appears to work nicely. I still have problems in my user code where I've specified a unicode string for a bind param. For Oracle 9 or Postgresql, these bind parameters are converted nicely for me. Can we make them auto-convert for Oracle 8 or do I need to avoid using unicode in sqla land?

    (We considered having the ability to support unicode for Postgres and still have the same code function in Oracle 8, though we will likely abandon this idea. This means I ''could'' and probably ''will'' go through all our code and change away from unicode, but it seems it would be nice for this to automatically convert unicode bind params on Oracle 8, agreed?)

    sqlalchemy.exc.DatabaseError: (DatabaseError) ORA-12704: character set mismatch
     'INSERT INTO systemparameters (name, type, stringval, numberval, dateval) VALUES (:name, :type, :stringval, :numberval, :dateval)' {'stringval': u'L', 'type': u's', 'name': u'invcost_fifo_lifo', 'dateval': None, 'numberval': None}
    
  3. Mike Bayer reporter

    if those are String-typed, and you're using at least text() or any SQL construct to define the statement, they will be encoded from unicode to string. Otherwise I'm not sure how a unicode object is getting sent through as a bind param with my most recent patch, you'd have to illustrate an example.

  4. Former user Account Deleted

    Would you expect this to automatically convert from unicode?

    from sqlalchemy import *
    from sqlalchemy.orm import *
    
    engine = create_engine('oracle://usr:pass@1.2.3.4:1521/live',echo=True)
    engine.connect()
    print "*****************************"
    print "server_version_info: ", engine.dialect.server_version_info
    print "_supports_char_length: ", engine.dialect._supports_char_length
    print "supports_unicode_binds: ", engine.dialect.supports_unicode_binds
    print "use_ansi:", engine.dialect.use_ansi
    print "*****************************"
    
    metadata = MetaData()
    Session = sessionmaker(bind=engine)
    DBSession = Session()
    
    table = Table("tabl", metadata,
        Column("col", String(255), primary_key=True)
    )
    
    class Record(object):
        pass
    
    mapper(Record, table)
    
    r=Record()
    r.col = u'Unicode String'
    
    metadata.create_all(engine)
    DBSession.add(r)
    DBSession.flush()
    

    Here is the output:

    2010-05-25 04:55:21,725 INFO sqlalchemy.engine.base.Engine.0x...ce90 SELECT USER FROM DUAL
    2010-05-25 04:55:21,729 INFO sqlalchemy.engine.base.Engine.0x...ce90 {}
    *****************************
    server_version_info:  (8, 1, 7, 4, 0)
    _supports_char_length:  False
    supports_unicode_binds:  False
    use_ansi: False
    *****************************
    2010-05-25 04:55:23,981 INFO sqlalchemy.engine.base.Engine.0x...ce90 SELECT table_name FROM all_tables WHERE table_name = :name AND owner = :schema_name
    2010-05-25 04:55:23,981 INFO sqlalchemy.engine.base.Engine.0x...ce90 {'name': 'TABL', 'schema_name': 'ARC'}
    2010-05-25 04:55:24,098 INFO sqlalchemy.engine.base.Engine.0x...ce90
    CREATE TABLE tabl (
            col VARCHAR(255) NOT NULL,
            PRIMARY KEY (col)
    )
    
    
    2010-05-25 04:55:24,099 INFO sqlalchemy.engine.base.Engine.0x...ce90 {}
    2010-05-25 04:55:24,237 INFO sqlalchemy.engine.base.Engine.0x...ce90 COMMIT
    2010-05-25 04:55:24,429 INFO sqlalchemy.engine.base.Engine.0x...ce90 BEGIN
    2010-05-25 04:55:24,429 INFO sqlalchemy.engine.base.Engine.0x...ce90 INSERT INTO tabl (col) VALUES (:col)
    2010-05-25 04:55:24,429 INFO sqlalchemy.engine.base.Engine.0x...ce90 {'col': u'Unicode String'}
    2010-05-25 04:55:24,629 INFO sqlalchemy.engine.base.Engine.0x...ce90 ROLLBACK
    Traceback (most recent call last):
      File "ora8.py", line 32, in <module>
        DBSession.flush()
      File "/home/rarch/tg2env/lib/python2.6/site-packages/SQLAlchemy-0.6.0.2-py2.6-linux-x86_64.egg/sqlalchemy/orm/session.py", line 1348, in flush
        self._flush(objects)
      File "/home/rarch/tg2env/lib/python2.6/site-packages/SQLAlchemy-0.6.0.2-py2.6-linux-x86_64.egg/sqlalchemy/orm/session.py", line 1429, in _flush
        flush_context.execute()
      File "/home/rarch/tg2env/lib/python2.6/site-packages/SQLAlchemy-0.6.0.2-py2.6-linux-x86_64.egg/sqlalchemy/orm/unitofwork.py", line 293, in execute
        rec.execute(self)
      File "/home/rarch/tg2env/lib/python2.6/site-packages/SQLAlchemy-0.6.0.2-py2.6-linux-x86_64.egg/sqlalchemy/orm/unitofwork.py", line 426, in execute
        uow
      File "/home/rarch/tg2env/lib/python2.6/site-packages/SQLAlchemy-0.6.0.2-py2.6-linux-x86_64.egg/sqlalchemy/orm/mapper.py", line 1559, in _save_obj
        c = connection.execute(statement.values(value_params), params)
      File "/home/rarch/tg2env/lib/python2.6/site-packages/SQLAlchemy-0.6.0.2-py2.6-linux-x86_64.egg/sqlalchemy/engine/base.py", line 1109, in execute
        return Connection.executors[c](c)(self, object, multiparams, params)
      File "/home/rarch/tg2env/lib/python2.6/site-packages/SQLAlchemy-0.6.0.2-py2.6-linux-x86_64.egg/sqlalchemy/engine/base.py", line 1186, in _execute_clauseelement
        return self.__execute_context(context)
      File "/home/rarch/tg2env/lib/python2.6/site-packages/SQLAlchemy-0.6.0.2-py2.6-linux-x86_64.egg/sqlalchemy/engine/base.py", line 1215, in __execute_context
        context.parameters[0](0), context=context)
      File "/home/rarch/tg2env/lib/python2.6/site-packages/SQLAlchemy-0.6.0.2-py2.6-linux-x86_64.egg/sqlalchemy/engine/base.py", line 1284, in _cursor_execute
        self._handle_dbapi_exception(e, statement, parameters, cursor, context)
      File "/home/rarch/tg2env/lib/python2.6/site-packages/SQLAlchemy-0.6.0.2-py2.6-linux-x86_64.egg/sqlalchemy/engine/base.py", line 1282, in _cursor_execute
        self.dialect.do_execute(cursor, statement, parameters, context=context)
      File "/home/rarch/tg2env/lib/python2.6/site-packages/SQLAlchemy-0.6.0.2-py2.6-linux-x86_64.egg/sqlalchemy/engine/default.py", line 282, in do_execute
        cursor.execute(statement, parameters)
    sqlalchemy.exc.DatabaseError: (DatabaseError) ORA-12704: character set mismatch
     'INSERT INTO tabl (col) VALUES (:col)' {'col': u'Unicode String'}
    
  5. Former user Account Deleted

    By the way, if I change String(255) to Unicode(255) then this works fine. It used to be that all the strings were Unicode, and I recently switch them to String when upgrading to 0.6.0 because I didn't want to use NVARCHAR where the existing columns were VARCHAR.

  6. Former user Account Deleted

    Another note: the exact same script that fails on Oracle 8 succeeds on higher versions of Oracle:

    *****************************
    server_version_info:  (10, 2, 0, 1, 0)
    _supports_char_length:  True
    supports_unicode_binds:  True
    use_ansi: True
    *****************************
    2010-05-16 12:23:15,701 INFO sqlalchemy.engine.base.Engine.0x...7410 SELECT table_name FROM all_tables WHERE table_name = :name AND owner = :schema_name
    2010-05-16 12:23:15,701 INFO sqlalchemy.engine.base.Engine.0x...7410 {'name': u'TABL', 'schema_name': u'ARC'}
    2010-05-16 12:23:15,706 INFO sqlalchemy.engine.base.Engine.0x...7410
    CREATE TABLE tabl (
            col VARCHAR(255 CHAR) NOT NULL,
            PRIMARY KEY (col)
    )
    
    
    2010-05-16 12:23:15,706 INFO sqlalchemy.engine.base.Engine.0x...7410 {}
    2010-05-16 12:23:15,901 INFO sqlalchemy.engine.base.Engine.0x...7410 COMMIT
    2010-05-16 12:23:15,902 INFO sqlalchemy.engine.base.Engine.0x...7410 BEGIN
    2010-05-16 12:23:15,903 INFO sqlalchemy.engine.base.Engine.0x...7410 INSERT INTO tabl (col) VALUES (:col)
    2010-05-16 12:23:15,903 INFO sqlalchemy.engine.base.Engine.0x...7410 {'col': u'Unicode String'}
    

    So, it seems inconsistent somewhere....

  7. Mike Bayer reporter

    Replying to guest:

    Would you expect this to automatically convert from unicode?

    OK, so here you need either the Unicode() type (or similarly convert_unicode=True with String), or as a shortcut you can add "convert_unicode=True" to create_engine() which will turn on the unicode->utf-8 coercion for all String types.

    If that solves the issue then we can commit this patch, which is the main thing holding up the 0.6.1 release at this point (shooting for this weekend).

  8. Former user Account Deleted

    Replying to zzzeek:

    Replying to guest:

    Would you expect this to automatically convert from unicode?

    OK, so here you need either the Unicode() type (or similarly convert_unicode=True with String), or as a shortcut you can add "convert_unicode=True" to create_engine() which will turn on the unicode->utf-8 coercion for all String types.

    If that solves the issue then we can commit this patch, which is the main thing holding up the 0.6.1 release at this point (shooting for this weekend).

    Yes, when I checked with Unicode(), it apparently works, so the patch is good.

    I am still curious, however: why do I need to use Unicode() for it to work with Oracle 8, but for Oracle 10, I don't need that - unicode strings are automatically converted even though I use String() ?

  9. Mike Bayer reporter

    the default behavior of String is to do nothing with bind parameters. the value is passed from your code straight to cx_oracle. (I had somewhat forgotten this detail in my response of yesterday). cx_oracle 5 allows Python unicodes as bind parameters directly, but apparently not when Oracle 8 is in use (even if the object is just ASCII, seems to be what you're observing).

    The usage of Unicode(), String(convert_unicode=True) or create_engine(..., convert_unicode=True) means the String type will intercept the u'xxx' string and encode it first to the dialect encoding (defaults to utf-8).

    I'm still a little suspicious that your Oracle 8 install might be potentially configurable to work here, but its not worth it if turning on the "encode to string" flag allows things to work anyway.

  10. Former user Account Deleted

    The usage of Unicode(), String(convert_unicode=True) or create_engine(..., convert_unicode=True) means the String type will intercept the u'xxx' string and encode it first to the dialect encoding (defaults to utf-8).

    Suppose I want the same table definition to work whether the connection string is Postgres, Oracle 8 or Oracle 10. Then, if I use Unicode() instead, that is great for Postgres. But if I am connecting to Oracle 8 (or 10) and the underlying data is still stored as ASCII, will that incur pointless string processing overhead to convert from ASCII to unicode for Python and then back to ASCII? It seems this would be the case for Unicode() or String(convert_unicode=True). What about setting convert_unicode=True. Does it only convert if it needs to or will it automatically always convert incoming data from ASCII to unicode for python and then back from unicode to ASCII?

    I'm still a little suspicious that your Oracle 8 install might be potentially configurable to work here, but its not worth it if turning on the "encode to string" flag allows things to work anyway.

    Unfortunately, we won't have control over this legacy database (in other words, I can't change the Oracle setup to see if it is configurable, even if I knew what to change as it may impact the legacy software using the database)...

  11. Mike Bayer reporter

    You really should use Unicode if you're sending in Python unicode objects (i.e. u'xxx'). The "pointless overhead" on the SQLA side is only when you use Oracle 8, which is the only scenario where SQLA is doing the encoding itself. In the postgresql and cx_oracle + oracle 10 scenarios, the driver is doing the unicode conversion since they accept Python unicode objects directly. SQLA 0.6 only does something when the underlying driver can't.

    As it turns out, in all of the above cases, native encoders (assuming you use the SQLA c extensions) are doing the work here. Overhead is minimal and should take a backseat to cross-compatibility. The "overhead" associated with bind parameters is typically minimal, its result set processing that is usually where most of the overhead lies.

    If you stick with String, the cx_oracle and PG drivers are by still doing the conversion to Python unicode anyway in result sets. The pg dialect has an option to turn this off but the cx_oracle one at the moment does not.

    When SQLalchemy does the unicode conversion itself, it only encodes Python unicode objects and only decodes Python string objects.

  12. Former user Account Deleted

    One more thought on this: As part of the patch, I'd consider adding

    self.convert_unicode = True
    

    since apparently cx_Oracle doesn't convert the unicode for oracle under version 8:

        def initialize(self, connection): 
        super(OracleDialect_cx_oracle, self).initialize(connection) 
        if self.server_version_info < (9, ): 
            self.supports_unicode_binds = False 
            self.convert_unicode = True
    

    Thoughts on that? (Or are you under the impression that there are ways of configuring Oracle 8 for some of these features, which I don't have set up...?)

  13. Mike Bayer reporter

    convert_unicode is an end user setting. if you don't have it set, it means you want SQLA to have no interaction with your strings.

  14. Mike Bayer reporter

    that is, no interaction with strings that aren't otherwise individually set to accept unicode. the setting should be able to handle being left on True safely.

  15. Former user Account Deleted

    I'm asking myself ''what changed since 0.5.8?'' There were none of these issues in 0.5.8, so it must not be a cx_Oracle difference between 8 and 9i ? Or did sqla change, now deferring unicode conversion to the driver?

  16. Former user Account Deleted

    I assume you would support this change to go along with the other changes in your patch?

    --- base.py.changed     2010-05-26 16:17:23.000000000 -0400
    +++ base.py     2010-05-26 16:16:06.000000000 -0400
    @@ -301,7 +301,10 @@
             return self.visit_CLOB(type_)
    
         def visit_unicode_text(self, type_):
    -        return self.visit_NCLOB(type_)
    +        if self.dialect._supports_nchar:
    +            return self.visit_NCLOB(type_)
    +        else:
    +            return self.visit_CLOB(type_)
    
         def visit_large_binary(self, type_):
             return self.visit_BLOB(type_)
    
  17. Former user Account Deleted

    Are you asking me to verify the codeset still works now on ORA 8 or that your tests work?

  18. Log in to comment