support cx_oracle 5.0.1's "no NLS_LANG required" unicode handling

Issue #1333 resolved
Former user created an issue

Hi, I am using sqlalchemy 0.5.2 with cx_Oracle 5.0 and python 2.5. When I am trying to update an NVARCHAR column using only cx_Oracle (no sqlalchemy) I manage to update the column correctly.

But when I map an object to the relevant table, and try to change the attribute that is connected with the column (and commit), it is changed to something like u'\uff\fd' (or '?' when using a different NLS_LANG).

I must mention that the column is recognized correctly as NVARCHAR when I do the mapping and the right "bind_processor" function is called.

Either way, I found out that the reason this is all happening is that the "bind_processor" function of "Unicode" return (a process function that returns) value.encode(dialect.encoding) in the case of a unicode value, instead of just returning the value.

cx_Oracle is expecting to receive the bind_param as a unicode string.

So I created a TypeDecorator that decorates NVARCHAR and in bind_processor it returns (a process function that returns) the unicode value itself and not the encoded value.

Am I right?

BTW the result_processor is fine.

Comments (8)

  1. Mike Bayer repo owner

    it appears to be a major change in cx_oracle: http://cx-oracle.sourceforge.net/HISTORY.txt . SQLA's adapter is developed against the 4 series of cx_oracle. cx_oracle should in any case not be interpreting a non-unicode object as a unicode object - it should pass the bytestring SQLA sends it straight through. there seem to be changes to this behavior in 5.0.1 you should try out.

    If you have time, see if you can get plain cx_oracle with a utf-8 encoded bytestring to work. if that doesn't work, i would report this as a bug in cx_oracle (the history mentions about being built in certain modes, this seems like an awkward requirement).

    The solution within SQLA will be that we need to detect the version of cx_oracle in use, which I'd like to push off to 0.6, since it has a framework in place for version detection upon dialect creation.

  2. Mike Bayer repo owner

    I can't reproduce this issue with cx_oracle 5.0.1. Did you set NLS_LANG appropriately ? the attached test case exhaustively inserts and retrieves multibyte data using both cx_oracle and SQLAlchemy from both NVARCHAR and VARCHAR columns.

  3. Mike Bayer repo owner

    regarding the above test, the only way I can make it fail is by unsetting NLS_LANG. It appears that cx_oracle 5.0.1 can handle unicode data without any NLS_LANG setting, but you have to pass in a Python unicode. unfortunately I tested against cx_oracle 4.3.3 and this is not the case, it raises "unhandled data type unicode" - so I can't just across the board turn off SQLA's unicode handling.

    So its back to 0.6 we go, where we can add logic to detect cx_oracle as of 5.0.1 and disable SQLA's encoding of unicode objects if detected.

  4. Former user Account Deleted

    It's wierd because I do set the NLS_LANG to utf-8 the same way you do.

    I think that there might be something wrong in my db, or in my enviornment... Mainly because after checking it, I had found out that the unicode isn't working for me with 4.x.x as well. I guess that for the time being I'll continue working with my special, not oracle9 compliant, NVARCHAR2 type.

    Oof, I'm upset. I work in a private network that is not connected to the internet, so I can't run your test through there, and I can't see if that's something in my env or in my db :(

  5. Log in to comment