ORM mapping of reflected col names that are not ASCII

Issue #2856 resolved
Former user created an issue

(original reporter: teddy) There is a bug with reflexion on cyrillic column names in Oracle. Lets assume we have a table with such DDL:

CREATE TABLE cyr_test ( 
    "ИД" NUMBER CONSTRAINT id_demo NOT NULL,
    "ПОЛЕ1"    VARCHAR2,
    "ЕЩЕПОЛЕ"   VARCHAR2,
    CONSTRAINT    с_id_pk_demo PRIMARY KEY ("ИД") 
);

So we try to reflect on this table in sqlalchemy:

#!/usr/bin/env python
# -*- coding: utf-8 -*-

from sqlalchemy import *
from sqlalchemy.orm import sessionmaker

engine = create_engine('oracle+cx_oracle://<some_conn_string>, encoding="windows-1251", optimize_limits=True, use_binds_for_limits=False, echo='debug', coerce_to_decimal=False)
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base(bind=engine)

S = sessionmaker(bind=engine)
s = S()
class CyrTest(Base):
    __table__ = Table("cyr_test", Base.metadata,
                    Column(u'ИД', Integer, primary_key=True),
                    oracle_resolve_synonyms=True,
                    autoload=True
                )

And, getting this error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "test.py", line 14, in <module>
    class CyrTest(Base):
  File "C:\Python27\lib\site-packages\sqlalchemy\ext\declarative\api.py", line 5
0, in __init__
    _as_declarative(cls, classname, cls.__dict__)
  File "C:\Python27\lib\site-packages\sqlalchemy\ext\declarative\base.py", line
292, in _as_declarative
    mt.map()
  File "C:\Python27\lib\site-packages\sqlalchemy\ext\declarative\base.py", line
376, in map
    **mapper_args
  File "C:\Python27\lib\site-packages\sqlalchemy\orm\__init__.py", line 1229, in
 mapper
    return Mapper(class_, local_table, *args, **params)
  File "C:\Python27\lib\site-packages\sqlalchemy\orm\mapper.py", line 219, in __
init__
    self._configure_properties()
  File "C:\Python27\lib\site-packages\sqlalchemy\orm\mapper.py", line 904, in _c
onfigure_properties
    setparent=True)
  File "C:\Python27\lib\site-packages\sqlalchemy\orm\mapper.py", line 1181, in _
configure_property
    prop.instrument_class(self)
  File "C:\Python27\lib\site-packages\sqlalchemy\orm\properties.py", line 128, i
n instrument_class
    doc=self.doc
  File "C:\Python27\lib\site-packages\sqlalchemy\orm\attributes.py", line 1449,
in register_descriptor
    manager.instrument_attribute(key, descriptor)
  File "C:\Python27\lib\site-packages\sqlalchemy\orm\instrumentation.py", line 1
92, in instrument_attribute
    self.install_descriptor(key, inst)
  File "C:\Python27\lib\site-packages\sqlalchemy\orm\instrumentation.py", line 2
44, in install_descriptor
    setattr(self.class_, key, inst)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-1: ordin
al not in range(128)

With another table, without cyrillic column names, everything works fine.

Tested on sqlalchemy 0.8.1/0.8.3/0.9.0b1 in Python 2.7.2 (ActiveState ActivePython 2.7.2.5), Windows 7 x64, Oracle 11g.

Comments (8)

  1. Former user Account Deleted

    (original author: teddy) Little addition. This bug is present in a 0.7.x version too, just forgot to post it then.

  2. Former user Account Deleted

    (original author: teddy) Another little addition, sorry for my memory =)

    If we declare mapped class a bit more explicitly:

    class CyrTest(Base):
        __table__ = Table("cyr_test", Base.metadata,
                        Column(u'ИД', Integer, primary_key=True, key='id'),
                        Column(u'ПОЛЕ1', String, key='field1'),
                        Column(u'ЕЩЕПОЛЕ', String, key='field2'),
                        oracle_resolve_synonyms=True,
                        autoload=True
                    )
    

    It will reflect on table successfully. The problem in keys, of course. I dont't know how to resolve this situation, but i see 2 ways: At first, we can just use colimn names as is, so we can write something like {{{CyrTest.Поле1}}}. At second, we can trasliterate names to latin, and use something like {{{CyrTest.Pole1}}}. Transliteration libraries is present for Python, for example https://pypi.python.org/pypi/trans, which i use in my projects.

  3. Former user Account Deleted

    (original author: teddy) Hmmm, did some experiments. I've put some changes to instrumentation.py:

    line 244 setattr(self.class_, key, inst)
    

    to

    line 244 setattr(self.class_, key.encode('utf-8'), inst)
    

    and to

    line 1 import trans
    ...
    line 244 setattr(self.class_, key.encode('trans'), inst)
    

    If we use transliteration, there is a problems such a {{{SomeMappedClass.table.c.keys()}}} returns native field names. If we use native names, there is a big question with manipulation in interactive mode...

  4. Mike Bayer repo owner

    OK when i first saw this I assumed the oracle dialect couldn't reflect those names, but it seems like it is, so wow on that.

    Second issue, note that the ORM is an optional component here. The reflection is working fine as you're getting a Table with the names you want. The issue is that you're mapping that Table to a Python class, a process which by default links the columns to same-named attributes on a class - and Python 2.x does not accept non-ASCII identifiers - this is not a SQLAlchemy limitation, it's a Python 2.x one.

    So if you wanted to do #1, e.g. CyrTest.Поле1., you could use Python 3 (and you'd definitely want to use 0.9, which is released as beta1 so far, for the best Py3K compat), otherwise it's not possible. In Python 3 the reflection + mapping works as is.

    Now using a system like a transliteration library, great, that's not a built-in for SQLAlchemy, you do in fact need to go with the "key" approach. We provide the "column_reflect" event to most effectively redefine keys as needed.

    Below is an example program against SQLite illustrating both a Python 3 and a Python 2 approach. I don't have a transliteration library available so I am using "attr_%s" % hexlify(name.encode('utf-8') here, replace that with whatever you'd like.

    If the reflection is working then this is as far as we can go, so I'm closing this as "worksforme".

    #!coding: utf-8
    
    from sqlalchemy import create_engine, Table
    from sqlalchemy.ext.declarative import declarative_base
    from sqlalchemy.orm import Session
    import sys
    
    e = create_engine("sqlite://", echo=True)
    
    e.execute(u"""
        CREATE TABLE test (
    
            "ИД" INTEGER PRIMARY KEY,
            "ПОЛЕ1" VARCHAR(20),
            "ЕЩЕПОЛЕ" VARCHAR(20)
        )
    """)
    
    e.execute(u"""
        INSERT INTO test ("ИД", "ПОЛЕ1", "ЕЩЕПОЛЕ") VALUES (1, 'v1', 'v2')
    """)
    
    
    py3 = sys.version_info >= (3, )
    
    if not py3:
        from sqlalchemy import event
        import binascii
    
        @event.listens_for(Table, "column_reflect")
        def column_reflect(inspector, table, column_info):
            # do whatever encoding/transliteration you'd like here
            column_info['key']('key') = "attr_%s" % binascii.hexlify(column_info['name']('name').encode('utf-8'))
    
    
    Base = declarative_base()
    
    class Foo(Base):
        __table__ = Table('test', Base.metadata, autoload=True, autoload_with=e)
    
    s = Session(e)
    f1 = s.query(Foo).first()
    
    if py3:
        # using exec so this script runs in py2k as well,
        # you can use non-ascii identifiers in py3k.
        exec("print(f1.ПОЛЕ1)")
    else:
        # our hexlified (or transliteterated, or whatever)
        print(f1.attr_d09fd09ed09bd09531)
    
  5. Former user Account Deleted

    (original author: teddy) Thank's for your explanation, really great. When i'm started with my experiments, i decided that there is a tricky problem with naming =) I'm not a very experienced or skilled developer, especially in architecture and smart separation of components (decomposition), so this a very good demonstration for me =)

    Thanks!

  6. Log in to comment