ORM mapping of reflected col names that are not ASCII
(original reporter: teddy) There is a bug with reflexion on cyrillic column names in Oracle. Lets assume we have a table with such DDL:
CREATE TABLE cyr_test (
"ИД" NUMBER CONSTRAINT id_demo NOT NULL,
"ПОЛЕ1" VARCHAR2,
"ЕЩЕПОЛЕ" VARCHAR2,
CONSTRAINT с_id_pk_demo PRIMARY KEY ("ИД")
);
So we try to reflect on this table in sqlalchemy:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from sqlalchemy import *
from sqlalchemy.orm import sessionmaker
engine = create_engine('oracle+cx_oracle://<some_conn_string>, encoding="windows-1251", optimize_limits=True, use_binds_for_limits=False, echo='debug', coerce_to_decimal=False)
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base(bind=engine)
S = sessionmaker(bind=engine)
s = S()
class CyrTest(Base):
__table__ = Table("cyr_test", Base.metadata,
Column(u'ИД', Integer, primary_key=True),
oracle_resolve_synonyms=True,
autoload=True
)
And, getting this error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "test.py", line 14, in <module>
class CyrTest(Base):
File "C:\Python27\lib\site-packages\sqlalchemy\ext\declarative\api.py", line 5
0, in __init__
_as_declarative(cls, classname, cls.__dict__)
File "C:\Python27\lib\site-packages\sqlalchemy\ext\declarative\base.py", line
292, in _as_declarative
mt.map()
File "C:\Python27\lib\site-packages\sqlalchemy\ext\declarative\base.py", line
376, in map
**mapper_args
File "C:\Python27\lib\site-packages\sqlalchemy\orm\__init__.py", line 1229, in
mapper
return Mapper(class_, local_table, *args, **params)
File "C:\Python27\lib\site-packages\sqlalchemy\orm\mapper.py", line 219, in __
init__
self._configure_properties()
File "C:\Python27\lib\site-packages\sqlalchemy\orm\mapper.py", line 904, in _c
onfigure_properties
setparent=True)
File "C:\Python27\lib\site-packages\sqlalchemy\orm\mapper.py", line 1181, in _
configure_property
prop.instrument_class(self)
File "C:\Python27\lib\site-packages\sqlalchemy\orm\properties.py", line 128, i
n instrument_class
doc=self.doc
File "C:\Python27\lib\site-packages\sqlalchemy\orm\attributes.py", line 1449,
in register_descriptor
manager.instrument_attribute(key, descriptor)
File "C:\Python27\lib\site-packages\sqlalchemy\orm\instrumentation.py", line 1
92, in instrument_attribute
self.install_descriptor(key, inst)
File "C:\Python27\lib\site-packages\sqlalchemy\orm\instrumentation.py", line 2
44, in install_descriptor
setattr(self.class_, key, inst)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-1: ordin
al not in range(128)
With another table, without cyrillic column names, everything works fine.
Tested on sqlalchemy 0.8.1/0.8.3/0.9.0b1 in Python 2.7.2 (ActiveState ActivePython 2.7.2.5), Windows 7 x64, Oracle 11g.
Comments (8)
-
Account Deleted -
Account Deleted (original author: teddy) Another little addition, sorry for my memory =)
If we declare mapped class a bit more explicitly:
class CyrTest(Base): __table__ = Table("cyr_test", Base.metadata, Column(u'ИД', Integer, primary_key=True, key='id'), Column(u'ПОЛЕ1', String, key='field1'), Column(u'ЕЩЕПОЛЕ', String, key='field2'), oracle_resolve_synonyms=True, autoload=True )
It will reflect on table successfully. The problem in keys, of course. I dont't know how to resolve this situation, but i see 2 ways: At first, we can just use colimn names as is, so we can write something like {{{CyrTest.Поле1}}}. At second, we can trasliterate names to latin, and use something like {{{CyrTest.Pole1}}}. Transliteration libraries is present for Python, for example https://pypi.python.org/pypi/trans, which i use in my projects.
-
Account Deleted (original author: teddy) Hmmm, did some experiments. I've put some changes to instrumentation.py:
line 244 setattr(self.class_, key, inst)
to
line 244 setattr(self.class_, key.encode('utf-8'), inst)
and to
line 1 import trans ... line 244 setattr(self.class_, key.encode('trans'), inst)
If we use transliteration, there is a problems such a {{{SomeMappedClass.table.c.keys()}}} returns native field names. If we use native names, there is a big question with manipulation in interactive mode...
-
repo owner - changed status to wontfix
- changed title to ORM mapping of reflected col names that are not ASCII
OK when i first saw this I assumed the oracle dialect couldn't reflect those names, but it seems like it is, so wow on that.
Second issue, note that the ORM is an optional component here. The reflection is working fine as you're getting a
Table
with the names you want. The issue is that you're mapping thatTable
to a Python class, a process which by default links the columns to same-named attributes on a class - and Python 2.x does not accept non-ASCII identifiers - this is not a SQLAlchemy limitation, it's a Python 2.x one.So if you wanted to do
#1, e.g.CyrTest.Поле1.
, you could use Python 3 (and you'd definitely want to use 0.9, which is released as beta1 so far, for the best Py3K compat), otherwise it's not possible. In Python 3 the reflection + mapping works as is.Now using a system like a transliteration library, great, that's not a built-in for SQLAlchemy, you do in fact need to go with the "key" approach. We provide the "column_reflect" event to most effectively redefine keys as needed.
Below is an example program against SQLite illustrating both a Python 3 and a Python 2 approach. I don't have a transliteration library available so I am using
"attr_%s" % hexlify(name.encode('utf-8')
here, replace that with whatever you'd like.If the reflection is working then this is as far as we can go, so I'm closing this as "worksforme".
#!coding: utf-8 from sqlalchemy import create_engine, Table from sqlalchemy.ext.declarative import declarative_base from sqlalchemy.orm import Session import sys e = create_engine("sqlite://", echo=True) e.execute(u""" CREATE TABLE test ( "ИД" INTEGER PRIMARY KEY, "ПОЛЕ1" VARCHAR(20), "ЕЩЕПОЛЕ" VARCHAR(20) ) """) e.execute(u""" INSERT INTO test ("ИД", "ПОЛЕ1", "ЕЩЕПОЛЕ") VALUES (1, 'v1', 'v2') """) py3 = sys.version_info >= (3, ) if not py3: from sqlalchemy import event import binascii @event.listens_for(Table, "column_reflect") def column_reflect(inspector, table, column_info): # do whatever encoding/transliteration you'd like here column_info['key']('key') = "attr_%s" % binascii.hexlify(column_info['name']('name').encode('utf-8')) Base = declarative_base() class Foo(Base): __table__ = Table('test', Base.metadata, autoload=True, autoload_with=e) s = Session(e) f1 = s.query(Foo).first() if py3: # using exec so this script runs in py2k as well, # you can use non-ascii identifiers in py3k. exec("print(f1.ПОЛЕ1)") else: # our hexlified (or transliteterated, or whatever) print(f1.attr_d09fd09ed09bd09531)
-
repo owner I'm changing the mapper documentation now to clarify this case.
-
Account Deleted (original author: teddy) Thank's for your explanation, really great. When i'm started with my experiments, i decided that there is a tricky problem with naming =) I'm not a very experienced or skilled developer, especially in architecture and smart separation of components (decomposition), so this a very good demonstration for me =)
Thanks!
-
repo owner here's a doc section on that: http://docs.sqlalchemy.org/en/rel_0_9/orm/mapper_config.html#automating-column-naming-schemes-from-reflected-tables
-
Account Deleted (original author: teddy) Thank you!
- Log in to comment
(original author: teddy) Little addition. This bug is present in a 0.7.x version too, just forgot to post it then.