databases/mysql.py cannot handle unicode COMMENTS of the schema
If a table schema includes a unicode COMMENT text then the mysql.py raise UnicodeEncodeError.
Traceback (most recent call last):
File "./list_service_machines.py", line 11, in ?
cstm_table = Table('SRVC', metadata, autoload=True)
File "/usr/lib/python2.4/site-packages/sqlalchemy/schema.py", line 143, in __call__
metadata.get_engine().reflecttable(table)
File "/usr/lib/python2.4/site-packages/sqlalchemy/engine/base.py", line 505, in reflecttable
self.dialect.reflecttable(conn, table)
File "/usr/lib/python2.4/site-packages/sqlalchemy/databases/mysql.py", line 364, in reflecttable
tabletype = self.moretableinfo(connection, table=table)
File "/usr/lib/python2.4/site-packages/sqlalchemy/databases/mysql.py", line 384, in moretableinfo
desc_fetched = str(desc_fetched)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 672-675: ordinal not in range(128)
Comments (8)
-
Account Deleted -
repo owner - marked as major
hi -
that patch you have will break if unicode data is returned from the server, since youve removed the conversion from unicode to string. The comment right above it illustrates why this conversion is necessary.
It would be helpful if you could attach the DDL for the table you're trying to reflect, and the fix would probably involve converting to string with encoding_errors set to IGNORE or something similar.
-
Account Deleted Okey,
follwing is a my sample DDL.
-
SRVCTEND field have a COMMENT which is written is Japanese.
CREATE TABLE
SRVC
(SRVCSRVC
int(8) unsigned zerofill NOT NULL default '00000000',SRVCTREG
datetime default NULL,SRVCTSIN
datetime default NULL,SRVCTEND
date default NULL COMMENT '利用期限',SRVCSTMP
timestamp NOT NULL default CURRENT_TIMESTAMP on update CURRENT_TIMESTAMP, PRIMARY KEY (SRVCSRVC
) ) ENGINE=MyISAM DEFAULT CHARSET=ujis;
I've tested with following script.
import sys from sqlalchemy import * db = create_engine('mysql://user:pass@localhost/db?charset=utf8&use_unicode=1') db.echo = True metadata = BoundMetaData(db) session = create_session(bind_to=db) srvc = Table('SRVC', metadata, autoload=True) $ python sqlalchemy_test.py 2006-12-15 09:44:35,632 INFO sqlalchemy.engine.base.Engine.0x..d0 show variables like 'lower_case_table_names' 2006-12-15 09:44:35,633 INFO sqlalchemy.engine.base.Engine.0x..d0 None 2006-12-15 09:44:35,642 INFO sqlalchemy.engine.base.Engine.0x..d0 describe SRVC 2006-12-15 09:44:35,642 INFO sqlalchemy.engine.base.Engine.0x..d0 {} 2006-12-15 09:44:35,660 INFO sqlalchemy.engine.base.Engine.0x..d0 SHOW CREATE TABLE SRVC 2006-12-15 09:44:35,660 INFO sqlalchemy.engine.base.Engine.0x..d0 {} Traceback (most recent call last): File "sqlalchemy_test.py", line 11, in ? srvc = Table('SRVC', metadata, autoload=True) File "/usr/lib/python2.4/site-packages/sqlalchemy/schema.py", line 143, in __call__ metadata.get_engine().reflecttable(table) File "/usr/lib/python2.4/site-packages/sqlalchemy/engine/base.py", line 505, in reflecttable self.dialect.reflecttable(conn, table) File "/usr/lib/python2.4/site-packages/sqlalchemy/databases/mysql.py", line 364, in reflecttable tabletype = self.moretableinfo(connection, table=table) File "/usr/lib/python2.4/site-packages/sqlalchemy/databases/mysql.py", line 384, in moretableinfo desc_fetched = str(desc_fetched) UnicodeEncodeError: 'ascii' codec can't encode characters in position 672-675: ordinal not in range(128) $
I think, to avoid this problem, COMMENT attribute should be removed from
desc_fetched
before converting tostr
.--- /usr/lib/python2.4/site-packages/sqlalchemy/databases/mysql.py.~1~ 2006-11-13 08:45:32.000000000 +0900 +++ /usr/lib/python2.4/site-packages/sqlalchemy/databases/mysql.py 2006-12-15 10:10:44.000000000 +0900 @@ -381,7 +381,7 @@ # this can come back as unicode if use_unicode=1 in the mysql connection if type(desc_fetched) is unicode: - desc_fetched = str(desc_fetched) + desc_fetched = str(re.sub(r"COMMENT '.*?'", "", desc_fetched)) elif type(desc_fetched) is not str: # may get array.array object here, depending on version (such as mysql 4.1.14 vs. 4.1.11) desc_fetched = desc_fetched.tostring()
-
-
- changed milestone to 0.4.0
- assigned issue to
-
81f42181f431eb8ea96c94ee8a2d798f11a296e7 addresses this, tested with the DDL above.
-
This has been fixed in 0.3.9 and for 0.4.
-
- changed status to resolved
-
repo owner - removed milestone
Removing milestone: 0.4.0 (automated comment)
- Log in to comment
I applied following fix.