translate utf8mb4 charset when detected from server side to client side, for _DecodingRowProxy operations
Issue #2771
resolved
this has been edited by mike to show an actual bug
from sqlalchemy import *
e = create_engine("mysql://scott:tiger@localhost/test?charset=utf8mb4&use_unicode=0")
e.connect()
#!
Traceback (most recent call last):
...
self._detect_ansiquotes(connection)
File "/Users/classic/dev/sqlalchemy/lib/sqlalchemy/dialects/mysql/base.py", line 2806, in _detect_ansiquotes
mode = row[1] or ''
File "/Users/classic/dev/sqlalchemy/lib/sqlalchemy/dialects/mysql/base.py", line 3309, in __getitem__
return item.decode(self.charset)
LookupError: unknown encoding: utf8mb4
Comments (13)
-
repo owner -
repo owner - changed status to open
a user has finally produced a test
-
repo owner - changed milestone to 1.0
- changed title to translate utf8mb4 charset when detected from server side to client side, for _DecodingRowProxy operations
- edited description
-
repo owner - edited description
-
repo owner - edited description
-
repo owner start of a patch:
diff --git a/lib/sqlalchemy/dialects/mysql/base.py b/lib/sqlalchemy/dialects/mysql/base.py index b8392bd..9bc8a48 100644 --- a/lib/sqlalchemy/dialects/mysql/base.py +++ b/lib/sqlalchemy/dialects/mysql/base.py @@ -3296,8 +3296,16 @@ class _DecodingRowProxy(object): # sets.Set(['value']) (seriously) but thankfully that doesn't # seem to come up in DDL queries. + _mysql_compat = { + "utf8mb4": "utf8" + # TODO: other MySQL charsets that need Python + # translation + } + def __init__(self, rowproxy, charset): self.rowproxy = rowproxy + if charset in self._mysql_compat: + charset = self._mysql_compat[charset] self.charset = charset def __getitem__(self, index):
-
diff --git a/lib/sqlalchemy/dialects/mysql/base.py b/lib/sqlalchemy/dialects/mysql/base.py index b8392bd..9bc8a48 100644 --- a/lib/sqlalchemy/dialects/mysql/base.py +++ b/lib/sqlalchemy/dialects/mysql/base.py @@ -3296,8 +3296,16 @@ class _DecodingRowProxy(object): # sets.Set(['value']) (seriously) but thankfully that doesn't # seem to come up in DDL queries. + _mysql_compat = { + "utf8mb4": "utf8" + # TODO: other MySQL charsets that need Python + # translation + } + def __init__(self, rowproxy, charset): self.rowproxy = rowproxy - self.charset = charset + self.charset = self._mysql_compat.get(charset, charset) def __getitem__(self, index):
-
The encodings supported by MySQL are:
["big5", "dec8", "cp850", "hp8", "koi8r", "latin1", "latin2", "swe7", "ascii", "ujis", "sjis", "hebrew", "tis620", "euckr", "koi8u", "gb2312", "greek", "cp1250", "gbk", "latin5", "armscii8", "utf8", "ucs2", "cp866", "keybcs2", "macce", "macroman", "cp852", "latin7", "utf8mb4", "cp1251", "utf16", "cp1256", "cp1257", "utf32", "binary", "geostd8", "cp932", "eucjpms"]
The encodings not supported by my Python are:
["dec8", "hp8", "koi8r", "swe7", "koi8u", "armscii8", "ucs2", "keybcs2", "macce", "utf8mb4", "binary", "geostd8", "eucjpms"]
-
I've googled each possible missing encoding and the python alias and found:
{ 'koi8r': 'koi8_r', 'koi8u': 'koi8_u', 'ucs2': 'utf-16-be', # can't be used as client connection encoding 'utf16': 'utf-16-be', # MySQL's uft16 is always bigendian 'utf8mb4': 'utf8', # real utf8 'eucjpms': 'ujis', }
-
repo owner OK, so that, a couple of tests, and if we could put authoritative guidance in the mysql dialect docs for utf8mb4, we're in good shape.
-
repo owner - changed status to resolved
map MySQL encodings to python encodings fixes
#2771→ <<cset 5794b777cbb4>>
-
repo owner → <<cset 58ce4d53bf5e>>
-
repo owner very glad to have this fixed thanks!
- Log in to comment
without further detail given, this issue would appear to have nothing to do with SQLAlchemy - the "charset" parameter are straight pass-thrus to the database API and the database itself, SQLAlchemy has no awareness of a "LookupError" and by default defers all unicode behaviors to the DBAPI and database itself in the case of MySQL. A stack trace would help to reveal the source of this issue.
sample test passes without issue:
output: