Unicode/UnicodeText return str instead of unicode on MySQL-Python 1.2.3

Issue #2906 resolved
Former user created an issue

I recently tried to use a UnicodeText column and was surprised to find that it was giving me str values. Upgrading MySQL-Python to 1.2.5 (1.2.4 works too) resulted in getting the correct unicode values.

Here's a short test script:

import sqlalchemy
import MySQLdb
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm.session import Session

Base = declarative_base()

class DemoTable(Base):
    __tablename__ = 'demo_table'

    id = sqlalchemy.Column(sqlalchemy.Integer(), primary_key=True)
    unicode_text = sqlalchemy.Column(sqlalchemy.UnicodeText())


engine = sqlalchemy.create_engine('mysql://something?charset=utf8')
Base.metadata.create_all(bind=engine)
session = Session(bind=engine)
session.add(DemoTable(unicode_text=u"What a lovely day"))
session.flush()
session.expire_all()

print 'MySQL', session.query(sqlalchemy.func.version()).one()[0](0)
print 'sqlalchemy', sqlalchemy.__version__
print 'MySQL-Python', MySQLdb.__version__
print type(session.query(DemoTable).first().unicode_text)

I tested on v0.8.2-v0.9.1. The output below is the same for all except for the sqlalchemy version number.

MySQL 5.6.13-2+debphp.org~precise+2
sqlalchemy 0.8.2
MySQL-Python 1.2.3
<type 'str'>


MySQL 5.6.13-2+debphp.org~precise+2
sqlalchemy 0.8.2
MySQL-Python 1.2.4
<type 'unicode'>


MySQL 5.6.13-2+debphp.org~precise+2
sqlalchemy 0.8.2
MySQL-Python 1.2.5
<type 'unicode'>

Comments (13)

  1. Mike Bayer repo owner

    ok here's the one way this can go wrong. try out this script:

    from sqlalchemy import create_engine
    from sqlalchemy.sql import expression
    from sqlalchemy import types as sqltypes
    
    engine = create_engine("mysql://scott:tiger@localhost/test?charset=utf8")
    dialect = engine.dialect
    connection = engine.connect()
    
    cursor = connection.connection.cursor()
    
    def ret_type(type_):
        cursor.execute(
            str(
                expression.select(
                    [                   expression.literal_column(
                            "'test plain returns'"),
                            type_)
                ](expression.cast(
    )).compile(dialect=dialect)
            )
        )
        row = cursor.fetchone()
    
        print type(row[0](0))
    
    ret_type(sqltypes.VARCHAR(60))
    ret_type(sqltypes.TEXT)
    

    if the two types printed out aren't consistent, that could cause this issue. MySQL-python 1.2.3 for me here reports "unicode" for both.

  2. Mike Bayer repo owner

    OK can you see why that's weird? thats the raw MySQL cursor, on 1.2.3, giving you "unicode". Can you track down what is specifically making it not return "unicode" for this one column in your test script? turning on echo='debug' in create_engine() will show you the data coming back.

  3. Former user Account Deleted

    I agree that it's weird. It may take me some time to get to the bottom of things but I'll do my best.

    For now, here's the debug output of my original demo:

    2014-01-10 11:41:17,010 INFO sqlalchemy.engine.base.Engine SELECT DATABASE()
    2014-01-10 11:41:17,012 INFO sqlalchemy.engine.base.Engine ()
    2014-01-10 11:41:17,012 DEBUG sqlalchemy.engine.base.Engine Col ('DATABASE()',)
    2014-01-10 11:41:17,013 DEBUG sqlalchemy.engine.base.Engine Row (u'graffiti_world_services',)
    2014-01-10 11:41:17,015 INFO sqlalchemy.engine.base.Engine SHOW VARIABLES LIKE 'character_set%%'
    2014-01-10 11:41:17,018 INFO sqlalchemy.engine.base.Engine ()
    2014-01-10 11:41:17,019 DEBUG sqlalchemy.engine.base.Engine Col ('Variable_name', 'Value')
    2014-01-10 11:41:17,019 DEBUG sqlalchemy.engine.base.Engine Row (u'character_set_client', u'utf8')
    2014-01-10 11:41:17,019 DEBUG sqlalchemy.engine.base.Engine Row (u'character_set_connection', u'utf8')
    2014-01-10 11:41:17,020 DEBUG sqlalchemy.engine.base.Engine Row (u'character_set_database', u'utf8')
    2014-01-10 11:41:17,020 DEBUG sqlalchemy.engine.base.Engine Row (u'character_set_filesystem', u'binary')
    2014-01-10 11:41:17,020 DEBUG sqlalchemy.engine.base.Engine Row (u'character_set_results', u'utf8')
    ### could this be related? ###
    2014-01-10 11:41:17,020 DEBUG sqlalchemy.engine.base.Engine Row (u'character_set_server', u'latin1')
    ###
    2014-01-10 11:41:17,020 DEBUG sqlalchemy.engine.base.Engine Row (u'character_set_system', u'utf8')
    2014-01-10 11:41:17,020 DEBUG sqlalchemy.engine.base.Engine Row (u'character_sets_dir', u'/usr/share/mysql/charsets/')
    2014-01-10 11:41:17,021 INFO sqlalchemy.engine.base.Engine SHOW VARIABLES LIKE 'sql_mode'
    2014-01-10 11:41:17,021 INFO sqlalchemy.engine.base.Engine ()
    2014-01-10 11:41:17,021 DEBUG sqlalchemy.engine.base.Engine Col ('Variable_name', 'Value')
    2014-01-10 11:41:17,021 DEBUG sqlalchemy.engine.base.Engine Row (u'sql_mode', u'STRICT_TRANS_TABLES,NO_ENGINE_SUBSTITUTION')
    2014-01-10 11:41:17,022 INFO sqlalchemy.engine.base.Engine DESCRIBE `demo_table`
    2014-01-10 11:41:17,022 INFO sqlalchemy.engine.base.Engine ()
    2014-01-10 11:41:17,023 DEBUG sqlalchemy.engine.base.Engine Col ('Field', 'Type', 'Null', 'Key', 'Default', 'Extra')
    2014-01-10 11:41:17,024 DEBUG sqlalchemy.engine.base.Engine Row (u'id', u'int(11)', u'NO', u'PRI', None, u'auto_increment')
    2014-01-10 11:41:17,025 INFO sqlalchemy.engine.base.Engine BEGIN (implicit)
    2014-01-10 11:41:17,026 INFO sqlalchemy.engine.base.Engine INSERT INTO demo_table (unicode_text) VALUES (%s)
    2014-01-10 11:41:17,029 INFO sqlalchemy.engine.base.Engine ('What a lovely day',)
    MySQL2014-01-10 11:41:17,030 INFO sqlalchemy.engine.base.Engine SELECT version() AS version_1
    2014-01-10 11:41:17,030 INFO sqlalchemy.engine.base.Engine ()
    2014-01-10 11:41:17,030 DEBUG sqlalchemy.engine.base.Engine Col ('version_1',)
    2014-01-10 11:41:17,031 DEBUG sqlalchemy.engine.base.Engine Row (u'5.6.13-2+debphp.org~precise+2',)
    5.6.13-2+debphp.org~precise+2
    sqlalchemy 0.8.4
    MySQL-Python 1.2.3
    2014-01-10 11:41:17,032 INFO sqlalchemy.engine.base.Engine SELECT demo_table.id AS demo_table_id, demo_table.unicode_text AS demo_table_unicode_text
    FROM demo_table
     LIMIT %s
    2014-01-10 11:41:17,032 INFO sqlalchemy.engine.base.Engine (1,)
    2014-01-10 11:41:17,033 DEBUG sqlalchemy.engine.base.Engine Col ('demo_table_id', 'demo_table_unicode_text')
    2014-01-10 11:41:17,033 DEBUG sqlalchemy.engine.base.Engine Row (21L, 'What a lovely day')
    <type 'str'>
    
  4. Former user Account Deleted
    CREATE TABLE `demo_table` (
      `id` int(11) NOT NULL AUTO_INCREMENT,
      `unicode_text` text COLLATE utf8_bin,
      PRIMARY KEY (`id`)
    ) ENGINE=InnoDB AUTO_INCREMENT=22 DEFAULT CHARSET=utf8 COLLATE=utf8_bin
    
  5. Mike Bayer repo owner

    see now that's a patch. I want to try seeing if i can backport that to 0.8 also. MySQL-python fixed the issue in this changeset: https://github.com/farcepest/MySQLdb1/commit/cd44524fef63bd3fcb71947392326e9742d520e8 . It's of course only in some of their repos and is hardly documented at all, but that's the code that fixes it. I was originally going to just do a simple version check for 1.2.3, but as that commit says "I have no idea how this works" in it and has no tests whatsoever, the patch I have here adds the specific failure condition to test for, so if/when MySQL-python regresses on this we won't be affected.

  6. Log in to comment