SQLite does not handle binary strings

Issue #1608 resolved
Former user created an issue

I need to store a (binary) secret to generate CSRF tokens, but get an error when I use SQLite.

Here is a testcase which demonstrates the problem:

import sqlalchemy
from sqlalchemy import orm
from sqlalchemy import schema
from sqlalchemy import types
from sqlalchemy.ext.declarative import declarative_base

metadata = schema.MetaData()
BaseObject = declarative_base(metadata=metadata)


class Account(BaseObject):
    __tablename__ = "owner"

    id = schema.Column(types.Integer(), primary_key=True, autoincrement=True)
    secret = schema.Column(types.String(32))

engine = sqlalchemy.create_engine('sqlite:///:memory:')
metadata.create_all(engine)
sm = orm.sessionmaker(bind=engine)

session = orm.scoped_session(sm)
a = Account(secret='\x1c\xb2j\xf5\xc8\xd0\x82"7\x17R\xa7\x94laH\x02\xa5<fG\x14\x85d*\xe0\xbe?\x08\xe0c\xb3')
session.add(a)
session.flush()

when I run this code I get the following traceback:

Traceback (most recent call last):
  File "bin/py", line 83, in <module>
    execfile(__file__)
  File "/tmp/binary.py", line 24, in <module>
    session.flush()
  File "/Users/wichert/Library/eggs/SQLAlchemy-0.5.6-py2.6.egg/sqlalchemy/orm/scoping.py", line 123, in do
    return getattr(self.registry(), name)(*args, **kwargs)
  File "/Users/wichert/Library/eggs/SQLAlchemy-0.5.6-py2.6.egg/sqlalchemy/orm/session.py", line 1356, in flush
    self._flush(objects)
  File "/Users/wichert/Library/eggs/SQLAlchemy-0.5.6-py2.6.egg/sqlalchemy/orm/session.py", line 1434, in _flush
    flush_context.execute()
  File "/Users/wichert/Library/eggs/SQLAlchemy-0.5.6-py2.6.egg/sqlalchemy/orm/unitofwork.py", line 261, in execute
    UOWExecutor().execute(self, tasks)
  File "/Users/wichert/Library/eggs/SQLAlchemy-0.5.6-py2.6.egg/sqlalchemy/orm/unitofwork.py", line 753, in execute
    self.execute_save_steps(trans, task)
  File "/Users/wichert/Library/eggs/SQLAlchemy-0.5.6-py2.6.egg/sqlalchemy/orm/unitofwork.py", line 768, in execute_save_steps
    self.save_objects(trans, task)
  File "/Users/wichert/Library/eggs/SQLAlchemy-0.5.6-py2.6.egg/sqlalchemy/orm/unitofwork.py", line 759, in save_objects
    task.mapper._save_obj(task.polymorphic_tosave_objects, trans)
  File "/Users/wichert/Library/eggs/SQLAlchemy-0.5.6-py2.6.egg/sqlalchemy/orm/mapper.py", line 1406, in _save_obj
    c = connection.execute(statement.values(value_params), params)
  File "/Users/wichert/Library/eggs/SQLAlchemy-0.5.6-py2.6.egg/sqlalchemy/engine/base.py", line 824, in execute
    return Connection.executors[c](c)(self, object, multiparams, params)
  File "/Users/wichert/Library/eggs/SQLAlchemy-0.5.6-py2.6.egg/sqlalchemy/engine/base.py", line 874, in _execute_clauseelement
    return self.__execute_context(context)
  File "/Users/wichert/Library/eggs/SQLAlchemy-0.5.6-py2.6.egg/sqlalchemy/engine/base.py", line 896, in __execute_context
    self._cursor_execute(context.cursor, context.statement, context.parameters[0](0), context=context)
  File "/Users/wichert/Library/eggs/SQLAlchemy-0.5.6-py2.6.egg/sqlalchemy/engine/base.py", line 950, in _cursor_execute
    self._handle_dbapi_exception(e, statement, parameters, cursor, context)
  File "/Users/wichert/Library/eggs/SQLAlchemy-0.5.6-py2.6.egg/sqlalchemy/engine/base.py", line 931, in _handle_dbapi_exception
    raise exc.DBAPIError.instance(statement, parameters, e, connection_invalidated=is_disconnect)
sqlalchemy.exc.ProgrammingError: (ProgrammingError) You must not use 8-bit bytestrings unless you use a text_factory that can interpret 8-bit bytestrings (like text_factory = str). It is highly recommended that you instead just switch your application to Unicode strings. u'INSERT INTO owner (secret) VALUES (?)' ['\x1c\xb2j\xf5\xc8\xd0\x82"7\x17R\xa7\x94laH\x02\xa5<fG\x14\x85d*\xe0\xbe?\x08\xe0c\xb3']('\x1c\xb2j\xf5\xc8\xd0\x82"7\x17R\xa7\x94laH\x02\xa5<fG\x14\x85d*\xe0\xbe?\x08\xe0c\xb3')

Comments (6)

  1. Mike Bayer repo owner

    This is DBAPI behavior, and the error you see is a recent addition to pysqlite to ensure proper usage. You need to use the Binary type to store binary strings, which uses the DBAPI Binary container to store the data:

        secret = schema.Column(types.Binary(32))
    

    Or less recommended you can use sqlite3's API directly:

    import sqlite3
    a = Account(secret=sqlite3.Binary('\x1c\xb2j\xf5\xc8\xd0\x82"7\x17R\xa7\x94laH\x02\xa5<fG\x14\x85d*\xe0\xbe?\x08\xe0c\xb3'))
    
  2. Former user Account Deleted

    The {{{Binary}}} type generates a BLOB though, not a {{{CHAR}}} column, which as far as I know means it takes up more space on the SQL server. Hence my reason for using {{{String}}} instead.

  3. Mike Bayer repo owner

    sqlite3 only has TEXT or BLOB in this regard, and to my knowledge neither have any "length" that has any real meaning internally.

    http://www.sqlite.org/datatype3.html

    TEXT is specified as intended for storing a charcater-encoded string, so is not appropriate for arbitrary binary data.

    also the BLOB type in SQLite has "TEXT" affinity so is pretty much the same thing underneath.

  4. Former user Account Deleted

    I tend to use SQLite for tests, but postgres for production environments. That would mean here that I would need to make my model conditional on the engine used: use String(32) when using postgres, but use Binary() when using SQLite so my tests will not fail. Can you think of a cleaner way to handle that?

  5. Mike Bayer repo owner

    sure build your own type using TypeEngine. Or build a TypeDecorator that implements load_dialect_impl, here's one i use for uuids:

    class GUIDType(TypeDecorator):
        impl = sa.CHAR
    
        def __init__(self):
            TypeDecorator.__init__(self, length=16)
    
        def load_dialect_impl(self, dialect):
            if dialect.name == 'sqlite':
                return dialect.type_descriptor(sa.CHAR(self.impl.length))
            else:
                return dialect.type_descriptor(PGUuid())
    
        def process_bind_param(self, value, dialect):
            if value is None:
                return value
            else:
                return str(value)
    
        def process_result_value(self, value, dialect):
            if value is None:
                return value
            else:
                return uuid.UUID(value)
    
  6. Log in to comment