numpy int64 garbled upon insert
numpy.int64
is garbled upon insert. This may apply to other numpy types as well. Workaround is to cast with int
before inserting.
Problem observed on latest commit (0847097c29ab48a5f111518e2c6ee324d5242057) and on v1.0.8.
Test case:
from sqlalchemy import create_engine, MetaData, Column, Integer, Table
import numpy as np
engine = create_engine('sqlite:///test.db', echo=True)
metadata = MetaData()
test_table = Table('test', metadata,
Column('id', Integer, primary_key=True),
Column('number', Integer)
)
metadata.drop_all(engine) # clean
metadata.create_all(engine)
engine.execute(test_table.insert(), {'number': np.int64(123)})
result = engine.execute(test_table.select()).first()
assert result[1] == 123, result
Comments (5)
-
repo owner -
Account Deleted reporter If only Python
int
is supported, why not add anassert isinstance(value, int)
?So, what you're suggesting is to implement a custom MyInteger (with e.g. an assert) by subclassing TypeDecorator and use that instead of Integer in my tables, correct?
-
repo owner Well the value is passed to the underlying DBAPI which interprets it anyway it wants, so we usually don't get too in the way with that with the base types, but more importantly an isinstance() check is yet another performance expense. If someone wants to insert 50K rows of 30 ints each that would be 150K^H^H^H^H 1.5M more Python function calls and people really get annoyed at that (see http://stackoverflow.com/questions/11769366/why-is-sqlalchemy-insert-with-sqlite-25-times-slower-than-using-sqlite3-directly).
So yes the TypeDecorator approach is most straightforward here, but also I'm surprised there aren't already tools for this job, I know that Pandas has a lot of SQLAlchemy integration maybe look into what they're doing.
-
Account Deleted reporter Thanks, I forgot pandas had those features. In fact, this problem originated from inserting data ~manually from a DataFrame (into 3 tables). So, I'll have a look.
The performance argument is a good one, it's not like we have luxuries like static type checking in Python.
-
Account Deleted reporter - changed status to resolved
- Log in to comment
there's no bug I can see here, numpy ints aren't Python ints and these wont be interpreted correctly by the DBAPI by default. You'd need to use type coercing as in http://docs.sqlalchemy.org/en/rel_1_0/core/custom_types.html#augmenting-existing-types