SQLite support for arbitrary precision numbers is broken
This test case passed in SQLite 0.6beta1 but fails in beta2 and beta3:
from sqlalchemy import *
import decimal
engine = create_engine('sqlite://', echo=True)
t = Table('t', MetaData(), Column('x', Numeric()))
t.create(engine)
n = decimal.Decimal("54.001")
engine.execute(t.insert(), [n}]({'x':))
ret = engine.execute(t.select()).fetchall()[0](0)[0](0)
print repr(n)
print repr(ret)
assert str(n) == str(ret)
The SQLite adapter currently adds zeroes to the number. (The PostgreSQL adapter does not.)
Comments (21)
-
repo owner -
Account Deleted For financial applications like mine, I really want to store the Decimal object as a string, not a float. By doing that, I might be giving up the ability to use aggregate functions, but those very functions would be error-prone anyway, so I am really not giving up anything at all.
It seems like I need another parameter for the Numeric constructor to tell SQLAlchemy that I want to store Numeric column values as a string, not a float.
-
Account Deleted To clarify: my database is really PostgreSQL, but I also have a fast test suite based on SQLAlchemy. I am using SQLite as an approximation of PostgreSQL. Therefore, I am hoping to create a schema that stores Decimal values untouched when the backend is PostgreSQL/psycopg, but that converts the Decimal values to strings (not floats) when the backend is SQLite.
-
Account Deleted Correction: my database is really PostgreSQL, but I also have a fast test suite based on SQLite.
-
repo owner Also your test fails with 0.6beta1 if you use this value:
n = decimal.Decimal('54.234246451650')
The usual recipe to store decimals on a system that doesn't support them natively is to store them as integers with a multiple. That way math calculations are still possible:
from sqlalchemy import * from sqlalchemy.types import TypeDecorator import decimal class ShiftedDecimal(TypeDecorator): impl = Integer def __init__(self, multiple): TypeDecorator.__init__(self) self.multiple = multiple def process_bind_param(self, value, dialect): if value is not None: value = value.shift(decimal.Decimal(self.multiple)) value = int(value) return value def process_result_value(self, value, dialect): if value is not None: value = decimal.Decimal(str(value)) value = value * decimal.Decimal("1E-%d" % self.multiple) return value def copy(self): return ShiftedDecimal(self.multiple) engine = create_engine('sqlite://', echo=True) t = Table('t', MetaData(), Column('x', ShiftedDecimal(9))) t.create(engine) n = decimal.Decimal('54.234246451') engine.execute(t.insert(), [n}]({'x':)) ret = engine.execute(t.select()).fetchall()[0](0)[0](0) print repr(n) print repr(ret) assert str(n) == str(ret)
a second
TypeDecorator
on top of that one can transparently select the native or non-native implementation based on database backend, usingload_dialect_impl()
. -
Account Deleted That would not work in my case because the precision varies (I intend to store multiple currencies). I don't need SQLite to support any math or even sorting. I just need the database to faithfully store the values.
-
Account Deleted I suppose I can create my own Numeric column type that stores strings and select my implementation using load_dialect_impl().
However, I am surprised that the current design of the Numeric type is considered reasonable. I expected that the purpose of the Numeric column type is to store decimal numbers faithfully at all costs. If the database does not support arbitrary precision, then the database should fall back to storing strings, not strong floats that introduce subtle errors. SQLite users who need to do math on Numeric columns should do all the math in Python (as I am doing). SQLite users who want to do math with numbers in the database should use a different column type like Float or Integer.
-
repo owner Replying to guest:
However, I am surprised that the current design of the Numeric type is considered reasonable.
I expected that the purpose of the Numeric column type is to store decimal numbers faithfully at all costs. If the database does not support arbitrary precision, then the database should fall back to storing strings, not strong floats that introduce subtle errors.SQLite documents its NUMERIC type as being lossless:
A column with NUMERIC affinity may contain values using all five storage classes. When text data is inserted into a NUMERIC column, the storage class of the text is converted to INTEGER or REAL (in order of preference) if such conversion is lossless and reversible. For conversions between TEXT and REAL storage classes, SQLite considers the conversion to be lossless and reversible if the first 15 significant decimal digits of the number are preserved. If the lossless conversion of TEXT to INTEGER or REAL is not possible then the value is stored using the TEXT storage class. No attempt is made to convert NULL or BLOB values. A string might look like a floating-point literal with a decimal point and/or exponent notation but as long as the value can be expressed as an integer, the NUMERIC affinity will convert it into an integer. Hence, the string '3.0e+5' is stored in a column with NUMERIC affinity as the integer 300000, not as the floating point value 300000.0.
yet pysqlite seems to return values from NUMERIC as lossy floats regardless:
CREATE TABLE t ( x NUMERIC ) INSERT INTO t (x) VALUES (?) ('54.234246451',) COMMIT SELECT t.x FROM t () ('x',) Row (54.234246450999997,)
Perhaps you should file a bug report with pysqlite ?
-
repo owner or, if there is some option we should be setting on our pysqlite connections such that NUMERIC is returned to us as strings unconditionally, that would solve the problem.
-
repo owner Also as far as backends that don't support arbitrary NUMERIC, the furthest we could go is to emit a warning. I'm sure you'll agree if someone specifies
Numeric
and then seesVARCHAR
in their create table statements, they'd be quite surprised. Most users of numeric values expect basic comparison and math operators to work too. -
Account Deleted I think those same users would expect the expression "decimal.Decimal(1.00)" to work. They are naive. However, I have better things to fight for, so I'll just go ahead and see what it takes to create my own FaithfulNumeric type that stores strings in SQLite. I will think of SQLAlchemy's Numeric type as UnfaithfulNumeric. :-)
-
Account Deleted Another possibility: the Numeric constructor could grow an optional parameter, "strict", that defaults to false. When strict is false, SQLAlchemy will retain its current forgiving behavior with respect to decimal numbers. When strict is true, backends that don't support arbitrary-precision decimal numbers will use a column type of VARCHAR, aggregate functions will not work, and comparisons will work as strings. I believe the strict parameter would be useful for financial applications and some scientific applications.
-
repo owner I really think your VARCHAR use case is quite unique, and what you are really storing is a "display" value. In number-heavy applications I have done, even when we use a native decimal type, we don't expect to pass the database value verbatim as output, there is always a formatting function that rounds to the desired number of decimals. Even dealing with monetary values, its not long before the values get multiplied by a rate somewhere and there goes your fixed number of decimal places. The current SQLA behavior is attempting to get as many significant digits represented as possible. A numeric type that isn't capable of math or comparison operations on the database side is not really of general use.
-
Account Deleted Ok, I chose to relax the tests and allow trailing zeros.
I'm not storing display values, but the unit tests I wrote tried to ensure that the database and all the storage layers do not round decimal values. The message I'm getting from you is that with SQLite, rounding is a necessary evil. Unfortunately, rounding could be a massive security risk in my application. Therefore, I'm going to have to ensure there's no rounding going on at the functional testing level instead of the unit testing level.
-
repo owner Replying to guest:
The message I'm getting from you is that with SQLite, rounding is a necessary evil.
let me reiterate that I'm not entirely sure, which is why I was encouraging you to check with the pysqlite folks. SQLite itself says it stores NUMERIC using a lossless scheme. It seems strange that pysqlite unconditionally converts to a floating point even if it isn't represented that way by SQLite, and there's various pysqlite settings that may affect this.
There's also no reason you can't store the decimals in a VARCHAR, just adapt the
ShiftedDecimal
recipe above to useString
instead ofInteger
. -
repo owner - changed status to resolved
OK the warning is in f432bc8fe15507653caff6f57e54d4bf6a78fbd7.
It's a little ridiculous that all of these DBAPIs (cx_oracle, pysqlite, pyodbc) put no effort at all into this issue, and nobody seems to care (especially cx_oracle surprises me - I looked throughout its source - it returns floating points unconditionally). In particular I'd really like users who find this to be an issue to please email the DBAPI authors, since in the absense of a documented option to have the DBAPI not return FP, this is really their problem, not mine.
-
repo owner I'll leave it open, but I obviously can't fix it any further.
-
repo owner take back what I said about cx_oracle. eefdbd3757a245ccb73eb191d37f23c3048ef99f makes usage of cx_oracle's output conversion feature to return NUMBER types as strings which are then passed to
Decimal
. So for cx_oracle we have full decimal accuracy. -
repo owner - removed status
- changed status to open
-
Account Deleted I have the same issue (i'm starting to develop a payroll app), so filed a bugreport on pysqlite tracker. http://code.google.com/p/pysqlite/issues/detail?id=14
gbtami
-
repo owner - changed status to wontfix
pysqlite can't fix it, so I think we all have to wait for the SQLite project to address this someday.
- Log in to comment
AFAIK, this is a sqlite limitation.
From http://sqlite.org/datatype3.html :
floating points are known to introduce errors.
If you can illustrate how to round-trip the following numbers using sqlite3/pysqlite directly, using a sqlite type that is using REAL affinity, we will commit your fix.