SQLite support for arbitrary precision numbers is broken

Issue #1759 resolved
Former user created an issue

This test case passed in SQLite 0.6beta1 but fails in beta2 and beta3:

from sqlalchemy import *
import decimal

engine = create_engine('sqlite://', echo=True)

t = Table('t', MetaData(), Column('x', Numeric()))
t.create(engine)

n = decimal.Decimal("54.001")
engine.execute(t.insert(), [n}]({'x':))

ret = engine.execute(t.select()).fetchall()[0](0)[0](0)

print repr(n)
print repr(ret)
assert str(n) == str(ret)

The SQLite adapter currently adds zeroes to the number. (The PostgreSQL adapter does not.)

Comments (21)

  1. Mike Bayer repo owner

    AFAIK, this is a sqlite limitation.

    From http://sqlite.org/datatype3.html :

    Each value stored in an SQLite database (or manipulated by the database engine) has one of the following storage classes:
    
        *
          NULL. The value is a NULL value.
        *
          INTEGER. The value is a signed integer, stored in 1, 2, 3, 4, 6, or 8 bytes depending on the magnitude of the value.
        *
          REAL. The value is a floating point value, stored as an 8-byte IEEE floating point number.
        *
          TEXT. The value is a text string, stored using the database encoding (UTF-8, UTF-16BE or UTF-16LE).
        *
     BLOB. The value is a blob of data, stored exactly as it was input.
    

    floating points are known to introduce errors.

    If you can illustrate how to round-trip the following numbers using sqlite3/pysqlite directly, using a sqlite type that is using REAL affinity, we will commit your fix.

    31943874831932418390.01
    319438950232418390.273596
    87673.594069654243
    4E+8
    5748E+15
    1.521E+15
    00000000000000.1E+12
    1E-2
    1E-3
    1E-4
    1E-5
    1E-6
    1E-7
    1E-8
    0.01000005940696
    0.00000005940696
    0.00000000000696
    0.70000000000696
    696E-12
    54.234246451650
    0.004354
    900.0
    876734.594069654
    
  2. Former user Account Deleted

    For financial applications like mine, I really want to store the Decimal object as a string, not a float. By doing that, I might be giving up the ability to use aggregate functions, but those very functions would be error-prone anyway, so I am really not giving up anything at all.

    It seems like I need another parameter for the Numeric constructor to tell SQLAlchemy that I want to store Numeric column values as a string, not a float.

  3. Former user Account Deleted

    To clarify: my database is really PostgreSQL, but I also have a fast test suite based on SQLAlchemy. I am using SQLite as an approximation of PostgreSQL. Therefore, I am hoping to create a schema that stores Decimal values untouched when the backend is PostgreSQL/psycopg, but that converts the Decimal values to strings (not floats) when the backend is SQLite.

  4. Former user Account Deleted

    Correction: my database is really PostgreSQL, but I also have a fast test suite based on SQLite.

  5. Mike Bayer repo owner

    Also your test fails with 0.6beta1 if you use this value:

    n = decimal.Decimal('54.234246451650')
    

    The usual recipe to store decimals on a system that doesn't support them natively is to store them as integers with a multiple. That way math calculations are still possible:

    from sqlalchemy import *
    from sqlalchemy.types import TypeDecorator
    import decimal
    
    class ShiftedDecimal(TypeDecorator):
        impl = Integer
    
        def __init__(self, multiple):
            TypeDecorator.__init__(self)
            self.multiple = multiple
    
        def process_bind_param(self, value, dialect):
            if value is not None:
                value = value.shift(decimal.Decimal(self.multiple))
                value = int(value)
            return value
    
        def process_result_value(self, value, dialect):
            if value is not None:
                value = decimal.Decimal(str(value))
                value = value * decimal.Decimal("1E-%d" % self.multiple)
    
            return value
    
        def copy(self):
            return ShiftedDecimal(self.multiple)
    
    engine = create_engine('sqlite://', echo=True)
    
    t = Table('t', MetaData(), Column('x', ShiftedDecimal(9)))
    t.create(engine)
    
    n = decimal.Decimal('54.234246451')
    engine.execute(t.insert(), [n}]({'x':))
    
    ret = engine.execute(t.select()).fetchall()[0](0)[0](0)
    
    print repr(n)
    print repr(ret)
    assert str(n) == str(ret)
    

    a second TypeDecorator on top of that one can transparently select the native or non-native implementation based on database backend, using load_dialect_impl().

  6. Former user Account Deleted

    That would not work in my case because the precision varies (I intend to store multiple currencies). I don't need SQLite to support any math or even sorting. I just need the database to faithfully store the values.

  7. Former user Account Deleted

    I suppose I can create my own Numeric column type that stores strings and select my implementation using load_dialect_impl().

    However, I am surprised that the current design of the Numeric type is considered reasonable. I expected that the purpose of the Numeric column type is to store decimal numbers faithfully at all costs. If the database does not support arbitrary precision, then the database should fall back to storing strings, not strong floats that introduce subtle errors. SQLite users who need to do math on Numeric columns should do all the math in Python (as I am doing). SQLite users who want to do math with numbers in the database should use a different column type like Float or Integer.

  8. Mike Bayer repo owner

    Replying to guest:

    However, I am surprised that the current design of the Numeric type is considered reasonable.
    I expected that the purpose of the Numeric column type is to store decimal numbers faithfully at all costs. If the database does not support arbitrary precision, then the database should fall back to storing strings, not strong floats that introduce subtle errors.

    SQLite documents its NUMERIC type as being lossless:

    A column with NUMERIC affinity may contain values using all five storage classes. When text data is inserted into a NUMERIC column, the storage class of the text is converted to INTEGER or REAL (in order of preference) if such conversion is lossless and reversible. For conversions between TEXT and REAL storage classes, SQLite considers the conversion to be lossless and reversible if the first 15 significant decimal digits of the number are preserved. If the lossless conversion of TEXT to INTEGER or REAL is not possible then the value is stored using the TEXT storage class. No attempt is made to convert NULL or BLOB values.
    
    A string might look like a floating-point literal with a decimal point and/or exponent notation but as long as the value can be expressed as an integer, the NUMERIC affinity will convert it into an integer. Hence, the string '3.0e+5' is stored in a column with NUMERIC affinity as the integer 300000, not as the floating point value 300000.0.
    

    yet pysqlite seems to return values from NUMERIC as lossy floats regardless:

    CREATE TABLE t (
        x NUMERIC
    )
    
    
    INSERT INTO t (x) VALUES (?)
    ('54.234246451',)
    COMMIT
    SELECT t.x FROM t
    ()
    ('x',)
    Row (54.234246450999997,)
    

    Perhaps you should file a bug report with pysqlite ?

  9. Mike Bayer repo owner

    or, if there is some option we should be setting on our pysqlite connections such that NUMERIC is returned to us as strings unconditionally, that would solve the problem.

  10. Mike Bayer repo owner

    Also as far as backends that don't support arbitrary NUMERIC, the furthest we could go is to emit a warning. I'm sure you'll agree if someone specifies Numeric and then sees VARCHAR in their create table statements, they'd be quite surprised. Most users of numeric values expect basic comparison and math operators to work too.

  11. Former user Account Deleted

    I think those same users would expect the expression "decimal.Decimal(1.00)" to work. They are naive. However, I have better things to fight for, so I'll just go ahead and see what it takes to create my own FaithfulNumeric type that stores strings in SQLite. I will think of SQLAlchemy's Numeric type as UnfaithfulNumeric. :-)

  12. Former user Account Deleted

    Another possibility: the Numeric constructor could grow an optional parameter, "strict", that defaults to false. When strict is false, SQLAlchemy will retain its current forgiving behavior with respect to decimal numbers. When strict is true, backends that don't support arbitrary-precision decimal numbers will use a column type of VARCHAR, aggregate functions will not work, and comparisons will work as strings. I believe the strict parameter would be useful for financial applications and some scientific applications.

  13. Mike Bayer repo owner

    I really think your VARCHAR use case is quite unique, and what you are really storing is a "display" value. In number-heavy applications I have done, even when we use a native decimal type, we don't expect to pass the database value verbatim as output, there is always a formatting function that rounds to the desired number of decimals. Even dealing with monetary values, its not long before the values get multiplied by a rate somewhere and there goes your fixed number of decimal places. The current SQLA behavior is attempting to get as many significant digits represented as possible. A numeric type that isn't capable of math or comparison operations on the database side is not really of general use.

  14. Former user Account Deleted

    Ok, I chose to relax the tests and allow trailing zeros.

    I'm not storing display values, but the unit tests I wrote tried to ensure that the database and all the storage layers do not round decimal values. The message I'm getting from you is that with SQLite, rounding is a necessary evil. Unfortunately, rounding could be a massive security risk in my application. Therefore, I'm going to have to ensure there's no rounding going on at the functional testing level instead of the unit testing level.

  15. Mike Bayer repo owner

    Replying to guest:

    The message I'm getting from you is that with SQLite, rounding is a necessary evil.

    let me reiterate that I'm not entirely sure, which is why I was encouraging you to check with the pysqlite folks. SQLite itself says it stores NUMERIC using a lossless scheme. It seems strange that pysqlite unconditionally converts to a floating point even if it isn't represented that way by SQLite, and there's various pysqlite settings that may affect this.

    There's also no reason you can't store the decimals in a VARCHAR, just adapt the ShiftedDecimal recipe above to use String instead of Integer.

  16. Mike Bayer repo owner

    OK the warning is in f432bc8fe15507653caff6f57e54d4bf6a78fbd7.

    It's a little ridiculous that all of these DBAPIs (cx_oracle, pysqlite, pyodbc) put no effort at all into this issue, and nobody seems to care (especially cx_oracle surprises me - I looked throughout its source - it returns floating points unconditionally). In particular I'd really like users who find this to be an issue to please email the DBAPI authors, since in the absense of a documented option to have the DBAPI not return FP, this is really their problem, not mine.

  17. Mike Bayer repo owner

    pysqlite can't fix it, so I think we all have to wait for the SQLite project to address this someday.

  18. Log in to comment