Binary type returns deprecated "buffer" type; should return string/bytes for symmetric round trips

Issue #1524 resolved
Former user created an issue

If you set a Binary column to the same value, it will be marked as dirty, and an update for that column issued on flush. Shouldn't a byte-by-byte compare be done here instead? If I recall correctly the buffer type that sqlalchemy uses for binary columns does not directly support comparison, you have to turn it into a str first, so perhaps that's the problem.

I had to modify my code to first do the compare and only set the column if the values were not equal. This worked to avoid the pointless update.

Comments (6)

  1. Mike Bayer repo owner

    no test case was supplied so there's nothing we can do here.

    from sqlalchemy import *
    from sqlalchemy.orm import *
    
    m = MetaData()
    
    def inc():
        global counter
        counter += 1
        return counter
    
    t = Table('t1', m,
        Column('id', Integer, primary_key=True),
        Column('data', Binary),
        Column('version', Integer, onupdate=inc, default=inc)
    )
    
    for e in (
        create_engine('sqlite://', echo=True),
        create_engine('postgres://scott:tiger@localhost/test', echo=True)
    ):
        counter = 0
        m.drop_all(e)
        m.create_all(e)
    
        class C(object):
            def __init__(self, data):
                self.data = data
    
        mapper(C, t)
    
        s = create_session(autocommit=False, autoflush=True, bind=e)
    
        c1 = C(data="this is some data")
        s.add(c1)
        s.commit()
        assert c1.version == 1
    
        c1.data = "this is some data"
        s.commit()
        assert c1.version == 1
    
  2. Mike Bayer repo owner

    passes on MySQL 5 as well. we don't hold onto the "buffer" returned by DBAPI that is only an interim value for certain backends.

  3. Former user Account Deleted
    • changed status to open
    • removed status

    Replying to zzzeek:

    Sorry, I use elixir, so I wasn't easily able to create a repro without it. Thanks to your code, however, I did. Modify your code by adding this to the end of the file.

    id = c1.id
    del c1
    
    s = create_session(autocommit=False, autoflush=True, bind=e)
    
    c2 = s.query(C).get(id)
    c2.data = "this is some data"
    s.commit()
    assert c2.version == 1
    

    The assertion trips. Setting a breakpoint after fetching c2, we see:

    c2.data <read-only buffer for 0x0234A3D0, size 17, offset 0 at 0x02EAC900> c2.data == "this is some data" False str(c2.data) == "this is some data" True

    This is using psycopg2 against postgres 8.3 and sqlalchemy 0.5.4p1

    -Eloff

  4. Mike Bayer repo owner

    Thanks for the test case. My comment regarding "buffer" was incorrect as I mistakenly thought we were performing that conversion, but this was not the case. The fix is to allow Binary to return a string, which represents bytes in Python (or the "bytes" type in python 3), thus representing binary data in a consistent way on the bind and result sides. the "buffer" should never have been exposed here and that is a remnant of an ancient decision that nobody has ever revisited.

    This change is backwards incompatible with applications coded against 0.5's behavior so the change is targeted at 0.6. For your 0.5 applications use your own binary type:

    class MyBinary(TypeDecorator):
        impl = Binary
    
        def process_result_value(self, value, dialect):
            if value is not None:
                value = str(value)
            return value
    

    0b292e8842851dd0ebb127ab1cd6131eefa288fe

  5. Former user Account Deleted

    Replying to zzzeek:

    Good to hear about the fix, it makes sense. Thanks also for the workaround.

    Cheers, -Eloff

  6. Log in to comment