PostgreSQL access can be jacked up if other code/libraries (within the same Python runtime) are also using psycopg2

Issue #783 resolved
Former user created an issue

I encountered this problem when porting my database code from using Django's database layer, to using SQLAlchemy/Elixir. Django was still accessing my database to store session information, so both SQLAlchemy and Django were digging their hands into psycopg2 stuff.

I suspect there may be a way for SQLAlchemy to work around this problem, when it detects that "someone else" (e.g., another library) has modified the psycopg2 runtime configuration. I'm not certain that it is possible to detect, however.

Also, I believe this bug is actually more of a design flaw/bug in psycopg2 than it is a problem with Django ''or'' SQLAlchemy, but I thought I'd pass this through you guys since you're the database experts, and let you talk with the psycopg people if need be.

Sorry, but I'm just going to refer you to the Django report that I made for this issue, as I already explained the problem fairly well, there: http://code.djangoproject.com/ticket/5171 . I know this issue is kinda questionable (i.e., whether or not it should be considered a bug), but it was pretty hellacious to track down, so it'd be great to save others the same grief I had.

Comments (6)

  1. Former user Account Deleted

    Darn -- forgot to add myself to the CC list when creating this ticket. My email address is cw264701@ohiou.edu, if you need to contact me. If you could add me to the CC list, that'd be great. Thanks.

  2. Mike Bayer repo owner

    whats the bug here ? django is monkeypatching psycopg2, then all other apps that rely upon psycopg2's documented behavior break ? If so, I'm calling "not our problem" on this one....

  3. Former user Account Deleted

    No, Django is not doing any monkeypatching. I can't remember, now, exactly what caused the problem, but I think it went something like this...

    When calling this:

    psycopg2.extensions.register_type(psycopg2.extensions.UNICODE)
    

    pyscopg2's configuration changes ''globally''. That is, such a call seems to affect all components -- within the given Python runtime -- that access a database via psycopg2.

    And (again, I can't remember exactly, but) I believe that, when that particular configuration change is made, psycopg2 tries to convert result strings (strings returned from queries) into {{{unicode}}} objects. And then, (I think) since SQLAlchemy does not set the "client encoding" ({{{self.connection.set_client_encoding('UTF8')}}}), {{{UnicodeDecodeError}}}'s occur when psycopg2 tries to pull non-ASCII strings from the database.

    Sorry -- I just can't remember specifically what the problem was; it was rather deep and complicated (as character encoding problems tend to be).

  4. Mike Bayer repo owner

    so, psycopg2 allows you to set one half of a certain unicode configuration globally, and the other half of it only on individual connections, which you have to do manually otherwise it breaks.....nice ! I think the psycopg2 extension should also be instrumenting psycopg2 connections to be properly configured if thats whats needed here. though i would guess that the client encoding setting could be made at the postgres configurational level as well.

  5. Former user Account Deleted

    Replying to zzzeek:

    so, psycopg2 allows you to set one half of a certain unicode configuration globally, and the other half of it only on individual connections, which you have to do manually otherwise it breaks.....nice !

    Yeah, I think that sums it up. :o)

    I think the psycopg2 extension should also be instrumenting psycopg2 connections to be properly configured if thats whats needed here.

    That sounds like a good solution. I'm not certain of whether that will work, as I can't remember the exact details, but I suspect it will.

    I believe this is actually a flaw in psycopg2, itself, but I don't know if it's likely the psycopg2 folks would agree (?).

  6. Mike Bayer repo owner

    Work around this by setting the client encoding on all new connections:

    # coding: utf-8
    
    from sqlalchemy import *
    
    from sqlalchemy.interfaces import PoolListener
    from sqlalchemy.pool import QueuePool
    
    from psycopg2 import extensions
    extensions.register_type(extensions.UNICODE)
    
    class SetEncoding(PoolListener):
        def connect(self, dbapi_con, con_record):
            dbapi_con.set_client_encoding('UTF8')
    
    engine = create_engine('postgres://scott:tiger@127.0.0.1/test', strategy="threadlocal")
    engine.pool.add_listener(SetEncoding())
    
    engine.begin()
    engine.execute("create table foo (data varchar(30))")
    engine.execute("insert into foo values(%(foo)s)", foo=u'réveillé')
    x = engine.execute(u"select * from foo").scalar()
    print repr(x)
    
  7. Log in to comment