PostgreSQL access can be jacked up if other code/libraries (within the same Python runtime) are also using psycopg2
I encountered this problem when porting my database code from using Django's database layer, to using SQLAlchemy/Elixir. Django was still accessing my database to store session information, so both SQLAlchemy and Django were digging their hands into psycopg2 stuff.
I suspect there may be a way for SQLAlchemy to work around this problem, when it detects that "someone else" (e.g., another library) has modified the psycopg2 runtime configuration. I'm not certain that it is possible to detect, however.
Also, I believe this bug is actually more of a design flaw/bug in psycopg2 than it is a problem with Django ''or'' SQLAlchemy, but I thought I'd pass this through you guys since you're the database experts, and let you talk with the psycopg people if need be.
Sorry, but I'm just going to refer you to the Django report that I made for this issue, as I already explained the problem fairly well, there: http://code.djangoproject.com/ticket/5171 . I know this issue is kinda questionable (i.e., whether or not it should be considered a bug), but it was pretty hellacious to track down, so it'd be great to save others the same grief I had.
Comments (6)
-
Account Deleted -
repo owner whats the bug here ? django is monkeypatching psycopg2, then all other apps that rely upon psycopg2's documented behavior break ? If so, I'm calling "not our problem" on this one....
-
Account Deleted No, Django is not doing any monkeypatching. I can't remember, now, exactly what caused the problem, but I think it went something like this...
When calling this:
psycopg2.extensions.register_type(psycopg2.extensions.UNICODE)
pyscopg2's configuration changes ''globally''. That is, such a call seems to affect all components -- within the given Python runtime -- that access a database via psycopg2.
And (again, I can't remember exactly, but) I believe that, when that particular configuration change is made, psycopg2 tries to convert result strings (strings returned from queries) into {{{unicode}}} objects. And then, (I think) since SQLAlchemy does not set the "client encoding" ({{{self.connection.set_client_encoding('UTF8')}}}), {{{UnicodeDecodeError}}}'s occur when psycopg2 tries to pull non-ASCII strings from the database.
Sorry -- I just can't remember specifically what the problem was; it was rather deep and complicated (as character encoding problems tend to be).
-
repo owner so, psycopg2 allows you to set one half of a certain unicode configuration globally, and the other half of it only on individual connections, which you have to do manually otherwise it breaks.....nice ! I think the psycopg2 extension should also be instrumenting psycopg2 connections to be properly configured if thats whats needed here. though i would guess that the client encoding setting could be made at the postgres configurational level as well.
-
Account Deleted Replying to zzzeek:
so, psycopg2 allows you to set one half of a certain unicode configuration globally, and the other half of it only on individual connections, which you have to do manually otherwise it breaks.....nice !
Yeah, I think that sums it up. :o)
I think the psycopg2 extension should also be instrumenting psycopg2 connections to be properly configured if thats whats needed here.
That sounds like a good solution. I'm not certain of whether that will work, as I can't remember the exact details, but I suspect it will.
I believe this is actually a flaw in psycopg2, itself, but I don't know if it's likely the psycopg2 folks would agree (?).
-
repo owner - changed status to wontfix
Work around this by setting the client encoding on all new connections:
# coding: utf-8 from sqlalchemy import * from sqlalchemy.interfaces import PoolListener from sqlalchemy.pool import QueuePool from psycopg2 import extensions extensions.register_type(extensions.UNICODE) class SetEncoding(PoolListener): def connect(self, dbapi_con, con_record): dbapi_con.set_client_encoding('UTF8') engine = create_engine('postgres://scott:tiger@127.0.0.1/test', strategy="threadlocal") engine.pool.add_listener(SetEncoding()) engine.begin() engine.execute("create table foo (data varchar(30))") engine.execute("insert into foo values(%(foo)s)", foo=u'réveillé') x = engine.execute(u"select * from foo").scalar() print repr(x)
- Log in to comment
Darn -- forgot to add myself to the CC list when creating this ticket. My email address is cw264701@ohiou.edu, if you need to contact me. If you could add me to the CC list, that'd be great. Thanks.