StatementError Exceptions un-pickable.
All StatementError exceptions (and exceptions derived) are not pickable. This means that they cannot be used, for example, in the callback of multiprocessing.map_async function (which uses cPickle to communicate between master and workers).
See discussion here: http://stackoverflow.com/a/8786557/1063605
This is probably a Python bug, and it's been already issued: http://bugs.python.org/issue13751 but it would be nice to workaround it in SQLAlchemy, too.
Comments (19)
-
repo owner -
Hi Mike,
Thanks for the quick fix. However, wouldn't it be more straightforward to just pass the argments down to the
.args
attribute ofException
(which is what pickle looks at)?In any case, don't you need to define
__reduce__
on other exception in "exc.py"? For example. you haveIn 1: n = exc.NoReferencedColumnError("foo", "tname", "cname")
In 2: from cPickle import loads, dumps
In 3: loads(dumps(n))
TypeError Traceback (most recent call last)
/usr/local/src/sqlalchemy/sqlalchemy/lib/sqlalchemy/<ipython console> in <module>()
TypeError: ('init() takes exactly 4 arguments (2 given)', <class 'exc.NoReferencedColumnError'>, ('foo',))
This corresponds to
class NoReferencedColumnError(NoReferenceError): """Raised by
ForeignKey
when the referredColumn
cannot be located."""def __init__(self, message, tname, cname): NoReferenceError.__init__(self, message) self.table_name = tname self.column_name = cname
but you have not added a
__reduce__
function for that class. There are a couple of other instances, likeNoReferencedTableError
andCircularDependencyError
.Additionally, I'm unclear about what to do about psycopg2 as you suggest. I think that writing on their site "make your exceptions pickleable" would be short of detail. Can you provide a usage example or more information about the issue?
BTW, on the subject of how to temporarily work around it, do you recommend adding
__reduce__
functions usingcopy_reg
or otherwise, or reraising the exception as lbolla does? I prefer the former, but apparently there is the issue of psycopg2 exceptions that you raised... Thanks. -
I came up with the following simple example
import psycopg2 conn = psycopg2.connect("dbname='illumina_faheem'") cur = conn.cursor() try: cur.execute("SELECT * FROM barf") except Exception, e: print e pass
import cPickle a = cPickle.dumps(e) l = cPickle.loads(a)
Which gives:
relation "barf" does not exist LINE 1: SELECT * FROM barf ^ Traceback (most recent call last): File "<stdin>", line 11, in <module> File "/usr/lib/python2.6/copy_reg.py", line 70, in _reduce_ex raise TypeError, "can't pickle %s objects" % base.name TypeError: can't pickle cursor objects
Is this convincing, or should I give an example using multiprocessing?
-
repo owner - removed status
- changed status to open
Replying to faheem:
Hi Mike,
Thanks for the quick fix. However, wouldn't it be more straightforward to just pass the argments down to the
.args
attribute ofException
(which is what pickle looks at)?not sure what you mean here. I basically didn't want to make the arguments in
__init__
optional, so__reduce__
is a way to control exactly how__init__
is called when unpickling.In any case, don't you need to define
__reduce__
on other exception in "exc.py"? For example. you haveIn 1: n = exc.NoReferencedColumnError("foo", "tname", "cname")
In 2: from cPickle import loads, dumps
In 3: loads(dumps(n))
TypeError Traceback (most recent call last)
oh yes, missed those. reopening. though those are typically raised at configuration time.
Additionally, I'm unclear about what to do about psycopg2 as you suggest. I think that writing on their site "make your exceptions pickleable" would be short of detail. Can you provide a usage example or more information about the issue?
BTW, on the subject of how to temporarily work around it, do you recommend adding
__reduce__
functions usingcopy_reg
or otherwise, or reraising the exception as lbolla does? I prefer the former, but apparently there is the issue of psycopg2 exceptions that you raised... Thanks.as a workaround I'd not be allowing exceptions to propagate outside of multiprocessing subprocesses. I'd reraise them or handle them.
-
repo owner Replying to faheem:
I came up with the following simple example
import psycopg2 conn = psycopg2.connect("dbname='illumina_faheem'") cur = conn.cursor() try: cur.execute("SELECT * FROM barf") except Exception, e: print e pass
import cPickle a = cPickle.dumps(e) l = cPickle.loads(a)
Which gives:
relation "barf" does not exist LINE 1: SELECT * FROM barf ^ Traceback (most recent call last): File "<stdin>", line 11, in <module> File "/usr/lib/python2.6/copy_reg.py", line 70, in _reduce_ex raise TypeError, "can't pickle %s objects" % base.name TypeError: can't pickle cursor objects
Is this convincing, or should I give an example using multiprocessing?
It's convincing, but you do need to mention that multiprocessing pickles exception objects. Otherwise they might wonder why you need this feature.
-
Replying to zzzeek:
Replying to faheem:
Hi Mike,
Thanks for the quick fix. However, wouldn't it be more straightforward to just pass the argments down to the
.args
attribute ofException
(which is what pickle looks at)?not sure what you mean here. I basically didn't want to make the arguments in
__init__
optional, so__reduce__
is a way to control exactly how__init__
is called when unpickling.I didn't mean to cause confusion. I just meant there are other ways one can do this. Per
sbt
inclass GoodExc1(Exception): def __init__(self, a): **Non-optional param in the constructor.** Exception.__init__(self, a) self.a = a
or
class GoodExc2(Exception): def __init__(self, a): **Non-optional param in the constructor.** self.args = (a,) self.a = a
But maybe these are no better than the way you are doing it.
as a workaround I'd not be allowing exceptions to propagate outside of multiprocessing subprocesses. I'd reraise them or handle them.
So something like what
lbolla
is doing in that question looks reasonable to you? I.e.def do(kwargs): try: i = kwargs['i']('i') print i raise BadExc('a') return i except Exception as e: raise Exception(repr(e))
I don't understand how this avoids the problem, but it seems to work.
Thanks, Faheem
-
Replying to zzzeek:
It's convincing, but you do need to mention that multiprocessing pickles exception objects. Otherwise they might wonder why you need this feature.
I've reported it here. The developers seem to be considering it, but I'm not really following the discussion. I did try to produce an example that would hang like the SQLA one, but was not able to. I did the following, but it doesn't hang as I would have expected. Any idea why?
import multiprocessing, psycopg2 def do(kwargs): i = kwargs['i']('i') conn = psycopg2.connect("dbname='illumina_faheem'") cur = conn.cursor() cur.execute("COMMIT; BEGIN; TRUNCATE foo%s; COMMIT;") pool = multiprocessing.Pool(processes=5) # start 5 worker processes results = [= [](] arglist) for i in range(10): arglist.append({'i':i}) r = pool.map_async(do, arglist, callback=results.append) r.get() r.wait() pool.close() pool.join()
-
repo owner Seems like they've accepted it. See how quick they were to close it at first ? Yeah, we all have itchy trigger fingers in this business :).
-
BTW, I didn't receive any of the updates to this ticket by email, as I would have expected.
-
repo owner I've updated the trac configuration let's see what happens.
-
repo owner - changed status to resolved
OK some more exceptions fixed in 26625d897f9becd7bccc6ece037e9ad39a6a77e9 and ORM + more tests in rc8b4b7c8cb7b9. If jenkins makes it through we should be good.
-
I just checked your changes. How about
CircularDependencyError
inlib/sqlalchemy/exc.py
? This doesn't have a__reduce__
defined. Otherwise looks good.I got an email notification of the last comment.
What is jenkins?
Faheem.
-
Thought about enabling openid support? See https://bitbucket.org/Dalius/authopenid-plugin/wiki/Home. This is in Debian, so is probably reasonably stable.
-
repo owner - removed status
- changed status to open
-
repo owner - changed status to resolved
-
repo owner Replying to faheem:
Thought about enabling openid support? See https://bitbucket.org/Dalius/authopenid-plugin/wiki/Home. This is in Debian, so is probably reasonably stable.
I'd prefer an OAuth plugin that I can hook into github, couldn't find one. openID seems kind of hopeless to me.
-
repo owner also we're on 0.12 so this wouldn't work anyway.
-
I've never heard of OAuth before. One learns something new every day. By hopeless do you mean insecure, or do you have other objections? I've used openID for awhile. I don't know about security, but I like not having to keep track of 10 million passwords.
-
repo owner - removed milestone
Removing milestone: 0.7.5 (automated comment)
- Log in to comment
this is tested/fixed on the SQLAlchemy side in d78d2d60aa30b0b6c3c230ddf3cafda2529e6409. However, at least psycopg2 doesn't support pickling of their exceptions, since they package the cursor within it, and probably other DBAPIs - so the usage of multiprocessing still needs you to re-raise exceptions as something harmless. You'll need to post on their tracker as well http://initd.org/psycopg/tracker/ to get it fixed on their end.