- attached base-sa05.diff
Patch adding support for multiple result sets in ResultProxy
This patch adds support for multiple result sets in the ResultProxy. It basically just adds a "nextset" method and replaces all the calls to close in the fetch* methods with a call to nextset, which closes the proxy if there are no more results available or no nextset method.
Comments (26)
-
Account Deleted -
Account Deleted - attached base-sa06.diff
patch to sqlalchemy/engine/base.py for sa 06
-
repo owner - changed milestone to 0.6.xx
the
ResultProxy
can't work without parsingcursor.description
and having an accurate sense of what columns are present in each row. What happens tocursor.description
whennextset()
is called ? What happens toResultRow.__getitem__()
when called with a non-integer value ?also if you send an email address I can add it to the CC.
-
Account Deleted Whoops, yes, that slipped my mind. I'm guessing just a call to self._init_metadata() after the self.cursor.nextset() is called?
Email is damoxc@gmail.com
-
Account Deleted After adding the call to
ResultProxy._init_metadata()
accessing a columns value viaRowProxy.__getitem__()
works fine when using a string key, alsoResultProxy.keys
contain the correct column names.Is there anything else I should be checking??
-
repo owner - changed watchers to damoxc@gmail.com
I'll try to get out every point:
-
I think this should focus on 0.6 for now since we've made major changes to
ResultProxy
in 0.6, both in its construction and the degree to which it's used - due to RETURNING support the interaction ofResultProxy
is more complicated now. Its possible for an 0.5 backport but 0.6 is really where the major new features are going. -
The hasattr(cursor, 'nextset') is needlessly expensive and isn't consistent with the usual method of feature discovery. Dialects that support
nextset()
should supply a class-levelsupports_nextset
flag, and that's what would be checked here. -
The additional overhead of calling
nextset()
should be minimized as much as possible. Need to see if call counts go over the mark in test/aaa_profiling/test_zoomark and test_zoomark_orm. -
Does MySQL-python support
nextset()
when plainexecute()
is called ? or iscallproc()
required ? Hopefully not since SQLA doesn't usecallproc()
as of yet. -
Lots of tests are needed. I'm not sure which DBAPIs support
nextset()
but assuming its MySQL and MSSQL, tests need to be present which illustrate multiple result sets being fetched for both of those backends. The result sets should have radically different column sets so that thecursor.description
re-init is well exercised. This is enough of a feature to support a new module test/sql/test_nextset.py , if its possible to build a test that is generic between multiple backends (not sure if the MySQL case requires building stored procedures to work). -
of course, the full suite of unit tests needs to pass 100% for MySQL, MSSQL and other nextset()-supporting dialects (and everything else too). Note that the current MSSQL-pyodbc dialect is actually calling cursor.nextset() on an INSERT statement in some cases, as it appends "; select scope_identity()" in some cases to work around a known pyodbc quirk regarding access to the
scope_identity()
function - so this case must be covered when testing.
-
Account Deleted Replying to zzzeek:
I'll try to get out every point:
I think this should focus on 0.6 for now since we've made major changes to
ResultProxy
in 0.6, both in its construction and the degree to which it's used - due to RETURNING support the interaction ofResultProxy
is more complicated now. Its possible for an 0.5 backport but 0.6 is really where the major new features are going.Okay, sounds sensible, I'll install 0.6 in which case for testing.
[BR]
The hasattr(cursor, 'nextset') is needlessly expensive and isn't consistent with the usual method of feature discovery. Dialects that support
nextset()
should supply a class-levelsupports_nextset
flag, and that's what would be checked here.Agreed.
[BR]
The additional overhead of calling
nextset()
should be minimized as much as possible. Need to see if call counts go over the mark in test/aaa_profiling/test_zoomark and test_zoomark_orm.Limiting this to only calling it on dialects that support it would definitely be an improvement.
[BR]
Does MySQL-python support
nextset()
when plainexecute()
is called ? or iscallproc()
required ? Hopefully not since SQLA doesn't usecallproc()
as of yet.Lots of tests are needed. I'm not sure which DBAPIs support
nextset()
but assuming its MySQL and MSSQL, tests need to be present which illustrate multiple result sets being fetched for both of those backends. The result sets should have radically different column sets so that thecursor.description
re-init is well exercised. This is enough of a feature to support a new module test/sql/test_nextset.py , if its possible to build a test that is generic between multiple backends (not sure if the MySQL case requires building stored procedures to work).Just did a simple test case of:
>>> cursor.execute('SELECT 1, 2, 3; SELECT 4, 5, 6') 1L >>> cursor.fetchone() (1L, 2L, 3L) >>> cursor.nextset() 1 >>> cursor.fetchone() (4L, 5L, 6L)
Although only a basic test seems to show that sprocs are not needed.
[BR]
of course, the full suite of unit tests needs to pass 100% for MySQL, MSSQL and other nextset()-supporting dialects (and everything else too). Note that the current MSSQL-pyodbc dialect is actually calling cursor.nextset() on an INSERT statement in some cases, as it appends "; select scope_identity()" in some cases to work around a known pyodbc quirk regarding access to the
scope_identity()
function - so this case must be covered when testing.I'll go through and build a more extensive patch against svn trunk and implement the tests. I'll attach it when I'm done!
-
repo owner thanks, since the tests are the majority of the work.
-
Account Deleted (original author: ged) Replying to zzzeek:
- The additional overhead of calling
nextset()
should be minimized as much as possible. Need to see if call counts go over the mark in test/aaa_profiling/test_zoomark and test_zoomark_orm.
I'm pretty sure this would not cause any significant overhead (both time and call count) for dialects which don't support it (after the supports_nextset change is done). The overhead of one call shouldn't be noticeable at that point. However, I'm a bit more concerned about the time overhead of the actual call to cursor.nextset() for dialects which support it, when the user does not use the functionality. My concern comes from the fact that I have no idea how fast it is. I suppose it's a near instantaneous when there is not more data, but that would need to be tested on all dialects which support it, and if this is not the case, I'd strongly -1 on the current implementation and would favor some kind of optional code path instead.
- The additional overhead of calling
-
Account Deleted (original author: ged) Also I think the current patch is buggy regarding exception handling: it would happen twice: once in nextset, then exception is reraised, and recatched in fetch* and rehandled there, then reraised...
-
Account Deleted Replying to ged:
Also I think the current patch is buggy regarding exception handling: it would happen twice: once in nextset, then exception is reraised, and recatched in fetch* and rehandled there, then reraised...
Should there just not be exception handling in
nextset()
, or perhaps change it tonextset(raise_exc=True)
? -
Account Deleted (original author: ged) Replying to guest:
Replying to ged:
Also I think the current patch is buggy regarding exception handling: it would happen twice: once in nextset, then exception is reraised, and recatched in fetch* and rehandled there, then reraised...
Should there just not be exception handling in
nextset()
, or perhaps change it tonextset(raise_exc=True)
?The later seem fine to me, but wait for zzzeek's take on this as my opinion is purely informative.
One more point: I am personally not sure calling nextset automatically is a good idea after all: it sure looks convenient in the common case, but it is both a bit surprising (normally calling fetchall several times on a cursor returns the same data) and prevents to fetch the results of the same set several times (unless the user manually copies the result of each set, which seem to be a sign of bad design to me). I think that leaving the user call nextset manually would be better. We of course need a way to tell the proxy to not close the cursor after the first fetchall, but I'll leave that problem to you guys :)
-
repo owner the problem with not calling nextset() is that the resultproxy autocloses when all results are fetched. Doing away with autoclose would be a huge inconvenience to a tremendous amount of code.
-
Account Deleted (original author: ged) Replying to zzzeek:
the problem with not calling nextset() is that the resultproxy autocloses when all results are fetched. Doing away with autoclose would be a huge inconvenience to a tremendous amount of code.
I know, hence my last sentence... IMO, the best option is to detect that there are several resultsets (eg by looking for ';' in the query) and not close the cursor in that case but that might be tricky to implement and/or not 100% reliable. The only other option I see is to use a monkeypatch strategy:
res = xxx.execute() res.autoclose = False set1 = res.fetchall() res.nextset() set2 = res.fetchall() res.close()
If you want to implement an "automatic nextset" behavior in addition to the above, that's fine with me, but the manual nextset behavior needs to be in, IMO. Again, a monkeypatch would do nicely:
res = xxx.execute() res.automatic_multiset = True # or yet another option: #res.autoclose = "nextset" set1 = res.fetchall() set2 = res.fetchall()
-
repo owner Replying to ged:
I know, hence my last sentence... IMO, the best option is to detect that there are several resultsets (eg by looking for ';' in the query) and not close the cursor in that case but that might be tricky to implement and/or not 100% reliable.
oh, no that's not reliable nor very performant. The usual case for multiple return sets is a stored procedure call.
I think we may have to go with execution options here, similarly to the way we are looking into doing "stream_results" in
#1619. it would be an option for the text() and select() constructs. i think I should take a look at the various options we're considering and ensuring theres a consistent system for all of them (including autocommit, which should move to select().options() too).Additionally I'd even say that, as a self documenting measure, calling nextset() on the
ResultProxy
would raise an error saying that "enable_nextset" wasn't enabled for the execution. prefer that to silent failure. -
repo owner So we now have
execution_options()
ready to go. This patch can be resubmitted, usingmultiple_results=True
as the statement option to indicate that autoclose should be disabled and replaced withnextset()
(or that autoclose just shuts off andnextset()
must be called, that might be better to start). the patch also requires that cursor.description be re-initialized within_init_metadata()
. and, we really would need to makeRowProxy
and friends aware of this, as each row maintains a reference to its parent. with_init_metadata()
being called repeatedly, we likely need to rearrange the internal structure such that the "parent" of a row remains referencing an object that corresponds to its result set. so a lot of work here. -
Account Deleted - changed watchers to damoxc@gmail.com, ged@bugfactory.org
(original author: ged) About the last part (RowProxy changes & related), I would like to review any change before it gets committed, as to not have something which is incompatible with a decent C implementation, and if possible, which is consistent with my current C implementation.
-
repo owner Replying to ged:
About the last part (RowProxy changes & related), I would like to review any change before it gets committed, as to not have something which is incompatible with a decent C implementation, and if possible, which is consistent with my current C implementation.
I've committed in c3702fa5169fef29ca13427abea2f6d41449c658 the basic idea of detaching the metadata from the
ResultProxy
, and while I was going to present as a patch, halfway through it became apparent that this is a much better way to go overall as we lose the redundancy ofPickledResultProxy
and we also cleanly detachRowProxy
objects from the parent. This should make the C implementation easier sinceResultProxyMetadata
is where the intensive stuff occurs - you should be able to leaveResultProxy
alone. -
repo owner Also, it removes the
close()
method fromRowProxy
. I'm trying to get 0.6beta1 out the door so I wanted this change to be present. -
repo owner - marked as critical
-
Account Deleted guys, is there any way how to get several row sets until this issue will be fixed ?
-
repo owner pull out the DBAPI cursor from the result proxy and work with it directly:
result = connection.execute("whataver") cursor = result.cursor for row in cursor.fetchall(): # etc. cursor.nextset() for row in cursor.fetchall(): # etc.
-
Account Deleted Great Thanks. I saved a lot of time
-
repo owner extra notes on this feature from https://github.com/zzzeek/sqlalchemy/pull/71:
with all of that, the feature is still fundamentally broken until we add support to the compiler system for multiple result sets. the ability to associate a series of columns/types with result columns (even with string SQL, see http://docs.sqlalchemy.org/en/rel_0_9/core/sqlelement.html#sqlalchemy.sql.expression.TextClause.columns) is critical, and particularly we have users running stored procedures and stuff using this feature. some new kind of API has to be devised for how this might work. I think perhaps we would assume that the linkage between compiled SQL elements and multiple result sets would still involve a single SQL element that then returns multiple results (like a stored proc, or custom SQL element). So we might need a generic method on any Selectable called "additional_columns()" or something like that. This part is outside of your realm so I'd have to look into it. But adding this feature means I have to jump on this part of it b.c. it will be asked for very quickly.
-
repo owner - changed milestone to 1.x.xx
-
repo owner - marked as major
- Log in to comment
patch to sqlalchemy/engine/base.py for sa 05