- edited description
Implementing next on ResultProxy
Since ResultProxy references a DB API cursor is there any reason why the next protocol is not implemented on ResultProxy? Given it can iterator over query results in a loop, it seems to me that it would make sense from a Python language standpoint if it implemented next. I am not aware of how other DB API cursors operate but using psycopg2 one could:
import psycopg2
select_statement = 'select * from generate_series(1, 5)'
conn = psycopg2.connect()
cursor = conn.cursor()
cursor.execute(select_statement)
next(cursor)
Whereas in Sqlalchemy, calling next on the result issues a TypeError:
from sqlalchemy import create_engine
engine = create_engine()
conn = engine.connect()
results = conn.execute(select_statement)
next(results)
Comments (9)
-
reporter -
repo owner - changed milestone to 1.2
- marked as enhancement
OK, per pep249 they are doing it, and also having
__iter__()
return self, so for us that looks like:diff --git a/lib/sqlalchemy/engine/result.py b/lib/sqlalchemy/engine/result.py index 3aae932f2..f7a2fdcfa 100644 --- a/lib/sqlalchemy/engine/result.py +++ b/lib/sqlalchemy/engine/result.py @@ -862,12 +862,15 @@ class ResultProxy(object): self.closed = True def __iter__(self): - while True: - row = self.fetchone() - if row is None: - return - else: - yield row + return self + + def __next__(self): + row = self.fetchone() + if row is None: + raise StopIteration() + return row + + next = __next__ @util.memoized_property def inserted_primary_key(self):
I'm a little nervous about StopIteration being slow so I want to run a performance test on that, to see if Python's iteration internals take away the usual overhead of an exception throw in this case.
-
repo owner relying upon
__next__
with StopIteration is unfortunately 20% slower in the general case for both python 2.7 and 3.6:from __future__ import print_function import timeit # short result set since we want to see # about when the iteration stops collection = list(range(3)) len_collection = len(collection) class Result(object): def __init__(self): self.index = -1 def get_next(self): self.index += 1 if self.index >= len_collection: return None else: return collection[self.index] class Go(object): def __init__(self): self.result = Result() def get_next(self): return self.result.get_next() class Go1(Go): def __iter__(self): while True: elem = self.get_next() if elem is None: return else: yield elem class Go2(Go): def __iter__(self): return self def __next__(self): elem = self.get_next() if elem is None: raise StopIteration() else: return elem next = __next__ def run(result): list(result) print(timeit.timeit("run(Go1())", "from __main__ import run, Go1")) print(timeit.timeit("run(Go2())", "from __main__ import run, Go2"))
#! [classic@photon2 sqlalchemy]$ /opt/python3.6/bin/python3 test.py 5.241073623998091 5.97535327001242 [classic@photon2 sqlalchemy]$ /opt/python2.7/bin/python test.py 5.01953792572 5.92616796494
which implies we'd need to build
__next__
totally separately from__iter__
. not a big deal but uglier. -
repo owner -
reporter Mike, again, thank you for your responsiveness and your time. A 20% performance decrease is a lot for the general case even if StopIteration is considered more Pythonic.
For reference in the standard library, the next implementation on DictReader does not rely on StopIteration.. It seems like this is a pretty similar circumstance regarding the performance needs of both ResultProxy and DictReader.
-
reporter Semantically speaking though it probably would be more Pythonic to have ResultProxy.fetchone() wrap next versus the other way around and ResultProxy.first() call next before closing the transaction. I think this makes more sense in that someone would expect the iter and next methods to be the direct method for iterating over a record stream.
-
repo owner For reference in the standard library, the next implementation on DictReader does not rely on StopIteration.. It seems like this is a pretty similar circumstance regarding the performance needs of both ResultProxy and DictReader.
you have to raise StopIteration, or in the example you posted it is likely relying on the next() called within to raise StopIteration. That's the behavioral contract of next() /
__next__()
.ResultProxy.fetchone() wrap next versus the other way around and ResultProxy.first() call next before closing the transaction.
fetchone() is the classic interface for the cursor so that is the "low level" method here. Iterators are slow, not just the StopIteration but also the "yield" approach in
__iter__()
produces overhead as well. There may also be confusion or bugs created for third party libraries, subclasses, etc. in reversing these methods (e.g. a class that tries to override fetchone() would now fail). -
repo owner - changed status to resolved
Add next(), next() to ResultProxy
Added
__next__()
andnext()
methods to :class:.ResultProxy
, so that thenext()
builtin function works on the object directly. :class:.ResultProxy
has long had an__iter__()
method which already allows it to respond to theiter()
builtin. The implementation for__iter__()
is unchanged, as performance testing has indicated that iteration using a__next__()
method withStopIteration
is about 20% slower in both Python 2.7 and 3.6.Change-Id: I70569a4c48ad85a3c21a7ad422f270a559926cfb Fixes:
#4077→ <<cset 339416b821ed>>
-
repo owner if you take a look at all four of fetchone(), first(), fetchmany(), and fetchall() on ResultProxy, you can see that it would be pretty jarring to arbitrarily have just fetchone() and first() use the iterator protocol directly while not fetchmany() and fetchall(), and you can also see the implementation of first() is not a complete superset of fetchone() in any case.
- Log in to comment