- attached oursqlTest.py
sqlalchemy does not correctly use executemany in oursql library
The oursql library supports execute many, but sqlalchemy does not use it as the API requires. Oursql will only run an execute statement as an executemany ''if'' the input parameters are oursql wrapper types:
http://packages.python.org/oursql/tutorial.html#inserting-large-data-sets
Since sqlalchemy does not wrap up a list in an IterWrapper before calling execute, oursql will never run bulk inserts.
Comments (2)
-
Account Deleted -
repo owner - changed status to wontfix
I think you're misreading the documentation. "Oursql will only run an execute statement as an executemany if the input parameters are oursql wrapper types" is not at all what it says. It says exactly:
oursql also provides a number of ways to insert large amounts of data without buffering it all in memory.
...
Additionally, executemany() is lazy; if passed a generator or any other iterator which does produces values lazily, values will only be taken from the iterator immediately before they are sent to the database.
that is, this area of the documentation only refers to the fact that executemany can optionally use an iterator, in which case it will not buffer the Python iterator into memory and thus allowing a dataset larger than memory to be streamed through, nothing more. In fact, this phrase regarding executemany() is not even connected to the previous mentions of the IterWrapper, FileWrapper, etc. classes - those classes are specifically meant to stream ''values'', not rows. I think OurSQL's docs should probably be clarified here.
There is no mention whatsoever of the notion that executemany "doesn't work" if not passed an iterator:
http://packages.python.org/oursql/api.html#oursql.Cursor.executemany Execute the same query with different sets of parameters. This is more efficient than calling execute many times with the same query.
It says plainly, "this is more efficient". There is no mention of the requirement to use an iterator.
The SQLAlchemy OurSQL dialect was contributed by the author of OurSQL itself, so I'm pretty confident that there's no bypassing of batch behavior here.
A separate enhancement might be for SQLAlchemy's execute() method to actually accept an iterator, which is passed through to DBAPIs that support such a feature. This would be doable in theory, though most DBAPIs don't actually support this usage and it would be kind of a major change for SQLAlchemy itself. If someone really wants to insert a number of rows larger than which fit into memory, it would be almost the same to just chunk 10000 rows at a time so there's not much rationale for such an enhancement.
- Log in to comment
Example of insert problem, needs mysql db to connect to.