sqlalchemy does not correctly use executemany in oursql library

Issue #2513 resolved
Former user created an issue

The oursql library supports execute many, but sqlalchemy does not use it as the API requires. Oursql will only run an execute statement as an executemany ''if'' the input parameters are oursql wrapper types:

http://packages.python.org/oursql/tutorial.html#inserting-large-data-sets

Since sqlalchemy does not wrap up a list in an IterWrapper before calling execute, oursql will never run bulk inserts.

Comments (2)

  1. Mike Bayer repo owner

    I think you're misreading the documentation. "Oursql will only run an execute statement as an executemany if the input parameters are oursql wrapper types" is not at all what it says. It says exactly:

    oursql also provides a number of ways to insert large amounts of data without buffering it all in memory.

    ...

    Additionally, executemany() is lazy; if passed a generator or any other iterator which does produces values lazily, values will only be taken from the iterator immediately before they are sent to the database.

    that is, this area of the documentation only refers to the fact that executemany can optionally use an iterator, in which case it will not buffer the Python iterator into memory and thus allowing a dataset larger than memory to be streamed through, nothing more. In fact, this phrase regarding executemany() is not even connected to the previous mentions of the IterWrapper, FileWrapper, etc. classes - those classes are specifically meant to stream ''values'', not rows. I think OurSQL's docs should probably be clarified here.

    There is no mention whatsoever of the notion that executemany "doesn't work" if not passed an iterator:

    http://packages.python.org/oursql/api.html#oursql.Cursor.executemany
    
    Execute the same query with different sets of parameters. This is more efficient than calling execute 
    many times with the same query.
    

    It says plainly, "this is more efficient". There is no mention of the requirement to use an iterator.

    The SQLAlchemy OurSQL dialect was contributed by the author of OurSQL itself, so I'm pretty confident that there's no bypassing of batch behavior here.

    A separate enhancement might be for SQLAlchemy's execute() method to actually accept an iterator, which is passed through to DBAPIs that support such a feature. This would be doable in theory, though most DBAPIs don't actually support this usage and it would be kind of a major change for SQLAlchemy itself. If someone really wants to insert a number of rows larger than which fit into memory, it would be almost the same to just chunk 10000 rows at a time so there's not much rationale for such an enhancement.

  2. Log in to comment