Server side cursors for single Selects

Issue #1619 resolved
Former user created an issue

It is useful to be able to have server side cursors disabled for most queries, and then enable them only for the queries you need them on.

The attached patch provides a server_side_cursor-argument to select().

Comments (15)

  1. Former user Account Deleted

    With server_side_cursors enabled at the engine level, you get at least three round-trips to the server for every select you're doing --- one for declaring the cursor, one to fetch the (possibly one-row) result and a third to close the cursor. That's a lot of overhead if there's just a few batch-load queries you really need to use server side cursors for.

  2. Mike Bayer repo owner

    OK, two things:

    1. the API to select() would be:
      select(...).options(postgresql_psycopg2_server_side_cursors=True)
      

    The key/value goes into an "options" dict on select(), which ordinarily is an empty dict attached to the _SelectBaseMixin class - options() replaces it with a local dict containing the options (i.e. generative style).

    1. the check needs to look like:

      self.compiled and self.compiled.statement.options.get('postgresql_psycopg2_server_side_cursors', False)

    which unfortunately still adds five attribute + one dictionary lookup to every single statement execution.

  3. Former user Account Deleted

    I get that you don't want to start a trend where dialect-specifics can clutter up selects arguments. :-)

    The concept of server side cursors are not really Postgres-specific, though, and can possibly be reused by other dialects, when they get support for it. "postgresql_psycopg2_"-prefix or not, if we let Select have a "_use_server_side_cursor"-attribute, whether to use it can be determined during compilation of the statement. Removing/deprecating the engine attribute server_side_cursors, the execution time check in create_cursor would then be self.compiled and self.compiled.statement._use_server_side_cursor. This would also relieve us from having to do all the other checks --- i.e. whether we're in a transaction, if it's FOR UPDATE, and so on. (I guess those are there mostly because it's currently an all or nothing-option. If the user specifies use_server_side_cursors for a query where it's impossible, that's a user error, IMHO. :)

    Some simple testing suggests the few per-query nanoseconds contributed by those attribute lookups are well within the noise range.

  4. Former user Account Deleted

    I've attached a new patch.

    Questions:

    • What about _TextClause? Should we give it options too? (That's the reason for the getattr( ..., dict())

    • Where should a test that ensures that a Query's select_options() get passed on to the compiled statement go? I skimmed through test/orm/test_query et al, but it didn't seem to fit there. :-)

    – AlexB

  5. Mike Bayer repo owner

    Ok in #1089 there is discussion of trying to get SS cursors to work with MySQL, though there is doubt that it really works. This is leading me however to consider a "higher level" option such as "stream_results", indicating SQLA do whatever it knows how to get results to stream. If the dialect doesn't support streaming, a warning should be emitted.

    text() should get a keyword argument for this particular feature, since text() isn't as useful "generatively".

    re: query.select_options(), the test is create a query with select_options(), call up query.statement and ensure the options are set on the underlying select() object.

  6. Former user Account Deleted

    Updated patch, per discussions on IRC, but to recap:

    • Renamed parameter to "stream_results". Note that with the OurSQL-dialect, this can eventually also be used with MySQL as well.
    • Fixed proper copying of the options dictionary when generating derived statements/queries.
    • Gave _TextClause and select() an options-kwarg.

    If it's starting to look right, I'll do the docs as well.

    – AlexB

  7. Mike Bayer repo owner

    changes TODO:

    • the select.options() in Query need to occur inside of _compile_context().
    • the ORM test for the options should probably be in test.orm.test_query, test that the options are present when query.statement, query.subquery(), query._compile_context() are called.
    • lets call query.select_options() query.statement_options(), to leave open room for UPDATE and DELETE later on.
    • the implicit creation of dict() in postgresql/psycopg2.py is a little wasteful, let's just build out that conditional
  8. Former user Account Deleted

    Attached patch ss-cursors-and-statement-options-1619.diff should cover those TODOs.

    • Changed selectable.options() to selectable.statement_option() as well, for the same reasons as you mentioned re. Query.
    • Possibly controversial: Alias now overrides _compiler to compile the original-statement. It was necessary to get query.subquery().execute() to respect the query's options.
    • I left the ORM-test in test_postgresql around, even though I added new ones to test_query. It exercises said _compiler-change, which the other ones don't --- since PostgreSQL+psycopg2 is the only thing with statement options right now. :-)

    -- AlexB

  9. Mike Bayer repo owner

    Replying to guest:

    • Possibly controversial: Alias now overrides _compiler to compile the original-statement. It was necessary to get query.subquery().execute() to respect the query's options.

    ehh id rather get alias to propagate those options when generating new selects, but even more i think the query.subquery().execute() case is not important right now (or even correct). its an execution option and a subquery is very specifically a component of a larger statement. so for now we just take that out.

  10. Former user Account Deleted

    The original-statement that is aliased retains the options. It's not as much query.subquery() specifically as it is having a selectable, with options, that is executed. I agree it's hackish and not important, which is why I pointed it out. :) (The Postgres- and SQLite-suite passed, so it didn't seem to break anything, so I left it in.)

    With propagate, do you mean having the options propagate up in the select-tree? I.e. that with select(..., from_obj=query.statement_options(whatever='value').subquery()), the outer select would get the subqueries options? I'd say that's counter-intuitive. Anyway, these are edge-cases that should see a real-world need before putting effort into.

  11. Log in to comment