Issues

Issue #3034 resolved

Use my own bindparam for Query.limit()

Dobes Vandermeer
created an issue

I am trying to use the "baked query" pattern to reduce the time spent generating SQL, which currently is quite significant for our app.

One thing I can't seem to parameterize using a bindparam, however, is the limit on the query.

Although in #805 the limit was changed into a bindparam, this doesn't allow me to provide my own bindparam but rather always creates its own bindparam, assuming the limit I procide is a number already.

I think that if the places where currently it wraps a numeric limit into a bindparam using sql.literal() it would also check if limit was already a bindparam then it would be good.

Here's an example that can be pasted into the test suite:

    def test_select_with_bindparam_limit(self):
        """Does a query allow bindparam for the limit?"""
        sess = create_session()
        users = []

        q1 = sess.query(self.classes.User).order_by(self.classes.User.id).limit(sa.bindparam('n'))
        for n in xrange(1,4):
            users[:] = q1.params(n=n).all()
            assert len(users) == n

Comments (32)

  1. Dobes Vandermeer reporter

    I have a proof-of-concept patch that does support this functionality that you review if you think this capability is of interest.

    https://bitbucket.org/dobesv/sqlalchemy/branch/issue_3034

    I'm not sure if this change is actually desirable so I'll leave it at that. To "finish" the patch I might want to:

    • Make a more specific function for support offset as either bindparam or int instead of using _literal_as_binds
    • Add a unit test for offset as a bindparam
  2. Mike Bayer repo owner
    • changed milestone to 1.0
    • marked as major

    this involves changing the type of object for select._limit and select._offset attributes to a SQL expression rather than an integer, or otherwise adding tracking of ints and expressions separately. im not sure if existing user recipes are looking at _limit / _offset, even though these are underscored they have never changed before.

  3. Mike Bayer repo owner

    OK actually, looking at that patch I remember that we have no choice but to keep _limit and _offset as ints, as we have many third party dialects for which breaking backwards compatibility is not an option. We will have to add new attributes _limit_clause and _offset_clause to select and query; then _limit and _offset will probably be descriptors which extract an integer value for the benefit of those unmodified dialects, and raise an exception if the expression is not a simple integer-holding expression.

    if you want to work up a patch like that it can be a candidate for the 0.9 series as it wont break backwards compat.

  4. Dobes Vandermeer reporter

    OK I added a test to that branch for offset working as a bindparam.

    Do you think _literal_as_binds is inappropriate for the purpose I am using it for there? Basically I just wanted a function that does "If it's already a bindparam, return it. Otherwise, create a literal bindparam for it". Based on the thread you linked, however, it might be appropriate at times to allow people to specify another clause like Null() or some other thing, which _literal_as_binds would do.

  5. Dobes Vandermeer reporter

    I think if you don't use the new feature it won't bother you - the old ways will still work. However, it does mean that using your own bindparam for offset/limit depends on support from the dialect. Which it has to ...

  6. Dobes Vandermeer reporter

    BTW my workaround (caching separately for each limit) works but in the long term I don't like it because I want to know in advance how many variations of the query I will be keeping, right now limit is the only part of the cache key that isn't a boolean.

  7. Mike Bayer repo owner

    OK we do use literal_as_binds for this kind of thing but we usually do it at SQL construct creation time, not in the compiler. this is to limit how many decisions compilers have to make, both for speed and for maximum compatibility.

    When the limit/offset is actually a literal value, we must run it through asint(); some dialects do not support these values being placed as bound parameters and assume these values are integers safe to copy directly into SQL (see Sybase) so that function must be present.

    I think to maintain maximum compatibility and zero chance of security issues on 3rd party dialects, as well as to allow for clear error reporting for non-upgraded dialects, ._limit and ._offset should always be guaranteed integers or else it raises an exception on access (e.g. "this select has a SQL-composed limit/offset expression"). these accessors can actually look at self._limit_clause.effective_value, assuming the value is a BindParamClause.

    New attributes _limit_clause and _offset_clause will store the actual SQL clause that is to be compiled into the statement. When these are set, the incoming argument can be produced from a _literal_as_binds-like function but it also has to run a literal value through asint() before producing a BindParameter.

  8. Dobes Vandermeer reporter

    A lot of code seems to assume that select._limit and select._offset are rapid access. I'm thinking maybe they should be kept as regular fields set to the actual offset, if we have one. I think we can keep those up to date in _copy_internals since that is where bind params are given their values.

  9. Mike Bayer repo owner

    the idea is that all code within SQLAlchemy itself will no longer look at select._limit and select._offset - it will all use the new fields. ._limit and ._offset can be less performant Property objects going forward, and are there just for the benefit of external dialects such as http://code.google.com/p/ibm-db/source/browse/ibm_db_sa/ibm_db_sa/base.py?repo=ibm-db-sa#335 , which interestingly enough also would be a total security breach if we didn't have the asint() check ;).

  10. Mike Bayer repo owner

    I guess the slice() logic in Query needs to read _offset a bit. not sure yet how Query's _limit/_offset should be interacting here, those might stay "as is" where they can be literal or SQL element directly, does that work?

  11. Dobes Vandermeer reporter

    I want planning to change the query class, just the select. But I could.

    It seems to work fine just having the query offset and limit be either a number or a clause.

    It may be less ideal this way - having a field just hold one kind of value does reduce confusion.

  12. Mike Bayer repo owner

    I think the _limit() and _offset() properties should raise an exception when there is a non-None limit/offset clause that is not a simple integer. This so that if someone tries to run a query against the Sybase or a similar dialect, and that query uses a non-int limit/offset, an exception will be raised. As it is now, the limit/offset is silently ignored on a backend like Sybase.

    otherwise looks pretty good.

  13. Dobes Vandermeer reporter

    OK I added code to throw an exception if the clause is set but not a bind parameter. asint() should already throw an exception if the value isn't an int. I'm not sure if the exception is the right type and message, though - do you think there's something more specific I could use for that?

  14. Mike Bayer repo owner

    we pretty much do InvalidRequestError for all of those, but then if something goes wrong within compilation, we like to raise CompilerError. poking around it seems like maybe UnsupportedCompilationError might be our best bet...lets do it like this:

    def _offset_or_limit_clause_asint(clause, attrname):
        if clause is None:
            return None
        try:
            value = clause.effective_value
        except AttributeError:
            raise exc.UnsupportedCompilationError(
                            "This SELECT structure does not use a simple "
                            "integer value for %s" % attrname)
        else:
            return util.asint(value)
    

    if you want to write more tests, we need a few more variants in test/sql/test_compiler.py. "test_int_limit_offset_coercion" tests the case where we call select.limit() or select.offset(), we'd need the test where we pass expressions to those and later try to access select._limit or select._offset, asserting the compilationerror is raised.

  15. Mike Bayer repo owner

    Then, I'd love if you can look into #3054 and tell me what you think of that idea, both from a feasability standpoint as well as usability. This might be it! thanks.

  16. Log in to comment