- changed milestone to 0.6.7
LIMIT for Oracle should render as ROWNUM <= [text] instead of ROWNUM <= :bindparam
Our limit queries were taking far longer than they should have been for a production database using Oracle 9i.
After much troubleshooting, I determined that the source of the problem is the "ROWNUM <= :parameter" clause that is rendered for Oracle's implementation of "LIMIT".
I took the sqlalchemy rendered query and plugged it into SQLPLUS, using bind parameters.
My query took consistently over 40 seconds.
There were several bind parameters. Replacing the single ROWNUM limit parameter with a literal of the same value consistently cut my query to about 4 seconds. ''I did reset oracle's cache between runs.''
This also changed Oracle's execution plan itself, along with these numbers:
With bind parameter: Cost over 5 million.
With literal text: Cost of 57,000
(I recognize that the numbers are relative only between different execution strategies for the same query, but I've given them anyway to give an idea of the order of magnitude difference.)
Oracle uses a STOPKEY mechanism to optimize ordered rownum limited queries. My hypothesis is that if this rownum is limited by a bind parameter, oracle's engine must query all matching records regardless of the presence of the limiting ordered stopkey, while given an actual value, it can only then limit the records it fetches.
It is feasible that oracle 10 or higher has fixed this problem, so you may decide this ticket is ultimately a recipe, but the evidence I'm seeing would convince me to seriously consider this as an improvement for sqlalchemy.
If this is not something for the project, can you point me in the rough direction of fixing this for my project?
Thanks
Comments (17)
-
repo owner -
repo owner so here's a patch, if you want to write tests for me, mainly one or two in test_oracle:CompileTest
diff -r eb4a843318b4fa76d238c43a293d64af64ca1148 lib/sqlalchemy/dialects/oracle/base.py --- a/lib/sqlalchemy/dialects/oracle/base.py Wed Mar 30 14:22:41 2011 -0400 +++ b/lib/sqlalchemy/dialects/oracle/base.py Thu Mar 31 11:56:35 2011 -0400 @@ -524,6 +524,8 @@ max_row = select._limit if select._offset is not None: max_row += select._offset + if not self.dialect.use_binds_for_limits: + max_row = literal_column("%d" % max_row) limitselect.append_whereclause( sql.literal_column("ROWNUM")<=max_row) @@ -542,8 +544,11 @@ offsetselect._oracle_visit = True offsetselect._is_wrapper = True + offset_value = select._offset + if not self.dialect.use_binds_for_limits: + offset_value = literal_column("%d" % offset_value) offsetselect.append_whereclause( - sql.literal_column("ora_rn")>select._offset) + sql.literal_column("ora_rn")>offset_value) offsetselect.for_update = select.for_update select = offsetselect @@ -635,10 +640,12 @@ def __init__(self, use_ansi=True, optimize_limits=False, + use_binds_for_limits=True, **kwargs): default.DefaultDialect.__init__(self, **kwargs) self.use_ansi = use_ansi self.optimize_limits = optimize_limits + self.use_binds_for_limits = use_binds_for_limits def initialize(self, connection): super(OracleDialect, self).initialize(connection)
-
Account Deleted Replying to zzzeek:
so here's a patch, if you want to write tests for me, mainly one or two in test_oracle:CompileTest
Each time I write tests I wish I could actually run them. Do you have instructions for how to run these tests?
-
repo owner there are detailed instructions in the file README.unittests.
-
Account Deleted Few items: * we needed
sql.literal_column
instead ofliteral_column
* at least for my case,max_row
was sometimes unicode, depending on the source of the limit, so I needed "%s" instead of "%d" * am I using limit incorrectly? It apparently works with unicode numbers, but I could see problems with offset, so what is best choice? * Whatever the answer is may affect the oracle hint several lines earlier("/*+ FIRST_ROWS(%d) */" % select._limit)
--- SQLAlchemy-0.6.4.3kb/lib/sqlalchemy/dialects/oracle/base.py 2010-10-20 19:47:35.000000000 -0400 +++ SQLAlchemy-0.6.4.4kb/lib/sqlalchemy/dialects/oracle/base.py 2011-03-31 13:06:25.000000000 -0400 @@ -515,40 +515,45 @@ limitselect = limitselect.prefix_with("/*+ FIRST_ROWS(%d) */" % select._limit) limitselect._oracle_visit = True limitselect._is_wrapper = True # If needed, add the limiting clause if select._limit is not None: max_row = select._limit if select._offset is not None: max_row += select._offset + if not self.dialect.use_binds_for_limits: + max_row = sql.literal_column("%s" % max_row) limitselect.append_whereclause( sql.literal_column("ROWNUM")<=max_row) # If needed, add the ora_rn, and wrap again with offset. if select._offset is None: limitselect.for_update = select.for_update select = limitselect else: limitselect = limitselect.column( sql.literal_column("ROWNUM").label("ora_rn")) limitselect._oracle_visit = True limitselect._is_wrapper = True offsetselect = sql.select( [for c in limitselect.c if c.key!='ora_rn'](c)) offsetselect._oracle_visit = True offsetselect._is_wrapper = True + offset_value = select._offset + if not self.dialect.use_binds_for_limits: + offset_value = sql.literal_column("%s" % offset_value) offsetselect.append_whereclause( - sql.literal_column("ora_rn")>select._offset) + sql.literal_column("ora_rn")>offset_value) offsetselect.for_update = select.for_update select = offsetselect kwargs['iswrapper']('iswrapper') = getattr(select, '_is_wrapper', False) return compiler.SQLCompiler.visit_select(self, select, **kwargs) def limit_clause(self, select): return "" @@ -626,24 +631,26 @@ ddl_compiler = OracleDDLCompiler type_compiler = OracleTypeCompiler preparer = OracleIdentifierPreparer execution_ctx_cls = OracleExecutionContext reflection_options = ('oracle_resolve_synonyms', ) def __init__(self, use_ansi=True, optimize_limits=False, + use_binds_for_limits=True, **kwargs): default.DefaultDialect.__init__(self, **kwargs) self.use_ansi = use_ansi self.optimize_limits = optimize_limits + self.use_binds_for_limits = use_binds_for_limits def initialize(self, connection): super(OracleDialect, self).initialize(connection) self.implicit_returning = self.__dict__.get( 'implicit_returning', self.server_version_info > (10, ) ) if self._is_oracle_8: self.colspecs = self.colspecs.copy()
-
Account Deleted So my real question is: * Do we cater to people who might put unicode/string values for limit/offset and make sqlalchemy convert them or do we say "sorry, you need to use some form of an integer" ?
-
repo owner the use case is, someone says
limit("5")
? how is that happening ? is that on your end ? ( no its not valid) -
Account Deleted Replying to zzzeek:
the use case is, someone says
limit("5")
? how is that happening ? is that on your end ? ( no its not valid)Yes, that is effectively what is happening, but more subtle, since it is ultimately being passed as a http parameter so it isn't obvious it isn't a number. I'll fix my code.
I'll put it back to "%d".
DBSession.query(Order).limit('10').offset('15').all()
Happily does max_row += select._offset:
WHERE ROWNUM <= 1015) WHERE ora_rn > 15) anon_1
Is it something that warrants an exception being thrown?
I'll also get a couple test cases.
-
Account Deleted This is a tangent, but related: The comment for this code is:
# See http://www.oracle.com/technology/oramag/oracle/06-sep/o56asktom.html # # Generalized form of an Oracle pagination query: # select ... from ( # select /*+ FIRST_ROWS(N) */ ...., rownum as ora_rn from ( # select distinct ... where ... order by ... # ) where ROWNUM <= :limit+:offset # ) where ora_rn > :offset # Outer select and "ROWNUM as ora_rn" can be dropped if limit=0
Notice the "distinct" in the comment. This is something I've programmatically added when the limit also contains a join. If you don't then your limit is not a limit of entities retrieved but rather a limit of joined rows fetched (which is obviously not what is intended).
I had felt sqlalchemy's engine should be adding the distinct, does this warrant another trac ticket?(I feel the answer forthcoming: "Sure, Kent, do you want to implement it...")
-
repo owner I don't know that the distinct is part of the general contract to simulate what LIMIT/OFFSET does. LIMIT OFFSET should be applying a window to the full rowset that the query would return in any case. If we are getting dupe ROWNUM values (didn't think it worked that way), then maybe.
-
repo owner also yes limit()/offset(), on the select() construct (as well as in its constructor) in theory should raise for a non-integer.
-
Account Deleted Here is my inclusive patch. (Sorry the diff is from 0.6.4, but I imagine no conflicts)
I've also included functionality for
offset()
andlimit()
to raise for non-integer(like) parameters (theutil.integer_or_raise
will hopefully be helpful elsewhere):diff -U10 -r SQLAlchemy-0.6.4.3kb/lib/sqlalchemy/dialects/oracle/base.py SQLAlchemy-0.6.4.4kb/lib/sqlalchemy/dialects/oracle/base.py --- SQLAlchemy-0.6.4.3kb/lib/sqlalchemy/dialects/oracle/base.py 2010-10-20 19:47:35.000000000 -0400 +++ SQLAlchemy-0.6.4.4kb/lib/sqlalchemy/dialects/oracle/base.py 2011-03-31 14:12:13.000000000 -0400 @@ -515,40 +515,45 @@ limitselect = limitselect.prefix_with("/*+ FIRST_ROWS(%d) */" % select._limit) limitselect._oracle_visit = True limitselect._is_wrapper = True # If needed, add the limiting clause if select._limit is not None: max_row = select._limit if select._offset is not None: max_row += select._offset + if not self.dialect.use_binds_for_limits: + max_row = sql.literal_column("%d" % max_row) limitselect.append_whereclause( sql.literal_column("ROWNUM")<=max_row) # If needed, add the ora_rn, and wrap again with offset. if select._offset is None: limitselect.for_update = select.for_update select = limitselect else: limitselect = limitselect.column( sql.literal_column("ROWNUM").label("ora_rn")) limitselect._oracle_visit = True limitselect._is_wrapper = True offsetselect = sql.select( [for c in limitselect.c if c.key!='ora_rn'](c)) offsetselect._oracle_visit = True offsetselect._is_wrapper = True + offset_value = select._offset + if not self.dialect.use_binds_for_limits: + offset_value = sql.literal_column("%d" % offset_value) offsetselect.append_whereclause( - sql.literal_column("ora_rn")>select._offset) + sql.literal_column("ora_rn")>offset_value) offsetselect.for_update = select.for_update select = offsetselect kwargs['iswrapper']('iswrapper') = getattr(select, '_is_wrapper', False) return compiler.SQLCompiler.visit_select(self, select, **kwargs) def limit_clause(self, select): return "" @@ -626,24 +631,26 @@ ddl_compiler = OracleDDLCompiler type_compiler = OracleTypeCompiler preparer = OracleIdentifierPreparer execution_ctx_cls = OracleExecutionContext reflection_options = ('oracle_resolve_synonyms', ) def __init__(self, use_ansi=True, optimize_limits=False, + use_binds_for_limits=True, **kwargs): default.DefaultDialect.__init__(self, **kwargs) self.use_ansi = use_ansi self.optimize_limits = optimize_limits + self.use_binds_for_limits = use_binds_for_limits def initialize(self, connection): super(OracleDialect, self).initialize(connection) self.implicit_returning = self.__dict__.get( 'implicit_returning', self.server_version_info > (10, ) ) if self._is_oracle_8: self.colspecs = self.colspecs.copy() diff -U10 -r SQLAlchemy-0.6.4.3kb/lib/sqlalchemy/sql/expression.py SQLAlchemy-0.6.4.4kb/lib/sqlalchemy/sql/expression.py --- SQLAlchemy-0.6.4.3kb/lib/sqlalchemy/sql/expression.py 2010-09-13 02:39:39.000000000 -0400 +++ SQLAlchemy-0.6.4.4kb/lib/sqlalchemy/sql/expression.py 2011-03-31 15:33:29.000000000 -0400 @@ -3539,22 +3539,22 @@ autocommit=None): self.use_labels = use_labels self.for_update = for_update if autocommit is not None: util.warn_deprecated('autocommit on select() is ' 'deprecated. Use .execution_options(a' 'utocommit=True)') self._execution_options = \ self._execution_options.union({'autocommit' : autocommit}) - self._limit = limit - self._offset = offset + self._limit = util.integer_or_raise(limit, 'limit', False) + self._offset = util.integer_or_raise(offset, 'offset', False) self._bind = bind self._order_by_clause = ClauseList(*util.to_list(order_by) or [ self._group_by_clause = ClauseList(*util.to_list(group_by) or [](]) )) def as_scalar(self): """return a 'scalar' representation of this selectable, which can be used as a column expression. Typically, a select statement which has only one column in its columns @@ -3607,28 +3607,28 @@ s = self.__class__.__new__(self.__class__) s.__dict__ = self.__dict__.copy() s._reset_exported() return s @_generative def limit(self, limit): """return a new selectable with the given LIMIT criterion applied.""" - self._limit = limit + self._limit = util.integer_or_raise(limit, 'limit', False) @_generative def offset(self, offset): """return a new selectable with the given OFFSET criterion applied.""" - self._offset = offset + self._offset = util.integer_or_raise(offset, 'offset', False) @_generative def order_by(self, *clauses): """return a new selectable with the given list of ORDER BY criterion applied. The criterion will be appended to any pre-existing ORDER BY criterion. """ diff -U10 -r SQLAlchemy-0.6.4.3kb/lib/sqlalchemy/util.py SQLAlchemy-0.6.4.4kb/lib/sqlalchemy/util.py --- SQLAlchemy-0.6.4.3kb/lib/sqlalchemy/util.py 2010-09-13 02:39:39.000000000 -0400 +++ SQLAlchemy-0.6.4.4kb/lib/sqlalchemy/util.py 2011-03-31 15:29:37.000000000 -0400 @@ -589,20 +589,36 @@ necessary. If 'flexi_bool' is True, the string '0' is considered false when coercing to boolean. """ if key in kw and type(kw[key](key)) is not type_ and kw[key](key) is not None: if type_ is bool and flexi_bool: kw[key](key) = asbool(kw[key](key)) else: kw[key](key) = type_(kw[key](key)) +def integer_or_raise(value, attribute, allow_negative=True): + """ensure the value is integer like or raise TypeError with attribute as name""" + if value is None: + return value + try: + value + 0 + except TypeError: + pass + else: + if int(value) == value: + if allow_negative or value >= 0: + return value + else: + raise ValueError("'%s' may not be negative" % attribute) + raise TypeError("'%s' must be %s integer or None" % (attribute, allow_negative and 'an' or 'a non negative')) + def duck_type_collection(specimen, default=None): """Given an instance or class, guess if it is or is acting as one of the basic collection types: list, set and dict. If the __emulates__ property is present, return that preferentially. """ if hasattr(specimen, '__emulates__'): # canonicalize set vs sets.Set to a standard: the builtin set if (specimen.__emulates__ is not None and issubclass(specimen.__emulates__, set_types)): Only in SQLAlchemy-0.6.4.4kb/lib: SQLAlchemy.egg-info diff -U10 -r SQLAlchemy-0.6.4.3kb/test/dialect/test_oracle.py SQLAlchemy-0.6.4.4kb/test/dialect/test_oracle.py --- SQLAlchemy-0.6.4.3kb/test/dialect/test_oracle.py 2010-09-13 02:39:39.000000000 -0400 +++ SQLAlchemy-0.6.4.4kb/test/dialect/test_oracle.py 2011-04-05 19:07:44.000000000 -0400 @@ -124,21 +124,47 @@ s = select([t](t), for_update=True).limit(10).offset(20).order_by(t.c.col2) self.assert_compile(s, 'SELECT col1, col2 FROM (SELECT col1, ' 'col2, ROWNUM AS ora_rn FROM (SELECT ' 'sometable.col1 AS col1, sometable.col2 AS ' 'col2 FROM sometable ORDER BY ' 'sometable.col2) WHERE ROWNUM <= ' ':ROWNUM_1) WHERE ora_rn > :ora_rn_1 FOR ' 'UPDATE') - + + def test_use_binds_for_limits(self): + t = table('sometable', column('col1'), column('col2')) + dialect = oracle.OracleDialect() + dialect.use_binds_for_limits = False + s = select([t](t)).limit(10) + self.assert_compile(s, + 'SELECT col1, col2 ' + 'FROM (SELECT sometable.col1 AS col1, ' + 'sometable.col2 AS ' + 'col2 FROM sometable) WHERE ROWNUM <= 10', + dialect=dialect) + for s in (select([t](t)).limit(Decimal('10')).offset(20), + select([t](t)).limit(10.00).offset(20.0)): + self.assert_compile(s, + 'SELECT col1, col2 FROM (SELECT col1, ' + 'col2, ROWNUM AS ora_rn FROM (SELECT ' + 'sometable.col1 AS col1, sometable.col2 AS ' + 'col2 FROM sometable) WHERE ROWNUM <= ' + '30) WHERE ora_rn > 20', + dialect=dialect) + assert_raises(TypeError,select([t](t)).limit, u'55') + assert_raises(TypeError,select([t](t)).offset, '55') + assert_raises(TypeError,select([t](t)).limit, 55.01) + assert_raises(TypeError,select([t](t)).offset, -55.01) + assert_raises(ValueError,select([t](t)).limit, -55) + assert_raises(ValueError,select([t](t)).offset, -5) def test_long_labels(self): dialect = default.DefaultDialect() dialect.max_identifier_length = 30 ora_dialect = oracle.dialect() m = MetaData() a_table = Table( 'thirty_characters_table_xxxxxx',
-
repo owner OK the trick here is that we can't put the integer_or_raise thing across the board in 0.6 since it will break existing apps. ill try to split this out among 0.6/0.7.
-
repo owner - changed status to resolved
-
Account Deleted Replying to zzzeek:
51fea2e159ca93daa0bc8066a5c35d8436d99418 f5893458a1781b84edff390ba5220ed1c6393f32
BTW, I left
test_use_binds_for_limits_enabled()
out because it is already being tested bytest_limit()
... just so you don't think I was a slacker... ;) -
repo owner figured I'd test the flag in both directions explicitly.
-
repo owner - removed milestone
Removing milestone: 0.6.7 (automated comment)
- Log in to comment
I hate oracle !