Oracle has a hard limit of 1000 entries in the WHERE IN clause

Issue #1540 resolved
Former user created an issue

If you try to select something using the in operator with a list of more than 1000 entries, you're bound to get an ORA-01795 error. We used to have the following code for the 0.4 series to monkey-patch around this:

def new_oracle_visit_binary(self, expr):
    """ Oracle does not support more than 1000 items in an IN expression.
        To overcome this limitation we change the visitor for binary expressions
        for the oracle compiler in such a way that IN expressions with more than 
        1000 items are automatically broken up in several IN queries that are ORed
        together.
    """

    if expr.operator != sqlalchemy.sql.operators.in_op:
        # *jedi mind trick* This is not the operator you are looking for.
        return sqlalchemy.sql.compiler.DefaultCompiler.visit_binary(self, expr)

    # This is an in_ clause
    chunk = 1000
    if len(expr.right.elem.clauses) <= chunk:
        # With less than "chunk" docs, keep going.
        return sqlalchemy.sql.compiler.DefaultCompiler.visit_binary(self, expr)

    # We need to split this expression up. Build sub-expressions.
    subexpr = [   for i in range(0, len(expr.right.elem.clauses), chunk):
        subexpr.append(sqlalchemy.sql.expression._BinaryExpression(
                expr.left,
                sqlalchemy.sql.expression.ClauseList(*expr.right.elem.clauses[i:i+chunk](]
)),
                sqlalchemy.sql.operators.in_op))

    # We have at least two subexpressions. Start or-ring from the left.
    combexpr = subexpr[0](0)
    for s in subexpr[1:](1:):
        combexpr = sqlalchemy.sql.expression._BinaryExpression(combexpr, s, sqlalchemy.sql.operators.or_)

    return sqlalchemy.sql.compiler.DefaultCompiler.visit_binary(self, combexpr)

# Install the new oracle binary expression visitor.
sqlalchemy.databases.oracle.dialect.statement_compiler.visit_binary = new_oracle_visit_binary

One of these days I'll probably come up with something for 0.5.

Comments (5)

  1. Mike Bayer repo owner

    In my own work, regardless of backend, I issue individual select() constructs for each batch of N records - and it is my work with Oracle that first made me aware of this kind of limitation. Sending over a statement that is unbounded in size seems like a bad idea to me. At the very least turning an IN into any number of OR statements, instead of allowing oracle to say, "hey we really aren't built to handle this large of a statement", without the user asking, seems like a huge surprise to me. So at the moment the approach above seems better to me as a new operator you'd build yourself, like expanding_or() or something like that.

  2. Former user Account Deleted

    It's true in general that unbound statements are a bad idea, but on the other hand, I kind of expect SQLAlchemy to do the right thing ''if'' the code ever comes across one, intended or not. Introducing another operator for this special case isn't much of an issue for me personally, to be honest we wrote the given code based on just such an operator before we switched to SA. But on the other hand, and of course you are the one to decide on this, it seems to me that SA should not carry over database driver specific limitations such as this one to its API. SA does such a good job abstracting away all the other differences, why keep this one?

  3. Mike Bayer repo owner

    I think the "right thing" for in_() to do is to render an "IN". Anything else is a surprise. Especially on Oracle, I think its important that a user be aware of that limitation and make an explicit decision how they'd like to handle it.

  4. Former user Account Deleted

    Yeah, I do agree with what you're saying. Looking at the code again, it seems we could just as easily do with something like this:

        chunk = 1000
        subexprs = [dataset_table.c.id.in_(cases[i:i+chunk](dataset_table.c.id.in_(cases[i:i+chunk))
                    for i in range(0, len(cases), chunk)]
        query = query.where(or_(*subexprs))
    

    Feel free to close this ticket.

  5. Log in to comment