- changed status to resolved
Endless loop with `foo in arraycolumn`
This example never terminates:
from sqlalchemy import *
from sqlalchemy.dialects.postgresql import ARRAY
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import *
Base = declarative_base()
class Foo(Base):
__tablename__ = 'foo'
id = Column(Integer, primary_key=True)
chain = Column(ARRAY(Integer))
e = create_engine('postgresql:///test', echo=True)
Base.metadata.create_all(e)
s = Session(e)
print s.query(Foo).filter(123 in Foo.chain).all()
Traceback on ^C:
^CTraceback (most recent call last):
File "satest.py", line 22, in <module>
print s.query(Foo).filter(123 in Foo.chain).all()
File "/home/adrian/dev/indico/env/lib/python2.7/site-packages/sqlalchemy/sql/operators.py", line 301, in __eq__
return self.operate(eq, other)
File "/home/adrian/dev/indico/env/lib/python2.7/site-packages/sqlalchemy/sql/elements.py", line 739, in operate
return op(self.comparator, *other, **kwargs)
File "/home/adrian/dev/indico/env/lib/python2.7/site-packages/sqlalchemy/sql/operators.py", line 301, in __eq__
return self.operate(eq, other)
File "<string>", line 1, in <lambda>
File "/home/adrian/dev/indico/env/lib/python2.7/site-packages/sqlalchemy/sql/type_api.py", line 60, in operate
return o[0](self.expr, op, *(other + o[1:]), **kwargs)
File "/home/adrian/dev/indico/env/lib/python2.7/site-packages/sqlalchemy/sql/default_comparator.py", line 69, in _boolean_compare
negate=negate, modifiers=kwargs)
File "/home/adrian/dev/indico/env/lib/python2.7/site-packages/sqlalchemy/sql/elements.py", line 2734, in __init__
self.type = type_api.to_instance(type_)
KeyboardInterrupt
From looking at the code __contains__
is not supported, but it should result in an exception and not in a hang.
Also, I think it would be useful to implement __contains__
to call self.contains([value])
. Using .any(123)
is not an option since it does not make use of a GIN index on the table, while .contains([123])
does even though both do the same thing.
Comments (12)
-
repo owner -
repo owner - Fixed issue where inadvertent use of the Python
__contains__
override with a column expression (e.g. by using'x' in col
) would cause an endless loop in the case of an ARRAY type, as Python defers this to__getitem__
access which never raises for this type. Overall, all use of__contains__
now raises NotImplementedError. fixes#3642
(cherry picked from commit e0a580b3d055a600afae61840058a5a30ef5fe74)
→ <<cset 69f6a8d714c0>>
- Fixed issue where inadvertent use of the Python
-
repo owner thanks for reporting! not sure I agree with "contains -> contains()", since the keyword used is "in", which people are reallly going to think is SQL IN. The in/contains mismatch across SQL / Python is a good reason to leave this area explicitly not supported.
-
reporter Hm, assuming there are no cases where an actual SQL
foo = ANY(..)
is needed instead of.. @> {foo}
and the functionality is exactly the same, would it be possible to emit the latter in case ofany()
? -
repo owner im sorry emit what for any() ?
-
reporter the same SQL that's emitted by
.contains([foo])
, i.e... @> '{foo}'
-
repo owner As I don't really use ARRAY types myself, I'm sure I'm not understanding. any() right now emits the operator "ANY" (and in 1.1 has been enhanced to also work with MySQL's ANY operator). If someone wants "@>", they use "contains". Why would we remove support entirely for one of these operators and make the two functions emit the same thing?
-
reporter 123 = ANY(arraycol)
checks if123
is in the array stored inarraycol
(like python'sin
operator).
arraycol @> '{123}'
does the same thing, but it can make use of a GIN-type index, making it more efficient.That's why I'm suggesting to always use the more efficient operator for this (assuming there are really no differences besides the index usage)
-
repo owner also, ANY (http://www.postgresql.org/docs/9.4/static/functions-comparisons.html) and @> (http://www.postgresql.org/docs/9.4/static/functions-array.html) don't appear to be equivalent at all. The left-hand datatype is completely different.
-
repo owner Then why does Postgresql have ANY ? Why can't they run this optimization on their end? Isn't this a bug in Postgresql ?
-
reporter ANY
by itself is more standardized (it's not just for arrays) - I think you can dosomecolumn = ANY (SELECT ...)
etc.I guess that's a usecase where SQL ANY is actually needed, but for "simple" argument types (i.e. not queries) you could still go for the more efficient
@>
operator -
repo owner yeah we dont really make decisions like that because it's just a surprise for the user who is doing somethign where this suddenly doesn't work. The SQL Core remains as 1-1 mapped to SQL as possible.
- Log in to comment
__contains__
override with a column expression (e.g. by using'x' in col
) would cause an endless loop in the case of an ARRAY type, as Python defers this to__getitem__
access which never raises for this type. Overall, all use of__contains__
now raises NotImplementedError. fixes#3642→ <<cset e0a580b3d055>>