parenthesization within unions

Issue #2528 resolved
Mike Bayer repo owner created an issue

you need parenthesis around the subqueries if you do this:

from sqlalchemy import *

q = select([column('x')]).limit(5).order_by(column('y'))
union = union(q, q).limit(3)
print union

oddly, no parenthesis generate if you do this (even with the second select() on the union()):

from sqlalchemy import *

q = select([column('x')]).limit(5).order_by(column('y')).alias()
union = union(q, q).limit(3).select()
print union

usually "alias" means, "I'm a subquery". so let's think about that.

also here, the limit/order by implies parenthesization in any case. it of course fails on sqlite. but if theres no way to order by inside a union on sqlite anyway, we may not care - or we may make an exception in sqlite's compiler. should test it out on other backends to check compatibility.

the workaround for now is self_group():

from sqlalchemy import *

q = select([column('x')]).limit(5).order_by(column('y')).self_group()
union = union(q, q).limit(3)
print union

Comments (8)

  1. Sok Ann Yap

    With ORM, I found that I need to add from_self() as a workaround, e.g.

    q1 = <some ORM query with ORDER BY, perhaps a DISTINCT ON>
    q2 = <some ORM query without ORDER BY>
    for instance in q1.from_self().union(q2):
        ...
    
  2. Mike Bayer reporter
    • edited description

    the syntax here got munged in the transfer from trac, also updated for latest text() behavior

  3. Roman Sakal

    I'm tring use some like:

         _order_by = query._order_by
         query_1 = query.filter(cls.value == 1).limit(150).subquery()
         query_2 = query.filter(db.or_(cls.value == 0, cls.value == None)).limit(75).subquery()
         items = cls.query.with_entities(Offer.id).select_entity_from(union_all((query_1), (query_2))).order_by(*_order_by).all()
    

    I got some like: LINE 6: LIMIT 150 AS anon_2 UNION ALL SELECT offers.id AS id

    I just need a parenthenes and I don't know how except manual query compile and call execute()

  4. Mike Bayer reporter
    • changed milestone to 1.1

    patch:

    diff --git a/lib/sqlalchemy/sql/selectable.py b/lib/sqlalchemy/sql/selectable.py
    index bfba35d..5387234 100644
    --- a/lib/sqlalchemy/sql/selectable.py
    +++ b/lib/sqlalchemy/sql/selectable.py
    @@ -1101,6 +1101,14 @@ class Alias(FromClause):
                                                         or 'anon'))
             self.name = name
    
    +    def self_group(self, target=None):
    +        if isinstance(target, CompoundSelect) and \
    +            isinstance(self.original, Select) and \
    +                self.original._needs_parens_for_grouping():
    +            return FromGrouping(self)
    +
    +        return super(Alias, self).self_group(target)
    +
         @property
         def description(self):
             if util.py3k:
    @@ -3208,6 +3216,13 @@ class Select(HasPrefixes, HasSuffixes, GenerativeSelect):
                     return None
             return None
    
    +    def _needs_parens_for_grouping(self):
    +        return (
    +            self._limit_clause is not None or
    +            self._offset_clause is not None or
    +            self._distinct
    +        )
    +
         def self_group(self, against=None):
             """return a 'grouping' construct as per the ClauseElement
             specification.
    @@ -3217,7 +3232,8 @@ class Select(HasPrefixes, HasSuffixes, GenerativeSelect):
             expressions and should not require explicit use.
    
             """
    -        if isinstance(against, CompoundSelect):
    +        if isinstance(against, CompoundSelect) and \
    +                not self._needs_parens_for_grouping():
                 return self
             return FromGrouping(self)
    
    diff --git a/test/sql/test_selectable.py b/test/sql/test_selectable.py
    index 3390f4a..4a332a4 100644
    --- a/test/sql/test_selectable.py
    +++ b/test/sql/test_selectable.py
    @@ -458,6 +458,26 @@ class SelectableTest(
             assert u1.corresponding_column(table2.c.col1) is u1.c._all_columns[0]
             assert u1.corresponding_column(table2.c.col3) is u1.c._all_columns[2]
    
    +    @testing.emits_warning("Column 'col1'")
    +    def test_union_alias_dupe_keys_grouped(self):
    +        s1 = select([table1.c.col1, table1.c.col2, table2.c.col1]).\
    +            limit(1).alias()
    +        s2 = select([table2.c.col1, table2.c.col2, table2.c.col3]).limit(1)
    +        u1 = union(s1, s2)
    +
    +        assert u1.corresponding_column(
    +            s1.c._all_columns[0]) is u1.c._all_columns[0]
    +        assert u1.corresponding_column(s2.c.col1) is u1.c._all_columns[0]
    +        assert u1.corresponding_column(s1.c.col2) is u1.c.col2
    +        assert u1.corresponding_column(s2.c.col2) is u1.c.col2
    +
    +        assert u1.corresponding_column(s2.c.col3) is u1.c._all_columns[2]
    +
    +        # this differs from the non-alias test because table2.c.col1 is
    +        # more directly at s2.c.col1 than it is s1.c.col1.
    +        assert u1.corresponding_column(table2.c.col1) is u1.c._all_columns[0]
    +        assert u1.corresponding_column(table2.c.col3) is u1.c._all_columns[2]
    +
         def test_select_union(self):
    
             # like testaliasunion, but off a Select off the union.
    
  5. Mike Bayer reporter
    • The behavior of the :func:.union construct and related constructs such as :meth:.Query.union now handle the case where the embedded SELECT statements need to be parenthesized due to the fact that they include LIMIT, OFFSET and/or ORDER BY. These queries do not work on SQLite, and will fail on that backend as they did before, but should now work on all other backends. fixes #2528

    → <<cset 88749550f6b9>>

  6. Log in to comment