Union queries fail if there are literal unlabeled columns

Issue #2974 resolved
Jairo Llopis created an issue

See this union query:

queries = list()

queries.append(sql.select(
    columns=[
        sql.expression.literal("column_1").label("label_1"),
        sql.expression.literal("column_2").label("label_2"),
    ],
    bind=conn.engine
))

queries.append(sql.select(
    columns=[
        sql.expression.literal("column_1"),
        sql.expression.literal("column_2"),
    ],
    bind=conn.engine
))

one = sql.expression.union(*queries)

It fails with this exception:

sqlalchemy.exc.ArgumentError: All selectables passed to CompoundSelect must have identical numbers of columns; select #1 has 2 columns, select #2 has 1

But both have 2 columns, it's just that the 2nd one is not labeled (because it is not needed in union queries, at least with MySQL).

Workaround: labeling all columns in 2nd query.

Notice that real columns, wich automatically seem to get a name, don't expose this weird behavior.

Comments (3)

  1. Mike Bayer repo owner

    you'll notice the warning indicating the cause of this:

    SAWarning: Column ':param_1' on table <sqlalchemy.sql.selectable.Select at 0x101a49310; Select object> being replaced by <sqlalchemy.sql.elements.ColumnClause at 0x101a49b90; %(4322530000 anon)s>, which has the same key. Consider use_labels for select() statements.

    I'm going to change the mechanics going into union() to not care about the key names of columns, and ill also change bindparam() (e.g. literal()) to use their anonymous key name in this collection, rather than the name "param_1".

  2. Mike Bayer repo owner
    • Some changes to how the :attr:.FromClause.c collection behaves when presented with duplicate columns. The behavior of emitting a warning and replacing the old column with the same name still remains to some degree; the replacement in particular is to maintain backwards compatibility. However, the replaced column still remains associated with the c collection now in a collection ._all_columns, which is used by constructs such as aliases and unions, to deal with the set of columns in c more towards what is actually in the list of columns rather than the unique set of key names. This helps with situations where SELECT statements with same-named columns are used in unions and such, so that the union can match the columns up positionally and also there's some chance of :meth:.FromClause.corresponding_column still being usable here (it can now return a column that is only in selectable.c._all_columns and not otherwise named). The new collection is underscored as we still need to decide where this list might end up. Theoretically it would become the result of iter(selectable.c), however this would mean that the length of the iteration would no longer match the length of keys(), and that behavior needs to be checked out. fixes #2974
    • add a bunch more tests for ColumnCollection

    → <<cset 302ad6228a12>>

  3. Mike Bayer repo owner
    • Adjusted the logic which applies names to the .c collection when a no-name :class:.BindParameter is received, e.g. via :func:.sql.literal or similar; the "key" of the bind param is used as the key within .c. rather than the rendered name. Since these binds have "anonymous" names in any case, this allows individual bound parameters to have their own name within a selectable if they are otherwise unlabeled. fixes #2974

    → <<cset 6aeec027a0fb>>

  4. Log in to comment