DISTINCT is used in select() but not in count()
Issue #285
resolved
When you use distinct=True in a select() statement on a table, a SELECT DISTINCT statemet is generated. But if you use the same clause and distinct=True in a count() statement, the distinct=True parameter is ignored and the count is wrong.
Attached is a small test progam which exercises the bug.
Raul Garcia.
Comments (1)
-
repo owner - Log in to comment
OK, this is a little subtle. the statement you have produces this (I converted
list(statement.execute())[0](0)[0](0)
into the equivalentstatement.scalar()
):you can see the aggregate count() function result is what will be run into the DISTINCT call (so DISTINCT is being propigated from SQLAlchemy). While this may seem pointless since count() returns only one row, that is not necessarily the case, such as if you had a GROUP BY in there which returned multiple rows containing a count; you may in some case want to get a list of counts against some criterion and determine the set of all distinct count values.
What you really want to apply DISTINCT to is the rowset from the table, and then you want to "count" that. So you do it this way:
So the
SELECT DISTINCT
creates the matrix of results that you care about; then the "count" query wraps that result.