This gem provides a nice API to create complicated SQL queries for associated
and aggregated tables.


Take a classic example:

class Post
	has_many :comments

Now let's say you want to display a table with some posts' attributes including
number of comments.  Normally you would use post.comments.count, but that
generates separate query for each post. Alternatively you could use counter
cache, but it can get out of sync plus it only works on this simple example,
while you could also need the time of latest comment for each post.

Rails 3

AggregateColumns solution would look like this:

	posts = Post.aggregate_columns( :association => :comments )

	SELECT posts.*, (SELECT count(*) AS comment_count FROM comments 
	WHERE ( = post_id)) AS comment_count FROM "posts" 
	ORDER BY comment_count DESC

	posts.first.comment_count # => "1"

This returns a full blown ActiveRecord::Relation object encapsulating a query
for posts with additional column called "comment_count" containing (big surprise
here) number of comments for each post. I'm taking advantage of the fact columns
specified in :select option are kept inside model objects (but not
type-casted, so comment_count would be a String).

 * :association - association to aggregate
 * :function - SQL aggregate function to use, defaults to count
 * :column - associated table column to aggregate, defaults to '*' (useful for
 * :result_column - name of resulting column, defaults to
   association_field_function with exception of '*' column, which results in 
 * :join_type - JOIN type for in-join mode, explained later

Ordering defaults to aggregate column descending, you may always change it using
reorder. The reason for using this default is that in my experience this is the
only order I needed :)

What if you did not want such a simple aggregation, but the one I already
mentioned: time of latest comment for each post:

	Post.aggregate_columns( :association => :comments, :function => :max, :column => :created_at )

	SELECT posts.*, (SELECT max(created_at) AS comments_created_at_max FROM
	comments WHERE ( = post_id)) AS comments_created_at_max FROM "posts"
	ORDER BY comments_created_at_max DESC

The name of aggregated column would be comments_created_at_max in this case, but
it can be changed using :result_column option.

You may add some additional relations to the subquery using a block, eg.

	Post.aggregate_columns( :association => :comments ) { |rel| rel.where( :user_id => 1 ) }

	SELECT posts.*, (SELECT count(*) AS comment_count FROM comments WHERE
	"comments"."user_id" = 1 AND ( = post_id)) AS comment_count FROM
	"posts" ORDER BY comment_count DESC

This may you can limit which comments are considered for aggregation.

The only option not mentioned yet is :join_type, which can only be explained
after digging deeper into how the gem works. Aggregate column is added in one of
two ways:

* if :join_type option is not present - it's appended as subquery to SELECT
  clause, eg:

	SELECT posts.*, (SELECT count(*) AS comment_count FROM comments WHERE
	( = post_id)) AS comment_count FROM "posts" ORDER BY comment_count

* if :join_type option is present - it's added as a subquery connected using
  JOIN clause with join type specified, eg (:join_type => :right):

	SELECT posts.*, comment_count FROM "posts" RIGHT JOIN (SELECT post_id,
	count(*) AS comment_count FROM comments GROUP BY post_id) comment_count_join
	ON = comment_count_join.post_id ORDER BY comment_count DESC
Now, each way has it's place.

In-select subquery:
* foreign key index is used (PostgreSQL at least)
* subquery executed only for rows fetched from main table

In-join subquery:
* with RIGHT join only fetches records from main table having results in
  aggregate column (eg. only posts with any comments)

You can only decide between those two on a case by case basis. But generally: if
you only care about records having some meaningful values in aggregate column,
in-join might be better; if main table has some large record count, but you only
want to select some subset of those, in-select might save you some computation

Rails 2

NOTE: this is deprecated and I'm not sure if it even still works. But the code
is there, so it might be useful for someone.

Instead of aggregate_columns you have aggregate_column_options:

	Post.aggregate_columns_options( :association => :comments )

which returns a Hash of options suitable to pass to .find call. You have
additional options: :joins and :conditions to be passed to the subquery, because
you don't have a Relation object to work on in a block.

Now something really complicated - sum of votes for comments whose authors are
active. Here's where :conditions and :joins options come in handy:

	Post.all( Post.aggregate_columns_options( 
			:association => :comments, 
			:function => :sum, 
			:column => :votes, 
			:joins => "INNER JOIN authors ON comments.author_id =",
			:conditions => [" = ?", true],
			:result_column => :comment_vote_count

:joins and :conditions are passed to the subquery, so they will determine which
comments are considered when calculating vote sum

Important things to note here:
* aggregate_column_options return :select, :joins and :order options, so those
	cannot be used for other purposes
* all options are used internally in aggregate subqueries, so they do not clash
	with normal find options (other than aforementioned ones). This means you can
	merge resulting options with eg. custom :conditions

You may also define multiple aggregate columns in one call:

		{ :association => :comments, :function => :max, :column => :created_at },
		{ :association => :tags, :result_column => :number_of_tags }

Yet another method to combine aggregate columns with other find options is to use scopes:

	Post.aggregate_columns_scope( :association => :comments ).scoped( ...


Many thanks go to Stefan Nothegger and Sharewise project
(, where the idea originates from.

2010 Marek Janukowicz/Starware. Released under MIT license.