PostgreSQL GROUPING SETS - Multiple grouping in one query

This article describes how to use GROUPING SETS in a GROUP BY clause to generate a report on multiple dimensions in one query.

In PostgreSQL, the GROUPING SETS clause is a parameter to the GROUP BY clause that allows you to generate a report on multiple dimensions in one query.

For example, you can use GROUPING SETS in the GROUP BY clause to to get a sales report with rows for sales per year and rows for sales per department.

PostgreSQL GROUPING SETS syntax

The following syntax shows how to use GROUPING SETS in the GROUP BY clause:

SELECT ...
FROM table_name
[WHERE ...]
GROUP BY
  GROUPING SETS (
    (group_expr_list_1)
    [, (group_expr_list_2), ...]
  )
;

here,

  • The GROUPING SETS is an argument of the GROUP BY clause, it must follow the GROUP BY keyword.
  • The (group_expr_list_N) is a list of expressions used for grouping, which can contain zero or more columns or expressions, like those used directly in the GROUP BY clause.
  • You can specify one or more grouping expressions for GROUPING SETS, and all result sets produced by all grouping expressions will be merged into the final result set.
  • If there are no columns or expressions in (group_expr_list_N), that is (), all eligible rows are aggregated into one group.

PostgreSQL GROUP BY Examples

We will use the tables in the Sakila sample database for demonstration, please install the Sakila sample database in PostgreSQL first.

In the Sakila sample database, the film table stores all the films in the DVD store.

To find the number of films for each film rating from the film table, use a statement like this:

SELECT rating, count(*)
FROM film
GROUP BY rating
ORDER BY rating;
 rating | count
--------+-------
 G      |   178
 PG     |   194
 PG-13  |   223
 R      |   195
 NC-17  |   210
(5 rows)

To find the number of films for each rental from the film table, use a statement like this:

SELECT rental_rate, count(*)
FROM film
GROUP BY rental_rate
ORDER BY rental_rate;
 rental_rate | count
-------------+-------
        0.99 |   341
        2.99 |   323
        4.99 |   336
(3 rows)

To merge the above two reports in one report, you can UNION ALLcombine the as follows:

SELECT rating, NULL rental_rate, count(*)
FROM film
GROUP BY rating

UNION ALL

SELECT NULL rating, rental_rate, count(*)
FROM film
GROUP BY rental_rate

ORDER BY rating, rental_rate;
 rating | rental_rate | count
--------+-------------+-------
 G      |             |   178  ┐
 PG     |             |   194  │
 PG-13  |             |   223  │ > ranting
 R      |             |   195  │
 NC-17  |             |   210  ┘
        |        0.99 |   341  ┐
        |        2.99 |   323  │ > rental_rate
        |        4.99 |   336  ┘
(8 rows)

However, the above statement with UNION ALL looks complicated. It would be great to be able to do all this in one simple statement. PostgreSQL GROUPING SETS can do all this for us:

SELECT rating, rental_rate, count(*)
FROM film
GROUP BY GROUPING SETS ((rating), (rental_rate))
ORDER BY rating, rental_rate;
 rating | rental_rate | count
--------+-------------+-------
 G      |             |   178  ┐
 PG     |             |   194  │
 PG-13  |             |   223  │ > ranting
 R      |             |   195  │
 NC-17  |             |   210  ┘
        |        0.99 |   341  ┐
        |        2.99 |   323  │ > rental_rate
        |        4.99 |   336  ┘
(8 rows)

So, you can simply understand that GROUPING SETS is a UNION ALL operation on multiple grouped result sets.

If you wanted to add a row to the above report to show the total number of films, you could be an empty grouping expression () in GROUPING SETS, as following:

SELECT rating, rental_rate, count(*)
FROM film
GROUP BY GROUPING SETS ((rating), (rental_rate), ())
ORDER BY rating, rental_rate;
 rating | rental_rate | count
--------+-------------+-------
 G      |             |   178
 PG     |             |   194
 PG-13  |             |   223
 R      |             |   195
 NC-17  |             |   210
        |        0.99 |   341
        |        2.99 |   323
        |        4.99 |   336
        |             |  1000

Conclusion

In the GROUP BY clause, you can use GROUPING SETS to generate a collection of result sets for multiple grouped queries in one query.

For more complex needs, PostgreSQL also provides ROLLUP and CUBE to simplify GROUPING SETS.