PostgreSQL GROUPING SETS - Multiple grouping in one query
This article describes how to use GROUPING SETS in a GROUP BY clause to generate a report on multiple dimensions in one query.
In PostgreSQL, the GROUPING SETS clause is a parameter to the GROUP BY clause that allows you to generate a report on multiple dimensions in one query.
For example, you can use GROUPING SETS in the GROUP BY clause to to get a sales report with rows for sales per year and rows for sales per department.
PostgreSQL GROUPING SETS syntax
The following syntax shows how to use GROUPING SETS in the GROUP BY clause:
SELECT ...
FROM table_name
[WHERE ...]
GROUP BY
GROUPING SETS (
(group_expr_list_1)
[, (group_expr_list_2), ...]
)
;
here,
- The
GROUPING SETSis an argument of theGROUP BYclause, it must follow theGROUP BYkeyword. - The
(group_expr_list_N)is a list of expressions used for grouping, which can contain zero or more columns or expressions, like those used directly in theGROUP BYclause. - You can specify one or more grouping expressions for
GROUPING SETS, and all result sets produced by all grouping expressions will be merged into the final result set. - If there are no columns or expressions in
(group_expr_list_N), that is(), all eligible rows are aggregated into one group.
PostgreSQL GROUP BY Examples
We will use the tables in the Sakila sample database for demonstration, please install the Sakila sample database in PostgreSQL first.
In the Sakila sample database, the film table stores all the films in the DVD store.
To find the number of films for each film rating from the film table, use a statement like this:
SELECT rating, count(*)
FROM film
GROUP BY rating
ORDER BY rating;
rating | count
--------+-------
G | 178
PG | 194
PG-13 | 223
R | 195
NC-17 | 210
(5 rows)To find the number of films for each rental from the film table, use a statement like this:
SELECT rental_rate, count(*)
FROM film
GROUP BY rental_rate
ORDER BY rental_rate;
rental_rate | count
-------------+-------
0.99 | 341
2.99 | 323
4.99 | 336
(3 rows)To merge the above two reports in one report, you can UNION ALLcombine the as follows:
SELECT rating, NULL rental_rate, count(*)
FROM film
GROUP BY rating
UNION ALL
SELECT NULL rating, rental_rate, count(*)
FROM film
GROUP BY rental_rate
ORDER BY rating, rental_rate;
rating | rental_rate | count
--------+-------------+-------
G | | 178 ┐
PG | | 194 │
PG-13 | | 223 │ > ranting
R | | 195 │
NC-17 | | 210 ┘
| 0.99 | 341 ┐
| 2.99 | 323 │ > rental_rate
| 4.99 | 336 ┘
(8 rows)However, the above statement with UNION ALL looks complicated. It would be great to be able to do all this in one simple statement. PostgreSQL GROUPING SETS can do all this for us:
SELECT rating, rental_rate, count(*)
FROM film
GROUP BY GROUPING SETS ((rating), (rental_rate))
ORDER BY rating, rental_rate;
rating | rental_rate | count
--------+-------------+-------
G | | 178 ┐
PG | | 194 │
PG-13 | | 223 │ > ranting
R | | 195 │
NC-17 | | 210 ┘
| 0.99 | 341 ┐
| 2.99 | 323 │ > rental_rate
| 4.99 | 336 ┘
(8 rows)So, you can simply understand that GROUPING SETS is a UNION ALL operation on multiple grouped result sets.
If you wanted to add a row to the above report to show the total number of films, you could be an empty grouping expression () in GROUPING SETS, as following:
SELECT rating, rental_rate, count(*)
FROM film
GROUP BY GROUPING SETS ((rating), (rental_rate), ())
ORDER BY rating, rental_rate;
rating | rental_rate | count
--------+-------------+-------
G | | 178
PG | | 194
PG-13 | | 223
R | | 195
NC-17 | | 210
| 0.99 | 341
| 2.99 | 323
| 4.99 | 336
| | 1000Conclusion
In the GROUP BY clause, you can use GROUPING SETS to generate a collection of result sets for multiple grouped queries in one query.
For more complex needs, PostgreSQL also provides ROLLUP and CUBE to simplify GROUPING SETS.