PostgreSQL DISTINCT Usages
This article describes how to use in PostgreSQL DISTINCT to return a result set with no duplicate rows.
In PostgreSQL, the DISTINCT keyword is used in the SELECT statement so that it returns a result set with no duplicate rows.
PostgreSQL DISTINCT syntax
To return a result set with no duplicate rows, use the SELECT statement with the DISTINCT keyword:
Here is the syntax for DISTINCT:
SELECT
   DISTINCT column1 [, column2, ...]
FROM
   table_name;
Explanation:
- The keyword DISTINCTshould be specified afterSELECT.
- Specify the columns to evaluate for duplicates after the DISTINCTkeyword .
- Multiple column names need to be separated by ,. If multiple column names are specified, MySQL will evaluate for duplicates based on the combined value of these columns.
- You can use DISTINCT *to evaluate duplicates for all columns.
PostgreSQL also provides DISTINCT ON (expression) to keep the first row of each set of duplicates using the following syntax:
SELECT
   DISTINCT ON (column1) column_alias,
   column2
FROM
   table_name
ORDER BY
   column1,
   column2;
It is a good practice to always use the ORDER BY clause with the DISTINCT ON(expression) to make the result set predictable.
Note that the DISTINCT ON expression must match the leftmost expression in the ORDER BY clause .
PostgreSQL DISTINCT Examples
We will use the tables in the Sakila sample database for demonstration, please install the Sakila sample database in PostgreSQL first.
To retrieve all ratings of films from the film table, use the following statement:
SELECT
    DISTINCT rating
FROM
    film;
 rating
--------
 R
 PG-13
 G
 PG
 NC-17
(5 rows)Here, in order to find all the films ratings, we use DISTINCT rating, so that each film rating appears only once in the result set.
To retrieve all rent amounts from the film table, use the following statement:
SELECT
    DISTINCT rental_rate
FROM
    film;
 rental_rate
-------------
        2.99
        4.99
        0.99
(3 rows)Here, in order to find all the film rental amounts, we use DISTINCT rental_rate, so that each film rental amount appears only once in the result set.
To retrieve all combinations of ratings and rental amounts from the film table, use the following statement:
SELECT
    DISTINCT rating, rental_rate
FROM
    film
ORDER BY rating;
 rating | rental_rate
--------+-------------
 G      |        0.99
 G      |        4.99
 G      |        2.99
 PG     |        2.99
 PG     |        0.99
 PG     |        4.99
 PG-13  |        4.99
 PG-13  |        0.99
 PG-13  |        2.99
 R      |        0.99
 R      |        2.99
 R      |        4.99
 NC-17  |        0.99
 NC-17  |        2.99
 NC-17  |        4.99
(15 rows)Here, we used DISTINCT rating, rental_rate to find all combinations of film ratings and rental amounts. To make the output more readable, we use ORDER BY to sort the result set in positive order by film ranking.
If you want to return the first row for each set of films ratings, use the DISTINCT ON:
SELECT
    DISTINCT ON (rating) rating,
    film_id,
    title
FROM
    film
ORDER BY rating, film_id DESC;
 rating | film_id |      title
--------+---------+------------------
 G      |       2 | ACE GOLDFINGER
 PG     |       1 | ACADEMY DINOSAUR
 PG-13  |       7 | AIRPLANE SIERRA
 R      |       8 | AIRPORT POLLOCK
 NC-17  |       3 | ADAPTATION HOLESDISTINCT and NULL
DISTINCT treats all null values ββare the same, regardless of the field’s type, so only one null value is left from DISTINCT.
For example the following SQL statement returns multiple rows of NULL records:
SELECT NULL nullable_col
UNION ALL
SELECT NULL nullable_col
UNION ALL
SELECT NULL nullable_col;
 nullable_col
--------------
 <null>
 <null>
 <null>
(3 rows)Here, we have 3 rows, each of which has a nullable_col column value of NULL.
After using DISTINCT:
SELECT
    DISTINCT nullable_col
FROM
    (
    SELECT NULL nullable_col
    UNION ALL
    SELECT NULL nullable_col
    UNION ALL
    SELECT NULL nullable_col
    ) t;
 nullable_col
--------------
 <null>
(1 row)This example uses to UNION ALL simulate a recordset containing multiple NULL values.
Conclusion
This article describes how to use the DISTINCT to clear duplicate rows in a result set. The key points of the usage of the DISTINCT are as follows:
- The SELECT DISTINCTstatement returns a result set with no duplicate rows.
- You can specify one or more columns after DISTINCT, or you can use it*.
- The DISTINCTtreat allNULLas equal and keep only one.
- The DISTINCT ONis used to return the first row of each group of repeated values.