PostgreSQL ntile() Function

The PostgreSQL ntile() function divides all rows in the partition into the specified number of buckets as evenly as possible, and returns the rank of the bucket where the current row is located.

ntile() Syntax

Here is the syntax of the PostgreSQL ntile() function:

ntile(buckets)
OVER (
  [PARTITION BY partition_column_list]
  [ORDER BY order_column_list]
)

Parameters

buckets

Required. The number of buckets. The number of buckets should not be greater than the number of rows in this partition.

partition_column_list

List of columns for partitioning.

partition_column_list

List of columns for sorting.

Return value

The PostgreSQL ntile() function divides all rows in the partition where the current row is located into a specified number of ranking buckets as evenly as possible, and returns the ranking of the bucket where the current row is located.

Suppose, you have 9 numbers from 1 to 9, you use ntile(3) to divide them into 3 buckets in ascending order as evenly as possible, so the rank of the bucket with 1-3 is 1, the rank of the bucket with 4-6 is 2, and the rank of the bucket with 7-9 is 3. The following statement shows it using generate_series() generate a result set consisting of 1 to 9:

SELECT
  x,
  ntile(3) over (
    ORDER BY x
  )
FROM
  generate_series(1, 9) x;
x | ntile
---+-------
 1 |     1
 2 |     1
 3 |     1
 4 |     2
 5 |     2
 6 |     2
 7 |     3
 8 |     3
 9 |     3
(9 rows)

ntile() Examples

Preparing Data

Use the following CREATE TABLE statement to create a table named tax_revenue to store tax returns for every quarter:

CREATE TABLE tax_revenue (
  id SERIAL PRIMARY KEY,
  year CHAR(4) NOT NULL,
  quarter CHAR(1) NOT NULL,
  revenue INT NOT NULL
);

This tax_revenue table has 4 columns as following:

  • id - The row ID, primary key.
  • year- The year.
  • quarter- The quarter of the year, from 1 to 4.
  • revenue- Tax revenue.

Insert some rows into the tax_revenue table using the following INSERT statement:

INSERT INTO tax_revenue
  (year, quarter, revenue)
VALUES
  ('2020', '1', 3515),
  ('2020', '2', 3678),
  ('2020', '3', 4203),
  ('2020', '4', 3924),
  ('2021', '1', 3102),
  ('2021', '2', 3293),
  ('2021', '3', 3602),
  ('2021', '4', 2901);

Use the following SELECT statement to show all rows of this table:

SELECT * FROM tax_revenue;
 id | year | quarter | revenue
----+------+---------+---------
  1 | 2020 | 1       |    3515
  2 | 2020 | 2       |    3678
  3 | 2020 | 3       |    4203
  4 | 2020 | 4       |    3924
  5 | 2021 | 1       |    3102
  6 | 2021 | 2       |    3293
  7 | 2021 | 3       |    3602
  8 | 2021 | 4       |    2901
(8 rows)

Example

The following statement uses a PostgreSQL ntile() function to divide the revenues into 2 buckets in ascending order:

SELECT *,
  ntile(2) OVER (
    PARTITION BY year
    ORDER BY revenue
  )
FROM tax_revenue;
 id | year | quarter | revenue | ntile
----+------+---------+---------+-------
  1 | 2020 | 1       |    3515 |     1
  2 | 2020 | 2       |    3678 |     1
  4 | 2020 | 4       |    3924 |     2
  3 | 2020 | 3       |    4203 |     2
  8 | 2021 | 4       |    2901 |     1
  5 | 2021 | 1       |    3102 |     1
  6 | 2021 | 2       |    3293 |     2
  7 | 2021 | 3       |    3602 |     2
(8 rows)

Note that the window function in the SQL statement above:

ntile(2) OVER (
  PARTITION BY year
  ORDER BY revenue
)

In the OVER clause,

  • The PARTITION BY year partitions all rows by year.
  • The ORDER BY revenue sorts all rows in each partition in ascending order of revenue.
  • The ntile(2) divides all revenues of each partition into 2 buckets as evenly as possible. Since there are 4 rows per partition, and there are 2 rows per bucket. So the first two rows of each year in the bucket with rank 1, and the last two rows in the bucket with rank 2.