How the ROW_NUMBER() function works in Mariadb?

The ROW_NUMBER() function is a window function that assigns a sequential integer to each row within a partition of a result set.

Posted on

The ROW_NUMBER() function is a window function that assigns a sequential integer to each row within a partition of a result set. The first row in each partition has a row number of 1, and the row number increases by 1 for each subsequent row in the partition. The ROW_NUMBER() function can be useful for pagination, ranking, and other scenarios that require a unique identifier for each row.

Syntax

The syntax of the ROW_NUMBER() function is as follows:

ROW_NUMBER() OVER (
    [PARTITION BY partition_expression, ...]
    [ORDER BY sort_expression [ASC | DESC], ...]
)

The ROW_NUMBER() function uses the OVER clause to define the window specification. The window specification consists of two optional parts: the PARTITION BY clause and the ORDER BY clause.

  • The PARTITION BY clause divides the result set into partitions, or groups of rows that share the same values of the partition expressions. The ROW_NUMBER() function is applied to each partition separately and restarts from 1 for each partition. If the PARTITION BY clause is omitted, the whole result set is treated as a single partition.
  • The ORDER BY clause specifies the order of the rows within each partition. The ROW_NUMBER() function assigns the row numbers based on this order. If the ORDER BY clause is omitted, the order of the rows is undefined and the row numbers may not be consistent across executions.

Examples

In this section, we will show some examples of using the ROW_NUMBER() function in Mariadb. We will use the following employees table for the examples.

CREATE TABLE employees (
    id INT PRIMARY KEY,
    name VARCHAR(50),
    department VARCHAR(50),
    salary INT
);

INSERT INTO employees VALUES
(1, 'Alice', 'Sales', 5000),
(2, 'Bob', 'Marketing', 6000),
(3, 'Charlie', 'IT', 7000),
(4, 'David', 'Sales', 4000),
(5, 'Eve', 'Marketing', 3000),
(6, 'Frank', 'IT', 8000),
(7, 'Grace', 'Sales', 9000),
(8, 'Harry', 'Marketing', 2000),
(9, 'Ivy', 'IT', 10000);

Example 1: Assign row numbers to the whole result set

The following query uses the ROW_NUMBER() function without any PARTITION BY or ORDER BY clause. It assigns a row number to each row in the employees table.

SELECT
    id,
    name,
    department,
    salary,
    ROW_NUMBER() OVER () AS "row_number"
FROM
    employees;

The output is:

+------+---------+------------+--------+------------+
| id   | name    | department | salary | row_number |
+------+---------+------------+--------+------------+
|    1 | Alice   | Sales      |   5000 |          1 |
|    2 | Bob     | Marketing  |   6000 |          2 |
|    3 | Charlie | IT         |   7000 |          3 |
|    4 | David   | Sales      |   4000 |          4 |
|    5 | Eve     | Marketing  |   3000 |          5 |
|    6 | Frank   | IT         |   8000 |          6 |
|    7 | Grace   | Sales      |   9000 |          7 |
|    8 | Harry   | Marketing  |   2000 |          8 |
|    9 | Ivy     | IT         |  10000 |          9 |
+------+---------+------------+--------+------------+

Note that the order of the rows is not guaranteed and may change in different executions. Therefore, this query is not very useful in practice.

Example 2: Assign row numbers within each department

The following query uses the ROW_NUMBER() function with the PARTITION BY clause. It assigns a row number to each row within each department, ordered by the salary in descending order.

SELECT
    id,
    name,
    department,
    salary,
    ROW_NUMBER() OVER (
        PARTITION BY department
        ORDER BY salary DESC
    ) AS "row_number"
FROM
    employees;

The output is:

+------+---------+------------+--------+------------+
| id   | name    | department | salary | row_number |
+------+---------+------------+--------+------------+
|    9 | Ivy     | IT         |  10000 |          1 |
|    6 | Frank   | IT         |   8000 |          2 |
|    3 | Charlie | IT         |   7000 |          3 |
|    2 | Bob     | Marketing  |   6000 |          1 |
|    5 | Eve     | Marketing  |   3000 |          2 |
|    8 | Harry   | Marketing  |   2000 |          3 |
|    7 | Grace   | Sales      |   9000 |          1 |
|    1 | Alice   | Sales      |   5000 |          2 |
|    4 | David   | Sales      |   4000 |          3 |
+------+---------+------------+--------+------------+

Note that the row numbers are reset for each department and reflect the rank of the employees by their salary within each department.

Example 3: Assign row numbers with ties

The following query uses the ROW_NUMBER() function with the ORDER BY clause. It assigns a row number to each row ordered by the salary in descending order.

SELECT
    id,
    name,
    department,
    salary,
    ROW_NUMBER() OVER (
        ORDER BY salary DESC
    ) AS "row_number"
FROM
    employees;

The output is:

+------+---------+------------+--------+------------+
| id   | name    | department | salary | row_number |
+------+---------+------------+--------+------------+
|    9 | Ivy     | IT         |  10000 |          1 |
|    7 | Grace   | Sales      |   9000 |          2 |
|    6 | Frank   | IT         |   8000 |          3 |
|    3 | Charlie | IT         |   7000 |          4 |
|    2 | Bob     | Marketing  |   6000 |          5 |
|    1 | Alice   | Sales      |   5000 |          6 |
|    4 | David   | Sales      |   4000 |          7 |
|    5 | Eve     | Marketing  |   3000 |          8 |
|    8 | Harry   | Marketing  |   2000 |          9 |
+------+---------+------------+--------+------------+

Note that the ROW_NUMBER() function does not handle ties, meaning that if two or more rows have the same salary, they will have different row numbers. If you want to assign the same row number to rows with the same salary, you can use the RANK() or DENSE_RANK() functions instead.

The ROW_NUMBER() function is one of the window functions that Mariadb supports. Window functions allow you to perform calculations over a set of rows, called a window, that are related to the current row. Window functions can be used to perform tasks such as ranking, aggregation, and analytics.

Some of the related window functions are:

  • RANK(): Assigns a rank to each row within a partition, with gaps in the rank values in case of ties.
  • DENSE_RANK(): Assigns a rank to each row within a partition, without gaps in the rank values in case of ties.
  • NTILE(): Divides the rows within a partition into a specified number of groups, and assigns a group number to each row.

You can find more information about these and other window functions in the Mariadb documentation.

Conclusion

In this article, we have learned how the ROW_NUMBER() function works in Mariadb. We have seen the syntax of the function, and some examples of using it with different window specifications.

The ROW_NUMBER() function can be very useful for various purposes, such as pagination, ranking, and numbering. However, there are some things that you should be aware of when using this function. Here are some tips and suggestions:

  • The ROW_NUMBER() function is not a standard SQL function, but a window function. This means that it can only be used in the SELECT clause of a query, and not in other clauses such as WHERE, GROUP BY, or HAVING. If you want to filter, group, or aggregate the results based on the row numbers, you need to use a subquery or a common table expression (CTE).
  • The ROW_NUMBER() function does not guarantee a consistent order of the rows across different executions, unless you specify an ORDER BY clause in the window specification. If the order of the rows is important for your application, you should always use an ORDER BY clause to define the order. Otherwise, the row numbers may change unpredictably due to factors such as data changes, index usage, or query optimization.
  • The ROW_NUMBER() function does not handle ties, meaning that if two or more rows have the same values of the order expressions, they will have different row numbers. This may not be desirable in some scenarios, such as ranking or grouping. If you want to assign the same row number to rows with the same values, you can use the RANK() or DENSE_RANK() functions instead. These functions will assign the same rank to rows with the same values, and skip or fill the gaps in the rank values accordingly.
  • The ROW_NUMBER() function can be combined with other window functions or aggregate functions to perform more complex calculations over the window. For example, you can use the SUM() function with the ROW_NUMBER() function to calculate the cumulative sum of a column over the window.