How the PERCENTILE_CONT() function works in Mariadb?

The PERCENTILE_CONT() function is a built-in function in Mariadb that returns the value of a given percentile within a group of values.

Posted on

The PERCENTILE_CONT() function is a built-in function in Mariadb that returns the value of a given percentile within a group of values. The function is useful for finding the median, quartiles, or any other percentile of a distribution. The function is also known as PERCENTILE_CONTINUOUS().

Syntax

The syntax of the PERCENTILE_CONT() function is as follows:

PERCENTILE_CONT(percentile) OVER (
  PARTITION BY expr1, expr2, ...
  ORDER BY expr3, expr4, ...
)

Where percentile is a decimal value between 0 and 1, inclusive, that specifies the percentile to compute. expr1, expr2, … are the expressions that define the partition or the group of values, and expr3, expr4, … are the expressions that define the order or the ranking of the values within each partition. The function returns a decimal value that is the value of the given percentile.

Examples

Example 1: Calculating the median of students’ scores

The following example shows how to use the PERCENTILE_CONT() function to calculate the median of students’ scores in a table called students:

SELECT name, score, PERCENTILE_CONT(0.5) OVER (ORDER BY score) AS Median
FROM students;

The output is:

+------+-------+--------+
| name | score | Median |
+------+-------+--------+
| Bob  | 50    | 70     |
| Alice| 60    | 70     |
| Eve  | 70    | 70     |
| Dave | 80    | 70     |
| Carol| 90    | 70     |
+------+-------+--------+

The function returns the median of the students’ scores, which is 70. The median is the value that divides the distribution into two equal halves, or the 50th percentile. The function uses a linear interpolation method to compute the percentile value when it is not an integer rank.

Example 2: Calculating the quartiles of products’ sales

The following example shows how to use the PERCENTILE_CONT() function to calculate the quartiles of products’ sales in a table called products:

SELECT product, category, sales,
       PERCENTILE_CONT(0.25) OVER (PARTITION BY category ORDER BY sales) AS Q1,
       PERCENTILE_CONT(0.5) OVER (PARTITION BY category ORDER BY sales) AS Q2,
       PERCENTILE_CONT(0.75) OVER (PARTITION BY category ORDER BY sales) AS Q3
FROM products;

The output is:

+---------+----------+-------+------+------+-------+
| product | category | sales | Q1   | Q2   | Q3    |
+---------+----------+-------+------+------+-------+
| A       | Books    | 100   | 125  | 250  | 325   |
| B       | Books    | 200   | 125  | 250  | 325   |
| C       | Books    | 300   | 125  | 250  | 325   |
| D       | Books    | 400   | 125  | 250  | 325   |
| E       | Toys     | 50    | 62.5 | 100  | 137.5 |
| F       | Toys     | 100   | 62.5 | 100  | 137.5 |
| G       | Toys     | 150   | 62.5 | 100  | 137.5 |
+---------+----------+-------+------+------+-------+

The function returns the quartiles of the products’ sales, which are the values that divide the distribution into four equal parts, or the 25th, 50th, and 75th percentiles. The function computes the quartiles for each category separately, using the partition by clause. The function uses a linear interpolation method to compute the percentile value when it is not an integer rank.

There are some other functions in Mariadb that are related to the PERCENTILE_CONT() function. They are:

  • PERCENTILE_DISC(): This function returns the value of a given percentile within a group of values. The function is similar to the PERCENTILE_CONT() function, but it returns the discrete value that is closest to the given percentile, rather than using a linear interpolation method. The function is also known as PERCENTILE_DISCRETE().
  • PERCENT_RANK(): This function returns the relative rank of a row within a group of rows. The function is similar to the PERCENTILE_CONT() function, but it returns the percentile of a given value, rather than the value of a given percentile. The function is also known as PERCENTILE_RANK().
  • MEDIAN(): This function returns the median of a set of values. The function is equivalent to the PERCENTILE_CONT(0.5) function, but it does not require the over clause.

Conclusion

The PERCENTILE_CONT() function is a useful function in Mariadb that allows you to calculate the value of a given percentile within a group of values. The function is helpful for finding the median, quartiles, or any other percentile of a distribution. The function uses a linear interpolation method to compute the percentile value when it is not an integer rank. You can also use other functions like PERCENTILE_DISC(), PERCENT_RANK(), and MEDIAN() to manipulate percentiles in different ways. I hope this article helped you understand how the PERCENTILE_CONT() function works in Mariadb.