SQL Server PERCENTILE_DISC() Function
PERCENTILE_DISC() is an aggregate function in SQL Server used to calculate the specified percentile of a set of values. This function returns a discrete value that is computed from the given set of values.
Syntax
The syntax for the PERCENTILE_DISC() function is as follows:
PERCENTILE_DISC(percentile) WITHIN GROUP (ORDER BY expression [ASC|DESC], ...)
OVER (PARTITION BY partition_expression1, partition_expression2,...)
where:
percentile: Required. A value between 0 and 1 that specifies the percentile to calculate.expression: Required. Specifies the column or expression used for sorting.ASC|DESC: Optional. Used to specify ascending or descending order.PARTITION BY: Optional. Used to specify partition columns that group data by the specified column.
Usage
The PERCENTILE_DISC() function is typically used in the following scenarios:
- For large datasets, we want to find a specific percentile of a set of data.
- For datasets with similar values, we need to obtain the mode or median of the dataset.
Examples
Example 1
Assuming we have the following employees table:
| ID | Name | Salary |
|---|---|---|
| 1 | John | 3000 |
| 2 | Mike | 2500 |
| 3 | Alice | 4000 |
| 4 | Tom | 5000 |
| 5 | Jane | 6000 |
| 6 | Bob | 3500 |
We can use the following query to calculate the median of the Salary column:
SELECT PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY Salary) OVER () as MedianSalary
FROM employees;
Executing the above SQL statement will yield the following result:
| MedianSalary |
|---|
| 3750 |
Example 2
Now suppose we have the following scores table:
| ID | Name | Score |
|---|---|---|
| 1 | John | 80 |
| 2 | Mike | 70 |
| 3 | Alice | 90 |
| 4 | Tom | 85 |
| 5 | Jane | 95 |
| 6 | Bob | 85 |
We can use the following query to calculate the mode of the Score column:
SELECT PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY Score) OVER () as ModeScore
FROM scores;
Executing the above SQL statement will yield the following result:
| ModeScore |
|---|
| 85 |
Conclusion
The PERCENTILE_DISC() function is a very useful function that can help us calculate a specific percentile of a set of data and can also help us find the mode or median of a dataset with similar values.