How the VAR_POP() function works in Mariadb?

The VAR_POP() function in MariaDB is used to calculate the population variance of a set of values.

Posted on

The VAR_POP() function in MariaDB is used to calculate the population variance of a set of values. The population variance is a measure of how spread out the data values are from their mean or average value, considering the entire population of data points.

Syntax

The syntax for the MariaDB VAR_POP() function is as follows:

VAR_POP(expr)
  • expr: The expression representing the set of values for which you want to calculate the population variance. This can be a column name, a literal value, or an expression that evaluates to a set of numeric values.

The function returns the population variance of the input values as a floating-point number.

Examples

Example 1: Calculate the population variance of a column

This example demonstrates how to use the VAR_POP() function to calculate the population variance of values in a column.

DROP TABLE IF EXISTS scores;
CREATE TABLE scores (
  student_id INT,
  score FLOAT
);

INSERT INTO scores (student_id, score)
VALUES
  (1, 85.5),
  (2, 92.0),
  (3, 78.0),
  (4, 88.5),
  (5, 91.0);

SELECT VAR_POP(score) AS population_variance
FROM scores;

The following is the output:

+---------------------+
| population_variance |
+---------------------+
|  25.300000000000004 |
+---------------------+

This example creates a scores table with student_id and score columns, inserts some sample data, and then calculates the population variance of the score column using the VAR_POP() function.

Example 2: Calculate the population variance of an expression

This example shows how to use the VAR_POP() function to calculate the population variance of an expression.

DROP TABLE IF EXISTS products;
CREATE TABLE products (
  product_id INT,
  price DECIMAL(10, 2),
  quantity INT
);

INSERT INTO products (product_id, price, quantity)
VALUES
  (1, 19.99, 100),
  (2, 29.99, 150),
  (3, 39.99, 75),
  (4, 49.99, 200);

SELECT VAR_POP(price * quantity) AS variance_of_total_value
FROM products;

The following is the output:

+-------------------------+
| variance_of_total_value |
+-------------------------+
|          9544172.105469 |
+-------------------------+

This example creates a products table with product_id, price, and quantity columns, inserts some sample data, and then calculates the population variance of the expression price * quantity using the VAR_POP() function. This expression represents the total value of each product (price multiplied by quantity).

Example 3: Calculate the population variance across multiple groups

This example illustrates how to use the VAR_POP() function to calculate the population variance across multiple groups.

DROP TABLE IF EXISTS employees;
CREATE TABLE employees (
  employee_id INT,
  department VARCHAR(50),
  salary FLOAT
);

INSERT INTO employees (employee_id, department, salary)
VALUES
  (1, 'Sales', 50000.0),
  (2, 'Sales', 55000.0),
  (3, 'Marketing', 60000.0),
  (4, 'Marketing', 62000.0),
  (5, 'IT', 70000.0),
  (6, 'IT', 75000.0);

SELECT
  department,
  VAR_POP(salary) AS population_variance
FROM
  employees
GROUP BY
  department;

The following is the output:

+------------+---------------------+
| department | population_variance |
+------------+---------------------+
| IT         |             6250000 |
| Marketing  |             1000000 |
| Sales      |             6250000 |
+------------+---------------------+

This example creates an employees table with employee_id, department, and salary columns, inserts some sample data, and then calculates the population variance of the salary column grouped by department using the VAR_POP() function and the GROUP BY clause.

Example 4: Calculate the population variance with NULL values

This example shows how the VAR_POP() function handles NULL values.

DROP TABLE IF EXISTS measurements;
CREATE TABLE measurements (
  id INT,
  value FLOAT
);

INSERT INTO measurements (id, value)
VALUES
  (1, 10.5),
  (2, 12.0),
  (3, NULL),
  (4, 15.0),
  (5, 11.0);

SELECT VAR_POP(value) AS population_variance
FROM measurements;

The following is the output:

+---------------------+
| population_variance |
+---------------------+
|            3.046875 |
+---------------------+

This example creates a measurements table with id and value columns, inserts some sample data including a NULL value, and then calculates the population variance of the value column using the VAR_POP() function. The function ignores the NULL value when calculating the population variance.

Example 5: Calculate the population variance of a subset of values

This example demonstrates how to use the VAR_POP() function to calculate the population variance of a subset of values based on a condition.

DROP TABLE IF EXISTS temperature_readings;
CREATE TABLE temperature_readings (
  location VARCHAR(50),
  temperature FLOAT
);

INSERT INTO temperature_readings (location, temperature)
VALUES
  ('New York', 25.5),
  ('Los Angeles', 30.0),
  ('Chicago', 20.0),
  ('New York', 27.0),
  ('Los Angeles', 32.0),
  ('Chicago', 18.0);

SELECT
  VAR_POP(temperature) AS population_variance
FROM
  temperature_readings
WHERE
  location = 'New York';

The following is the output:

+---------------------+
| population_variance |
+---------------------+
|              0.5625 |
+---------------------+

This example creates a temperature_readings table with location and temperature columns, inserts some sample data, and then calculates the population variance of the temperature column for the location ‘New York’ using the VAR_POP() function and a WHERE clause.

The following are some functions related to the MariaDB VAR_POP() function:

  • MariaDB VAR_SAMP() function is used to calculate the sample variance of a set of values.
  • MariaDB STDDEV_POP() function is used to calculate the population standard deviation of a set of values.
  • MariaDB STDDEV_SAMP() function is used to calculate the sample standard deviation of a set of values.
  • MariaDB VARIANCE() function is an alias for the VAR_SAMP() function and calculates the sample variance.

Conclusion

The VAR_POP() function in MariaDB is a valuable tool for calculating the population variance of a set of values, which is a crucial statistical measure in various data analysis tasks. By understanding its syntax, usage, and the examples provided, you can effectively incorporate this function into your SQL queries and database operations, enabling you to perform advanced statistical analysis on your data.