How the VAR_SAMP() function works in Mariadb?

The VAR_SAMP() function in MariaDB is used to calculate the sample variance of a set of values.

Posted on

The VAR_SAMP() function in MariaDB is used to calculate the sample variance of a set of values. Sample variance is a measure of the spread or dispersion of a sample of data points from their mean value. It is particularly useful in statistical analysis and data exploration tasks when dealing with a sample of data rather than the entire population.

Syntax

The syntax for the MariaDB VAR_SAMP() function is as follows:

VAR_SAMP(expr)
  • expr: The expression that represents the set of values for which you want to calculate the sample variance. This expression can be a column name, a literal value, or a more complex expression that evaluates to a set of values.

The function returns the sample variance of the specified expression as a numeric value.

Examples

Example 1: Calculate the sample variance of values in a column

This example shows how to use the VAR_SAMP() function to calculate the sample variance of values in a column of a table.

DROP TABLE IF EXISTS scores;
CREATE TABLE scores (
  student_id INT,
  score INT
);

INSERT INTO scores (student_id, score) VALUES
  (1, 85),
  (2, 92),
  (3, 78),
  (4, 88),
  (5, 91);

SELECT VAR_SAMP(score) FROM scores;

The following is the output:

+-----------------+
| VAR_SAMP(score) |
+-----------------+
|         31.7000 |
+-----------------+

In this example, a table scores is created with columns for student_id and score. After inserting some sample data, the VAR_SAMP(score) function is used to calculate the sample variance of the score values across all rows in the table.

Example 2: Calculate the sample variance of a subset of values

This example illustrates how to use the VAR_SAMP() function to calculate the sample variance of a subset of values based on a condition.

DROP TABLE IF EXISTS sales;
CREATE TABLE sales (
  product_id INT,
  quantity INT,
  price DECIMAL(10, 2)
);

INSERT INTO sales (product_id, quantity, price) VALUES
  (1, 10, 9.99),
  (2, 15, 14.99),
  (1, 20, 9.99),
  (3, 25, 19.99);

SELECT VAR_SAMP(price) FROM sales WHERE product_id = 1;

The following is the output:

+-----------------+
| VAR_SAMP(price) |
+-----------------+
|        0.000000 |
+-----------------+

In this example, a table sales is created with columns for product_id, quantity, and price. After inserting some sample data, the VAR_SAMP(price) function is used to calculate the sample variance of the price values for the subset of rows where product_id is 1. Since all prices for product_id 1 are the same (9.99), the sample variance is 0.

Example 3: Calculate the sample variance of values from a subquery

This example shows how to use the VAR_SAMP() function to calculate the sample variance of values retrieved from a subquery.

DROP TABLE IF EXISTS orders;
CREATE TABLE orders (
  order_id INT,
  customer_id INT,
  total_amount DECIMAL(10, 2)
);

INSERT INTO orders (order_id, customer_id, total_amount) VALUES
  (1, 1, 100.00),
  (2, 2, 150.00),
  (3, 1, 200.00),
  (4, 2, 175.00);

SELECT VAR_SAMP(total_amount) FROM
  (SELECT total_amount FROM orders WHERE customer_id = 1) AS customer_orders;

The following is the output:

+------------------------+
| VAR_SAMP(total_amount) |
+------------------------+
|            5000.000000 |
+------------------------+

In this example, a table orders is created with columns for order_id, customer_id, and total_amount. After inserting some sample data, a subquery is used to retrieve the total_amount values for orders where customer_id is 1. The VAR_SAMP() function is then applied to this subquery to calculate the sample variance of the total_amount values for the specified customer.

Example 4: Calculate the sample variance of values from a JOIN

This example demonstrates how to use the VAR_SAMP() function to calculate the sample variance of values retrieved from a JOIN operation.

DROP TABLE IF EXISTS products;
CREATE TABLE products (
  product_id INT,
  product_name VARCHAR(255),
  price DECIMAL(10, 2)
);

DROP TABLE IF EXISTS sales;
CREATE TABLE sales (
  sale_id INT,
  product_id INT,
  quantity INT
);

INSERT INTO products (product_id, product_name, price) VALUES
  (1, 'Product A', 9.99),
  (2, 'Product B', 14.99),
  (3, 'Product C', 19.99);

INSERT INTO sales (sale_id, product_id, quantity) VALUES
  (1, 1, 10),
  (2, 2, 15),
  (3, 1, 20),
  (4, 3, 25);

SELECT VAR_SAMP(price) FROM products p
  JOIN sales s ON p.product_id = s.product_id;

The following is the output:

+-----------------+
| VAR_SAMP(price) |
+-----------------+
|       22.916667 |
+-----------------+

In this example, two tables products and sales are created with columns for product information and sales details, respectively. After inserting some sample data, an inner join is performed between the two tables, and the VAR_SAMP(price) function is applied to the price column from the products table to calculate the sample variance of all product prices.

The following are some functions related to the MariaDB VAR_SAMP() function:

  • MariaDB VAR_POP() function is used to calculate the population variance of a set of values.
  • MariaDB STDDEV_POP() function is used to calculate the population standard deviation of a set of values.
  • MariaDB STDDEV_SAMP() function is used to calculate the sample standard deviation of a set of values.
  • MariaDB VARIANCE() function is an alias for the VAR_SAMP() function and calculates the sample variance.

Conclusion

The VAR_SAMP() function in MariaDB is a powerful tool for calculating the sample variance of a set of values, which is a crucial statistic in various data analysis and statistical modeling tasks. By understanding its syntax and usage, as demonstrated through the examples provided, you can effectively incorporate this function into your SQL queries and database operations to gain insights into the spread or dispersion of your sample data.