How the COUNT(DISTINCT) function works in Mariadb?

The MariaDB COUNT(DISTINCT) function is used to return the count of distinct non-NULL values in a set.

Posted on

The MariaDB COUNT(DISTINCT) function is used to return the count of distinct non-NULL values in a set. It’s commonly utilized in scenarios where one needs to count the number of unique entries in a database column, such as counting the number of different products sold or the number of unique visitors to a website.

Syntax

The syntax for the MariaDB COUNT(DISTINCT) function is as follows:

SELECT COUNT(DISTINCT column_name) FROM table_name;

This function takes the name of a column as a parameter and returns an integer representing the number of unique, non-NULL values found in that column.

Examples

Counting Unique Usernames

To demonstrate how COUNT(DISTINCT) can be used to count unique usernames in a user table:

DROP TABLE IF EXISTS users;
CREATE TABLE users (
    id INT AUTO_INCREMENT PRIMARY KEY,
    username VARCHAR(255) NOT NULL
);

INSERT INTO users (username) VALUES ('alice'), ('bob'), ('alice'), ('carol');

SELECT COUNT(DISTINCT username) FROM users;

The output for this statement will be:

+--------------------------+
| COUNT(DISTINCT username) |
+--------------------------+
|                        3 |
+--------------------------+

This indicates that there are three unique usernames in the users table.

Unique Status Codes

Here’s how to count unique HTTP status codes from a log table:

DROP TABLE IF EXISTS logs;
CREATE TABLE logs (
    id INT AUTO_INCREMENT PRIMARY KEY,
    status_code INT NOT NULL
);

INSERT INTO logs (status_code) VALUES (200), (200), (404), (500), (404);

SELECT COUNT(DISTINCT status_code) FROM logs;

The output will be:

+-----------------------------+
| COUNT(DISTINCT status_code) |
+-----------------------------+
|                           3 |
+-----------------------------+

This shows that there are three unique HTTP status codes in the logs table.

Counting Unique Products Sold by Each Store

Suppose we have a table named sales that records each product sold by different stores. We want to count the number of unique products sold by each store.

DROP TABLE IF EXISTS sales;
CREATE TABLE sales (
    store_id INT NOT NULL,
    product_id INT NOT NULL
);

INSERT INTO sales (store_id, product_id) VALUES
(1, 100),
(1, 101),
(1, 100),
(2, 101),
(2, 102),
(2, 103),
(2, 102),
(3, 104);

SELECT store_id, COUNT(DISTINCT product_id)
FROM sales
GROUP BY store_id;

The output for this statement will be:

+----------+----------------------------+
| store_id | COUNT(DISTINCT product_id) |
+----------+----------------------------+
|        1 |                          2 |
|        2 |                          3 |
|        3 |                          1 |
+----------+----------------------------+

This result indicates that:

  • Store 1 sold 2 unique products.
  • Store 2 sold 3 unique products.
  • Store 3 sold 1 unique product.

The GROUP BY clause groups the rows that have the same values in the specified column(s), and then the COUNT(DISTINCT) function is applied to each group, counting the unique entries in the product_id column for each store_id.

Here are a few functions related to the MariaDB COUNT(DISTINCT) function:

  • MariaDB COUNT() function is used to count the number of rows returned by a query, including duplicates and NULLs.
  • MariaDB SUM() function calculates the sum of a set of values.
  • MariaDB AVG() function computes the average value of a numeric column.

Conclusion

Understanding the COUNT(DISTINCT) function in MariaDB is essential for analyzing data sets for unique occurrences. Whether you’re working with large datasets or small tables, this function provides a straightforward way to quantify the number of distinct entries in a column, aiding in data analysis and reporting tasks.