How the REGEXP() function works in Mariadb?

The REGEXP() function is a string function that returns 1 if a string matches a given regular expression pattern, or 0 otherwise.

Posted on

The REGEXP() function is a string function that returns 1 if a string matches a given regular expression pattern, or 0 otherwise. It is useful for performing complex pattern matching on strings. In this article, we will introduce the syntax and usage of the REGEXP() function in Mariadb, and provide some examples to demonstrate its functionality. We will also list some related functions that can be used in conjunction with the REGEXP() function.

Syntax

The syntax of the REGEXP() function is as follows:

string REGEXP pattern

The function takes two arguments:

  • string: The string to be searched.
  • pattern: The regular expression pattern to be matched.

The function returns 1 if the string matches the pattern, or 0 if it does not match. If either string or pattern are NULL, the result is NULL. The function is equivalent to the REGEXP operator or the RLIKE operator¹. The function follows the case sensitivity rules of the effective collation. Matching is performed case insensitively for case insensitive collations, and case sensitively for case sensitive collations and for binary data. The collation case sensitivity can be overwritten using the (?i) and (?-i) PCRE flags².

Examples

In this section, we will show some examples of using the REGEXP() function in Mariadb. We will use the following sample table for illustration:

CREATE TABLE employees (
  id INT PRIMARY KEY,
  name VARCHAR(50),
  email VARCHAR(50),
  phone VARCHAR(20)
);

INSERT INTO employees VALUES
(1, 'Alice', '[email protected]', '+1-234-567-8901'),
(2, 'Bob', '[email protected]', '+1-345-678-9012'),
(3, 'Charlie', '[email protected]', '+1-456-789-0123'),
(4, 'David', '[email protected]', '+44-123-456-7890'),
(5, 'Eve', '[email protected]', '+86-123-4567-8901');

Example 1: Checking if an email address contains a specific domain name

We can use the REGEXP() function to check if an email address contains a specific domain name. For example, we can check if the email address contains the domain name example.com. We can use the following regular expression pattern to match the domain name:

'@example\\.com$'

This pattern means to match a literal @ character, followed by the literal string example.com, followed by the end of the string. We need to escape the dot (.) character with a backslash (\) because it has a special meaning in regular expressions. We also need to double the backslash (\\) because MariaDB uses the C escape syntax in strings³. We can use the following query to apply the function to the email column of the employees table, and filter the rows that match the pattern:

SELECT name, email
FROM employees
WHERE email REGEXP '@example\\.com$';

The output is:

+---------+---------------------+
| name    | email               |
+---------+---------------------+
| Alice   | [email protected]   |
| Bob     | [email protected]     |
| Charlie | [email protected] |
| David   | [email protected]   |
| Eve     | [email protected]     |
+---------+---------------------+

Example 2: Checking if a phone number starts with a specific country code

We can also use the REGEXP() function to check if a phone number starts with a specific country code. For example, we can check if the phone number starts with the country code +1. We can use the following regular expression pattern to match the country code:

'^\\+1-'

This pattern means to match the beginning of the string, followed by a literal + character, followed by the literal string 1-. We need to escape the plus (+) character with a backslash (\) because it has a special meaning in regular expressions. We also need to double the backslash (\\) because MariaDB uses the C escape syntax in strings³. We can use the following query to apply the function to the phone column of the employees table, and filter the rows that match the pattern:

SELECT name, phone
FROM employees
WHERE phone REGEXP '^\\+1-';

The output is:

+---------+-----------------+
| name    | phone           |
+---------+-----------------+
| Alice   | +1-234-567-8901 |
| Bob     | +1-345-678-9012 |
| Charlie | +1-456-789-0123 |
+---------+-----------------+

Example 3: Checking if a name contains a specific letter

We can also use the REGEXP() function to check if a name contains a specific letter. For example, we can check if the name contains the letter e. We can use the following regular expression pattern to match the letter:

'e'

This pattern means to match a literal e character. We can use the following query to apply the function to the name column of the employees table, and filter the rows that match the pattern:

SELECT name
FROM employees
WHERE name REGEXP 'e';

The output is:

+---------+
| name    |
+---------+
| Alice   |
| Charlie |
| Eve     |
+---------+

Example 4: Checking if a name starts with a vowel

We can also use the REGEXP() function to check if a name starts with a vowel. A vowel is one of the letters a, e, i, o, or u. We can use the following regular expression pattern to match a vowel at the beginning of the string:

'^[aeiou]'

This pattern means to match the beginning of the string, followed by one of the characters in the square brackets ([...]). The square brackets indicate a character class, which means any one of the characters inside the brackets. We can use the following query to apply the function to the name column of the employees table, and filter the rows that match the pattern:

SELECT name
FROM employees
WHERE name REGEXP '^[aeiou]';

The output is:

+-------+
| name  |
+-------+
| Alice |
| Eve   |
+-------+

Example 5: Checking if a phone number ends with a specific digit

We can also use the REGEXP() function to check if a phone number ends with a specific digit. For example, we can check if the phone number ends with the digit 1. We can use the following regular expression pattern to match the digit at the end of the string:

'1$'

This pattern means to match a literal 1 character, followed by the end of the string. We can use the following query to apply the function to the phone column of the employees table, and filter the rows that match the pattern:

SELECT name, phone
FROM employees
WHERE phone REGEXP '1$';

The output is:

+-------+-------------------+
| name  | phone             |
+-------+-------------------+
| Alice | +1-234-567-8901   |
| Eve   | +86-123-4567-8901 |
+-------+-------------------+

There are some other functions that are related to the REGEXP() function in Mariadb. They are:

  • REGEXP_INSTR(): This function returns the position of the first occurrence of the pattern in the subject string, or 0 if no match is found. It also accepts optional arguments to specify the start position, occurrence, return option, and match parameter.
  • REGEXP_SUBSTR(): This function returns the substring that matches the pattern in the subject string, or an empty string if no match is found. It also accepts optional arguments to specify the position, occurrence, and match parameter.
  • REGEXP_REPLACE(): This function returns a new string where the pattern is replaced by the replacement in the subject string. It also accepts optional arguments to specify the position, occurrence, and match parameter.

Conclusion

In this article, we have learned how to use the REGEXP() function in Mariadb to perform complex pattern matching on strings.