MySQL Prefix Indexes

This article discusses how to create prefix indexes for string columns in MySQL.

In MySQL, you can create a prefix index for a prefix of the specified length of a string column.

Compared with creating an index for the entire string column, a prefix index can reduce disk usage and improve the write speed of the index.

MySQL Prefix Index syntax

To add a prefix index to a table, use the CREATE INDEX statement:

CREATE INDEX index_name
ON table_name (column_name(length));

To define a prefix index when creating a table, use the CREATE TABLE statement:

CREATE TABLE table_name(
  column_list,
  INDEX(column_name(length))
);

Explanation:

  • If the column is a non-binary character type such as CHAR, VARCHAR and TEXT, length is the number of characters to use for prefix index.
  • If the column is a binary character type such as BINARY, BLOB and VARBINARY, length is the number of bytes to use for prefix index.

MySQL Prefix Index Examples

We’ll use the actor table from the Sakila sample database for demonstration.

Take a look at the actor table definition:

DESC actor;
+-------------+-------------------+------+-----+-------------------+-----------------------------------------------+
| Field       | Type              | Null | Key | Default           | Extra                                         |
+-------------+-------------------+------+-----+-------------------+-----------------------------------------------+
| actor_id    | smallint unsigned | NO   | PRI | NULL              | auto_increment                                |
| first_name  | varchar(45)       | NO   |     | NULL              |                                               |
| last_name   | varchar(45)       | NO   | MUL | NULL              |                                               |
| last_update | timestamp         | NO   |     | CURRENT_TIMESTAMP | DEFAULT_GENERATED on update CURRENT_TIMESTAMP |
+-------------+-------------------+------+-----+-------------------+-----------------------------------------------+
4 rows in set (0.00 sec)

Take a look at the indexes in the actor table:

SHOW INDEXES FROM actor;
+-------+------------+---------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| Table | Non_unique | Key_name            | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Visible | Expression |
+-------+------------+---------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| actor |          0 | PRIMARY             |            1 | actor_id    | A         |         200 |     NULL |   NULL |      | BTREE      |         |               | YES     | NULL       |
| actor |          1 | idx_actor_last_name |            1 | last_name   | A         |         122 |     NULL |   NULL |      | BTREE      |         |               | YES     | NULL       |
+-------+------------+---------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
2 rows in set (0.00 sec)

The following query finds actors whose first name begins GE:

SELECT *
FROM actor
WHERE first_name LIKE 'GE%';
+----------+------------+-----------+---------------------+
| actor_id | first_name | last_name | last_update         |
+----------+------------+-----------+---------------------+
|       96 | GENE       | WILLIS    | 2006-02-15 04:34:33 |
|      134 | GENE       | HOPKINS   | 2006-02-15 04:34:33 |
|      151 | GEOFFREY   | HESTON    | 2006-02-15 04:34:33 |
|      177 | GENE       | MCKELLEN  | 2006-02-15 04:34:33 |
+----------+------------+-----------+---------------------+
4 rows in set (0.00 sec)

Since the first_name column is not indexed, the query optimizer must scan all rows to return results, as shown in the output of the following EXPLAIN statement:

EXPLAIN
SELECT *
FROM actor
WHERE first_name LIKE 'GE%';
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | actor | NULL       | ALL  | NULL          | NULL | NULL    | NULL |  200 |    11.11 | Using where |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

If you often need to search by prefix, consider creating an prefix index:

CREATE INDEX idx_first_name_prefix_2
ON actor(first_name(2));

Then, view the execution plan through the EXPLAIN statement

EXPLAIN
SELECT *
FROM actor
WHERE first_name LIKE 'GE%';
+----+-------------+-------+------------+-------+-------------------------+-------------------------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type  | possible_keys           | key                     | key_len | ref  | rows | filtered | Extra       |
+----+-------------+-------+------------+-------+-------------------------+-------------------------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | actor | NULL       | range | idx_first_name_prefix_2 | idx_first_name_prefix_2 | 10      | NULL |    4 |   100.00 | Using where |
+----+-------------+-------+------------+-------+-------------------------+-------------------------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.01 sec)

Indexes are now used by the MySQL query optimizer.

Principles of creating a prefix index

When creating a prefix index, determining the prefix length requires some skill.

When we determine the prefix length, we should try to keep the prefix of string columns unique. The stronger the uniqueness, the more efficient the index.

Conclusion

The MySQL prefix index is beneficial to improve the efficiency of the indexed string column.