PostgreSQL Expression Indexes

This article describes how to create expression indexes in PostgreSQL.

In addition to creating indexes on columns, PostgreSQL allows you to create indexes on expressions based on one or more columns in a table, which are called expression indexes.

Note that indexes on expressions are expensive to maintain because PostgreSQL must evaluate the expression on each row as it is inserted or updated, and use the result for the index. Therefore, you should only use indexes on expressions when retrieval speed is more important than insertion and update speed.

PostgreSQL Expression Index Syntax

To create an expression index, follow this syntax:

CREATE INDEX index_name
ON table_name ( (expression) );

Explanation:

  • The syntax for creating an expression index is basically the same as creating a normal index, except that the column names are replaced by expressions.
  • Expressions are generally enclosed in parentheses. If the expression is just a function call, parentheses can be omitted.
  • The index_name is the index name. You can skip the index name and PostgreSQL will automatically generate one.

Once an index expression is defined, PostgreSQL will consider the index when the expression defining the index appears in a WHERE clause or ORDER BY clause of an SQL statement.

PostgreSQL Expression Index Examples

The following demonstrates PostgreSQL expression indexes with the customer table from PostgreSQL Sakila sample database.

The following is a partial definition of the customer table:

                                             Table "public.customer"
   Column    |            Type             | Collation | Nullable |                    Default
-------------+-----------------------------+-----------+----------+-----------------------------------------------
 customer_id | integer                     |           | not null | nextval('customer_customer_id_seq'::regclass)
 store_id    | smallint                    |           | not null |
 first_name  | character varying(45)       |           | not null |
 last_name   | character varying(45)       |           | not null |
 email       | character varying(50)       |           |          |
 address_id  | smallint                    |           | not null |
 activebool  | boolean                     |           | not null | true
 create_date | date                        |           | not null | 'now'::text::date
 last_update | timestamp without time zone |           |          | now()
 active      | integer                     |           |          |
Indexes:
    "customer_pkey" PRIMARY KEY, btree (customer_id)
    "idx_fk_address_id" btree (address_id)
    "idx_fk_store_id" btree (store_id)
    "idx_last_name" btree (last_name)

...

There is a B-tree index idx_last_name defined on the last_name column. The following EXPLAIN statement shows that PostgreSQL uses the idx_last_name index when quering rows with last name Adam from the customer table:

EXPLAIN
SELECT * FROM customer
WHERE last_name = 'Adam';
                                  QUERY PLAN
-------------------------------------------------------------------------------
 Index Scan using idx_last_name on customer  (cost=0.28..8.29 rows=1 width=70)
   Index Cond: ((last_name)::text = 'Adam'::text)
(2 rows)

However, when looking for customers with adam lowercase, PostgreSQL cannot make use of the index for the lookup:

EXPLAIN
SELECT * FROM customer
WHERE LOWER(last_name) = 'adam';
                        QUERY PLAN
----------------------------------------------------------
 Seq Scan on customer  (cost=0.00..16.98 rows=3 width=70)
   Filter: (lower((last_name)::text) = 'adam'::text)
(2 rows)

In order to improve the query efficiency of the LOWER() function, you can define an index expression like this:

CREATE INDEX ON customer(LOWER(last_name));

Now, look at the execution plan of the above statement again,

EXPLAIN
SELECT * FROM customer
WHERE LOWER(last_name) = 'adam';
                                   QUERY PLAN
---------------------------------------------------------------------------------
 Bitmap Heap Scan on customer  (cost=4.30..10.83 rows=3 width=70)
   Recheck Cond: (lower((last_name)::text) = 'adam'::text)
   ->  Bitmap Index Scan on customer_lower_idx  (cost=0.00..4.30 rows=3 width=0)
         Index Cond: (lower((last_name)::text) = 'adam'::text)
(4 rows)

You will notice that the PostgreSQL optimizer uses indexes.

Conclusion

PostgreSQL allows you to create indexes on expressions based on one or more columns in a table, which are called expression indexes.