PostgreSQL Partial Indexes

This article describes how to create partial indexes in PostgreSQL.

PostgreSQL allows you to create partial indexes. A partial index helps reduce index size because it does not index all the values in the column, it only indexes the data in the eligible rows.

Partial indexes are useful if you have common conditions that use constant values in WHERE clause, like this:

SELECT *
FROM table_name
WHERE column_name = constant_value;

Let’s take a look at the customer table from Sakila sample database:

\d customer
                                             Table "public.customer"
   Column    |            Type             | Collation | Nullable |                    Default
-------------+-----------------------------+-----------+----------+-----------------------------------------------
 customer_id | integer                     |           | not null | nextval('customer_customer_id_seq'::regclass)
 store_id    | smallint                    |           | not null |
 first_name  | character varying(45)       |           | not null |
 last_name   | character varying(45)       |           | not null |
 email       | character varying(50)       |           |          |
 address_id  | smallint                    |           | not null |
 activebool  | boolean                     |           | not null | true
 create_date | date                        |           | not null | 'now'::text::date
 last_update | timestamp without time zone |           |          | now()
 active      | integer                     |           |          |
Indexes:
    "customer_pkey" PRIMARY KEY, btree (customer_id)
    "customer_lower_idx" btree (lower(last_name::text))
    "idx_fk_address_id" btree (address_id)
    "idx_fk_store_id" btree (store_id)
    "idx_last_name" btree (last_name)

For example, you are generally interested in inactive customers and often do some follow-up to get them to come back and buy more.

The following query finds all inactive customers:

SELECT *
FROM customer
WHERE active = 0;

To execute this query, the query planner needs to scan the customer table, as EXPLAIN shown in the following statement:

EXPLAIN
SELECT *
FROM customer
WHERE active = 0;
                        QUERY PLAN
-----------------------------------------------------------
 Seq Scan on customer  (cost=0.00..15.49 rows=15 width=70)
   Filter: (active = 0)
(2 rows)

You can optimize this query by creating an index on the active column as follows:

CREATE INDEX idx_customer_active
ON customer(active);

The index serves its purpose, however, it contains many rows that have never been searched, ie all active customers.

To define an index that includes only inactive customers, use the following statement:

CREATE INDEX idx_customer_inactive
ON customer(active)
WHERE active = 0;

This statement defines a partial index idx_customer_inactive​, and only values of the active column ​that match the WHERE condition active = 0 will be indexed. The size of this index will be small compared to the idx_customer_active index built on all values of active.

From now on, PostgreSQL will consider partial indexes whenever this WHERE clause appears in a query:

EXPLAIN
SELECT *
FROM customer
WHERE active = 0;
                                      QUERY PLAN
---------------------------------------------------------------------------------------
 Index Scan using idx_customer_active on customer  (cost=0.15..11.69 rows=15 width=70)
   Index Cond: (active = 0)
(2 rows)

Conclusion

This article describes the PostgreSQL partial index and how to use it to specify which rows should be added to the index.