Speed up 180+ rows returned by simple SELECT with PostgreSQL

by Dr Mouse   Last Updated September 11, 2019 09:06 AM - source

I have performance issues on a SELECT statement which returns 180+ out of 220+ of millions of rows on my managed PostgreSQL. The table structure is the following:

CREATE TABLE members (
    id bigserial NOT NULL,
    customer varchar(64) NOT NULL,
    group varchar(64) NOT NULL,
    member varchar(64) NOT NULL,
    has_connected bool NULL DEFAULT false,
    CONSTRAINT members_customer_group_members_key UNIQUE (customer, group, member),
    CONSTRAINT members_pkey PRIMARY KEY (id)

The "guilty" SELECT query is:

    customer = :customer;

I have already indexed the table:

CREATE INDEX members_idx ON members USING btree (customer, group, has_connected, member);

and the query behave well for most of the customer value. However, I have a customer, let's call it 1234, which represents 80 % of the table, so the query planner prefers to scan the whole table according to the following ̀explain analyze` result:

Seq Scan on public.members  (cost=0.00..5674710.80 rows=202271234 width=55) (actual time=0.018..165612.655 rows=202279274 loops=1)
  Output: community, member, has_connected
  Filter: ((members.customer)::text = '1234'::text)
  Rows Removed by Filter: 5676072
Planning time: 0.106 ms
Execution time: 175174.714 ms

As I said earlier, my PostgreSQL is a managed instance on Google Cloud Platform with 10 vCPUs and 30 GB RAM. I am rather limited to the available flags , so the only PostgreSQL options tuned on this instance are:

max_connections: 1000
work_mem: 131072 KB
maintenance_work_mem: 2000000 KB

What are my options to solve this issue and to strongly reduce the query time, preferably below 30 seconds if possible ?

Related Questions

Working with raster data in Google Cloud SQL

Updated February 07, 2019 02:06 AM

Horizontally Scaling Database Guide

Updated September 19, 2018 14:06 PM

Why use Failover replica on Google Cloud RDS?

Updated November 27, 2018 21:06 PM

Why is an instance forced to reboot?

Updated March 28, 2019 12:06 PM

What database to choose for 1TB of time series?

Updated November 12, 2018 17:06 PM