In last blog series we learned basics of Indexing. In this blog series we will understand on how to understand why query is not using the index. Basically sometimes what happens is that we assume when we run a query it should use a particular index but it actually either use diff. index or do not use index at all and do sequential scan. So in this blogs series we will work on understanding this case by case.
When we ask Postgres to execute a query it is executed in three steps:
- Parse – to validate the query sytax
- Planner – to plan how the query is executed like whether to do index scan or sequential scan or bitmap scan
- Executor – it will finally execute the planned tasks
Here we are more concerned about the Planner part, the entire purpose why the query needs planning is to reduce the total I/O (reading from disks/memory).
In first part we will work on example where Postgres use diff. index and primarily on the order by column and why it does that.
Lets understand it by example.
First create a table orders with indexes:
create table orders as select s as orderno, md5(random()::text) as orderitem , now() as order_created from generate_Series(1,1000000) s;
create index orderno_idx on orders(orderno);
create index order_created_idx on orders(order_created);
Now you see we have three columns orderno integer , orderitem text , order_created as date and two index one on orderno and other on order_created.
Now lets query :
we will do two query one which contains sorting on order_created and one without it , lets see which index which query use:
explain analyze select * from orders where orderno > 80000 order by order_created desc limit 100 ;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.42..4.54 rows=100 width=45) (actual time=0.016..0.038 rows=100 loops=1)
-> Index Scan Backward using order_created_idx on orders (cost=0.42..37829.43 rows=920121 width=45) (actual time=0.015..0.033 rows=100 loops=1)
Filter: (orderno > 80000)
Planning time: 0.170 ms
Execution time: 0.051 ms
(5 rows)
explain analyze select * from orders where orderno < 80000 limit 100 ;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.42..4.22 rows=100 width=45) (actual time=3.121..3.137 rows=100 loops=1)
-> Index Scan using orderno_idx on orders (cost=0.42..3028.31 rows=79879 width=45) (actual time=3.120..3.130 rows=100 loops=1)
Index Cond: (orderno < 80000)
Planning time: 0.052 ms
Execution time: 3.150 ms
(5 rows)
Lets see the first query – we have put a where clause order > 80000 and order by order created .
- In this query normal assumption is it should use Btree index created on orderno as it will help on filtering the data fast
- But actually it is using the Btree index created on order created
Lets understand the reasons for doing so:
- Query with order clause – Very Important to understand this
- As we have put the order by clause , in our query there will be around 9 million records matching for order > 80000 (as we have created 10 million record from 1 – 10 million)
- And postgres thought rather than getting all 9 million record in memory and sorting its better we use the sorted index and check for values > 80000
- there is a higher chance in this of getting values > 80000 in sorted order
- Query without order clause –
- as we just needs to check for orderno > 80000 and it is already sorted in orderno index so order no index is used.
In this we understood that when we use order by clause the tendency of postgres is to use index which is there on order by field (as in this case order created).
In next blog we will figure out why postgres in doing bitmap scan rather than index scan.
[…] it clears the usage if hypothetical index in PostgreSQL . In one of our blogs we learned about why index is not working and also how to check on which tables index needed […]
LikeLike
[…] last blog series we learned basics of Indexing. In this blog series part 1 we will understand on how to understand why query is not using the index. Basically sometimes what […]
LikeLike
[…] you discovered the query try to create indexes – we have a blogs on understanding indexes and why system not using your index , you can check out these if you are unable to figure […]
LikeLike