In last blog we learned about BRIN indexes , in this we will go in little detail on optimizing BRIN indexes and parameters for its optimization.
As we learned that BRIN index basically store the summary information (min and max ) for a particular page range(preferrably adjacent) such that it becomes easy to find which pages to cross check for a particular filter.
In this we will understand the BRIN index update mechanism and optimize the page range parameter.
When BRIN INDEX is updated
- Index Creation
- When the index is created it will create summary for all the pages of the table.
- Table Update
- For all page which are already summarized if any entry added / updated / removed , the index is updated in the transaction only
- For new pages which are not summarized
- Manual Update
- By default BRIN does not auto update for this one needs to run the function –
select brin_summarize_new_values('<index name>'::regclass);
- By default BRIN does not auto update for this one needs to run the function –
- Auto Update
- On autovacuum system will add the summary info for those blocks as well, we need to create index with auto vacuum on like this
create index c_t_brin_idx on calllog using brin(call_time) with (pages_per_range = 32,autosummarize = on);
- On autovacuum system will add the summary info for those blocks as well, we need to create index with auto vacuum on like this
- Manual Update
How to optimize BRIN Indexes
Primary parameter to optimize BRIN indexes is the pages_per_range , this balance between no. of pages which needs to be scanned (because of lossy nature) vs no. of index entry created and index maintenance needs.
This parameter needs to be optimized based on individual needs , we will try to provide some guiding rules here, but before that we should do some math here and calculate the number of pages and rows per page:
Calculate no. of pages :
select relpages from pg_class where relname = 'student';;
relpages
----------
5406
Now the number of pages is 5406
Now lets calculate total records :
select count(*) from student;
count
---------
1000000
Now we have total rows as 1 million , lets count no. of record per page:
select 1000000/5406 as row_per_page;
row_per_page
--------------
184
Now we see that row per page is 184 in our case, this will be diff. in you case so before proceeding check in your table.
and default value is 128 in page per range — so minimum values our index need to filter for every query is : 184*128 = 23552
Now lets try to see the rules, also rules needs to be balanced for best performance as in databases everything depends on your needs:
- Table should be Large – something > 5 millions
- if less data then benefit of small size of index would be less and Btree would also be small.
- Data should primarily be INSERT Only
- If we are constantly updating the data and also the data is updated in such a way that single value occur in multiple page range (eg: page 1-10 raneg is 100 -550 and page 11-20 range is 250-800), in this case the system needs to query both range pages which will slow down the filter as pages increase
- Data should be inserted in IN ORDER
- If we are inserting data in such a way that the column in which we are creating index that single value occur in multiple page range (eg: page 1-10 raneg is 100 -550 and page 11-20 range is 250-800), in this case the system needs to query both range pages which will slow down the filter as pages increase
- Search record are generally in large numbers > 1000
- if we are selecting single value Btree will generally out perform the BRIN , but if we are selecting the no. of records in some 1000’s , BRIN will start to perform better at large count.
Al in all one needs to be very clear in the above points before deciding to use BRIN.
In next blog we will move to Gin Indexes.
Subscribe our blog for more indepth tutorials.