25
Short Queries and Indexes Local Adaptation from: Henrietta Dombrovskaya, Boris Novikov “System Tuning” Saint Petersburg, Russia 2006

Short Queries and Indexes

Embed Size (px)

DESCRIPTION

Short Queries and Indexes. Local Adaptation from: Henrietta Dombrovskaya, Boris Novikov “System Tuning” Saint Petersburg, Russia 2006. What will be covered:. Which queries are considered short. Short queries and indexes Indexes and mass data update Choosing selection criteria - PowerPoint PPT Presentation

Citation preview

Page 1: Short Queries and Indexes

Short Queries and Indexes

Local Adaptation from: Henrietta Dombrovskaya, Boris Novikov

“System Tuning” Saint Petersburg, Russia 2006

Page 2: Short Queries and Indexes

Henrietta Dombrovskaya – Enova Financial

What will be covered:

• Which queries are considered short.• Short queries and indexes• Indexes and mass data update• Choosing selection criteria• Excessive selection conditions• How to avoid using indexes• Joins order• Impact of indexes on nested loops• Other index types

2012

Page 3: Short Queries and Indexes

Henrietta Dombrovskaya – Enova Financial

Which Queries are Considered Short?

• The query is considered short, when result can be obtained processing the small number of records, even if the tables are large.

• For short queries sorting, grouping, and even joins are not time consuming.

• Optimization goal for short queries: to avoid full scan of large tables (for small tables full scan may be still OK)

• Typically for short queries tuning is necessary to improve throughput, not response time – the user does not care whether it takes 50 ms or 150 ms.

2012

Page 4: Short Queries and Indexes

Henrietta Dombrovskaya – Enova Financial

Short Queries and Indexes

• If we want to avoid full scan, some sort of index for the table should exist

• Any index is a part of the database schema; we can create new indexes only if schema changes are allowed. Sometimes they can’t be implemented right away, and sometimes delayed index creation may be almost impossible

• Be aware of potential implication of index creation on other queries and data modifications

2012

Page 5: Short Queries and Indexes

Henrietta Dombrovskaya – Enova Financial

Indexes and Bulk Data Update

drop index index_1;drop index index_2;…Bulk INSERT …

create index index_1(….);create index index_2(….);….

2012

Page 6: Short Queries and Indexes

Henrietta Dombrovskaya – Enova Financial

Choosing Selection Criteria

• Index selectivity• Unique indexes• Selection criteria and indexes• Compound indexes• Using index for data retrieval

2012

Page 7: Short Queries and Indexes

Henrietta Dombrovskaya – Enova Financial

Index Selectivity

• Do not use an index with small number of distinct values (exception - bitmap indexes, where applicable)

• Index usage order: make sure that the index with the highest selectivity level will be used first

2012

Page 8: Short Queries and Indexes

Henrietta Dombrovskaya – Enova Financial

Index Selectivity - Exampleselect customer_id from customer_sources cs where incoming_brand_id

=11 AND type_cd IN ('lead_reject_import', 'import',

'pass_active_customer') AND cs.received_time > current_date - interval '24 hours' AND cs.received_time < current_timestamp - interval ' 15

minutes' AND cs.source_type_cd not in

('yesloans1stgbi','yesloansgbi','noworries1stgbi','noworriesgbi','aspiregbi','aspire-cpfgbi')

We have indexes for type_cd, source_type_cd and received_time, the optimizer may choose condition with =, while in this case received_time has higher selectivity level.

2012

Page 9: Short Queries and Indexes

Henrietta Dombrovskaya – Enova Financial

Unique Indexes

• If a column is described as a primary key, the unique index will be created automatically

• You may need extra unique constraints for data integrity purposes (each UNIQUE constraint will generate unique index, too)

• Unique indexes make nested loops efficient.

2012

Page 10: Short Queries and Indexes

Nested LoopsFOR row1 in table1 loop For row2 in table2 loop If match(row1,row2) insert output row end loopend loop

• When to use: joins and products

• Cost: proportional (Size_of_T1)X(Size_of_T2)

5863413

312

1

Table 1 Table 2

Page 11: Short Queries and Indexes

Henrietta Dombrovskaya – Enova Financial

Selection Criteria and Indexes – Columns Transformation

• Any column transformations will prevent from using an index:

where lower(last_name)=‘smith’• If we need to search by transformed value, we need

to create additional index:CREATE INDEX people_m13 ON private.people USING btree (lower(last_name::text));

2012

Page 12: Short Queries and Indexes

Henrietta Dombrovskaya – Enova Financial

Selection Criteria and Indexes- Using like Operator

• Using like operator:– WHERE (lower(people.first_name) like E'chaman%‘ will not use the indexPossible rewriting, if for some reason we can’t create an

index:– WHERE (lower(people.first_name) >=E'chaman' and

lower(people.first_name) <E'chamam') will use the index

2012

Page 13: Short Queries and Indexes

Henrietta Dombrovskaya – Enova Financial

Pattern Indexes

CREATE INDEX people__last_name_pattern ON private.people (lower(last_name::text) text_pattern_ops);

2012

Page 14: Short Queries and Indexes

Henrietta Dombrovskaya – Enova Financial

Compound Indexes• Compound index is build for several columns of one table.Note: if an index was built for columns (X,Y,Z), you can use it

to search X, XY and XYZ, not Y and not YZ. Postgres : equality on leading columns, inequality on other columns, which will still be scanned, but may save additional trip to the table.

• Why to create compound index?– Additional selectivity– Additional data storage

• Index-organized tables – not currently available in Postgres

2012

Page 15: Short Queries and Indexes

Henrietta Dombrovskaya – Enova Financial

Using Indexes for Data RetrievalWhen all the columns from select statement are included

into compound index, they may be retrieved without accessing the table.

Example:

CREATE INDEX loans_m2 ON cnu.loans (customer_id, funding_date, status_cd);

SELECT funding_date, status_cd FROM loans WHERE customer_id =1111111

2012

Page 16: Short Queries and Indexes

Henrietta Dombrovskaya – Enova Financial

Using Multiple Indexes in Postgres

Postgres can use the search results from multiple indexes by creating a bitmap of matching rows in main memory and then OR-ing or AND-ing themIn this case the records will be scanned in the physical order, so the index-based ordering will be lost. Usage of compound indexes vs. sets of single-column indexes should be justified on case-by-case basis.

2012

Page 17: Short Queries and Indexes

Henrietta Dombrovskaya – Enova Financial

Excessive Selection Criteria

We can add redundant selection criteria to pre-select small records subset from the big table:– prompt to use specific indexes – reduce the sizes of join arguments.

2012

Page 18: Short Queries and Indexes

Henrietta Dombrovskaya – Enova Financial

Excessive Selection Criteria – Example

SELECT <…> FROM reports r, expense_items iWHERE r.rep_id=i.rep_id AND (r.proc_date=’1-jul-2003’ AND i.charge<0 OR r.proc_date=’15-jun-2003 AND i.charge=0 );

In this case the complex selection criteria can’t be applied before the tables are joined. (expense_items may contain over several million records)

2012

Page 19: Short Queries and Indexes

Henrietta Dombrovskaya – Enova Financial

Excessive Selection Criteria – Example (2)

We will modify this select statement the following way:

SELECT <…>FROM reports r,

expense_items IWHERE r.rep_id=i.rep_idAND (r.proc_date IN (‘1-jul-2003’, ’15-jun-2003’) )AND (r.proc_date=’1-jul-2003’ AND i.charge<0 OR r.proc_date=’15-jun-2003’ AND i.charge=0 );

This will allow to restrict reports to the ones processed on Jul 1 and Jul 15, and then join only those reports with expense_items.

2012

Page 20: Short Queries and Indexes

Henrietta Dombrovskaya – Enova Financial

How to Avoid Using Indexes

• Use system-specific mechanisms, like optimizer hints

• When optimizer hints can’t be used - modify selection criteria.

Examples:– attr1+0=p_value– COALESCE (t1.attr2, 0)=t2.attr2

2012

Page 21: Short Queries and Indexes

Henrietta Dombrovskaya – Enova Financial

Order of Joins

In short queries the size of join result may be small because of:– Restrictions on the joined tables (reduce the

number of records in join arguments)– Semi-join (one argument significantly restricts the

result size)If the optimizer generates an execution plan

with large intermediate results size, the order of joins should be changed

2012

Page 22: Short Queries and Indexes

Henrietta Dombrovskaya – Enova Financial

Order of Joins - ExampleSELECT DISTINCT c.* FROM customers c LEFT OUTER JOIN loans l on l.customer_id = c.id INNER JOIN customer_sources cs ON cs.id = (SELECT MAX(id) FROM customer_sources cs1 WHERE cs1.customer_id = c.id AND incoming_brand_id = c.brand_id AND type_cd IN ('lead_reject_import', 'import', 'pass_active_customer')) LEFT OUTER JOIN work_items w ON w.customer_id = c.id and w.created_by in ('generate_ast_workitems','generate_ast_workitems_instl') and w.created_on > current_date - interval '24 hours‘WHERE

c.brand_id = 11AND c.status_cd = 'active'AND c.fraud_flg = FALSEAND cs.received_time > current_date - interval '24 AND cs.received_time < current_timestamp - interval ' 15 minutes' AND cs.source_type_cd not in

('yesloans1stgbi','yesloansgbi','noworries1stgbi','noworriesgbi','aspiregbi','aspire-cpfgbi')

2012

Page 23: Short Queries and Indexes

Henrietta Dombrovskaya – Enova Financial

Impact of Indexes on Nested Loops• For small queries nested loops will be efficient, when

the inner table is indexed by join attribute• No transformation should be applied to the indexed

attribute.• Note: this may not work for long queries

2012

Page 24: Short Queries and Indexes

Henrietta Dombrovskaya – Enova Financial

Other Index Types• Function-based indexes - use when the queries WHERE clause often

contains the same attribute(s) transformation(like end_date - start_date)

• Bitmap indexes - use for combination on several low-cardinality attributes, mostly for Data Warehouse type applications.

• Partial indexes:CREATE INDEX payment_transactions_m16c ON cnu.payment_transactions_committed (credit_account_cd, status_cd, eff_date, payment_method_cd) WHERE status_cd::text = 'created'::text AND payment_method_cd::text = 'bank_account_ach'::text AND (credit_account_cd::text = 'disbursement_account'::text OR credit_account_cd::text = 'cso_disbursement_account'::text);

2012

Page 25: Short Queries and Indexes

Henrietta Dombrovskaya – Enova Financial

More Complex Index ExampleCREATE INDEX customers__email_alt_pattern_idx ON cnu.customers (lower(email_alt::text) text_pattern_ops) WHERE email_alt IS NOT NULL;

Selection criteria example:

lower(email) like ‘johns%' OR (email_alt IS NOT NULL AND lower(email_alt) like ‘johns%')

This condition will use bitmap OR’ing.

2012