1 / 25

Short Queries and Indexes

Short Queries and Indexes. Local Adaptation from: Henrietta Dombrovskaya, Boris Novikov “System Tuning” Saint Petersburg, Russia 2006. What will be covered:. Which queries are considered short. Short queries and indexes Indexes and mass data update Choosing selection criteria

maitland
Télécharger la présentation

Short Queries and Indexes

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Short Queries and Indexes Local Adaptation from: Henrietta Dombrovskaya, Boris Novikov “System Tuning” Saint Petersburg, Russia 2006

  2. What will be covered: • Which queries are considered short. • Short queries and indexes • Indexes and mass data update • Choosing selection criteria • Excessive selection conditions • How to avoid using indexes • Joins order • Impact of indexes on nested loops • Other index types Henrietta Dombrovskaya – Enova Financial

  3. Which Queries are Considered Short? • The query is considered short, when result can be obtained processing the small number of records, even if the tables are large. • For short queries sorting, grouping, and even joins are not time consuming. • Optimization goal for short queries: to avoid full scan of large tables (for small tables full scan may be still OK) • Typically for short queries tuning is necessary to improve throughput, not response time – the user does not care whether it takes 50 ms or 150 ms. Henrietta Dombrovskaya – Enova Financial

  4. Short Queries and Indexes • If we want to avoid full scan, some sort of index for the table should exist • Any index is a part of the database schema; we can create new indexes only if schema changes are allowed. Sometimes they can’t be implemented right away, and sometimes delayed index creation may be almost impossible • Be aware of potential implication of index creation on other queries and data modifications Henrietta Dombrovskaya – Enova Financial

  5. Indexes and Bulk Data Update drop index index_1; drop index index_2; … Bulk INSERT … create index index_1(….); create index index_2(….); …. Henrietta Dombrovskaya – Enova Financial

  6. Choosing Selection Criteria • Index selectivity • Unique indexes • Selection criteria and indexes • Compound indexes • Using index for data retrieval Henrietta Dombrovskaya – Enova Financial

  7. Index Selectivity • Do not use an index with small number of distinct values (exception - bitmap indexes, where applicable) • Index usage order: make sure that the index with the highest selectivity level will be used first Henrietta Dombrovskaya – Enova Financial

  8. Index Selectivity - Example select customer_id from customer_sourcescs where incoming_brand_id =11 AND type_cd IN ('lead_reject_import', 'import', 'pass_active_customer') AND cs.received_time > current_date - interval '24 hours' AND cs.received_time < current_timestamp - interval ' 15 minutes' AND cs.source_type_cd not in ('yesloans1stgbi','yesloansgbi','noworries1stgbi','noworriesgbi','aspiregbi','aspire-cpfgbi') We have indexes for type_cd, source_type_cd and received_time, the optimizer may choose condition with =, while in this case received_timehas higher selectivity level. Henrietta Dombrovskaya – Enova Financial

  9. Unique Indexes • If a column is described as a primary key, the unique index will be created automatically • You may need extra unique constraints for data integrity purposes (each UNIQUE constraint will generate unique index, too) • Unique indexes make nested loops efficient. Henrietta Dombrovskaya – Enova Financial

  10. Nested Loops FOR row1 in table1 loop For row2 in table2 loop If match(row1,row2) insert output row end loop end loop • When to use: joins and products • Cost: proportional (Size_of_T1)X(Size_of_T2) Table 1 Table 2 5 3 8 1 6 2 3 4 1 3 1

  11. Selection Criteria and Indexes – Columns Transformation • Any column transformations will prevent from using an index: where lower(last_name)=‘smith’ • If we need to search by transformed value, we need to create additional index: CREATE INDEX people_m13 ON private.people USING btree (lower(last_name::text)); Henrietta Dombrovskaya – Enova Financial

  12. Selection Criteria and Indexes- Using like Operator • Using like operator: • WHERE (lower(people.first_name) like E'chaman%‘ will not use the index Possible rewriting, if for some reason we can’t create an index: • WHERE (lower(people.first_name) >=E'chaman' and lower(people.first_name) <E'chamam') will use the index Henrietta Dombrovskaya – Enova Financial

  13. Pattern Indexes CREATE INDEX people__last_name_pattern ON private.people (lower(last_name::text) text_pattern_ops); Henrietta Dombrovskaya – Enova Financial

  14. Compound Indexes • Compound index is build for several columns of one table. Note: if an index was built for columns (X,Y,Z), you can use it to search X, XY and XYZ, not Y and not YZ. Postgres : equality on leading columns, inequality on other columns, which will still be scanned, but may save additional trip to the table. • Why to create compound index? • Additional selectivity • Additional data storage • Index-organized tables – not currently available in Postgres Henrietta Dombrovskaya – Enova Financial

  15. Using Indexes for Data Retrieval When all the columns from select statement are included into compound index, they may be retrieved without accessing the table. Example: CREATE INDEX loans_m2 ON cnu.loans (customer_id, funding_date, status_cd); SELECT funding_date, status_cdFROM loans WHERE customer_id=1111111 Henrietta Dombrovskaya – Enova Financial

  16. Using Multiple Indexes in Postgres Postgres can use the search results from multiple indexes by creating a bitmap of matching rows in main memory and then OR-ing or AND-ing them In this case the records will be scanned in the physical order, so the index-based ordering will be lost. Usage of compound indexes vs. sets of single-column indexes should be justified on case-by-case basis. Henrietta Dombrovskaya – Enova Financial

  17. Excessive Selection Criteria We can add redundant selection criteria to pre-select small records subset from the big table: • prompt to use specific indexes • reduce the sizes of join arguments. Henrietta Dombrovskaya – Enova Financial

  18. Excessive Selection Criteria – Example SELECT <…> FROM reports r, expense_itemsi WHERE r.rep_id=i.rep_id AND (r.proc_date=’1-jul-2003’ AND i.charge<0 OR r.proc_date=’15-jun-2003 AND i.charge=0 ); In this case the complex selection criteria can’t be applied before the tables are joined. (expense_itemsmay contain over several million records) Henrietta Dombrovskaya – Enova Financial

  19. Excessive Selection Criteria – Example (2) We will modify this select statement the following way: SELECT <…>FROM reports r, expense_items I WHERE r.rep_id=i.rep_id AND (r.proc_dateIN (‘1-jul-2003’, ’15-jun-2003’) ) AND (r.proc_date=’1-jul-2003’ AND i.charge<0 OR r.proc_date=’15-jun-2003’ AND i.charge=0 ); This will allow to restrict reports to the ones processed on Jul 1 and Jul 15, and then join only those reports with expense_items. Henrietta Dombrovskaya – Enova Financial

  20. How to Avoid Using Indexes • Use system-specific mechanisms, like optimizer hints • When optimizer hints can’t be used - modify selection criteria. Examples: • attr1+0=p_value • COALESCE (t1.attr2, 0)=t2.attr2 Henrietta Dombrovskaya – Enova Financial

  21. Order of Joins In short queries the size of join result may be small because of: • Restrictions on the joined tables (reduce the number of records in join arguments) • Semi-join (one argument significantly restricts the result size) If the optimizer generates an execution plan with large intermediate results size, the order of joins should be changed Henrietta Dombrovskaya – Enova Financial

  22. Order of Joins - Example SELECT DISTINCT c.* FROM customers c LEFT OUTER JOIN loans l on l.customer_id = c.id INNER JOIN customer_sourcescs ON cs.id = (SELECT MAX(id) FROM customer_sources cs1 WHERE cs1.customer_id = c.id AND incoming_brand_id = c.brand_id AND type_cd IN ('lead_reject_import', 'import', 'pass_active_customer')) LEFT OUTER JOIN work_items w ON w.customer_id = c.id and w.created_by in ('generate_ast_workitems','generate_ast_workitems_instl') and w.created_on > current_date - interval '24 hours‘ WHERE c.brand_id= 11 AND c.status_cd = 'active' AND c.fraud_flg = FALSE AND cs.received_time > current_date - interval '24 AND cs.received_time < current_timestamp - interval ' 15 minutes' AND cs.source_type_cdnot in ('yesloans1stgbi','yesloansgbi','noworries1stgbi','noworriesgbi','aspiregbi','aspire-cpfgbi') Henrietta Dombrovskaya – Enova Financial

  23. Impact of Indexes on Nested Loops • For small queries nested loops will be efficient, when the inner table is indexed by join attribute • No transformation should be applied to the indexed attribute. • Note: this may not work for long queries Henrietta Dombrovskaya – Enova Financial

  24. Other Index Types • Function-based indexes - use when the queries WHERE clause often contains the same attribute(s) transformation(like end_date - start_date) • Bitmap indexes - use for combination on several low-cardinality attributes, mostly for Data Warehouse type applications. • Partial indexes: CREATE INDEX payment_transactions_m16c ON cnu.payment_transactions_committed (credit_account_cd, status_cd, eff_date, payment_method_cd) WHERE status_cd::text = 'created'::text AND payment_method_cd::text = 'bank_account_ach'::text AND (credit_account_cd::text = 'disbursement_account'::text OR credit_account_cd::text = 'cso_disbursement_account'::text); Henrietta Dombrovskaya – Enova Financial

  25. More Complex Index Example CREATE INDEX customers__email_alt_pattern_idx ON cnu.customers (lower(email_alt::text) text_pattern_ops) WHERE email_alt IS NOT NULL; Selection criteria example: lower(email) like ‘johns%' OR (email_alt IS NOT NULL AND lower(email_alt) like ‘johns%') This condition will use bitmap OR’ing. Henrietta Dombrovskaya – Enova Financial

More Related