Andreas Weininger IBM Software Group

H07 Understanding Query Plans Andreas Weininger IBM Software Group 3.10.2006 • 3:15 PM - 4:15 PM Platform: Informix

Objectives • Understand what a query plan is. • Understand how query plans are executed. • Understand how queries are mapped to query plans. • Understand what the performance implications of different query plans are. • Understand how query plans can be monitored/analyzed.

Query Plans • Structure which determines the execution of a query (which operations, communication, etc.) • Iterator tree • Iterators • Data transfer between iterators • Mapping to Segments (threads, iterator instances)

Iterators • Interface: • Create • Open • Next • Close • Implementation of database operations • Each (non-leaf) iterator is calling operations on children

Iterator (cont) Data flow parent Call Iterator e.g. hash join Iterator instance Children e.g. scans

Iterator Types • Scan • Join (e.g. nested-loop, sort-merge, hash, anti-join) • Group • Other (sort, insert, update, upsert, etc.) • Stop-and-Go iterators (e.g. sort) • Whole input must be consumed before output is produced

Example Query INSERT INTO t3 SELECT t1.c3, t2.c3 FROM t1, t2 WHERE t1.c1 = t2.c1 AND t1.c2 = 0

Example Iterator Tree Insert t3 hjoin scan t1 scan t2

Insert t3 Insert t3 Insert t3 hjoin hjoin hjoin hash table Scan t1 Scan t2 Scan t2 Scan t2 scan Scan t2 Scan t1 Scan t2 Execution of Query Plan probe phase t1.c1=t2.c1 build phase t1.c2=0

How are Queries mapped to Query Plans? INSERT INTO t3 SELECT t1.c3, t2.c3 FROM t1, t2 WHERE t1.c1 = t2.c1 AND t1.c2 = 0 Insert t3 Proj: t1.c3,t2.c3 hjoin Join Cond: t1.c1=t2.c1 Proj: c1,c3 Proj: c1,c3 scan t1 scan t2 Filter: c2=0

General Rules for Mapping Queries to Query Plans • Cost based decision: • Join order • Join type • Access method • Early Projection • Early Restriction • Late Group by

Iterator Types and Filters • Scan Iterators • The actual operation of reading data. • Join Iterators • Used when a query involves more than one table • Rows from one table are matched to rows of the second table • Filters • The search criteria of a query used to determine which rows are to be selected by the query.

Scan Types • Sequential Table Scan • Data rows are read in sequential order as they exit in the table. • Index Scan • Data in a table is accessed through an index. • Skip Duplicate Index Scan • Special type of index scan in which duplicate rows are skipped • ROWID Scan • The data rows physical location is given and read directly

Scan Types (con’t) • First Row Scan • Data is scanned until the first row satisfies the search condition is found, then the scan is terminated • Auto Index Path • Add an index on the fly if one does not exist on a required column • Key-Only Scan • If all columns required by the query are available in the index, then the data row is not retrieved.

Join Types • Nested Loop Join • Scans the inner table once for every row in the outer table. • The inner table can use either an index or table scan. • Dynamic Hash Join • Scans the smaller table and builds a hash table by applying a hash function on the join key. • The larger table is passed by the hash function to see if a match occurs. • Cartesian Product Join • All rows of the inner table with all rows of the outer table.

Filters • Lower-index filter • A single filter or set of filters which provide a lower bound or starting position for an index scan. • Example: column >= 10 Start scanning the ascending index from value 10 • Upper-Index Filter • A single filter or set of filters which provide an upper bound or stopping position for an index scan

Filters (continued) • Key-First Filter • Index key filters are applied first to see if the corresponding data row needs to be retrieved. • Example: SELECT * FROM tab where c1>=1 and c2=2 If an index exists on c1 and c2 then the filter c2=2 will be applied to each index key first. Only if this is satisfied will the corresponding data row be fetched • C1>=1 is used as a lower index filter to start the index scan • Key-Start

How to Enable the Optimizer Explain • Enabling and Disabling the Optimizer Explain file • Turning on in application • Dynamically enabling • Enabling statistics

Enable Explain in the Application • Enable in the application • SET EXPLAIN ON • SET EXPLAIN STATISTICS IDS only; XPS includes Query Statistics by default in the explain file • SET EXPLAIN OFF • Location of the file • UNIX • File name sqexplain.out • The current directory if you have a local client • Home directory on remote host if the client is remote • XPS: SET EXPLAIN FILE TO ‘filename’

Example of Query Plan select count(*) from customer c, orders o where c.customer_num = o.customer_num Estimated Cost: 7 Estimated # of Rows Returned: 1 1) informix.o: INDEX PATH (1) Index Keys: customer_num (Key-Only) (Serial, fragments: ALL) 2) informix.c: INDEX PATH (1) Index Keys: customer_num (Key-Only) (Serial, fragments: ALL) Lower Index Filter: informix.c.customer_num = informix.o.customer_num NESTED LOOP JOIN Query text Estimates Access Order Detailed Access Plan OptimizerStatistics

Example Query Statistics IDS Table map : ---------------------------- Internal name Table name ---------------------------- t1 o t2 c type table rows_prod est_rows time rows_scan est_cost ------------------------------------------------------------------------------- scan t1 23 23 0:0:0 23 2 type table rows_prod est_rows time rows_scan est_cost ------------------------------------------------------------------------------- scan t2 23 28 0:0:0 1 0 type rows_prod est_rows time est_cost ------------------------------------------------------- nljoin 23 22 0:0:0 7

Optimizer Statistics Actual Numbers Optimizer predictions Table map : ---------------------------- Internal name Table name ---------------------------- t1 tab1 t2 tab2 type table rows_prod est_rows time rows_scan est_cost ------------------------------------------------------------------------------- scan t1 92 92 0:0:0 5244 282 type table rows_prod est_rows time rows_scan est_cost ------------------------------------------------------------------------------- scan t2 92 92 0:0:0 92 5 type time rows_bld rows_prb novrflo est_cost ------------------------------------------------------------------------------- hjoin 8464 148 0:0:0 92 92 0 321 rows_prod est_rows

Dynamically Enabling Explain • IDS: Enable using onmode -Y • onmode –Y <sid> [0|1] • Can not turn on SQL statistics dynamically • Only queries optimized after explain has been enabled will be displayed in the explain file. • XPS: Enable using onmode –q explain • onmode -q explain <Sid> {off|avoid_execute| {{on | immediate} [ file <fname> | fileappend <fname>]}} on: explain plans for following queries in session Immediate: explain plan for currently running query File: put output in this file Fileappend: append output to this file

View Real Time Query Statistics (IDS) onstat –g pqs {sid} • Viewing current progress of query • Real time data • Always enabled • Only while query is executing Query Operators : Opaddr opname phase rows time in1 in2 ----------------------------------------------------------------- b444d60 scan done 52 0:0:0 0 0 b45cd60 scan done 4 0:0:0 0 0 b45e2f0 join done 92 0:0:0 b444d60 b45cd60 b460d60 scan done 13 0:0:0 0 0 b45e208 join done 795 0:0:0 b45e2f0 b460d60

View Real Time Query Statistics Query Operators : Opaddr opname phase rows time in1 in2 ----------------------------------------------------------------- b444d60 scan done 52 0:0:0 0 0 b45cd60 scan done 4 0:0:0 0 0 b45e2f0 join done 92 0:0:0 b444d60 b45cd60 b460d60 scan done 13 0:0:0 0 0 b45e208 join done 795 0:0:0 b45e2f0 b460d60 Join Scan Join Scan Scan

View Real Time Query Statistics (XPS) • onstat –g xqp <planid> • Query plan during run time XMP Query Plan oper segid brid width misc info ----------------------------------------- scan 3 0 2 s scan 4 0 2 t hjoin 2 0 2 group 2 0 2 group 1 0 1

View Real Time Query Statistics (XPS) • onstat –g xqs <planid> • Query statistics for already executed segments XMP Query Statistics Cosvr_ID: 1 Plan_ID: 286 type segid brid information ---- ----- ---- ----------- scan 3 0 inst cosvr time rows_prod rows_scan ---- ----- ---- --------- --------- 0 1 3 498437 498437 1 2 4 501563 501563 -------------------------------------- 2 1000000 1000000

View Real Time Query Statistics (XPS) • onstat –g xmp • Query statistics for the currently running segments XMP Query Segments for Plan-id : 288 segid width numbr qryid sessid flags seqno 0 1 1 288 1.150 0x11a 1 2 2 1 288 1.150 0x118 3 4 2 1 288 1.150 0x118 4 XMP Query Operators for Plan-id : 288 opaddr qry segid branch brtid opname phase rows in1 in2 0x1393a160 288 2 0-0 685 xchg create 0 0x1393a2b0 0x0 0x1393a2b0 288 2 0-0 685 group create 0 0x1393a6a4 0x0 0x1393a6a4 288 2 0-0 685 hjoin probe 25266 0x1393a9a4 0x1393aa9c 0x1393a9a4 288 2 0-0 685 xchg done 498437 0x0 0x0 0x1393aa9c 288 2 0-0 685 xchg next 597125 0x0 0x0 0x13732e80 288 4 0-0 686 xchg open 664284 0x13739dcc 0x0 0x13739dcc 288 4 0-0 686 scan next 664656 0x0 0x0

Overview Section • The SQL statement for the query • An estimate of the query cost in units the optimizer uses to compare plans • An estimate for the number of rows that the query is expected to return • Maximum number of threads used to execute the query if PDQPRIORITY is set. • Any temporary files required for executing order by and/or group by • Directives followed or ignored

Access Order • The tables are listed in the order in which they are accessed. • Top of the list is the first table accessed • For each table the following information is listed: • The accessed table’s name • The access plan by which the server reads the table • Sequential scan, Index path,… • The active fragments to be scanned. • For each pair of tables the join plan is listed • Nested loop join, dynamic hash join

Detailed Access Plan • This section is repeated for each table in the Access Section. • Depending on the type of access different information is displayed • Sequential Scans • Filters to be applied • Index and auto-index Scans • The index keys on which filters are applied • Key-Only, Aggregate and/or Key-First • Lower index filter • Upper index filter • Key-First Filters

A list of the tables and their internal names For each operator Operator type Internal table name Rows produced by the operator Optimizer predicted rows produce by the operator Rows scanned Estimated cost of this operator Query Statistics

Example of Optimizer Statistics QUERY SELECT tab1.c2 FROM tab1,tab2 WHERE tab1.c2 = tab2.c2 AND tab1.c2 MATCHES "systable*" Estimated Cost: 321 Estimated # of Rows Returned: 148 1) miller3.tab1: SEQUENTIAL SCAN Filters: miller3.tab1.c2 MATCHES 'systable*' 2) miller3.tab2: INDEX PATH (1) Index Keys: c2 (Key-Only) (Serial, fragments: ALL) Lower Index Filter: miller3.tab2.c2 MATCHES 'systable*' DYNAMIC HASH JOIN Dynamic Hash Filters: miller3.tab1.c2 = miller3.tab2.c2

Controlling the Optimizer • External Directives • Directives not required to be in the application • Allows DBA to modify the query plan w/o changing the application • To create external directives use the SAVE EXTERNAL DIRECTIVE statement • SET OPTIMIZATION • LOW, HIGH, • FIRST_ROWS, ALL_ROWS • Update Statistics • Statistics and distributions • Directives • Positive and negative directives

Other Information • An Overview of the IBM Informix Dynamic Server Optimizer • http://www-128.ibm.com/developerworks/db2/zones/informix/library/techarticle/0211desai/0211desai.html • Monitoring Query Execution in XPS • http://www-128.ibm.com/developerworks/db2/library/techarticle/dm-0404weininger/ • IBM Informix Performance Manual • IBM Informix SQL Reference Manual

Questions

H07 Understanding Query Plans Andreas Weininger IBM Software Group Andreas.Weininger@de.ibm.com

Andreas Weininger IBM Software Group

Andreas Weininger IBM Software Group

Presentation Transcript

IBM Software Group

IBM Software Industry Solutions

IBM Software Overview

IBM Software Group Joanna Tame Technical Sales Specialist

IBM Software Group

IBM Open Source Software Competition Mark Wallace, Software Architect, IBM Dublin Software Lab

IBM User Group

IBM Software Group

IBM Open Source Software Competition Mark Wallace, Software Architect, IBM Dublin Software Lab

IBM Software Support Process - Yogesh Talekar, IBM

IBM Software

IBM Software Group

IBM Software Group

IBM Software

Dr. Donald Ferguson IBM Fellow, SWG Chief Architect Chair, IBM Software Group Architecture Board

Major Reasons for IBM Software Group Acquisitions

IBM Software Group

IBM Software Group

Omer Tripp IBM Software Group omert@il.ibm