290 likes | 681 Vues
AWR: Going beyond the scripts supplied by Oracle. Jerry Brenner, 5/17/2007. Who am I?. 13 years experience in database internals, primarily query processing and optimization Technical lead for rewriting subquery processing at Sybase
E N D
AWR: Going beyond the scripts supplied by Oracle Jerry Brenner, 5/17/2007
Who am I? • 13 years experience in database internals, primarily query processing and optimization • Technical lead for rewriting subquery processing at Sybase • Original member of query processing team at Cloudscape (Now open source as Derby) • Team lead for database performance and functionality at Guidewire
What is Guidewire? • Leading supplier of solutions for Property and Casualty insurance companies • Applications for claims processing, policy administration and billing • Global company, with operations in London, Sydney, Paris and Germany • Over 35 customers
What is Guidewire’s application platform? • All applications built on the same platform • All java, running in a servlet container • Proprietary persistence layer • Most queries built by query generator • All applications are highly configurable and include the ability to add custom tables and queries • All applications run against both Oracle and SQL Server • Oracle is our lead platform for performance testing, because of Statspack and AWR
Performance testing and support at Guidewire • Performance testing for 3.5 years • Try to improve the level of testing with each release • Know that there will always be the risk of performance issues occurring in the field • Built a number of supportability tools into the product • Output of tools persisted with every performance test • Tools available in the field and designed to require minimal interaction with minimal interaction • Previous releases on 9i, current releases on 10g • Statspack supported on 9i and 10g, AWR on 10g
Performance testing and support at Guidewire • Perf harness originally built in perl and shell scripts • Hard to maintain • Couldn’t make tools available to the field • Perf harness rewritten as a Guidewire application • Tools rewritten in java • Both harness and tools easier to maintain • Tools automatically available in all products • All tests run with STATISTICS_LEVEL=ALL, so we can capture detailed query plans
Quick overview of our AWR tool • Snapshots taken outside of our products • Tool requires Dictionary privilege • Tool takes snapshot ids as input, generates a zip file, which includes: • .sql files for most of the queries run against the AWR tables (for reference) • Html pages with (empty) links to query plans, where applicable • Shell scripts for calling Oracle scripts (awrrpt, awrddrpi, awrsqrpt and ashrpti) for the AWR reports • User expands zip files and executes shell scripts to get AWR reports and query plans
Some limitations of Oracle supplied AWR reports • Limited information on: • Top queries by various criteria across all executions • Hot objects by various criteria • Which plans are using potentially expensive access methods • Which indexes are being used • AWR doesn’t know about our applications
Some limitations of Oracle supplied AWR reports • No information on: • Infrequently run, but expensive queries • Why hot objects are hot (AWR does not capture row source information in query plans, even when STATISTICS_LEVEL=ALL) • Queries not using bind parameters • Queries that could be expensive, but test data is missing • Which indexes are not used • Min/max values for resource usage of query plans • Index key columns
Information that we get from the AWR tables • Top queries according to various criteria, both across all executions and per execution • Statistics on gets and physical reads and writes on all objects (table or index) in our schema, with information rolled up by (table, tablespace) and (table) • All query plans that access all objects in our schema • All queries that returned 0 rows across all executions • All queries that returned an average of between 0 and 1 row • All plans that include potentially expensive access methods: • Merge join • Hash join • Index skip scan • … • Query plans for all queries of interest • ASH reports for all queries of interest
Information that we get from other sources • Query plans with row source information for all queries of interest (optional) • Graphs of CPU and i/o usage for all machines across all tiers in the tests
Top queries by resource usage Dimensions • Resource (CPU time, elapsed time, buffer gets, physical reads, rows returned) • Execution (Across all executions, per execution) • SQL (All SQL, various types of query generator queries, inserts, updates, LIKE, …)
Examples of top queries by resource usage across all executions -- Get the top 100 queries by total number of disk_reads_delta SELECT * FROM (SELECT SQL_ID, SUM(disk_reads_delta) AS "Disk Reads" FROM DBA_HIST_SQLSTAT S WHERE SNAP_ID > 209 AND SNAP_ID <= 210 AND S.MODULE = 'ClaimCenter' GROUP BY SQL_ID HAVING SUM(disk_reads_delta) >= 0 ORDER BY 2 DESC) DERIVED_TABLE WHERE ROWNUM <= 100;
Examples of top queries by resource usage per execution • -- Get the top 100 queries by total number of disk_reads_delta by execution SELECT * FROM (SELECT SQL_ID, CASE SUM(EXECUTIONS_DELTA) WHEN 0 THEN 0 ELSE SUM(disk_reads_delta) / SUM(EXECUTIONS_DELTA) END AS "Disk Reads/Execution" FROM DBA_HIST_SQLSTAT S WHERE SNAP_ID > 209 AND SNAP_ID <= 210 AND S.MODULE = 'ClaimCenter' GROUP BY SQL_ID HAVING CASE SUM(EXECUTIONS_DELTA) WHEN 0 THEN 0 ELSE SUM(disk_reads_delta) / SUM(EXECUTIONS_DELTA) END >= 0 ORDER BY 2 DESC) DERIVED_TABLE WHERE ROWNUM <= 100;
Top queries by access or join method Dimensions • Resource (CPU time, elapsed time, buffer gets, physical reads, rows returned) • Execution (Across all executions, per execution) • Access or join method (Index fast full scan, index skip scan, hash join, merge join, …)
Examples of top queries by access or join method across all executions -- Get the top 100 queries (mergeJoinQueries) by total number of disk_reads_delta SELECT * FROM (SELECT SQL_ID, SUM(disk_reads_delta) AS "Disk Reads" FROM DBA_HIST_SQLSTAT S WHERE SNAP_ID > 209 AND SNAP_ID <= 210 AND S.MODULE = 'ClaimCenter' AND EXISTS (SELECT * FROM DBA_HIST_SQL_PLAN P WHERE S.SQL_ID = P.SQL_ID AND S.PLAN_HASH_VALUE = P.PLAN_HASH_VALUE AND P.OPERATION = 'MERGE JOIN') GROUP BY SQL_ID HAVING SUM(disk_reads_delta) >= 0 ORDER BY 2 DESC) DERIVED_TABLE WHERE ROWNUM <= 100;
Examples of top queries by access or join method per execution -- Get the top 100 queries (mergeJoinQueries) by total number of disk_reads_delta by execution SELECT * FROM (SELECT SQL_ID, CASE SUM(EXECUTIONS_DELTA) WHEN 0 THEN 0 ELSE SUM(disk_reads_delta) / SUM(EXECUTIONS_DELTA) END AS "Buffer Gets/Execution" FROM DBA_HIST_SQLSTAT S WHERE SNAP_ID > 209 AND SNAP_ID <= 210 AND S.MODULE = 'ClaimCenter' AND EXISTS (SELECT * FROM DBA_HIST_SQL_PLAN P WHERE S.SQL_ID = P.SQL_ID AND S.PLAN_HASH_VALUE = P.PLAN_HASH_VALUE AND P.OPERATION = 'MERGE JOIN') GROUP BY SQL_ID HAVING CASE SUM(EXECUTIONS_DELTA) WHEN 0 THEN 0 ELSE SUM(disk_reads_delta) / SUM(EXECUTIONS_DELTA) END >= 0 ORDER BY 2 DESC) DERIVED_TABLE WHERE ROWNUM <= 100;
Queries that return too few rows in a test environment • Is the test valid or is it skipping an expensive code path due to missing data or test problems? Example: -- Get the queries that return 0 rows across all executions SELECT * FROM (SELECT SQL_ID, SUM(rows_processed_delta) AS "Rows Processed", SUM(executions_delta) FROM DBA_HIST_SQLSTAT S WHERE SNAP_ID > 209 AND SNAP_ID <= 210 AND S.MODULE = 'ClaimCenter' GROUP BY SQL_ID HAVING SUM(rows_processed_delta) = 0 ORDER BY 3 DESC) DERIVED_TABLE WHERE ROWNUM <= 1000000;
Hot objects Dimensions: • I/O type (Logical reads, physical reads, physical writes) • Object ((Object), (table, tablespace), (table)) Missing info: • What makes an object hot (according to each i/o type)
Examples of top objects -- logical_reads, physical_reads, physical_writes by segment (tablespace, table, object_type). select n.owner , n.tablespace_name, case when n.object_type = 'TABLE' then n.object_name when n.object_type = 'INDEX' then (SELECT table_name from dba_indexes WHERE index_name = n.object_name AND owner = n.owner) when n.object_type = 'LOB' then (SELECT table_name from dba_lobs WHERE segment_name = n.object_name AND owner = n.owner) else 'N/A' end as Table_name, n.object_name, n.object_type, r.logical_reads, round(r.logical_reads_ratio * 100, 2) logical_reads_ratio, r.physical_reads, round(r.physical_reads_ratio * 100, 2) physical_reads_ratio, r.physical_writes, round(r.physical_writes_ratio * 100, 2) physical_writes_ratio from dba_hist_seg_stat_obj n, (select * from (select e.dataobj#, e.obj#, e.dbid, sum(e.logical_reads_delta) logical_reads, ratio_to_report(sum(e.logical_reads_delta)) over () logical_reads_ratio, sum(e.physical_reads_delta) physical_reads, ratio_to_report(sum(e.physical_reads_delta)) over () physical_reads_ratio, sum(e.physical_writes_delta) physical_writes, ratio_to_report(sum(e.physical_writes_delta)) over () physical_writes_ratio from dba_hist_seg_stat e where e.snap_id > 209 and e.snap_id <= 210 group by e.dataobj#, e.obj#, e.dbid having sum(e.logical_reads_delta) > 0 or sum(e.physical_reads_delta) > 0 or sum(e.physical_writes_delta) > 0 order by logical_reads desc) d ) r where n.dataobj# = r.dataobj# and n.obj# = r.obj# and n.dbid = r.dbid and n.owner = 'MKTG' order by logical_reads desc;
Examples of top (table, tablespace)s -- logical_reads, physical_reads, physical_writes by tablespace and table. SELECT owner, tablespace_name, table_name, sum(logical_reads) as "LOGICAL READS", sum(logical_reads_ratio) as logical_reads_ratio, sum(physical_reads) as "PHYSICAL READS", sum(physical_reads_ratio) as physical_reads_ratio, sum(physical_writes) as "PHYSICAL WRITES", sum(physical_writes_ratio) as physical_writes_ratio FROM ( select n.owner, n.tablespace_name, case when n.object_type = 'TABLE' then n.object_name when n.object_type = 'INDEX' then (SELECT table_name from dba_indexes WHERE index_name = n.object_name) when n.object_type = 'LOB' then (SELECT table_name from dba_lobs WHERE segment_name = n.object_name) else 'N/A' end as Table_name, n.object_name, n.object_type, r.logical_reads, round(r.logical_reads_ratio * 100, 2) logical_reads_ratio, r.physical_reads, round(r.physical_reads_ratio * 100, 2) physical_reads_ratio, r.physical_writes, round(r.physical_writes_ratio * 100, 2) physical_writes_ratio from dba_hist_seg_stat_obj n, (select * from (select e.dataobj#, e.obj#, e.dbid, sum(e.logical_reads_delta) logical_reads, ratio_to_report(sum(e.logical_reads_delta)) over () logical_reads_ratio, sum(e.physical_reads_delta) physical_reads, ratio_to_report(sum(e.physical_reads_delta)) over () physical_reads_ratio, sum(e.physical_writes_delta) physical_writes, ratio_to_report(sum(e.physical_writes_delta)) over () physical_writes_ratio from dba_hist_seg_stat e where e.snap_id > 209 and e.snap_id <= 210 group by e.dataobj#, e.obj#, e.dbid having sum(e.logical_reads_delta) > 0 or sum(e.physical_reads_delta) > 0 or sum(e.physical_writes_delta) > 0 order by logical_reads desc) d ) r where n.dataobj# = r.dataobj# and n.obj# = r.obj# and n.dbid = r.dbid AND n.owner = 'MKTG' ) c group by owner, tablespace_name, table_name order by 4 desc;
Examples of top tables -- logical_reads, physical_reads, physical_writes by table. SELECT owner, table_name, sum(logical_reads) as "Logical Reads", sum(logical_reads_ratio) as "Logical Reads Ratio", sum(physical_reads) as "Physical Reads", sum(physical_reads_ratio) as "Physical Reads Ratio", sum(physical_writes) as "Physical Writes", sum(physical_writes_ratio) as "Physical Writes Ratio" FROM ( select n.owner, case when n.object_type = 'TABLE' then n.object_name when n.object_type = 'INDEX' then (SELECT table_name from dba_indexes WHERE index_name = n.object_name) when n.object_type = 'LOB' then (SELECT table_name from dba_lobs WHERE segment_name = n.object_name) else 'N/A' end as Table_name, n.object_name, n.object_type, r.logical_reads, round(r.logical_reads_ratio * 100, 2) logical_reads_ratio, r.physical_reads, round(r.physical_reads_ratio * 100, 2) physical_reads_ratio, r.physical_writes, round(r.physical_writes_ratio * 100, 2) physical_writes_ratio from dba_hist_seg_stat_obj n, (select * from (select e.dataobj#, e.obj#, e.dbid, sum(e.logical_reads_delta) logical_reads, ratio_to_report(sum(e.logical_reads_delta)) over () logical_reads_ratio, sum(e.physical_reads_delta) physical_reads, ratio_to_report(sum(e.physical_reads_delta)) over () physical_reads_ratio, sum(e.physical_writes_delta) physical_writes, ratio_to_report(sum(e.physical_writes_delta)) over () physical_writes_ratio from dba_hist_seg_stat e where e.snap_id > 209 and e.snap_id <= 210 group by e.dataobj#, e.obj#, e.dbid having sum(e.logical_reads_delta) > 0 or sum(e.physical_reads_delta) > 0 or sum(e.physical_writes_delta) > 0 order by logical_reads desc) d ) r where n.dataobj# = r.dataobj# and n.obj# = r.obj# and n.dbid = r.dbid and n.owner = 'MKTG' ) c group by owner, table_name order by 3 desc;
Capturing query plans with row source info SELECT s.sql_id, t.* FROM dba_hist_sqlstat s, TABLE(dbms_xplan.display_cursor(s.sql_id, null, 'ALLSTATS')) t WHERE module = ‘ClaimCenter' AND s.sql_id IN (SELECT sql_id FROM v$sql_plan_statistics_all) AND snap_id > <begin_snap> AND snap_id <= <end_snap>;
Finding potential incorrect use of constants • We use constants, instead of bind variables, when there is a good chance that the data is skewed • Valid constants will almost always be integers Algorithm: • Identify queries with string constants • Count constants and keep copy of query with and without the constants • Group queries by number of constants and remaining text • Display grouped queries when the group size is greater than 1
Displaying all query plans that access an object For tables, plans organized by: • Table only access • Index and table access • Index only access Code for indexes is old, plans currently organized by: • Any access to the index Would like to have plans organized by: • Access type (similar to how plans are organized for tables)
What’s next for our tools • Find queries which would be affected, across all tests, if we disabled an expensive access or join method • Find indexes which are not used by any query in any test • Find indexes which are used infrequently • Add ability to compare results from different instances • Aggregate results across clusters of related queries • Identify potentially redundant indexes
Information that we’d like to be able to get • More detailed information on what makes an object hot: • Row source information with STATISTICS_LEVEL=ALL • Logical writes • Which indexes are being updated during DML • Source of gets during an insert (indexes, RI) • Query plans for inserts • Which plans could benefit from a covering index or addition of columns to the index • Which plans could benefit from a sort avert index