1 / 43

Wait-Time Based Oracle Performance Management

Wait-Time Based Oracle Performance Management. Prepared for Ohio Oracle User Group Presented by Dean Richards Senior Engineer, Confio Software. Who is Dean Richards? . Senior DBA/Engineer for Confio Software Former DBA performance consultant with Oracle for five years

lilac
Télécharger la présentation

Wait-Time Based Oracle Performance Management

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Wait-Time Based Oracle Performance Management Prepared for Ohio Oracle User GroupPresented by Dean Richards Senior Engineer, Confio Software

  2. Who is Dean Richards? • Senior DBA/Engineer for Confio Software • Former DBA performance consultant with Oracle for five years • Specialize in performance tuning using Oracle Wait Event interface • Review performance of Oracle databases at hundreds of customers each year

  3. Agenda • Confio Performance Intelligence • Case Study One: Hot Block Issue • Case Study Two: Full Table Scans • Case Study Three: Inefficient Indexes • Q&A

  4. Working the Right Problems? • After spending an agonizing week tuning Oracle to minimize I/O operations, management typically rewards you with: • A. An all expense paid vacation • B. A free lunch • C. Crumbs from the kitchen • D. Reward? Nobody even noticed!

  5. Many tools measure system health Assumption: If I make the database healthy, users benefit Symptoms DBA finds “big” problem and fixes it, users report no impact Lots of data to review and things to fix, not sure which to do first Unclear view of performance leads to Finger-pointing Why Does This Happen? It’s your Code! It’s your Database! Developer or vendor IT staff

  6. Confio Performance Intelligence • Three Key Principles 1. SQL View: All statistics atSQL statement level 2. Full View: Separately measureevery resource (Oracle wait events)to isolate source of problems 3. Time View: MeasureTime, not number of times a resource is utilized

  7. SQL View Principle • Example: ‘CEO’ measuring ‘employee’ output • Averaging over entire company gives no useful data • Must measure each job separately • DBA must manage database similarly • Measure and identify bottlenecks for each SQL independently

  8. Time View Principle • Example: ‘CEO’ counting ‘tasks’ vs. ‘time to complete’ • Counting system statistics not meaningful • Must measure Time to complete • System stats (buffer size, hit ratios, I/O counts) do not identify where database customers are waiting • Identify and optimize Wait Time for each SQL as best indicator of performance

  9. Full View Principle • Example: ‘CEO’ measuring results with blind spot hiding key processes • Without direct visibility, valuable info is lost • Must have visibility to every process step • Distinctly identify and measure each Oracle resource for each distinct SQL

  10. Compliant Performance Tool Types Two Primary Types of Tools • Session Specific Tools • Tools that focus on one session at a time often by tracing the process • Examples: OraSRP Profiler (open source), Hotsos Profiler, tkprof • Continuous DB Wide Monitoring Tools • Tools that focus on all sessions by sampling Oracle • Examples: Confio Ignite, Symantec i3 • Both tools have a place in the organization

  11. Tracing • Use cautiously due to session statistics skew • 95 out of 100 users are running well • 5 out of 100 have spent 99% of time waiting for locked rows • If you trace one of the “95” sessions • No locking problems at all • May spend time trying to tune other items that may not be important • If you trace one of the “5” sessions • Severe locking problems • Appears that you could fix the locking problems and reduce your wait time by 99%

  12. Tracing (cont) • Advantages • Very precise - may be only way to get some statistics • Bind variable information is available • Can provide detailed analysis even deeper than wait events • Disadvantages • Only works if a known problem is going to occur in the future (and known session) • Difficult to see trends over time

  13. Continuous DB Wide Monitoring Tools • 24/7 sampling provides real-time and historical perspective • Allows DBA to go back in time • I had a problem at 3:00 pm yesterday • Use Performance Intelligence - trend reports, graphs, etc to communicate with other groups • What is starting to perform poorly? • What progress have we made while tuning?

  14. Oracle Wait Interface • V$SESSION_WAIT (X$KSUSECST) • SID (join to v$session) • EVENT • P1, P1RAW, P2, P2RAW, P3, P3RAW • STATE = ‘WAITING’ – currently waiting on event • STATE = ‘WAITED…’ – currently on CPU (or in queue) • Oracle 10g added this info to V$SESSION

  15. Oracle Sessions • V$SESSION (X$KSUSE) • SID • USERNAME • SQL_HASH_VALUE • Join to V$SQL • PROGRAM • MODULE / ACTION • DBMS_APPLICATION_INFO • PLAN_HASH_VALUE • Join to V$SQL_PLAN

  16. Base Query SELECT sid, username, program, module, action, machine, osuser, sql_hash_value, … decode(state, ‘WAITING’, event, ‘CPU’) event, p1, p1raw, p2, …, SYSDATE FROM V$SESSION s WHERE s.status = ‘ACTIVE’ AND event NOT IN (<idle wait events>);

  17. Additional Information • V$SESSION • service_name, machine, client_info • row_wait_obj#, blocking_session • Go back later to get • Sql_text from v$sql • SQL stats from v$sqlarea • Execution plan from v$sql_plan • Object info from dba_objects

  18. Case Study OneHot Block Issue

  19. Problem Observed • Critical situation: application performance unsatisfactory • All email coming into and going out of the company was tracked in order to find: • Viruses • Espionage • Legal reasons • However, email was getting behind • Email not getting to end-users for several hours • Declared top priority in company

  20. Wait Events During Problem Log file waits Buffer busy waits Query that is doing “real” work

  21. What do we know? • Which SQL: DDL or Commits SQL hash_value=0 • Which Resource: buffer busy waits log file waits • How much time: 163 Hours of wait time per day

  22. “buffer busy waits” Description • Buffer is being read into cache by another session and this session is waiting for that process to complete. • In Oracle 10g buffer busy waits are further refined and this becomes “read by other session” • Buffer is already in the cache but in an incompatible mode, i.e. another session is changing it.

  23. “buffer busy waits” Description • P1 – file number information • P2 – block number information SELECT owner, segment_name, segment_type FROM   dba_extents WHERE file_id = &P1 AND &P2 BETWEEN block_id AND block_id + blocks -1 • Gives information about the object being waited for

  24. “buffer busy waits” Analysis

  25. Results • Found hot block problem • “buffer busy waits” was waiting for Block #2 in the file “…staging01.dbf” • The email processing code was creating a series of staging tables, every time it executed • Solutions • Started using temporary tables vs. create/drop distinct tables each time the process ran

  26. Case Study Two DB File Scattered Reads

  27. Problem Observed • Problem: Login taking 12 minutes for each user when they started their day • High wait accumulation from 6:30 – 8:30 am • 240 Users X 12 Minutes = 48 Hours Every Day • 6 employees wasted time per day • $400,000+ wasted per year • Applied Confio methods to problem identification • Identify Wait Time, offending SQL, offending Resource

  28. Wait Events During Problem

  29. Investigation

  30. What do we know? • Which SQL: LoginLookup UpdateInventory • Which Resource: scattered read buffer busy waits • How much time: 48+ Hours Every Day

  31. Hypotheses: Oracle Interpretations Key Questions: • Is full table scan necessary? • What causes a full table scan for this SQL Statement? Two Alternative paths for optimization: • Eliminate Full Table Scan • Add Index / Collect Histograms • Update Statistics • Utilize Query Hints • Full Table Scan Required - Improve response time • Parallelized Reads • Optimize I/O Subsystem • Optimize Application

  32. Results • Added indexes to underlying tables • Added Materialized View Full Table Scan Fixed

  33. Case Study Three DB File Sequential Reads

  34. Problem Observed • Data Warehouse loads were taking too long • Noticed high wait times on “db file sequential read” wait event • DBAs were confused – why are data loads “reading” data • Applied Confio Method to problem identification • Identify Wait Time, offending SQL, offending Resource

  35. Investigation

  36. Investigation for an INSERT Statement Sequential read time by object for SQL

  37. What do we know? • Which SQL: Load Process • Which Resource: DB File Sequential Read • How much time: 5 hour+ 90% of wait time

  38. Investigating db file sequential reads • Often considered a “good” read • DB file sequential reads normally occur during index lookups • Often a single-block read although it may retrieve more than one block. • P1 – file id • P2 – block id • Join to DBA_EXTENTS (see buffer busy waits)

  39. Hypotheses: Oracle Interpretations of Sequential Reads Causes of excessive wait times: • Reading too many index leaf blocks • Low cardinality first column index • Not finding block in buffer cache forces disk read • Slow disk reads • Contention for certain blocks

  40. Results • Many sessions were loading data and all were updating low cardinality indexes • Modified index and noticed a 50% performance improvement in an INSERT • Customer is also analyzing global vs. local indexes • Reviewing usage of bitmap indexes • Removed unused indexes • Enhanced the disk subsystem

  41. Conclusion • Conventional Tuning focus on “system health” and lead to finger-pointing and confusion • Wait event tuning implemented according to Confio Performance Intelligence is the best way to tune • Two compliant tools types • Tracing tools • Continuous DB-wide monitoring tools

  42. About Confio Software • Developer of Performance Tools • Igniter Suite • Ignite for Oracle, DB2, SQL Server, Sybase • Ignite for Java • Packaged, easy-to-use implementation of Wait Time Performance Tuning • Based in Colorado, 100’s of worldwide customers • Free trial at www.confio.com

  43. Thank you for coming Dean Richards Stop by the Confio Booth Contact Information • deanrichards@confio.com • 303-938-8282 ext. 116 • Company website http://www.confio.com

More Related