1 / 48

Understanding Common Oracle Wait Events

Understanding Common Oracle Wait Events. Kirtikumar Deshpande Dallas Oracle User Group September 15, 2005. About Me. Senior Oracle DBA Verizon Information Services Phone Directories Publication. About OWI Book:.

amiel
Télécharger la présentation

Understanding Common Oracle Wait Events

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Understanding Common Oracle Wait Events Kirtikumar Deshpande Dallas Oracle User Group September 15, 2005

  2. About Me • Senior Oracle DBA • Verizon Information Services • Phone Directories Publication

  3. About OWI Book: “…Where this book excels is bridging the gap between the perfect measurement and implementable solutions by explaining why's and how's of the problem. I was amazed how the authors put together rational explanations of common wait events like latch free, bolstered by the elaboration of internals like hash buckets, cache buffer chains and how to rectify those - it all seems so simple when it comes out in the book. Whether you are a veteran DBA who have seen all the battles since the Civil War or a rookie just starting out, this book is for you, a vital weapon in your arsenal, especially the scripts for identifying trouble spots. If I'm allowed to keep only one book on Oracle - this will be it.” - Arup Nanda

  4. About OWI Book: “I received this book on Tuesday and I literally could not put it down. I consumed it like a great thriller. It contains a great deal of information that can be found no where else in print. Performance monitoring and tuning with the Oracle Wait Interface is still new to many Oracle DBAs, even seasoned ones, precisely because the details of how to gather and interpret the information have been difficult to come by until now...” – John Smiley

  5. About OWI Book: “The book is simply spectacular, both for the quality of its writing as well as the depth of the material. Practical? Indeed, indispensible! The three authors, Richmond Shee, Kirti Deshpande, and K Gopalakrishnan, have done a wonderful job in organizing a enormous subject area into manageable chunks. They have also managed to render potentially bone-dry source material into very readable text, interspersed heavily with code examples and output, sidebars, and analogies. This is a good read as well as an authoritative reference. …. The combined efforts of the three authors and the four technical editors blows my mind. …. All I can say is - get it! ” - Tim Gorman

  6. Acknowledgement • Special thanks to Richmond Shee for allowing me to use the contents of his presentation at IOUG Live! 2005

  7. Agenda • OWI Monitoring and Data capture • Handling Common Oracle Wait Events –Going Beyond P1, P2, and P3 Intermediate level: OWI Monitoring and Data Collection Novice level: Separating Symptoms from Problems Beginner level: Event Attributes (P1, P2, P3), Event Classification OWI Views New Convert: Paradigm Shift

  8. OBJECTIVE and SCOPE • Take Home • Information you can use to discover the root cause of performance problems and answer the 64-thousand dollar questions: • Why did the job run so slowly? • Why did the job run so quickly? • Scope • OWI Monitoring: Oracle7 to Oracle9i Database • Handling OWI: Oracle7 to Oracle Database 10g

  9. First agenda item… OWI Monitoring and Data Capture

  10. OWI Monitoring and Data Capture • Q1) Why is historical performance data important? • Q2) What is the best source of performance data? • V$SYSTEM_EVENT? • V$SESSION_EVENT? • V$SESSION_WAIT? • Q3) What is a good data capture method and sampling frequency? • Trace event 10046? • Statspack?

  11. OWI Monitoring and Data Capture The importance of historical performance data… • Users expect their DBAs to be omniscient • DBAs are expected to be aware of performance issues 24x7 • You need a history of all foreground processes ran in the instance

  12. OWI Monitoring and Data Capture Determine the best source of data… V$SYSTEM_EVENT Pros: Cons: system-level data V$SESSION_EVENT Pros: session-level granularity Cons: session-level granularity • V$SESSION_WAIT (X$KSUSECST) • Pros: Fine-grain data • Cons: • Changes quickly, High volume of data • Data requires translation

  13. OWI Monitoring and Data Capture • Determine the best data capture method and sampling frequency… • Requirement: • A performance data collector that is capable of monitoring all foreground processes on a 24x7 basis. • Desired features: • Wait-based philosophy • Low overhead • Always-on • Repositories (wait events, runtime statistics, SQL statements, and SQL plans)

  14. OWI Monitoring and Data Capture • Consider the trace event 10046… • Oracle’s most comprehensive trace facility. • It captures wait events, SQL statements, bind variables. • Fine-grain data is best for troubleshooting, but requires a lot of disk space. • Disk space and overhead limitations prevent instance-wide monitoring • Trace file • …is not user friendly: WAIT #12: nam='db file scattered read' ela= 0 p1=106 p2=60227 p3=8 • …does not have cross referencing: WAIT #1: nam='enqueue' ela= 3007483 p1=1415053318 p2=393259 p3=149 • …can have bugs: WAIT #0: nam='db file parallel write' ela= 2 p1=-144 p2=1 p3=0 • It may add significant overhead to the RDBMS and further degrade the performance of an already slow running process. • Documentation for interpreting trace file is seldom available

  15. OWI Monitoring and Data Capture Summary - Trace event 10046

  16. OWI Monitoring and Data Capture • Consider the database logoff trigger… • Excellent for session-level summary. • Great for benchmarking. • Instance-wide monitoring capability. • Trigger overhead depends mainly on the code. • Some PL/SQL coding is necessary. • Disk space requirement is generally low – depends on the logoff rate. • Only available in Oracle8i and later versions. • Not suitable for root cause analysis which requires fine-grain data.

  17. OWI Monitoring and Data Capture An application of the database logoff trigger… CATEGORY SUBCATEGORY WAIT_EVENT VALUE PERCENT ---------- ----------- ----------------------------- ---------- ---------- CPU OTHER Fetch, Execute, Lookups, etc 41089 34.37 PARSE parse time cpu 14139 11.83 RECURSIVE recursive cpu usage 552 .46 DISK I/O DIRECT I/O direct path read 0 0 Direct path write 0 0 FULL SCANS db file scattered read 854 .71 NORMAL I/O db file sequential read 3645 3.05 LATENCY COMMITS log file sync 1469 1.23 FILE OPS file open 1 0 LATCH latch free 57470 48.07 LOG FILE log file switch completion 6 .01 NETWORK SQL*Net message to client 179 .15 SQL*Net more data from client 2 0 SQL*Net more data to client 36 .03 OTHER buffer busy waits 107 .09 MISC MISC library cache pin 2 0

  18. OWI Monitoring and Data Capture Summary – Database logoff trigger

  19. OWI Monitoring and Data Capture • Consider Statspack… • Report has a lot of information that allows you to examine performance from several perspectives. • Instance-level snapshots offer coarse-grain information that roughly indicates there is a problem but not specifically where the problem is • - No different than querying v$system_event, v$sysstat, v$latch, etc. • Session-level snapshots? How are you going to automate it? Even if session-level snapshot automation is not an issue, the data is still too coarse. • - No different than querying v$session_event and v$sesstat. • Difficulty in determining the best sampling frequency.

  20. OWI Monitoring and Data Capture Summary – Statspack

  21. OWI Monitoring and Data Capture Problem: There is no free suitable tool available. Prior to Oracle Database 10g, you have to develop your own tool or purchase very expensive 3rd party tools. Too Expensive Too Coarse Too Coarse

  22. OWI Monitoring and Data Capture BYOT: Build Your Own Tool (using PL/SQL to capture data) • Three major areas to consider: • Sampling frequency • Repository • Events to monitor

  23. OWI Monitoring and Data Capture BYOT: Build Your Own Tool (using PL/SQL to capture data) • Data Source: • V$SESSION_WAIT (X$KSUSECST) • Sampling frequency: • Affects the quantity and granularity of data • Depends on data capture method • Unix Shell script • PL/SQL procedure • Unix Cron • SNP background process

  24. OWI Monitoring and Data Capture BYOT: Build Your Own Tool (using PL/SQL to capture data) Repositories: Minimum two repositories (wait events & SQL code) • SQL statements help set the context and get you closer to the problem. • Event: Buffer busy waits P1 & P2 = FOOBAR table P3 = 220 • Also helps developers to locate the right module.

  25. OWI Monitoring and Data Capture BYOT: Build Your Own Tool (using PL/SQL to capture data) Events to ignore: KXFX: Execution Message Dequeue – Slave PX Deq: Execution Msg KXFQ: kxfqdeq - normal deqeue PX Deq: Table Q Normal Wait for credit - send blocked PX Deq Credit: send blkd Wait for credit - need buffer to send PX Deq Credit: need buffer Wait for credit - free buffer PX Deq Credit: free buffer parallel query dequeue wait PX Deque wait Parallel Query Idle Wait – Slaves PX Idle Wait dispatcher timer virtual circuit status slave wait pipe get rdbms ipc message rdbms ipc reply pmon timer smon timer WMON goes to sleep client message SQL*Net message from client (* debatable) Null event (* debatable) PL/SQL lock timer Events to monitor: db file sequential read db file scattered read latch free direct path read direct path write Enqueue library cache pin buffer busy waits free buffer waits

  26. OWI Monitoring and Data Capture BYOT: Build Your Own Tool (using PL/SQL to capture data) • 24x7 monitoring. • Wait event history. • Immediate answer to why a certain process runs like molasses. • Proactive performance management. • SQL statement and plan repositories. • Jobs elapsed time can be determined from the sampling intervals. • Low disk space requirement. • Extensive PL/SQL coding. • Overhead depends on the quality of code. • Not suitable for short-running jobs.

  27. OWI Monitoring and Data Capture Summary – PL/SQL procedure

  28. OWI Monitoring and Data Capture Chapter 4 contains a detailed discussion of OWI monitoring and data capture

  29. Second agenda item… • Handling Common Oracle Wait Events • Going beyond P1, P2, and P3

  30. Handling Wait Events db file sequential read db file scattered read At what point do these wait events become a problem? What are they a symptom of? • Low cache hit ratio • Slow I/O subsystem • Physical I/O calls • Small block size • Small buffer cache Handling these wait events requires you to know:- • The amount of time the events are costing the process. • The SQL statement that is associated with the events. Solution: SQL tuning

  31. Handling Wait Events Latch Free Latch Free contention is a symptom of? • Low SPIN_COUNT. • Inefficient SQL statements. • Concurrency coupled with high demands for resources. • Insufficient number of latches. • Insufficient or slow CPU. Handling the latch free contention requires you to know:- • The type of latch sessions are competing for (28 individual latch wait events in Oracle10g Release 1). • The amount of time a session spent waiting on latches. • The SQL statement that is associated with the event.

  32. Handling Wait Events Latch Free: Shared Pool & Library Cache • Contention for the Shared Pool & Library Cache latch is a symptom of? • Hard parses – literal SQL statements. • Soft parses. • Oversized shared pool. • High version count. • Bad application Solution: If not (c), the real solution is correcting Application Workarounds: • Set CURSOR_SHARING = FORCE • Set SESSION_CACHED_CURSORS

  33. Handling Wait Events – Latch Free: Cache Buffers Chains A Working Set Hash Latch LRU LRUW Hash Bucket Buffer Header Hash Chain Buffers Memory

  34. Handling Wait Events Latch Free: Cache Buffers Chains Contention for the CBC latch is symptomatic of? • Inefficient SQL statement. • Hot blocks. • Long hash chains. • Insufficient number of latches. Handling the CBC latch contention requires you to know: • If the contention is widespread or localized to a particular latch. • The SQL statements that participate in the competition.

  35. Handling Wait Events Latch Free: Cache Buffers Chains Solutions: • Tune the application and SQL statements. • Reduce the level of concurrency. Workarounds: • Spread the hot blocks across multiple CBC latches. • Consider increasing _SPIN_COUNT (Oracle9i and above, use _LATCH_CLASS and _LATCH_CLASSES). • Consider increasing _DB_BLOCK_HASH_BUCKETS. • Consider increasing _DB_BLOCK_HASH_LATCHES.

  36. Handling Wait Events Buffer Busy Waits BBW contention is a symptom of? • Read/read, read/write, or write/write contention. • Corrupted buffer pin. • Insufficient INITRANS. • Large block size. Handling the BBW contention requires you to know:- • The amount of time a session spent waiting on the event. • The reason code that represents why a process fails to get a buffer pin. • The class of block that the buffer busy waits event is for. • The SQL statements that are associated with the event. • The segment that the buffer belongs to.

  37. Handling Wait Events BBW: Solutions depend on the class of block and reason code: BBW contention for data block class (class #1), reason code 130 • Reduce the level of concurrency or change the way the work is partitioned between the parallel threads. • Optimize the SQL statement to reduce the number of physical and logical reads. • Increase the number of FREELISTS and FREELIST GROUPS. BBW contention for data block class (class #1), reason code 220 • Reduce the level of concurrency or change the partitioning method. • Reduce the number of rows in the block. • Rebuild the object in another tablespace with a smaller block size (Oracle9i and above).

  38. Handling Wait Events BBW : Solutions depend on the class of block and reason code: BBW contention for data segment header (class #4) • Increase the number of FREELISTS and FREELIST GROUPS of the identified object. • Ensure the gap between PCTFREE and PCTUSED is not too small. • Ensure the next extent size is not too small. BBW contention for undo segment header (class #17) • **Applies to rollback segment, not the system-managed undo. • Create additional rollback segments. • Ensure the next extent size is not too small. BBW contention for undo blocks (class #18) • Application tuning.

  39. Handling Wait Events Free Buffer Waits wait is symptomatic of? • Small buffer cache. • Insufficient number of DBWR processes. • Inefficient SQL statement. • Slow I/O subsystem. • Delayed block cleanout. Free Buffer Waits Handling the Free Buffer Waits event requires you to know:- • The amount of time a session spent waiting on the event. • The SQL statements that are associated with the event. • The number of DBWR processes. • The I/O operation and database storage system.

  40. Handling Wait Events Free Buffer Waits Solutions: • Optimize the SQL statements. • Increase the number of DBWR processes. • Use appropriate I/O operation (async or sync). • Lower the FAST_START_MTTR_TARGET value. • Reduce the buffer cache size. • Increase the buffer cache size. • Pre-scan the table after each load.

  41. Handling Wait Events Log File Sync Log File Sync wait is symptomatic of? • Oversized log buffer. • High commit frequency. • Bad application. • Slow LGWR process. Handling the Log File Sync event requires you to know:- • The amount of time a session spent waiting on the event. • The type of job (batch or OLTP) that is associated with the event. Solution: • Reduce the commit frequency. Workarounds: • Reduce the log buffer size or lower the _LOG_IO_SIZE. • Increase LGWR I/O throughput.

  42. Handling Wait Events Enqueue Enqueue contention is symptomatic of? • Concurrent access to the DBMS_AQ package. • Concurrent transactions with incompatible lock requests for a database resource. • Concurrent transactions with incompatible lock requests for a latch. • Poor application design. Handling the Enqueue contention requires you to know:- • The type and mode of enqueue the sessions are competing for (All enqueues have independent wait event names in Oracle Database 10g). • The amount of time a session spent waiting on enqueues. • The SQL statement that is associated with the event.

  43. Handling Wait Events TX enqueue in mode 6 (Exclusive) Contention for the TX enqueue in mode 6 is for row-level locks. In Oracle Database 10g, this is “enq: TX – row lock contention”. Solutions: • Commit or rollback the transaction holding the lock. • Fix the application so that sessions don’t go after the same rows. Workaround: • None

  44. Handling Wait Events Contention for the TX enqueue in mode 4 can be due to: • ITL shortage - In Oracle Database 10g: “enq: TX – allocate ITL entry”) • Unique key enforcement • Bitmap index entry TX enqueue in mode 4 (Share) Solution depends on the object of contention: • Increase the number of INITRANS. • Prevent multiple sessions from inserting the same key value into a table. • Don’t use bitmap indexes.

  45. Handling Wait Events Contention for the TM enqueue in mode 3,4,5 is normally due to non-indexed foreign key columns. TM enqueue in mode 3,4,5 (Row-X, Share, Share Row_X) Solution: • Index the foreign key columns of the object identified by the TM enqueue.

  46. Handling Wait Events Chapters 5, 6, and 7 contain a detailed discussion of how to handle common Oracle wait events.

  47. Handling Wait Events Q Do you think you can use the information presented in this session to identify performance bottlenecks?

  48. Understanding Common Oracle Wait Events Q & A Kirtikumar_Deshpande@yahoo.com

More Related