1 / 27

Agenda

Achieving Scalability, Performance and Availability on Linux with Oracle 9iR2-RAC Grant McAlister Senior Database Engineer Amazon.com Paper 32110. Agenda. Why Oracle on Linux and RAC The Tests Scaling Performance Availability Choice of Interconnect Conclusion. Why Linux .

roscoe
Télécharger la présentation

Agenda

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Achieving Scalability, Performance and Availability on Linux with Oracle 9iR2-RACGrant McAlisterSenior Database EngineerAmazon.comPaper 32110

  2. Agenda • Why Oracle on Linux and RAC • The Tests • Scaling • Performance • Availability • Choice of Interconnect • Conclusion

  3. Why Linux • Lower Total Cost of Ownership • Near commodity hardware and support • Multiple O/S and hardware vendors • Common platform (IA-32) for entire enterprise • Unix look and feel • New enterprise kernel • No database conversions when changing Linux hardware or O/S

  4. Why RAC on Linux • Cost • Ability to use near commodity systems (2-4 processors) • Lower level of support needed on system units • The need for availability • Young and rapidly evolving O/S • Near commodity hardware and support • The need to scale database beyond 8 processors • The need for large amounts of memory > 32GBytes

  5. The Tests • Real life workloads • Not modified or partitioned to support RAC • Used automatic space management • Workload #1 • Simple workload of small queries with little locking. • Workload #2 • Typical nasty workload with many inserts, updates and select for updates causing a lot of locking and blocking.

  6. Workload #1 Single Instance Profile Load Profile ~~~~~~~~~~~~ Per Second Per Transaction --------------- --------------- Redo size: 77,516.28 1,460.05 Logical reads: 4,134.57 77.88 Block changes: 462.54 8.71 Physical reads: 155.70 2.93 Physical writes: 27.14 0.51 User calls: 11,012.73 207.43 Parses: 432.50 8.15 Sorts: 187.32 3.53 Executes: 432.89 8.15 Transactions: 53.09 % Blocks changed per Read: 11.19 Recursive Call %: 0.68 Rollback per transaction %: 0.82 Rows per Sort: 353.26 Top 5 Wait Events on a single instance Avg Total Wait wait Waits Event Waits Timeouts Time (s) (ms) /txn ---------------------------- ------------ ---------- ---------- ------ -------- db file sequential read 560,060 0 1,249 2 2.9 log file sync 180,813 494 676 4 0.9 log file parallel write 188,017 181,946 143 1 1.0 latch free 87,584 6,309 141 2 0.5 db file parallel write 5,794 2,895 14 2 0.0

  7. Workload #2 Single Instance Profile Load Profile ~~~~~~~~~~~~ Per Second Per Transaction --------------- --------------- Redo size: 244,988.60 5,306.31 Logical reads: 14,562.36 315.41 Block changes: 1,802.47 39.04 Physical reads: 319.45 6.92 Physical writes: 91.52 1.98 User calls: 2,877.06 62.32 Parses: 457.06 9.90 Sorts: 290.13 6.28 Executes: 456.73 9.89 Transactions: 46.17 % Blocks changed per Read: 12.38 Recursive Call %: 4.16 Rollback per transaction %: 0.96 Rows per Sort: 13.09 Top 5 wait events on a single instance Avg Total Wait wait Waits Event Waits Timeouts Time (s) (ms) /txn ---------------------------- ------------ ---------- ---------- ------ -------- db file sequential read 346,048 0 1,412 4 1.7 enqueue 177 119 369 2087 0.0 free buffer waits 752 32 348 463 0.0 db file scattered read 141,564 0 325 2 0.7 log file sync 207,109 37 306 1 1.0

  8. The Hardware and Software • Software • Oracle 9.2.0.1 • Red Hat Advanced Server 2.1 (2.4.9-e.3) • Hardware • 3 types of clusters that each have 4 nodes • 2 Pentium III Xeon Processors @ 1.126GHz & 5 Gbytes of RAM • 2 Pentium 4 Xeon DP Processors @ 2.4GHz & 4 Gbytes of RAM • 4 Pentium 4 Xeon MP Processors @ 1.6GHz & 10 Gbytes of RAM • Database files were on raw partitions

  9. Scaling • The ability to produce higher transactional volumes when adding additional processors or additional nodes.

  10. Scaling of workload #1

  11. Scaling of workload #2

  12. Some workloads scale better

  13. Some of the differences Top 5 workload #1 timed events Top 5 workload #2 timed events

  14. Performance • The time taken to perform a query is important • Execution time influences transactional volume • Can cause dramatic changes in the end user response time • Stock Exchange • Internet Retailer • Bank • Only you know what is reasonable for your database and application

  15. Execution times for workload #1

  16. Execution times for workload #2

  17. Some ways to improve • Make sure you database is well tuned for single instance operation • Consider using different block sizes for hot indexes • Hash partition hot tables and indexes • Partition the workload

  18. Availability • Minimize failures by building clusters with as few single points of failure as possible. • Setup your RAC cluster to recover from node and instance failure as quickly as possible.

  19. Redundant RAC Configuration

  20. Instance recovery time fast_start_mttr_target is the key

  21. Node failure recovery time • Recovery Time= Failure detection + Instance recovery • Failure detection = (MissCount * 1 second) • MissCount parameter in found in cmcfg.ora • When MissCount = 20 and fast_start_mttr_target=120 • All workload #2 processing resumed in less than 1 minute after crashing a node.

  22. Impact of a single node failure Node failed Cluster Reconfigured CM ejects node Recovery Complete

  23. Choice of Interconnect • 1000Mbit (Gigabit) Ethernet • Latency ~ 0.07 ms • Transfer Rate - 30+ MBytes per second • More expensive but becoming common with the advent of gigabit over copper. • 100Mbit Ethernet • Latency ~ 0.20 ms • Transfer Rate - 10 MBytes per second • Common and inexpensive

  24. 100mbit vs. Gigabit

  25. Conclusions • RAC scaled at 90% on a simple workload • RAC scaled consistently at 55+% on a complex workload • There is an impact to query performance depending on your workload • You can recover from failures in less than 1 minute • When configured correctly a RAC cluster can scale, perform and be highly available.

  26. Q & Q U E S T I O N S A N S W E R S A

More Related