1 / 45

Partitioning with Oracle 11G

Partitioning with Oracle 11G. Bert Scalzo, Domain Expert, Oracle Solutions Bert.Scalzo@Quest.com. About the Author …. Domain Expert & Product Architect for Quest Software Oracle Background: Worked with Oracle databases for over two decades (starting with version 4)

lee-brennan
Télécharger la présentation

Partitioning with Oracle 11G

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Partitioning with Oracle 11G Bert Scalzo, Domain Expert, Oracle Solutions Bert.Scalzo@Quest.com

  2. About the Author … Domain Expert & Product Architect for Quest Software Oracle Background: • Worked with Oracle databases for over two decades (starting with version 4) • Work history includes time at both “Oracle Education” and “Oracle Consulting” Academic Background: • Several Oracle Masters certifications • BS, MS and PhD in Computer Science • MBA (general business) • Several insurance industry designations Key Interests: • Data Modeling • Database Benchmarking • Database Tuning & Optimization • "Star Schema" Data Warehouses • Oracle on Linux – and specifically: RAC on Linux This presentation draws heavily on these areas • Articles for: • Oracle’s Technology Network (OTN) • Oracle Magazine, • Oracle Informant • PC Week (eWeek) • Articles for: • Dell Power Solutions Magazine • The Linux Journal • www.linux.com • www.orafaq.com

  3. Books by Author … Coming in 2008 …

  4. Agenda • Partitioning Benefits • Partitioning History • Partitioning Options • Partitioning Advisor (if you’re licensed) • Typical Data Warehousing Environment • TPC-H “Data Warehouse” Benchmark • Results TPC-H with Various Partition Strategies • What about OLTP Environments and the TPC-C/E • Lessons Learned (and their relevance/application) • Questions & Answers

  5. Partitioning Benefits: Facts • Manageability • Classic “Divide & Conquer” technique • More granular storage allocation options • Keeps otherwise time consumptive options viable • Availability • More granular online/offline options • More granular rebuild/reorganization options • More granular object level backup/restore options • Capacity Management • Enables a “Tiered Storage Architecture” approach • More granular storage cost management decision points • Performance • Partition Pruning • Partition-Wise Joins

  6. Partitioning Benefits: Opinion (Mine) • Manageability 40% • Availability 20% • Capacity Management 20% • Performance 20% • Don’t over-sell/over-expect the performance aspect • Need to experiment for best approach for a database • Better to take longer at the start to get right, because very often it’s far too expensive to change afterwards • Examples demonstrate very positive performance, but better to be conservative and error on the side of caution – then be very pleasantly surprised… Why to Partition

  7. Partition Pruning (Restriction Based) • From Docs: In partition pruning, the optimizer analyzes FROM and WHERE clauses in SQL statements to eliminate unneeded partitions when building the partition access list. This enables Oracle Database to perform operations only on those partitions that are relevant ... • “Divide and Conquer” for performance • Sometimes can yield order of magnitude improvement • But once again, best not to oversell and/or over-expect • Some Potential Issues to be aware of: • SQL*Plus Auto-Trace can sometimes miss partition pruning • “Old Style” Explain Plans via simple SELECT has issues too • Best to always use DBMS_XPLAN and/or SQL_TRACE Note: Trace file analysis much easier these days – SQL Developer + free Hotsos plug-in, metalink trace analysis scripts, Quest Toad DBA

  8. Partition-Wise Join (Multi-Object Based) • From Docs: Partition-wise joins reduce query response time by minimizing the amount of data exchanged among parallel execution servers when joins execute in parallel. This significantly reduces response time & improves the use of both CPU & memory resources. • Different Flavors: • Full – Single to Single • Full – Composite to Single • Full – Composite to Composite • Partial – Single • Partial – Composite • Indexing Strategy Counts • Local Prefixed/Non-Prefixed • Global All of these affect the explain plan

  9. Picture Worth 1000 Words (from Docs) Simple Mantra: Subdivide the work into equally paired chunks, then perform all that work using many parallel processes Make sure not to over-allocate CPU’s – remember there will also be concurrent workload

  10. Partitioning History (from Oracle 11G training+) Oracle 5 Before Tablespaces – we had partitions  Oracle 7 Partition Views – really more of a cheat 

  11. Partitioning Options – Part 1 IOT’s can be partitioned as well in later versions of Oracle, so the basic choices are even more complex than this…

  12. Partitioning Options – Part 2 Prior to 11G: Oracle White Paper: 2007 Partitioning in Oracle Database 11g

  13. Partitioning Options – Part 3 Post 11G: Oracle White Paper: 2007 Partitioning in Oracle Database 11g Very exciting new options…

  14. Partitioning Advisor (if you’re licensed) Advisor Central -> SQL Advisors -> SQL Access Advisor

  15. Typical Data Warehouse Architecture TPC-H

  16. Typical Environments We’ll come back to this picture

  17. TPC-H Benchmark • Industry Standard “Data Warehouse” Benchmark • URL: www.tpc.org/tpch • Spec: http://tpc.org/tpch/spec/tpch2.7.0.pdf • 8 Tables • 22 Queries (answer complex business questions) • Database scaling: • Factor = 1, 10, 30, 100, 300, 1000, 3000, 10000, 30000, 100000 • Size GB = 1, 10, 30, 100, 300, 1000, 3000, 10000, 30000, 100000

  18. TPC-H Data Model Sub-Partitions SF * 6,000,000 SF * 200,000 SF * 800,000 SF * 10,000 Partitions 25 5 SF * 150,000 SF * 1,500,000

  19. TPC-H Permits Partitioning … But what to do, what to do ???

  20. Disclosure Reports http://tpc.org/tpch/results/tpch_perf_results.asp

  21. Disclosure Report – Lots of Info This is where people document exactly what advanced database feature and storage parameters they used – info is invaluable 

  22. Disclosure Report – Appendix B

  23. Sample Expensive Query

  24. Example Explain Plan

  25. Example Explain Plan Explain complete. Plan hash value: 2545634784 ---------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | ---------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 42533 | 5648K| | 641K (1)| 01:57:41 | | 1 | SORT GROUP BY | | 42533 | 5648K| 105M| 641K (1)| 01:57:41 | |* 2 | HASH JOIN | | 715K| 92M| | 631K (1)| 01:55:51 | | 3 | TABLE ACCESS FULL | H_NATION | 25 | 725 | | 3 (0)| 00:00:01 | |* 4 | HASH JOIN | | 715K| 72M| | 631K (1)| 01:55:51 | | 5 | TABLE ACCESS FULL | H_SUPPLIER | 100K| 781K| | 646 (1)| 00:00:08 | |* 6 | HASH JOIN | | 720K| 68M| 68M| 631K (1)| 01:55:44 | |* 7 | HASH JOIN | | 751K| 59M| 232M| 589K (1)| 01:48:10 | |* 8 | HASH JOIN | | 3004K| 197M| 4984K| 485K (1)| 01:28:57 | |* 9 | TABLE ACCESS FULL| H_PART | 100K| 3808K| | 11805 (1)| 00:02:10 | | 10 | TABLE ACCESS FULL| H_LINEITEM | 60M| 1716M| | 342K (1)| 01:02:52 | | 11 | TABLE ACCESS FULL | H_ORDER | 15M| 200M| | 72122 (1)| 00:13:14 | | 12 | TABLE ACCESS FULL | H_PARTSUPP | 8000K| 122M| | 25945 (1)| 00:04:46 | ---------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("S_NATIONKEY"="N_NATIONKEY") 4 - access("S_SUPPKEY"="L_SUPPKEY") 6 - access("PS_SUPPKEY"="L_SUPPKEY" AND "PS_PARTKEY"="L_PARTKEY") 7 - access("O_ORDERKEY"="L_ORDERKEY") 8 - access("P_PARTKEY"="L_PARTKEY") 9 - filter("P_NAME" LIKE :SYS_B_2)

  26. Method of Attack • Since many data warehouses are utilized for data mining, we can’t always know every possible query likely to run – thus aggregate measure for success • Thus we’ll compare the benchmark’s weighted performance scores for the TPC-H using various partitioning schemes (all within spec of course…) • Goal will be to find the best overall partitioning … • Then we’ll examine some specific explain plans …

  27. 10G Sample Test cases • 10G Simple Approach (Just Huge Tables) • Range: ORDER (order date) • Hash: LINEITEM (order key) • 10G Basic Approach (Single Level Partitions) • Range: ORDER (order date) • Hash: LINEITEM (order key) • List: CUSTOMER (nation key) • Hash: PART, SUPPLIER and PARTSUPP (part & supp keys) • 10G Complex Approach (Composite Partitions) • Range-Hash: ORDER (order date & cust key) • Multi-Hash: LINEITEM (part, supp & order keys) • List: CUSTOMER (nation key) • Hash: PART, SUPPLIER and PARTSUPP (part & supp keys)

  28. 11G Sample Test cases • 11G Simple Approach (+Interval) • Interval-Hash: ORDER (order date & cust key) • Multi-Hash: LINEITEM (part, supp & order keys) • List: CUSTOMER (nation key) • Hash: PART, SUPPLIER and PARTSUPP (part & supp keys) • 11G Basic Approach (+Virtual) • Interval-Hash: ORDER (virtualized order date & cust key) • Multi-Hash: LINEITEM (part, supp & order keys) • List: CUSTOMER (nation key) • Hash: PART, SUPPLIER and PARTSUPP (part & supp keys) • 11G Complex Approach (+REF) • Interval-Hash: ORDER (virtualized order date & cust key) • REF: LINEITEM (order key) • List: CUSTOMER (nation key) • Hash: PART, SUPPLIER and PARTSUPP (part & supp keys)

  29. Is that It? • No – just six very obvious high-level scenarios • Your selections and actual mileage will vary • Experimentation usually yields the best results • Always trust “empirical results” over conjecture • So improved response-time beats better explain plan • Remember, DW’s usually have unpredictable queries • So don’t tune for just a few queries, look for the best overall and/or more generic performance solution 

  30. Intermediate Results can be Misleading  TPC-H Power score seemingly implies that every partitioning schema is incrementally better TPC-H Throughput score seems to show that non-partitioned is equal to the best

  31. Final Results tell the Real Truth  TPC-H Query/Hour score shows that some partitioning schemes better, and some not TPC-H $/Query/Hour score confirms the inverse in terms of dollars per unit of work

  32. Why such seemingly Opposite Results ??? • Run times and explain plans apply to single measurable operation • Even aggregate & averaged run times don’t relate the entire truth! • Need answer based upon sound mathematics (reliable & repeatable)

  33. TPC-C Benchmark • Historical Industry Standard “OLTP” Benchmark • URL: www.tpc.org/tpchc • Spec: http://tpc.org/tpcc/spec/tpcc_current.pdf • Probably the most used & widely quoted benchmark • But suffers from overly simplistic design & code logic • Generally considered unreliable with modern RDBMS • But still a decent rough “sounding board” for many …. • Being replaced by the newer TPC-E (later slides)

  34. TPC-C Data Model # Terminals/Warehouse (i.e. concurrent users) Base Scaling Unit Clustered Partitions Sub-Partitions

  35. TPC-E Benchmark • Emerging Industry Standard “OLTP” Benchmark • URL: www.tpc.org/tpche • Spec: http://tpc.org/tpce/spec/TPCE-v1.5.1.pdf • Very new and still evolving – but highly promising • Not too many published TPC-E results as of yet … • Design not compromised by RDBMS features • Much more realistic (i.e. real world) in nature  • Nowhere near as easy as the old TPC-C test 

  36. TPC-E Data Model

  37. TPS is Moot, Average Response Time is King tpmC 584 582 579 578 But wait: adding cluster &partitioning yields negative – why ??? Look to Stats Pack, AWR and ADDM Reports to investigate…

  38. Single block reads 32 ms! DISTRICT Table needs clustered

  39. Suggest fix one major item per test iteration, so made choice to address this 1st Single block read 32 ms  Clustering worked, it made SQL the #1 performance issue – as was expected Single block read 5 ms  Partitioning did not shine through just yet, possibly skewed by the first issue

  40. Switch MEMORY_TARGET to SGA/PGA_TARGETS -13% Notice that manual memory management resulted in 13% gain !!! But wait, there’s more (isn’t that almost always the case) …

  41. Now it’s all just SQL, so time for SQL Tuning Advisor & SQL Tuning Sets

  42. SQL Tuning Sets & 11G Results/Client Cache -9% Going to stop – Have quickly reached the “good enough” point 

  43. Architecture Findings Mostly Partition Elimination Mostly Partition Wise Join Design to eliminate per object Design to parallelize across objects

  44. Other Interesting Findings • 64-bit scales much more reliably than 32-bit, even when on same hardware and using the <= 4GB memory model • If you know the application code’s nature and it’s well definable, manual memory management may be better • Using manual SGA/PGA targets with floors yields much more scalable results and also more predictable patterns • Partitioning is not an automatic bonus, must experiment to identify the optimal partitioning scheme per situation • Don’t forget older technologies like clusters, they still can add a positive to the overall equation in certain cases • Don’t forget SQL Tuning Sets & SQL Advisor, or the “explain plans” may not in fact be the best obtainable • Don’t forget 11G’s Result and Client Caches

  45. Thank you Questions and Answers … Presenter: Bert Scalzo E-mail: Bert.Scalzo@Quest.com Note: these slides should be available on Open World web site, but I’ll also make sure to post them on my company’s web site: www.toadworld.com/Experts/BertScalzosToadFanaticism/tabid/318/Default.aspx

More Related