1 / 51

Oracle Database 11g for Data Warehousing

Presenter’s Name Presenter’s Title. Oracle Database 11g for Data Warehousing. Agenda. Technology Monitoring Information Life-cycle Management (ILM) ‏ Oracle Optimized Warehouse Initiative Market. <Insert Picture Here>. <Insert Picture Here>. Technology. Parallel Execution.

adanna
Télécharger la présentation

Oracle Database 11g for Data Warehousing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Presenter’s Name Presenter’s Title Oracle Database 11g for Data Warehousing

  2. Agenda Technology Monitoring Information Life-cycle Management (ILM)‏ Oracle Optimized Warehouse Initiative Market <Insert Picture Here>

  3. <Insert Picture Here> Technology

  4. Parallel Execution select c.cust_last_name , sum(s.amount_sold)‏ from customers c, sales s where c.cust_id = s.cust_id group by c.cust_last_name ; Data on Disk Parallel Servers Coordinator join aggregate scan join aggregate scan aggregate join scan Scanners Joiners Aggregators

  5. Partitioning – Benefits ORDERS USA ORDERS ORDERS EUROPE FEB FEB JAN JAN Large Table Difficult to Manage Partition Divide and Conquer Easier to Manage Improve Performance Composite Partition Better Performance More flexibility to match business needs Transparent to applications

  6. Partitioning in Oracle Database 11gInterval Partitioning Partitions are created automatically as data arrives INVENTORY ORDERS ORDERS FEB MAR APR JAN FEB MAR JAN FEB JAN

  7. Partitioning in Oracle Database 11gComplete Composite Partitioning Range – range List – list List – hash List – range     ORDERS ORDERS >5000 Gold ORDERS >5000 1000-5000 Silver 1000-5000 EUROPE EUROPE FEB USA USA JAN LIST-RANGERegion byOrder Value RANGE-RANGEOrder Date by Order Value LIST-LISTRegion by Customer Type

  8. Partitioning in Oracle Database 11gReference Partitioning Inherit partitioning strategy LineItems StockHolds LineItems ORDERS PickLists StockHolds BackOrders ORDERS LineItems PickLists BackOrders StockHolds LineItems StockHolds ORDERS LineItems ORDERS PickLists PickLists StockHolds BackOrders ORDERS PickLists BackOrders BackOrders FEB JAN PartitionORDERSby Date APR MAR

  9. REGION AS (SUBSTR(ORDER_ID,6,2))------ US EU EU US US ORDERS USA EUROPE FEB JAN Partitioning in Oracle Database 11gVirtual Column-Based Partitioning • REGION requires no storage • Partition by ORDER_DATE, REGION ORDERS ORDER_ID ORDER_DATE CUSTOMER_ID... ---------- ----------- ----------- -- 9834-US-14 12-JAN-2007 65920 8300-EU-97 14-FEB-2007 39654 3886-EU-02 16-JAN-2007 4529 2566-US-94 19-JAN-2007 15327 3699-US-63 02-FEB-2007 18733

  10. Compression Tables and indexes can be compressed Can be specified on a per-partition basis Typical compression ratio 3:1 Requires more CPU to load data Decompression hardly costs resources Compress for all DML operations Less data on disk Requires less time to read Completely transparent Up To 3X Compression

  11. SQL Query Result Cache Store query results in cache Repetitive executions can use cached result Data Warehouse queries Long-running, IO-intensive Expensive computations Return few rows Excellent opportunity for SQL Query Result Cache ------------------------------------------------------------------ | Id | Operation | Name | ------------------------------------------------------------------ | 0 | SELECT STATEMENT | | | 1 | RESULT CACHE | fz6cm4jbpcwh48wcyk60m7qypu | | 2 | SORT GROUP BY ROLLUP | | |* 3 | HASH JOIN | | etc.

  12. SQL Query Result CacheOpportunity Retail customer data (~50 GB)‏ Concurrent users submitting queries randomly Executive dashboard with 12 heavy analytical queries Cache results only at in-line view level 12 queries run in random, different order – 4 queries cached Measure average, total response time for all users # Users No cache Cache Improvement 2 186 s 141 s 24% 4 267 s 201 s 25% 8 447 s 334 s 25%

  13. Other Performance FeaturesTransparent to Your Application Materialized Views Transparent rewrites of expensive queries Including rewrites on remote objects Incremental automatic refresh Bitmap Indexes Optimal storage Ideal for star or star look-a-like schemas SQL Access Advisor – based on workload Materialized view advice Index advice Partition advice

  14. Bring Algorithms to the DataNot Data to the Algorithms Analytic computations done in the database SQL Analytics OLAP Data Mining Statistics Scalability Security Backup & Recovery Simplicity SQL analytics OLAP Data Mining Statistics

  15. Native Support for Pivot and Unpivot SALESREP QU REVENUE ---------- -- ---------- 100 Q1 230 100 Q2 240 100 Q3 260 100 Q4 300 101 Q1 200 101 Q2 220 101 Q3 250 101 Q4 260 102 Q1 260 102 Q2 280 102 Q3 265 102 Q4 310 SALESREP Q1 Q2 Q3 Q4 ---------- ----- ----- ----- ----- 100 230 240 260 300 101 200 220 250 260 102 260 280 265 310

  16. Native Support for Pivot and Unpivot SALESREP QU REVENUE ---------- -- ---------- 100 Q1 230 100 Q2 240 100 Q3 260 100 Q4 300 101 Q1 200 101 Q2 220 101 Q3 250 101 Q4 260 102 Q1 260 102 Q2 280 102 Q3 265 102 Q4 310 QUARTERLY_SALES SALESREP Q1 Q2 Q3 Q4 ---------- ----- ----- ----- ----- 100 230 240 260 300 101 200 220 250 260 102 260 280 265 310 select * from quarterly_sales unpivot include nulls (revenue for quarter in (q1,q2,q3,q4))‏ order by salesrep, quarter ;

  17. Native Support for Pivot and Unpivot SALES_BY_QUARTER SALESREP QU REVENUE ---------- -- ---------- 100 Q1 230 100 Q2 240 100 Q3 160 100 Q4 90 100 Q3 100 100 Q4 140 100 Q4 70 101 Q1 200 101 Q2 220 101 Q3 250 101 Q4 260 102 Q1 260 SALESREP 'Q1' 'Q2' 'Q3' 'Q4' ---------- ----- ----- ----- ----- 100 230 240 260 300 101 200 220 250 260 102 260 280 265 310 select * from sales_by_quarter pivot (sum(revenue)‏ for quarter in ('Q1','Q2','Q3','Q4'))‏ order by salesrep ;

  18. Transform Data Where Data ResidesIn-database ETL technology Extract Load Transform Insert Data Pump Transportable Tablespaces Partition Exchange Loading Change Data Capture Distributed Queries SQL*Loader External Tables Table Functions Multi-Table Insert MERGE DML error logging

  19. Capture changes from [redo | archive] logs No changes to source applications Minimal performance impact on source applications Store changes in change tables Provide (bulk) SQL interface to change data Asynchronous Change Data Capture Time-based subscription windows Capture PMOPs DW Tables Change Data Read-consistent subscription Log Miner and Streams Transform SQL, PL/SQL,Java Oracle Database 11g Log files OLTP DB

  20. RAC – Scale Incrementally W o r k l o d 300% 200% 100% 3 6 9 12 15 18 21 24 Months

  21. Automatic Storage Management Storage pool for database files Load-balanced across disks Capacity on demand Add/remove storage on-line Automatic IO load balancing Fault tolerant, high performance Automatically mirrors and stripes Low cost No IO tuning required No volume manager or file system needed

  22. Mixed Workloads Concurrent small data loads and queries Looks like... OLTP Oracle's read consistency Readers never block writers Writers never block readers Queries are always consistent and auditable No deadlocks Introduced in Oracle V4 (1982) – major improvements in V6 (1988)‏ report Budget table update Rollback Segment Before Image update accurate report

  23. Database Resource Manager Protect the system pro-actively Maximum number of concurrent operations Priority-dependent maximum Degree Of Parallelism (DOP)‏ 20 users (DOP 10)‏ Sales Analysis High Priority 200 users (DOP 4)‏ Ad Hoc Reports Medium Priority 200 users (DOP 4)‏ ETL Jobs Low Priority

  24. Oracle Database Security       Protect stored data Protect data in transit Access Control Authorize Audit Authenticate Marketing Finance Sales   Identity Management

  25. Feature Usage for Large-Scale Data Warehouses Partitioning, parallelism, and compression are the foundation for large-scale data warehousing Source: TB Club Report: A survey of 30 multi-TB Oracle DW’s – data July 2006

  26. <Insert Picture Here> Monitoring

  27. I/O MonitoringDatabase Control

  28. I/O MonitoringDatabase Control

  29. Parallel Execution MonitoringDatabase Control

  30. Near Real-Time SQL MonitoringComing in Grid Control

  31. Parallel SQL MonitoringComing in Grid Control

  32. <Insert Picture Here> Information Life-cycle Management (ILM)‏

  33. Information Lifecycle Management Active Data Less Active Data Historical Data “The policies, processes, practices, and tools used to align the business value of information with the most appropriate and cost effective IT infrastructure from the time information is conceived through its final disposition.” Storage Networking Industry Association (SNIA) Data Management Forum

  34. Information Lifecycle Management Orders Q1 Orders Q2 Orders Q3 Orders Q4 Orders Older Orders Active High Performance Storage Tier Less Active Low CostStorage Tier Historical Online ArchiveStorage Tier

  35. Traditional Storage ApproachAll data resides on a single storage tier Active Active High Performance Storage Tier = $72 per Gb All data on active = $972,000!

  36. Partitioning is the Foundation for ILMPartition data onto appropriate storage tier Low cost Storage Tier = $14 per Gb Read only Storage Tier = $7 per Gb High Performance Storage Tier = $72 per Gb Active Historical Less Active

  37. Partitioning is the Foundation for ILMMove data onto appropriate storage tier Low cost Storage Tier = $14 per Gb Read only Storage Tier = $7 per Gb High Performance Storage Tier = $72 per Gb 5% Active 60% Historical 35% Less Active

  38. Partitioning is the Foundation for ILMReduce storage costs accordingly Low cost Storage Tier = $14 per Gb Read only Storage Tier = $7 per Gb High Performance Storage Tier = $72 per Gb $49,800 $67,700 $58,000 5% Active 60% Historical 35% Less Active

  39. Introduce CompressionReduce storage costs across all tiers $49,800 $67,700 $58,000 $16,600 $22,600 $19,400 5% Active 60% Historical 35% Less Active Lets use compression factor of 3

  40. Cost Savings by Storage Tier

  41. <Insert Picture Here> Oracle Optimized Warehouse Initiative

  42. Oracle Optimized Warehouse Initiative Goals for Oracle data warehouse solutions: Provide superior system performance Provide a superior customer experience

  43. Full Range of DW Solution Options ReferenceConfiguration OptimizedWarehouse • Scalable systems pre-installed and pre-configured: ready to run out-of-the-box • Benefits: • High performance • Simple to buy • Fast to implement • Easy to maintain • Competitively priced • Documented best-practice configurations for data warehousing • Benefits: • High performance • Simple to scale; modular building blocks • Industry-leading database and hardware • Available today with HP, IBM, Sun, EMC/Dell • Partitioning • RAC • Database Options • Management Packs Pre-configured, Pre-installed, Validated Flexibility Custom • Flexibility for the most demanding data warehouse • Benefits: • High performance • Unlimited scalability • Completely customizable • Industry-leading database and hardware • Database Options • Management Packs

  44. <Insert Picture Here> Market

  45. Data Warehouse Market Oracle is the Data Warehousing DBMS Market Leader Source: IDC, 2006 - Worldwide Data Warehousing Tools 2005 Vendor Shares

  46. Leading ScalabilityWintercorp VLDB Survey 1998 Survey 2003 Survey 2005 Survey Sears Teradata 4.63 HCIA Informix 4.50 Wal-Mart Teradata 4.42 Tele Danmark DB2 2.84 Citicorp DB2 2.47 MCI Informix 1.88 NDC Health Oracle 1.85 Sprint Teradata 1.30 Ford Oracle 1.20 Acxiom Oracle 1.13 France Telecom Oracle 29.23 AT&T Proprietary 26.27 SBC Teradata 24.81 Anonymous DB2 16.19 Amazon.com Oracle 13.00 Kmart Teradata 12.59 Claria Oracle 12.10 HIRA Sybase IQ 11.94 FedEx Teradata 9.98 Vodafone Gmbh Teradata 9.91 Yahoo! Oracle 100.39 AT&T Daytona 93.88 KT-IT Group DB2 49.40 AT&T Daytona 26.71 LGR - Cingular Oracle 25.20 Amazon.com Oracle 24.77 Anonymous DB2 19.65 UPSS Microsoft 19.47 Amazon.com Oracle 18.56 Nielsen Media Sybase IQ 17.69 Source: http://www.wintercorp.com

  47. Oracle DW 10+TB Customers (3/2006)Various Platforms and Architectures Acxiom 16 TB HP Allstate 15 TB Sun (RAC)‏ Amazon 61 TB HP (RAC)‏ Cellcom 14 TB HP CenturyTel 10 TB HP Chase 30 TB IBM (RAC)‏ Choicepoint 14 TB Sun Claria 38 TB Sun Experian 14 TB Sun KTF 14 TB HP Cingular 25 TB HP • Mastercard 20 TB IBM (RAC)‏ • NASDAQ 35 TB Sun • NexTel 28 TB HP • NYSE Group 15 TB HP (RAC)‏ • Reliance Ltd 13 TB Sun • Starwood 12 TB HP • TIM (Italy) 12 TB HP (RAC)‏ • Turkcell 14 TB Sun (RAC)‏ • UBS AG 15 TB Sun • UPS 10 TB HP • Yahoo! 130 TB Fujitsu Hundreds of Terabyte+ DW Customers!

  48. Summary Technology Monitoring Information Life-cycle Management (ILM)‏ Oracle Optimized Warehouse Initiative Market <Insert Picture Here>

  49. For More Information http://search.oracle.com BI & Data Warehousing or oracle.com

More Related