1 / 17

Data Warehousing

Data Warehousing . Databases support:. Transaction Processing Systems operational level decision recording of transactions Decision Support Systems tactical and strategic decision making analysis of historical records. Can one database support both?. RDBMS. DSS. TPS.

lana
Télécharger la présentation

Data Warehousing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Warehousing

  2. Databases support: • Transaction Processing Systems • operational level decision • recording of transactions • Decision Support Systems • tactical and strategic decision making • analysis of historical records

  3. Can one database support both? RDBMS DSS TPS

  4. Can one database support both? RDBMS DSS TPS • low concurrency • large reads • significant aggregation • high concurrency • small transactions • limited aggregation Yes… but at a cost in performance.

  5. The Solution… TPS DSS Production Database (OLTP) Data Warehouse Extract, Transport & Transformation Load

  6. OLTP vs DW Characteristics OLTP Database Data Warehouse High Read/Write Concurrency Primarily Read Only Highly Normalized Highly Denormalized Limited Transaction History Massive Transaction History Very Detailed Data Detailed and Summarized Data Limited External Data Significant External Data

  7. Data Marts (3-tier approach) External Data Sources Data Mart A DSS Data Warehouse Production Database (OLTP) Data Mart B DSS ETL Data Mart C Transformation & Limitation DSS

  8. Data Marts (bottom-up approach) External Data Sources Data Mart A DSS ETL Production Database (OLTP) External Data Sources Data Mart B ETL DSS ETL Data Mart C DSS External Data Sources

  9. Multi-dimensional (Sales) Data 80 110 60 25 California 40 90 50 30 Utah 70 55 60 35 March 3 Arizona March 2 March 1 Diet Soda Lime Soda Soda Orange Soda

  10. Cube Operations • Cube (group by option) • Slice (implement in Oracle with where clause) • Dice (implement in Oracle with where clause) • Drill Down (implemented in report writers) • Roll-up (group by option) • Pivot (not implemented by Oracle (but by Access))

  11. Cube Data Example Create table sales ( Item varchar2(20), State varchar2(20), Amount number(6), Day date); Insert into Sales values('Soda','California',80,'01-Mar-2004'); Insert into Sales values('Diet Soda','California',110,'01-Mar-2004'); …

  12. Examine these queries Select * from sales; Select Item, State, sum(amount) from sales group by Item, State; Select Item, State, sum(amount) from sales group by Rollup(Item, State); Select State, Item, sum(amount) from sales group by Rollup(State, Item); Select State, Item, sum(amount) from sales group by Cube(State, Item);

  13. Materialized Views • Materialized views are schema objects that can be used to summarize, precompute, replicate, and distribute data. They are suitable in various computing environments such as data warehousing, decision support, and distributed or mobile computing: • In data warehouses, materialized views are used to precompute and store aggregated data such as sums and averages. Materialized views in these environments are typically referred to as summaries because they store summarized data. • Cost-based optimization can use materialized views to improve query performance by automatically recognizing when a materialized view can and should be used to satisfy a request. The optimizer transparently rewrites the request to use the materialized view. Queries are then directed to the materialized view and not to the underlying detail tables or views. • In distributed environments, materialized views are used to replicate data at distributed sites and synchronize updates done at several sites with conflict resolution methods. The materialized views as replicas provide local access to data that otherwise has to be accessed from remote sites. • In mobile computing environments, materialized views are used to download a subset of data from central servers to mobile clients, with periodic refreshes from the central servers and propagation of updates by clients back to the central servers.

  14. Create Materialized View (partial syntax)

  15. Materialized View refresh_clause

  16. MV Example Create Materialized View MVcustomer REFRESH start with sysdate Next sysdate+(1/24) AS Select customerID,lastname,firstname, phone from customers;

  17. RDBMS Star Schema Item Store ItemID StoreID Name Manager Sales UnitPrice Street SalesNO Brand City SalesUnits Category Zip SalesDollars SalesCost ItemID Customer Day CustID CustID DayID StoreID Name DayOfMonth DayID Phone Month Street Year City DayOfWeek

More Related