Data Warehousing - PowerPoint PPT Presentation

michi
data warehousing n.
Skip this Video
Loading SlideShow in 5 Seconds..
Data Warehousing PowerPoint Presentation
Download Presentation
Data Warehousing

play fullscreen
1 / 12
Download Presentation
Data Warehousing
139 Views
Download Presentation

Data Warehousing

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Data Warehousing Lecture-4 Introduction and Background

  2. Introduction and Background

  3. How is it Different? • Starts with a 6x12 availability requirement ... but 7x24 usually becomes the goal. • Decision makers typically don’t work 24 hrs a day and 7 days a week. An ATM system does. • Once decision makers start using the DWH, and start reaping the benefits, they start liking it… • Start using the DWH more often, till want it available 100% of the time.

  4. How is it Different? • Starts with a 6x12 availability requirement ... but 7x24 usually becomes the goal. • For business across the globe, 50% of the world may be sleeping at any one time, but the businesses are up 100% of the time. • 100% availability not a trivial task, need to take into account loading strategies, refresh rates etc.

  5. Requirements  Program  How is it Different? • Does not follows the traditional development model • Classical SDLC • Requirements gathering • Analysis • Design • Programming • Testing • Integration • Implementation

  6. DWH Program  Requirements How is it Different? • Does not follows the traditional development model • DWH SDLC (CLDS) • Implement warehouse • Integrate data • Test for biasness • Program w.r.t data • Design DSS system • Analyze results • Understand requirement

  7. Data Warehouse Vs. OLTP OLTP (On Line Transaction Processing) Select tx_date, balance from tx_table Where account_ID = 23876;

  8. Data Warehouse Vs. OLTP DWH Select balance, age, sal, gender from customer_table, tx_table Where age between (30 and 40) and Education = ‘graduate’ and CustID.customer_table = Customer_ID.tx_table;

  9. Data Warehouse Vs. OLTP

  10. Data Warehouse Vs. OLTP OLTP: OnLine Transaction Processing (MIS or Database System)

  11. Comparison of Response Times • On-line analytical processing (OLAP) queries must be executed in a small number of seconds. • Often requires denormalizationand/or sampling. • Complex query scripts and large list selections can generally be executed in a small number of minutes. • Sophisticated clustering algorithms (e.g., data mining) can generally be executed in a small number of hours (even for hundreds of thousands of customers).

  12. Putting the pieces together  www data OLAP Servers (Tier 2) Clients (Tier 3) Semistructured Sources Query/Reporting MOLAP     Extract Transform Load (ETL)        Analysis         Business Users ROLAP IT Users Data Mining  Operational Data Bases   Business Users Archived data Data (Tier 0) Data Warehouse Server (Tier 1) Meta Data Data Warehouse Data sources Data Marts Tools