1 / 40

How to design a DATA WAREHOUSE

Linh Nguyen (Elly). How to design a DATA WAREHOUSE. What is DW? DW architecture How to design a DW? Advantages of DW Disadvantages of DW Questions and Answers. Agenda. What is DW? DW architecture How to design a DW? Advantages of DW Disadvantages of DW Questions and Answers. Agenda.

ilori
Télécharger la présentation

How to design a DATA WAREHOUSE

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Linh Nguyen (Elly) How to design aDATA WAREHOUSE

  2. What is DW? DW architecture How to design a DW? Advantages of DW Disadvantages of DW Questions and Answers Agenda

  3. What is DW? DW architecture How to design a DW? Advantages of DW Disadvantages of DW Questions and Answers Agenda

  4. What is DW? Be a relational database organized to hold information in a structure that best supports reporting and analysis • Subject oriented • Time variant • Non-volatile: stable • Integrated: from multi source

  5. What is DW? DW architecture How to design a DW? Advantages of DW Disadvantages of DW Questions and Answers Agenda

  6. DW architecture

  7. DW architecture (cont)

  8. Data marts DW architecture (cont)

  9. DW architecture (cont) • Slowly changing dimensions (SCD) • Dimension (table) types • Fact (table) types • Scheme types

  10. DW architecture (cont) • Slowly changing dimensions (SCD) • There are 6 types of SCD, they are type 0,1,2,3,4,6 • The most common SCDs are Types 1, 2, and 3

  11. DW architecture (cont) • I.1. SCD Type 0 • Do not manage a slowly changing dimension. • Values remain as they were at the time the dimension record was first entered. • Surrogate key is not required. • Be easy to maintain.

  12. DW architecture (cont) • I.2. SCD Type 1 • Do not track historical data at all. • Overwrite old data with new data. • Surrogate key is not required. • Be easy to maintain.

  13. DW architecture (cont) • I.3. SCD Type 2 • Create multiple records for a given natural key with separate surrogate key • New record is inserted each time a change is made • Should not be used if dimensional model is subject to change

  14. DW architecture (cont) • I.4. SCD Type 3 • It tracks changes using separate columns • Type 3 has limited history preservation depending on the number of columns designated for storing historical data

  15. DW architecture (cont) • I.5. SCD Type 4 • There are two tables storing data of a dimension • One table keeps the current data • Additional table is used to keep a record of changes

  16. DW architecture (cont) • I.6. SCD Type 6 (Hybrid) • Type 6 combines the approaches of types 1, 2 and 3 (1+2+3=6)

  17. DW architecture (cont) • II. Dimensions Types • Conformed dimension • Junk dimension • Degenerate dimension • Role-playing dimension

  18. DW architecture (cont) • II.1. Conformed dimension • Be a dimension table which can be shared by multiple fact tables • Have a single meaning or content through data warehouse

  19. DW architecture (cont) • II.2. Junk dimension • Be a collection of random/miscellaneous attributes that are unrelated to any particular dimension and don't fit into tight star scheme to: • Reduce number of dimension tables • Decrease number of columns in fact tables

  20. DW architecture (cont) • II.3 Degenerate dimension • Any values in the fact table that don’t join to dimensions are either considered degenerate dimensions or measures • Degenerate dimension can be rolled up from Facts within SSAS

  21. DW architecture (cont) • II.4. Role-playing dimension • Is a dimension which can play different roles in a fact table • For example Date dimension which can be used for Order date, Shipping date, Due date...

  22. DW architecture (cont) • III. Fact Types • Consist of the measurements, metrics or facts of business process (the one contains no measures or facts is factless fact table) • Measure types: • Additive type: can be summed up through all of the dimensions • Semi Additive type: can be summed up for some of the dimensions • Non Additive type: can not be summed up for any of the dimensions • Fact data can be at detail level or aggregated • Fact types: • Transactional: the most detailed data held in the table • Periodic snapshots: it takes a 'picture of the moment – period of time' of data from the transactional table • Accumulating snapshots: store the activity of a process from beginning to end

  23. DW architecture (cont) • IV. Scheme Types • Start Scheme • SnowFlake Scheme

  24. DW architecture (cont) • IV.1. Star schema Design

  25. DW architecture (cont) • IV.1. Advantages of Star schema • be very easy to understand, even for non technical business managers • provide better performance and smaller query times • be easily extensible and will handle future changes easily

  26. DW architecture (cont) • IV.2. Snowflake schema Design

  27. DW architecture (cont) • IV.2. Advantages of Snowflake schema • Save storage space • Reduce processing time in some cases

  28. What is DW? DW architecture How to design a DW? Advantages of DW Disadvantages of DW Questions and Answers Agenda

  29. Analyze requirements*** Identify the Grain Identify the Dimensions Identify the Facts How to design a DW?

  30. How to design a DW ?(cont) • Recommendation • Use surrogate key instead of natural key, PK is required in SSAS • Should not create foreign key with check option • Avoid NULL value of surrogate key on fact table • Should assign a special surrogate key • Pay much consider on many-to-many relationship

  31. How to design a DW ?(cont) • Demo Star Scheme • Demo SnowFlake Scheme

  32. What is DW? DW architecture How to design a DW? Advantages of DW Disadvantages of DW Questions and Answers Agenda

  33. •Enhances end-user access to a wide variety of data. •Increases data consistency. •Increases productivity and decreases computing costs. Advantage of DW

  34. •Be able to combine data from different sources, in one place. •Can record historical information for data source tables that are not set up to save an update history. Advantage of DW (cont)

  35. What is DW? DW architecture How to design a DW? Advantages of DW Disadvantages of DW Questions and Answers Agenda

  36. •Extracting, cleaning and loading data could be time consuming. •Data warehousing project scope might increase. •Problems with compatibility with systems already in place e.g. transaction processing system. Disadvantage of DW

  37. •Providing training to end-users, who end up not using the data warehouse. • Security could develop into a serious issue, especially if the data warehouse is web accessible. •A data warehouse is a HIGH maintenance system. Disadvantage of DW (cont)

  38. What is DW? DW architecture How to design a DW? Advantage of DW Disadvantage of DW Questions and Answers Agenda

  39. Thank you for your attendant QA

  40. Reference • The Data Warehouse Toolkit book by Ralph Kimball • Building the Data Warehouse book by W. H. Inmon • Data Warehouse Design Solutions by Christopher Adamson and Michael Venerable • http://www.cognosforums.com/forums/ • http://dbms.knowledgehills.com/ • http://www.dwinfocenter.org/index.html • http://en.wikipedia.org/wiki/Slowly_changing_dimension • http://en.wikipedia.org/wiki/Fact_table • http://www.1keydata.com/datawarehousing/fact-table-types.html • http://dbms.knowledgehills.com/Dimensional-Modeling-(DM)-tutorial-with-OLAP-and-data-warehouse-design-concepts/a32p1

More Related