Download
chapter 1 adamson venerable n.
Skip this Video
Loading SlideShow in 5 Seconds..
Chapter 1 Adamson & Venerable PowerPoint Presentation
Download Presentation
Chapter 1 Adamson & Venerable

Chapter 1 Adamson & Venerable

200 Views Download Presentation
Download Presentation

Chapter 1 Adamson & Venerable

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. 0 Chapter 1Adamson & Venerable Spring 2012

  2. 0 Dimensional Modeling • Dimensional Model Basics • Fact & Dimension Tables • Star Schema • Granularity • Facts and Measures • Multiple Processes • Aggregation

  3. Transaction Systems • They capture critical detailed info about business events: • A product is manufactured • An account is credited while another is debited • A seat is reserved • Most businesses have a variety of OLTP systems • An OLTP system looks at the micro-level transactions that drive the process • OLTP’s normalized design ensures transaction consistency

  4. 0 Dimensional Modeling • Dimensional Model Basics • Relational Model vs. Dimensional Model • ERDs vs. Star Schema • Normalization issues • Reports & Queries in Relational Models can be complex: • Especially when seeking historical comparisons • Business Process Measurements (Not in RDBMS) • Gross Margins • Average Aged A/R balances • Inventory levels compared to Sales by Warehouse Location • ROI, etc.

  5. Business Questions • Questions managers ask about a business: • What are the best-selling products last week? • How often did we fulfill an order in more than two shipments? • What is the quarterly trend on orders by region? • Show me gross margin by product category • How does inventory level compare to sales by warehouse? • Answering these questions requires a different design than in OLTP • The dimensional model serves this purpose

  6. Process Metrics • Each question centers around a business process: • Orders, Account Management, Inventory Management, Accounts Receivable, Returns • Each question reveals how a process is measured: • Gross Margin measures the sales process for distributors • Banks monitor balances as part of their account mgt. • Warehouse managers track inventory levels • Accountants quantify receivables • For some processes there is a single metric • Some others have multiple metrics

  7. The Dimensional Model • Reflects the way business people think • Facts or Measures: The measurements that are of importance to a business • Dimensions: The parameters by which a measurement can be viewed or referred to • Any fact (measure) can be combined with any dimension • EXAMPLE: An Order Entry Process

  8. 0 Fact & Dimension Tables

  9. 0 Dimensional Modeling • Fact & Dimension Tables • Use Normalization approach to determine appropriate dimension tables • Functional dependencies • Customer_key  Name, address, city, state, zip • Salesperson_key  Name, code, territory, region • Date_key date, day_week, day_of_month, etc. • Order_key  product_key, order_date_key, salesperson_key, customer_key, order_dollars, extended_cost, margin_dollars, quantity_ordered, order_number, order_line

  10. 0 Star Diagram for Orders

  11. 0 Star Schema • Primary Keys • Foreign Keys • Orders [Order_Date_key, Product_key, salesperson_key, customer_key, ….] • Surrogate Key • Orders [Order_ID, Order_Date_key, Product_key, Salesperson_key, Customer_key, etc….] • Avoid Keys from other Systems • Granularity (the Grain) • What level of detail is recorded on the Fact Table • Individual Transactions • Daily summaries, weekly summaries, monthly summaries, etc. • Must be uniform throughout

  12. Ensuring a chosen Grain • Foreign keys relating a row in the fact table to dimension rows should never be null • Optional relationships to a dimension are usually a sign of a grain problem • The rule against optionality can be relaxed when a dimension is not a part of the grain of the fact table. Then, add “fake” rows such as a product called Not-a-Product or a promotion table item called Not-a-Promotion

  13. 0 The Dimensions: WIDE tables • Produce “Wide” dimension tables: 100 or 200 attributes (dimensions) are common • Relationships among dimensions in a dimension table does not have to be strong: • Customer table may include region, regional data, sales force data, etc. • Resist the urge to normalize dimension tables! • Role of dimensions attributes are to qualify queries/break out measures • Spreading dimensional attributes across several tables would complicate and slow down queries and users would have to wait longer • Dimension hierarchies are flattened out: year 1998 may appear 365 times in the time dimension table, departments may appear multiple times in a salesperson dimension table

  14. 0 Dimensional Modeling • Slowly changing dimensions • Type 1 Change • Correct an error or a value that has no significance in future analysis • E.g., phone number change for customer • Type 2 Change • Add new row in dimension table • Significant change, such as a new location (address) for customer • Type 3 Change • Add new dimension or modify schema (add columns) • New organization structure for firm; want to be able to compare new and old

  15. 0 The Fact table: A DEEP table 3 Kinds of Measures • Fully additive • Facts may be added equally well across any dimension • E.g., gross sales dollars, extended cost, margin dollars, quantity sold • Non additive • Facts that may not be added. Generally ratios. E.g., Margin rate • Need to break components into individual facts: margin dollars, sales dollars • Semi-additive • Fact that can be summarized across some dimensions • E.g., Banking account balances can be totaled for point in time, but not across time periods.

  16. 0 Dimensional Modeling • Multiple Processes • E.g., Orders vs. Shipments • Different process • Different grains • E.g., Time dimension • Hour • Day • Week • Month • Year

  17. 0 Multiple Processes: Orders & Shipments

  18. 0 Dimensional Modeling • Aggregation • As size of Data Warehouse grows we may need to aggregate date in order to improve query performance • E.g., change from Daily to Monthly summaries • Keep same Dimension Tables • Time table may be modified • Fact table keeps same measures as before • Facts now represent monthly summarizations of data.

  19. 0 Aggregate Table De-normalize from Day to Month Adjust primary key accordingly

  20. 0 Summary • Dimensional Model Basics • Fact & Dimension Tables • Star Schema • Granularity • Facts and Measures • Multiple Processes • Aggregation