1 / 20

Chapter 1 Adamson & Venerable

0. Chapter 1 Adamson & Venerable. Spring 2012. 0. Dimensional Modeling. Dimensional Model Basics Fact & Dimension Tables Star Schema Granularity Facts and Measures Multiple Processes Aggregation. Transaction Systems. They capture critical detailed info about business events:

loren
Télécharger la présentation

Chapter 1 Adamson & Venerable

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 0 Chapter 1Adamson & Venerable Spring 2012

  2. 0 Dimensional Modeling • Dimensional Model Basics • Fact & Dimension Tables • Star Schema • Granularity • Facts and Measures • Multiple Processes • Aggregation

  3. Transaction Systems • They capture critical detailed info about business events: • A product is manufactured • An account is credited while another is debited • A seat is reserved • Most businesses have a variety of OLTP systems • An OLTP system looks at the micro-level transactions that drive the process • OLTP’s normalized design ensures transaction consistency

  4. 0 Dimensional Modeling • Dimensional Model Basics • Relational Model vs. Dimensional Model • ERDs vs. Star Schema • Normalization issues • Reports & Queries in Relational Models can be complex: • Especially when seeking historical comparisons • Business Process Measurements (Not in RDBMS) • Gross Margins • Average Aged A/R balances • Inventory levels compared to Sales by Warehouse Location • ROI, etc.

  5. Business Questions • Questions managers ask about a business: • What are the best-selling products last week? • How often did we fulfill an order in more than two shipments? • What is the quarterly trend on orders by region? • Show me gross margin by product category • How does inventory level compare to sales by warehouse? • Answering these questions requires a different design than in OLTP • The dimensional model serves this purpose

  6. Process Metrics • Each question centers around a business process: • Orders, Account Management, Inventory Management, Accounts Receivable, Returns • Each question reveals how a process is measured: • Gross Margin measures the sales process for distributors • Banks monitor balances as part of their account mgt. • Warehouse managers track inventory levels • Accountants quantify receivables • For some processes there is a single metric • Some others have multiple metrics

  7. The Dimensional Model • Reflects the way business people think • Facts or Measures: The measurements that are of importance to a business • Dimensions: The parameters by which a measurement can be viewed or referred to • Any fact (measure) can be combined with any dimension • EXAMPLE: An Order Entry Process

  8. 0 Fact & Dimension Tables

  9. 0 Dimensional Modeling • Fact & Dimension Tables • Use Normalization approach to determine appropriate dimension tables • Functional dependencies • Customer_key  Name, address, city, state, zip • Salesperson_key  Name, code, territory, region • Date_key date, day_week, day_of_month, etc. • Order_key  product_key, order_date_key, salesperson_key, customer_key, order_dollars, extended_cost, margin_dollars, quantity_ordered, order_number, order_line

  10. 0 Star Diagram for Orders

  11. 0 Star Schema • Primary Keys • Foreign Keys • Orders [Order_Date_key, Product_key, salesperson_key, customer_key, ….] • Surrogate Key • Orders [Order_ID, Order_Date_key, Product_key, Salesperson_key, Customer_key, etc….] • Avoid Keys from other Systems • Granularity (the Grain) • What level of detail is recorded on the Fact Table • Individual Transactions • Daily summaries, weekly summaries, monthly summaries, etc. • Must be uniform throughout

  12. Ensuring a chosen Grain • Foreign keys relating a row in the fact table to dimension rows should never be null • Optional relationships to a dimension are usually a sign of a grain problem • The rule against optionality can be relaxed when a dimension is not a part of the grain of the fact table. Then, add “fake” rows such as a product called Not-a-Product or a promotion table item called Not-a-Promotion

  13. 0 The Dimensions: WIDE tables • Produce “Wide” dimension tables: 100 or 200 attributes (dimensions) are common • Relationships among dimensions in a dimension table does not have to be strong: • Customer table may include region, regional data, sales force data, etc. • Resist the urge to normalize dimension tables! • Role of dimensions attributes are to qualify queries/break out measures • Spreading dimensional attributes across several tables would complicate and slow down queries and users would have to wait longer • Dimension hierarchies are flattened out: year 1998 may appear 365 times in the time dimension table, departments may appear multiple times in a salesperson dimension table

  14. 0 Dimensional Modeling • Slowly changing dimensions • Type 1 Change • Correct an error or a value that has no significance in future analysis • E.g., phone number change for customer • Type 2 Change • Add new row in dimension table • Significant change, such as a new location (address) for customer • Type 3 Change • Add new dimension or modify schema (add columns) • New organization structure for firm; want to be able to compare new and old

  15. 0 The Fact table: A DEEP table 3 Kinds of Measures • Fully additive • Facts may be added equally well across any dimension • E.g., gross sales dollars, extended cost, margin dollars, quantity sold • Non additive • Facts that may not be added. Generally ratios. E.g., Margin rate • Need to break components into individual facts: margin dollars, sales dollars • Semi-additive • Fact that can be summarized across some dimensions • E.g., Banking account balances can be totaled for point in time, but not across time periods.

  16. 0 Dimensional Modeling • Multiple Processes • E.g., Orders vs. Shipments • Different process • Different grains • E.g., Time dimension • Hour • Day • Week • Month • Year

  17. 0 Multiple Processes: Orders & Shipments

  18. 0 Dimensional Modeling • Aggregation • As size of Data Warehouse grows we may need to aggregate date in order to improve query performance • E.g., change from Daily to Monthly summaries • Keep same Dimension Tables • Time table may be modified • Fact table keeps same measures as before • Facts now represent monthly summarizations of data.

  19. 0 Aggregate Table De-normalize from Day to Month Adjust primary key accordingly

  20. 0 Summary • Dimensional Model Basics • Fact & Dimension Tables • Star Schema • Granularity • Facts and Measures • Multiple Processes • Aggregation

More Related