1 / 46

Adamson & Venerable Chapter 2 & working out a Homework 5/6 Solution

Adamson & Venerable Chapter 2 & working out a Homework 5/6 Solution. Transforming Relational Databases into Dimensional Diagrams Spring 2012. Dimensional modeling in Sales. In a DW designed to analyze SALES data, important component of a dimensional model is the Product Dimension.

taniel
Télécharger la présentation

Adamson & Venerable Chapter 2 & working out a Homework 5/6 Solution

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Adamson & Venerable Chapter 2 &working out a Homework 5/6 Solution Transforming Relational Databases into Dimensional Diagrams Spring 2012

  2. Dimensional modeling in Sales • In a DW designed to analyze SALES data, important component of a dimensional model is the Product Dimension. • Product dimension includes important characteristics used to differentiate the product in the marketplace, called Discriminators.

  3. Key Business Term: Discriminators • Discriminators: Descriptive characteristics of a product that further describe it and are relevant to purchasing decisions. Tracking discriminators allows the business analyst to monitor performance of various product styles, influencing production and marketing plans. • Discriminators for a men’s suit: Cloth, color, style/cut, weight, size • Discriminators for vehicles: Model name, model styling package, line, category, exterior color, model year, interior color

  4. Other Dimensions in Sales • Time dimension: Time key, month, day, date, year, day of week, quarter. • Customer_Demographic dimension: This does not require a row for each customer, but groups customers by different combinations of age, gender, income, and geography. The degree of demographic segmentation varies by industry. • Dealer dimension: Data on dealer performance are needed, to support decisions on which dealers should be eased out of business. • Method_Of_Paymentdimension (lease, financing options, etc.)

  5. Fact Table: Storing derived facts • A commonly used derived fact should be stored, and not calculated in reports and queries. Cutting such “redundant” key measures from the fact table results in the following: • Development of reports gets more complex • Increased potential for errors in reports • Increased documentation requirements • A Hundred dollars’ worth of disk space is saved (40 MB of space savings for a 10-million row fact table)

  6. Transformation Stages(Key for Homework #5/6) • De-normalization Process • Start with Normalized Tables • Determine Dimensions and Fact Tables • Delete Relationships • Rebuild Tables • Rebuild Relationship Diagram as Star Diagram, a.k.a., Dimension Table • Provides information needed to complete Homework 6, too!

  7. Premiere Products ERD Redrawn to form most likely Star Diagram

  8. Delete Relationships

  9. Rebuild Tables • Using copied operations database • Be sure all ops. data is saved and backed up … multiple times. • Data staging & cleansing • Denormalize extra relationships • Order:OrderLine; • Customer:SalesRep; • Order:OrderLine  OrderDetail • Customer:SalesRep  OrderDetail • Transform data for new tables in Access: Make Table • Export data files, if needed, to rebuild elsewhere (Excel) • Additional Transformations as needed in Excel • Create the Time dimension • Re-Import data files to new tables

  10. Order_OrderLine Query

  11. Restructured Data -- Stage 1

  12. Preparing Rep:Customer

  13. Join to Order_OrderLine

  14. OrderDetail Query Order of Columns Not critical Sort order not critical but Good time to revise

  15. Make-Table OrderDetail Query

  16. OrderDetail Table Set Primary Keys

  17. Data Cleansing Fix dates

  18. Add Indexes

  19. Build Star Diagram Fact Table What’s Missing?

  20. Time Dimension Use SQL to eliminate Redundant dates

  21. Built-in Functions

  22. Built-in Date/Time Functions

  23. Excel Time Table • Create Time table using • Excel Formulas: • Fill • Copy/paste • Compute Quarter • Conversion Formulas • Etc.

  24. Import Time Table to Access

  25. Time Table w/Indexes

  26. Premier Products Star Diagram

  27. PP -- Relation List Fact Table • OrderDetail[OrderNum, PartNum, OrderDate, CustNo, RepNo, NumOrdered, Price] Dimension Tables • Customer[CustNum, CustName, Street, City, State, Zip, Balance, CreditLimit] • Rep[RepNum, LastName, FirstName, Street, City, State, Zip, Commission, Rate] • Part[PartNum, Desc, OnHand, Class, Warehouse, Price] • Time[TimeKey, Day_of_Week, Month, Year, JulianDate, Quarter, etc.]

  28. Replacing OrderDate with Time_key

  29. Tools/Analyze/Tables

  30. Tools/Analyze/Tables

  31. Tools/Analyze/Tables

  32. Tools/Analyze/Documenter

  33. Tools/Analyze/Documenter CUSTOMER table

  34. http://www.webopedia.com GUIDLast modified: Thursday, June 20, 2002 • Short for Globally Unique Identifier, a unique 128-bit number that is produced by the Windows OS or by some Windows applications to identify a particular component, application, file, database entry, and/or user. For instance, a Web site may generate a GUID and assign it to a user's browser to record and track the session. A GUID is also used in a Windows registry to identify COMDLLs. Knowing where to look in the registry and having the correct GUID yields a lot information about a COM object (i.e., information in the type library, its physical location, etc.). Windows also identifies user accounts by a username (computer/domain and username) and assigns it a GUID. Some database administrators even will use GUIDs as primary key values in databases. • GUIDs can be created in a number of ways, but usually they are a combination of a few unique settings based on specific point in time (e.g., an IP address, network MAC address, clock date/time, etc.).

  35. Tools/Analyze/Documenter OrderDetail table

  36. Henry Books ERD (before) BookInventory

  37. Henry Books ERD (before)

  38. Update Query for Inventory Table[Physical Inventory Date]

  39. Inventory Table: Date Updated

  40. BookInventory Fact TableNote: 2 Time Keys [Pick one]

  41. BookInventory Fact TableNote: 2 Time Keys [Pick one]

  42. BookInventory Fact TableNote: 2 Time Keys [Pick one]

  43. BookInventory Fact TableNote: 2 Time Keys [Pick one]

  44. Henry Books Star Diagram

  45. HB -- Relation List Fact Table • BookTrans[BookCode, AuthorNum, BranchNum, Sequence, PubCode, OnHand, InventoryDate] Dimension Tables • Book[BookCode, Title, Type, Price, Paperback] • Publisher[PubCode, PubName, City, …State, Zip] • Author[AuthorNum, LastName, FirstName, …] • Branch[BranchNum, BranchName, Location, Employees] • Time[TimeKey, Day_of_Week, Month, Year, JulianDate, Quarter, etc.]

  46. Summary • Complete Transformations • How normal are the resulting tables? • 1NF, 2NF, 3NF? • Document Transformation maps • Prepare for Appending Tables with new data • Tools/Analyze/Documenter (next time)

More Related