1 / 37

Implementing a Data Warehouse with SQL Server Jump Start

Implementing a Data Warehouse with SQL Server Jump Start. Richard Currey | Senior Technical Trainer–New Horizons United George Sq ui l lace | Senior Technical Trainer–New Horizons Great Lakes. Meet Richard Currey. Senior Technical Trainer – New Horizons United

apria
Télécharger la présentation

Implementing a Data Warehouse with SQL Server Jump Start

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Implementing a Data Warehouse with SQL Server Jump Start Richard Currey | Senior Technical Trainer–New Horizons United George Squillace | Senior Technical Trainer–New Horizons Great Lakes

  2. Meet Richard Currey • Senior Technical Trainer – New Horizons United • Focused on database and development technologies • MCDBA, MCITP Dev / Admin / BI, MCSE Data Platform, BI • MCSD, MCPD Web Developer, ASP .NET Developer, Windows Developer • 28 Years Industry Experience • Designed, developed and managed BI-centric projects at several Fortune 500 organizations • Extensive consulting and project management background

  3. Meet George SQUILLACE • Senior Technical Trainer – New Horizons Great Lakes (20 Years) • “SQL” is in his name! • Focused on database technologies • MCT Since 1997 • 28 Years Industry Experience • SQL Server 2012: certified as MCSA, MCSE: Data Platform, & MSCE: Business Intelligence • Certified in every version of SQL Server since SQL 2000 • Certified in every version of Windows Server from NT 4.0 through Windows Server 2008 • Certified in Exchange Server 5.5 through Exchange Server 2007

  4. Course Modules

  5. Setting Expectations • Target Audience • Data warehousing specialists who want to expand their knowledge of SQL Server Integration Services (SSIS) • Database professionals who want to take exam 70-463 and get certified in data warehouse implementations • Suggested Prerequisites/Supporting Material • SQL Server development experience and exposure to extract, transform, and load (ETL) processes • Course 10777, Implementing a Data Warehouse with Microsoft SQL Server 2012 • MS Press Book: Training Kit (Exam 70-463): Implementing a Data Warehouse with Microsoft SQL Server 2012

  6. 01 | Design and Implement Dimensions and Fact Tables Richard Currey | Senior Technical Trainer–New Horizons United George Squillace | Senior Technical Trainer–New Horizons Great Lakes

  7. Module 1 Overview • Schema Design: Star vs. Snowflake • Facts and Fact Tables • Fact and Dimension Granularity • Conformed and Non-Conformed Dimensions • Time Dimensions

  8. Topic: Schema Design: Star vs. Snowflake

  9. Topic: Schema Design: Star versus Snowflake • Star Schema • Snowflake Schema • Processing and Performance Considerations

  10. Star Schema DimSalesPerson SalesPersonKey SalesPersonName StoreName StoreCity StoreRegion • A star schema has a single table for each dimension • Each table supports all attributes for that dimension • Typically a de-normalized solution FactOrders CustomerKey SalesPersonKey ProductKey ShippingAgentKey TimeKey OrderNo LineItemNo Quantity Revenue Cost Profit DimCustomer CustomerKey CustomerName City Region DimDate DateKey Year Quarter Month Day DimProduct ProductKey ProductName ProductLine SupplierName DimShippingAgent ShippingAgentKey ShippingAgentName

  11. Implementing a Star Schema

  12. Snowflake Schema DimSalesPerson SalesPersonKey SalesPersonName StoreKey DimStore StoreKey StoreName • GeographyKey • More normalized solution • Typically contains multiple tables per dimension • Each table contains dimension key, value, and the foreign key value for the parent DimDate DateKey Year Quarter Month Day FactOrders CustomerKey SalesPersonKey ProductKey ShippingAgentKey TimeKey OrderNo LineItemNo Quantity Revenue Cost Profit DimGeography GeographyKey City Region DimCustomer CustomerKey CustomerName GeographyKey DimShippingAgent ShippingAgentKey ShippingAgentName DimProductLine ProductLineKey ProductLineName DimProduct ProductKey ProductName ProductLineKey SupplierKey DimSupplier SupplierKey SupplierName

  13. Implementing a Snowflake Schema

  14. Processing and Performance Considerations • Star schema requires de-normalization during the load process • Can impact the ETL times • Snowflake schema can increase dimension complexity • Can impact Analysis Services solutions, negatively affecting cube performance

  15. Topic: Facts and Fact Tables

  16. Topic: Facts and Fact Tables • What Is a Fact? • Grouping Facts • What Is Granularity? • Design Considerations

  17. What Is a Fact? • Facts are the key metrics used to measure business results: • Sales • Production • Inventory • Can be additive, semi-additive, or non-additive

  18. Grouping Facts • Facts are grouped into fact tables • Related facts should be in the same fact table • Facts with different granularity should be in different tables

  19. What Is Granularity? • Granularity refers to the level of detail in which facts are recorded • Facts can be at different levels of granularity

  20. Design Considerations • Fact tables should have all keys relating to dimensions • Primary key should be composite of all dimension keys • Separate additive, semi-additive, and non-additive facts

  21. Topic: Fact and Dimension Granularity

  22. Topic: Fact and Dimension Granularity • How to Determine Fact Granularity • Dimension Granularity

  23. How to Determine Fact Granularity • Granularity is determined based on business needs • Should be the lowest level of detail that needs to be examined • If data from transactional systems has more detail than needed for analysis, ETL should aggregate the details

  24. Dimension Granularity • Dimension granularity needs to be matched with fact granularity • Each dimension has its own granularity • Fact tables are keyed to the granularity of the dimensions

  25. Topic: Conformed and Non-Conformed Dimensions

  26. Topic: Conformed and Non-Conformed Dimensions • What Are Conformed and Non-Conformed Dimensions? • Shared and Degenerate Dimensions • What Is a Slowly Changing Dimension?

  27. What Are Conformed and Non-Conformed Dimensions? • Conformed dimension • Shared by multiple fact tables • Used when all business users have the same definitions for the dimension • Non-conformed dimension • Dimension table targeted to a single fact table • Used when dimensions have different definitions for different business units

  28. Creating Conformed and Non-Conformed Dimensions

  29. Shared and Degenerate Dimensions • Shared dimension • Used by multiple facts • Dimension key is stored in the fact table • Dimension value is stored in the dimension table with other attributes of that dimension • Degenerate dimension • Used by a single fact table • Dimension value is stored directly in the fact table • No corresponding dimension table

  30. What Is a Slowly Changing Dimension? • When the historical attribute values are retained if the attributes are updated • Used when the organization does not want to lose track of what actually happened • Example: customer moves from Connecticut to Seattle • Slowly changing dimension types: • Type 1: Attribute history is not retained • Type 2: Attribute change creates a new record • Type 3: Original attribute value recorded and latest value recorded with an effective date

  31. Implementing a Slowly Changing Dimension

  32. Topic: Time Dimensions

  33. Topic: Time Dimensions • Types of Time Dimensions • Time Dimensions and Hierarchies

  34. Types of Time Dimensions • Based on standard calendar breakdowns • Year => Month => Day • Year => Quarter => Week => Day • Based on fiscal calendar • Year => Fiscal Quarter => Fiscal Month => Fiscal Week => Day • Time dimension needs to contain all hierarchy elements to the lowest granularity for the fact tables

  35. Time Dimensions and Hierarchies • Establishes the “buckets” that the business uses • Typically there are multiple hierarchies in the dimension • Calendar • Business • Created using Microsoft Excel, scripts, or are auto-generated

  36. Creating a Time Dimension

More Related