230 likes | 345 Vues
This article provides an in-depth overview of the data warehousing features in SQL Server 2008, emphasizing data compression, partitioning, and parallelism. Key aspects include row and page compression techniques, estimation of potential savings, and column-based partition alignment. We also discuss optimizations for join operations using Bitmap filters and how these features can improve performance compared to SQL Server 2005. Whether you're assessing compression options or exploring partitioning benefits, this resource serves as an essential guide for data professionals.
E N D
Data Warehousing Features in SQL Server 2008 James Rowland-Jones @jrowlandjones
Data Compression • Enterprise Edition Only • Row and Page Compression • Compression Ratio 2 to 1 or 3 to 1 - 50% to 70% reduction in data • Can be for a table, index or a subset of their partitions • Estimate savings: exec sp_estimate_data_compression_savings • Max row size plus compression overhead must not exceed 8060 bytes
Compression Alert UNCOMPRESSED TEXT Compressed Table
Monitoring Compression • SQL Server, Access Methods Object • Page compression attempts/sec • Pages compressed/sec • Compression Statistics for individual Partitions • Dynamic Management Function • sys.dm_db_index_operational_stats
Resource Governor (Quickly) Data Compression DEMO TIME
P & P Partitioning Parallelism
Partitioning & Parallelism • Partition Table Parallelism • Few Outer Rows Parallelism • Partition-Aligned Indexed Views • SQL 2005 behaviour – needs to be dropped before switch • Switch Partition Pulls across indexed view • Rebuild index partition
What is a Partitioned Table? P1 P2 P3 P4
Partitioning & Parallelism Compared SQL Server 2005 P1 P2 P3 P4 P2 SQL Server 2008 P1 P2 P3 P4 P2
Work Around for SQL Server 2005 Partition 3 • UNION Partition 4
Few Outer Rows Parallelism SELECT d.Date_Desc ,SUM(f.Sale_Amt*f.Sales_Qty) FROM Tbl_Fact_Store_Sales f JOIN Tbl_Dim_Date d ON f.sk_date_id = d.sk_date_id WHERE d.date_value between '10/1/2004' and '10/7/2004' GROUP BY d.Date_Desc • SQL 2005 • One thread given per page of rows on a nested loop join • SQL 2008 • One thread given per row on a nested loop join • Good for Joins to Date Dim • M$ internal DW Scale Benchmark perf increase by 30%
Work-Around’s for SQL Server 2005 • STUFF YOUR ROW • Add a JUNK Col on the Date dimension to force one row per page • CLUSTER ON A GUID • Add a column and populate with GUIDs to encourage Rows onto separate pages
Partition Aligned Indexed Views • The Big Chore was “Sliding” a table with an indexed view on it. • In 2005 this needed to be dropped • In 2008 it does not
Sliding Window with Indexed View in Place Rebuild Partitioned Index Filtered Indexes IT’s DEMO TIME
STAR JOINS “Optimized” Bitmap Filters • What is a Bitmap filter • In memory structure (no index overhead) • Created dynamically • Typically quite small in size • Bitmap Filter SQL 2005 • What it was in 2005... • Hash or Merge JOIN • Optimised Bitmap Filter SQL 2008 • Enterprise Edition • Parallel Query • Hash JOIN only • Fact table must have > 100 pages • Single Column join (No PK FK relationship requirement)(integer needed for optimized) • Dimension input cardinalities are smaller than fact input cardinalities • Look for Bitmap warning event for missed opportunities to use Bitmap
STAR JOINS Minimally Logged INSERTS FOR THE FINAL TIME