380 likes | 474 Vues
Explore SQL performance fundamentals and concepts, hardware advancements, potential pitfalls, and strategies for efficient query optimization in 2011. Learn about parameters, compile options, statistics, and execution plans to enhance performance.
E N D
SQL Performance 2011/12 Joe Chang, SolidQwww.qdpma.com http://sqlblog.com/blogs/joe_chang/default.aspx jchang6@yahoo.com
2011 • Hardware is Powerful & Cheap • CPU (cores), memory, and now IO too! • Quad-core since 2006, 1GHz since 2000 • Is Performance still a concern? • Yes, along with fundamentals • Modern Performance Strategy • Can handle minor inefficiencies • Identify and circumvent the really bad things
Modern Hardware • 4-12 cores per processor socket • 16GB DIMM at less than $1K • SSD • Enterprise grade SSD still moderately expensive • Both SLC and MLC • Consumer SSD – really cheap, $3-4K per TB • Uneven performance characteristics over time • Desired IO performance: 1-2GB/s, 20-50K IOPS
Hardware Baseline 2011 Entry Mid-range 2 Xeon 5600 6-core 48 – 192GB 6 x 8GB to 12 x 16GB SSD options 16+ SATA SSDs 4-5 PCI-E SSDs • 1 Xeon E3 quad-core • 16GB memory • 4 x 4GB unbuffered ECC • SSD options: • 2-4 SATA SSDs • 1-2 PCI-E SSDs
Performance Fundamentals • Network round-trips • Owner qualified, case correct • Log write latencies • Sufficiently low to support transaction volume • Not necessarily separate data and log disks • Normalization – correct data trumps all! • Indexes – a few good ones, and not too many! • SQL – that the optimizer
What can go wrong? • With immense hardware resources • And a great database engine • What can go wrong? • Following a fixed set of rules and procedures • Basic transactions processing should work well • If your process does something unanticipated • Some things can go horribly wrong
Performance Concepts • Query Optimizer • Execution plan operators • Formula for component operation • Data distribution statistics • to estimate rows & pages, automatically updated • Rules when estimate not possible • Stored procedure compile rules • Parameters and variables
Stored Procedure Basics
Parameters and Variables • On compile • Parameter values used to for row estimate • Variables – assume unknown value • Consider effect of skewed distribution
Parameter & Variable 6 rows for value 1 4 rows for value unknown Consider impact for skewed data distributions
Stored Procedure Compile Options • WITH RECOMPILE • OPTIMIZE FOR • Plan Guide • Temp table • KEEP PLAN, KEEPFIXED PLAN
Compile & Execute Time • Plan reuse desired when • Compile cost is high relative to execute cost • Recompile desired when • Execute cost is high relative to compile cost
Statistics Basics
Statistics • No statistics – table variables • Temp table – statistics auto recompute • 6 row modified, 500 rows, every 20% thereafter • Statistics sampling • Random page, how to handle skewed distribution? • Upper and lower bounds • Problems caused by incrementing columns • Propagation errors
Statistics Recompute • Scenario: start with accurate statistics • Update column with new values • That did not previously exist • If fewer than 20% of rows updated • Auto-recompute is not triggered
Sampling • Default sampling percentage is usually good • Caution: not a random row sample! • Random sampling of page • From nonclustered index if available • If there is correlation between pages & values • Then serious over estimation possible
Out of range • Statistics sampling tries to identify lower and upper bound
Bad Execution Plan Examples Not comprehensive
Scenarios Not comprehensive • Or condition • Multiple optional search arguments • Skewed distributions • 1 business logic for Small and large data sets • Reports for 1 day, 1 week, 1 month, 1 year • Statistics related problems • Resulting in horrible execution plan
When the Query Optimizer Does not understand you
UNION and UNION ALL • UNION • Only distinct rows • Sort to eliminate duplicates • Can be expensive for high row counts • UNION ALL • All rows • No sort to eliminate duplicates
Multiple Optional SARGs This was suppose to work, but does not
Table Variable No Statisticsassumes 1 row, 1page Why? No recompiles
Loop Join – Scan Inner Source Estimate 1 row Really Bad News
The Correct Plan Estimate 1 row Hash Join forcedwith hint
Temp Table versus CTE • Consider options • SELECT xxx INTO #Temp • FROM Sql Main Expression • WITH tmp AS (SELECT xxx FROM Sql) Main Expression