1 / 40

SQL Server Columnstore Performance Tuning

DBI409. SQL Server Columnstore Performance Tuning. Eric N Hanson Principal Program Manager Microsoft Corporation. Session Objectives and Takeaways. Session Objective(s ): Learn to make query performance go from good to outstanding Batch mode is crucial to speedups

cyndi
Télécharger la présentation

SQL Server Columnstore Performance Tuning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DBI409 SQL Server Columnstore Performance Tuning Eric N Hanson Principal Program Manager Microsoft Corporation

  2. Session Objectives and Takeaways • Session Objective(s): • Learn to make query performance go from good to outstanding • Batch mode is crucial to speedups • Good application design can work around limits

  3. Why pay attention? Your customer with well-tuned xVelocitycolumnstoreapplication Your customer with poorly-tuned xVelocitycolumnstoreapplication

  4. demo Outer Join Performance Limit

  5. Fundamentals of xVelocityColumnstore Performance • Each column stored separately, compressed • Batch mode query execution • Segment elimination

  6. Columnstore physical structure segment 1 • Segment (a.k.a. row group) = 1 million rows • Table partition has 1 or more segments • Each column segment is a BLOB • Dictionaries are BLOBs • Catalog has one row per column segment with min/max values catalog … dictionaries segment N

  7. Batch vs. Row Mode

  8. Batch mode processing • Process ~1000 rows at a time • Batch stored in vector form • Vector operators implemented • Filter, hash join, hash aggregation • Greatly reduced CPU time (7 to 40X) Batch object Column vectors bitmap of qualifying rows

  9. #1 Takeaway! Make sure most of the work of the query happens in batch mode.

  10. Segment Elimination • Segment(rowgroup)= 1 million row chunk • Min, Max kept for each column in a segment • Scans can skip segments based on this select Date, count(*) from dbo.Purchasewhere Date >= 20120201 group by Date skipped

  11. DOs • Use star schema • Put columnstores on large tables only • Include every column of table in columnstore index • Use integer surrogate keys for joins • Structure your queries as star joins with grouping and aggregation as much as possible

  12. DON’T • Join on string columns • Filter on string columns of columnstore table • Join pairs of very large tables if you don’t have to • Use NOT IN <subquery> on columnstore table • Use UNION ALL to combine columnstore tables with other tables

  13. Outer Join Workaround • Outer join prevents batch processing • Rewrite query to do most work in batch mode  DEMO

  14. IN and EXISTs Issue & Workaround • Using IN and EXISTS with subqueries can prevent batch mode execution • IN ( <list of constants> ) typically works fine •  DEMO

  15. Union All Issue & Workaround • UNION ALL often prevents batch mode • Workaround • Push GROUP BY and aggregation over UNION ALL • Do final GROUP BY and aggregation of results • Called “local-global aggregation” •  DEMO

  16. Scalar Aggregates Issue & Workaround • Aggregate without group by doesn’t get batch processing • select count(*) from dbo.Purchase1.2 seconds • Workaround • with CountByDate (Date, c) 0.093 secondsas ( select Date, count(*) from dbo.Purchasegroup by Date ) select sum(c) from CountByDate;

  17. Multiple DISTINCT aggregates issue & workaround • Table Spool if query has 2 or more DISTINCT aggregates • Spool takes time to write • Spool write is single threaded • Reads of spool in plan are single threaded, in row mode • Workaround: • form each DISTINCT agg in separate subquery • Join results on grouping keys •  DEMO SQL Server 2012 runs queries with 1 DISTINCT agg and 1 or more non-distinct aggs in batch mode without any spool!

  18. Physical DB Design, Loading, and Index Management

  19. Adding Data Using Partition Switching • Columnstores must be partition-aligned • Partition switching fully supported • To add data daily • Partition by day • Every day • Split last partition • Create staging table and columnstore index it • Switch it in • Avoids costly drop/rebuild

  20. Trickle Loading with Columnstore Indexes • Master table (columnstore) • Delta table (rowstore) • Query using UNION ALL local-global aggregation workaround • Add Delta to Master nightly •  DEMO

  21. Avoid Nonclustered B-trees • Covering B-trees no longer needed • Extra B-trees can cause optimizer to choose poor plan • Save space • Reduce ETL time

  22. Eliminating Unsupported Data Types • Omit column, or • Modify column type to supported type • Reduce precision of numerics to 18 digits or less • Convert guid’s to ints • Reduce precision of datetimeoffset to 2 or less • Convert hierarchyid to int or string

  23. Achieving Fast Columnstore Index Builds • One thread per segment • Low memory throttles parallelism • Consider • high min server memory setting • Set REQUEST_MAX_MEMORY_GRANT_PERCENT to 50 or more • Add memory • Omit columns • Normalize fact columns to dimensions • Vertically partition • sys.dm_exec_query_memory_grants shows DOP

  24. Not enough memory to build a columnstore index? • Same ideas apply as for achieving fast builds • Run out of memory during execution, not startup? • Errors 701 or 802 indicate this • Workaround: reduce DOP explicitly, e.g. • create columnstore index <name> on <table>(<columns>) with(maxdop= 1);

  25. Maximizing the Benefits of Segment Elimination

  26. Verifying Segment Elimination • You may want to check if segments are eliminated • Ways to get segment elimination confirmation: • Xevent sqlserver.column_store_segment_eliminate • Text output messages • Trace flag 646 outputs segment elimination messages • Trace flag 3605 directs messages to error log file •  DEMO

  27. Ensuring segment elimination by date • Use clustered B-tree on date • Columnstore inherits order • Or, partition by date • Ordering by load date, ship date, order date etc. can all work • Dates are naturally correlated

  28. Partitioning and Segment Elimination -- Example Select … From Fact Where Date = 20120301

  29. Multi-dimensional segment elimination • Pre-sort or partition on a keydrawn from >1 dimension • E.g. YYYYMMRR • Can be persisted computed columnnot used in columnstore select f.region_id, avg(f.duration)from fact_CDR f where f.region_id = 1 and f.date_id between 20120101 and 20120131 group by f.region_id

  30. Additional Tuning Considerations

  31. String performance issues • String filters don’t get pushed to storage engine • more batches to process • defeats segment elimination • Joining on string columns is slow • Factor strings out to dimensions

  32. Forcing use or non-use of Columnstores • Query hint • OPTION(IGNORE_NONCLUSTERED_COLUMNSTORE_INDEX) • Index hint • … FROM F WITH(index=MyColumnStore) … • … FROM F WITH(index=MyClusteredBtree) …

  33. Summary • Key’s to fast processing • Batch mode • Segment elimination • Make sure most work is done in batch mode! • Can work around read-only property by • Partitioning • Using Delta-file for trickle load • Future work will reduce need for tuning

  34. Related Content • Columnstore Tuning Guide (your reading homework ) • http://social.technet.microsoft.com/wiki/contents/articles/sql-server-columnstore-performance-tuning.aspx • Columnstore FAQ • http://social.technet.microsoft.com/wiki/contents/articles/sql-server-columnstore-index-faq.aspx • Introductory Columnstores Talk Video • TechEd 2011 session DBI312 • Customer Case Studies • See list in FAQ

  35. Track Resources Hands-On Labs @sqlserver @ms_teched SQL Server 2012 Eval Copy Get Certified! mva Microsoft Virtual Academy

  36. Resources Learning TechNet • Connect. Share. Discuss. • Microsoft Certification & Training Resources http://northamerica.msteched.com www.microsoft.com/learning • Resources for IT Professionals • Resources for Developers • http://microsoft.com/technet http://microsoft.com/msdn

  37. Required Slide Complete an evaluation on CommNet and enter to win!

  38. Please Complete an Evaluation Your feedback is important! Multipleways to Evaluate Sessions Be eligible to win great daily prizes and the grand prize of a $5,000 Travel Voucher! Scan the Tag to evaluate this session now on myTechEdMobile

  39. © 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

More Related