250 likes | 558 Vues
Architecting for Scale in SharePoint 2010. Russ Houberg Senior Technical Architect, MCM KnowledgeLake, Inc. Scaling SP2010 from the Ground Up. Storage Architecture SQL Tuning Tidbits Remote Blob Storage (Demo) Performance and Control Scalable Taxonomy Design (Demo)
E N D
Architecting for Scale in SharePoint 2010 Russ Houberg Senior Technical Architect, MCM KnowledgeLake, Inc.
Scaling SP2010 from the Ground Up • Storage Architecture • SQL Tuning Tidbits • Remote Blob Storage (Demo) • Performance and Control • Scalable Taxonomy Design (Demo) • Search… A Complete Story • The Big Picture: 10 million, 100 million A BILLION Documents…
Storage Architecture • Storage Architecture can make or break SharePoint Performance • Poor storage performance can tank the whole SharePoint farm! • Can Be Tough to Estimate • Use an extendable storage platform if possible • Wider is Better • More spindles always better than higher GB • Avoid using a small number of large disks for increasing storage capacity
Storage Architecture • TempDB, Search DBs, Content DBs • Multiple Data Files in Primary File Group • # Files = ½ to ¼ of CPU Cores | <= CPU Cores • Separate to unique spindle sets if possible • Pre-Allocate all Data Files, Including TempDB • Estimate Projected DB Size and Divide by # Files to get the pre-allocation size for each file • Leave “AutoGrow” enabled, but don’t rely on it • Pre-Allocation to prevent AutoGrow • Set AutoGrow to 10% or logical MB/GB value based on projected database Size
Storage Architecture • Data / Log File Spindle Priority
SQL Tuning Tidbits • SQL Instant Initialization • Run SQL As Domain User with either… • Local Admin • Grant “Perform Volume Maintenance Tasks” • TempDB Pre-Allocation to 10% Largest DB • SAN vs DAS vs NAS (Don’t Overshare!) • Host Bus Adapter (HBA) Configuration • NTFS Allocation Unit Size: 64K • Enable Locked Pages in Memory (SQL Std.) • Don’t skimp on RAM!
RBS Background • Remote BLOB Storage (RBS) • By default SharePoint stores Binary Large Objects (BLOBs) in the content database • When enabled… Intercepts binary content (documents) and sends them to a BLOB store • Microsoft provides the “local” FILESTREAM provider to allow for usage of the SQL Server local NTFS file system as a BLOB store.
Remote BLOB Storage SharePoint 2003 • What’s this ECM thing? • Interesting workarounds • API access was problematic SharePoint 2007 • SP1 Brings us EBS Provider • - BLOBs are orphaned during edit/save • - Orphan cleanup is resource intensive • Externalization happens on the WFE (reduced RPS) • Future support of EBS API is not guaranteed SharePoint 2010 Long Live RBS - Transactional consistency supports “VETO” - Transactional consistency allows for UPDATE - Orphan cleanup uses SQL Indexes - Transparent to the SharePoint API - RBS is the best option for future support
Remote BLOB Storage SharePoint WFE 7. Back to User SharePoint Object Model 2. Enforce Business Logic 1. Save Request RBS Client Library Relational Access 3. Save Blob 5. Return BLOB ID 6. Save Metadata & BLOB ID BLOB Store Provider Library 4. Write Blob SQL Server Blob Store Content DB Config DB
RBS Requirements • SQL Server 2008 R2 • Any Version, even SQL Express R2 • FILESTREAM RBS Provider (Current Version) • http://go.microsoft.com/fwlink/?LinkId=177388
RBS Licensing and Limitations • The FILESTREAM provider is supported by SharePoint Server 2010 only when it is used with SQL Server 2008 R2 or SQL Server 2008 R2 Express. • Only “local commodity storage” (hard drive) is supported. • Direct Attached Storage (DAS), Network Attached Storage (NAS), and Storage Area Network (SAN) are all considered to be “remote commodity storage” and are not supported by SharePoint 2010. • Any other 3rd Party RBS Provider is considered to be a “remote server” provider and SharePoint 2010 licensing requires that SQL Server 2008 R2 Enterprise Editionbe implemented.
demo… Remote BLOB Storage
Performance and Control SharePoint 2003 • Column Indexes were not possible • Database Indexes were not supported SharePoint 2007 • Column Indexes (10) could be configured via the UI • End users could impact performance with poor performing • list views SharePoint 2010 • Database optimizations allow far more items in a list • Support for (20) Multi-Column Indexes • Resource intensive operations can be limited or disallowed • during production hours • Large query thresholds • Blocking Operations • Can be overridden via the Object Model • Can configure an unblocked “window”
Scalable Taxonomy Design • SP2010 Boundaries – Now More Stuff!!! • 30 Million Documents/Items in a List • 5000 Item View/Query Result Size (Default for a reason) • 100 Million Items in SharePoint Server 2010 Search • 1 BILLION Items in FAST For SharePoint 2010 Index • 250,000 Site Collections per Web Application • 200GB Content DB Size (SOFT LIMIT) • Recommend for Collaboration content or Fast Backup/Restore SLA • Content DB sizes up to 1TB are SUPPORTED for large single-site repositories and archives of non-collaborative content! • That’s 150 Million items in a single Site Collection in a single Content Database with RBS enabled (avg. 7KB metadata row)
Scalable Taxonomy Design • Enabling 100 Million • Place large Collaboration Site Collections (20GB+) in their own content database • Break Up Archive/Records Site Collections by Year or, if necessary, Content Type and Year • AVOID Item Level ACLs!!! • Release to Metadata Based Folder Structures as a workaround • Use Content Type Syndication to facilitate multiple Site Collections of the same type • Use Content Organizer as a “Drop Zone”
demo… Content Organization
Search… A Complete Story SharePoint 2003 • WSS CAML Only • SPS Shared Services yielded decent full text results SharePoint 2007 • WSS 3.0 SiteDataQuery allowed search across lists/sites • MOSS Search added Managed Properties • FAST ESP for SharePoint was a late player SharePoint 2010 • Microsoft SharePoint Foundation Search • Site Collection Scope | No Redundancy | 10 Million • Microsoft Search Server Express 2010 • Extended Features| No Redundancy | 10 Million • Microsoft SharePoint 2010 Search / Search Server • Extended Features | Scale Out | Redundancy | 100 Million • Microsoft FAST Search Server 2010 for SharePoint • Extreme Scale | Redundancy | Doc Processing Pipeline • 1 Billion documents! (per farm)
Search… A Complete Story • SharePoint Server 2010 / Search Server • Multiple Crawl Servers (Scale Out/Redundancy) • Crawl Servers comprised of stateless Crawlers • Multiple Crawlers improve crawl performance • Multiple Crawl DBs support more Crawlers • Crawl DB is separated from Property DB • Index is comprised of multiple Index Partitions that can be mirrored on different Query Servers • Multiple Index Partitions improve Query Performance
Search… A Complete Story Cool… What can it do?
Search… A Complete Story • FAST Search Server 2010 for SharePoint • Extreme Scale and Performance • Custom Relevancy and Navigation Tuning • Tune Performance for content volume, query volume, crawl pipeline performance and query speed • Uses SharePoint 2010 Query Servers • Bolt on FAST Servers for additional processing • Add server ROWS for query performance and high availability or COLUMNS for crawl performance • Can scale to support 1 Billion items!
In Review… • Storage is the KEY to Performance • RBS reduces Content DB Size and facilitates large repositories • SharePoint governs end-user operations • Content Type Publishing and Content Organization help balance database loading • Search solutions now handle the entire range of corpus possibilities • 10 million is easy, 100 million can be done, 1BILLION is possible!
More… http://www.houberg.net @rhouberg http://www.knowledgelake.com/resources