1 / 68

Microsoft SQL Server, Scalability, & Database Research Jim Gray Researcher Microsoft Corporation

Microsoft SQL Server, Scalability, & Database Research Jim Gray Researcher Microsoft Corporation. Outline. Summary of what you heard. (10 min) The database scene in general. (10 min) Scaleability: Farms, Clones,Parts & Packs (15 min) Microsoft DB research focus. (15 min)

gmayer
Télécharger la présentation

Microsoft SQL Server, Scalability, & Database Research Jim Gray Researcher Microsoft Corporation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Microsoft SQL Server, Scalability, & Database Research Jim Gray Researcher Microsoft Corporation

  2. Outline • Summary of what you heard. (10 min) • The database scene in general. (10 min) • Scaleability: Farms, Clones,Parts & Packs (15 min) • Microsoft DB research focus. (15 min) • TerraServer (design and ops). • RAGS. • Data Mining • Q&A (10 min)

  3. Organizations Are Going Online • Building a digital nervous system. • Inexpensive hardware means huge databases are possible. • But, we are drowning in data. • Databases help organize information. • Microsoft’s goal: • Information at your fingertips. • Make it easy to capture, manage, and analyze information.

  4. Microsoft SQL Server 7 Goals Easy • Dynamic self management • Multi-site management • Operation Scripting • Job scheduling and execution • Alert/response management • Scriptable Install+upgrade • DBA profiling/tuning tools • Unicode • English Language Query • Integrated with NT Security • Integrated with NT files Scalability Data Warehousing

  5. Scalability • Win9x/NTW version • Dynamic row-level locking • Improved query optimizer • Intra-query parallelism • VLDB improvements • Replication improvements • Distributed query • High Availability Clusters Easy Scalability Data Warehousing

  6. Scale Down to Windows 95-98 • Full function (same as NTW) • Integration with Access 97 • MSDE in Office2000 and MSDN • WinCE version demonstrated

  7. Distributor ODBC OS 390 DB2 DB2 VSAM CICS Subscriber Subscriber Subscriber Subscriber Replication Publisher 2PC, RPC • Transactional and Merge • Remote update • ODBC and OLE DB subscribers • Wizards • Performance Updating Subscriber (immediate updates)

  8. Query Processor Enhancements Focus on Complex Queries • Parallelism • Improved scan, fetch, & sort • Smart hash & merge join • Large joins & grouping • Better query optimization • Multi-index operations • Automatic statistics maintenance • Distributed Query • Heterogeneous Query

  9. Global Agg. + Result 50 rows 4 x 50 rows + + + + Local Agg. 50,000 rows Disks Parallel QuerySMP & Disk Parallelism • Plus Distributed • Plus Hash Join (fanciest on the planet) • Plus Optimized Partitioned views • # of emp. per group • total inc. per group

  10. DirectoryService Database (DB2, VSAM, Oracle, …) Spreadsheet Photos Mail Maps Documents and the Web Distributed Heterogeneous QueriesData Fusion / Integration SQL 7.0 Query Processor Join spread sheets, databases, directories, Text DBs etc. Any source that exposes OLE DB interfaces SQL Server as gateway, even on the desktop

  11. Backup Fuzzy Parallel Incremental Restartable Recovery Fast File granularity Reorganize shrinks file reclusters file Auto-Repair Index creation ~2x faster than 6.5 DBCC not required, a good practice 5x - 100x faster Recovery time (secs) # of indices UtilitiesThe Key to LARGE Databases

  12. Data Warehousing • Warehousing Framework • Visual data modeler • Microsoft repository • Data transformation services (DTS) • Plato & Dcube - Multi Dimensional Data Cubes • English query 2.0 Easy Scalability Data Warehousing

  13. Extact & Load OLAP Data Warehouse Storage Data Warehouse / Data Analysis • Data Transformation Services to get data into the warehouse • CUBE (OLE/DB OLAP) to analyze data Operational Data

  14. User 1 Dcube Clientapp Dcube Clientapp SQL MD SQL 1990 FORD 1991 CHEVY 1992 1993 By Year By Make By Make & Year RED WHITE BLUE By Color & Year Sum By Color User 2 Plato and Data Cubeand HOLAP Source table Europe Partition 1 “Plato” ROLAP Designer USA Partition 2 “Plato” server Asia Partition 3 ROLAP

  15. English Query

  16. Easy Scalable Data Warehousing

  17. “Shiloh” The Next SQL Server • Shiloh (H1’00) - Strengthen Position • Data Warehousing leadership • Materialized Views • Cascading Referential Integrity(#1 requested user-group feature) • XML support • Scalability • WinCE support • W2K VLM (36 and 64 bit) • Multi-instance support • Yukon – Next Big Step • Scalability (Clusters, Partitions) • Programmability • Ease of Use (Self Tuning, Auto Config)

  18. Outline • Summary of what you heard. (10 min) • The database scene in general. (10 min) • Scaleability: Farms, Clones,Parts & Packs (15 min) • Microsoft DB research focus. (15 min) • TerraServer (design and ops). • RAGS. • Data Mining • Q&A (10 min)

  19. Yotta Zetta Exa Peta Tera Giga Mega Kilo Info Capture Everything! Recorded All BooksMultiMedia • You can record everything you see or hear or read. • What will you do with it? • How will you organize & analyze it? • Most data will never be seen • Analysis an summarizationare key technologies All books (words) .Movie Video 8 PB per lifetime (10GBph) Audio 30 TB (10KBps) Read or write: 8 GB (words) See: http://www.lesk.com/mlesk/ksg97/ksg.html A Photo A Book

  20. Data Tidal Wave • Seagate 47GB drive @ 783$ (= 1.7 ¢/mb) • 100 GB penny per MB drive coming in 2000 • 10 $/GB = 10 k$/ Terabyte! • “Everyone” can afford one • What’s a terror bite? • If you sell ten billion items a year (e.g Wal-Mart) • And you record 100 bytes on each one • Then you get a TeraByte/year • Where will the terror bytes come from? • Multimedia (like the TerraServer) and...

  21. Reducing Data’s Cost-of-OwnershipSelf-Managing data Cost of ownership: One admin/TB (100K$ vs 10K$) Admin cost exceeds storage cost. SQL 7: • Suggests indices • Migrates data away from end of file • Truncates file Someday: • Automatic move files to balance disks • Online defragmentation & restructuring • Online physical redesign

  22. OBJECT RELATIONALThe Next Great DBMS Wave • All DB vendors have added objects to DB • Microsoft is adding DBs to Objects • Integration with COM+ • Gives user-defined types and objects • Plug-ins will be Billion dollar industry • Blades for SQL Server razor

  23. Data stream in a typical interface… “ABC47-Z”, “100”, “STL”, “C”, “3”, “28” Same data stream in XML… <INVENTORY> <PART_NUM>ABC47-Z</PART_NUM> <QUANTITY>100</QUANTITY> <WAREHOUSE>STL</WAREHOUSE> <ZONE>C</ZONE> <AISLE>3</AISLE> <BIN>28</BIN> </INVENTORY> Why Is XML Important?Self-describing data

  24. table.xsl bar.xsl art.xsl

  25. XML Applications • Exposing Software as a “Service” • Websites without UI’s • Exposed services with common scheme • Integration points at the enterprise, value-chain, workgroup, desktop and intelligent gizmo “levels” • B2B value chains • Uses XML to transmit wide range of date to a broad set of stakeholders (regulatory agencies, suppliers, customers, etc.). • Leverage for prior efforts like EDI • BizTalk a key industry effort in this regard

  26. Library MVS CICS SAP R/3 www.biztalk.org XMLschema XML XML Order Processing XML Service Interface XML XML Another Service XML Message XML Message XMLDocument JD Edwards Browser Client Apps New Form Factors XML: BizTalk Framework XMLDocument

  27. Outline • Summary of what you heard. (20 min) • The database scene in general. (10 min) • Scaleability: Farms, Clones,Parts & Packs (10 min) • Microsoft DB research focus. (15 min) • TerraServer (design and ops). • RAGS. • Data Mining • Q&A (15 min)

  28. Terminology for scaleability Farm • Farms of servers: • Clones: identical • Scaleability + availability • Partitions: • Scaleability • Packs • Partition availability via fail-over Clone Partition Pack

  29. Unpredictable Growth • The TerraServer Story: • We expected 5 M hits per day • We got 50 M hits on day 1 • We peak at 15-20 M hpd on a “hot” day • Average 5 M hpd after 1 year • Most of us cannot predict demand • Must be able to deal with NO demand • Must be able to deal with HUGE demand

  30. An Architecture for Internet Services? • Need to be able to add capacity • New processing • New storage • New networking • Need continuous service • Online change of all components (hardware and software) • Multiple service sites • Multiple network providers • Need great development tools • Change the application several times per year. • Add new services several times per year.

  31. Premise: Each Site is a Farm • Buy computing by the slice (brick): • Rack of servers + disks. • Grow by adding slices • Spread data and computation to new slices • Two growth styles: • Clones: anonymous servers • Parts+Packs: Partitions fail over within a pack • In both cases, remote farm for disaster recovery

  32. Scaleable SystemsScale UP and Scale OUT • Everyone does both. • Choice is • Size of a brick • Clones or partitions • Size of a pack

  33. Everyone scales outWhat’s the Brick? • 1M$/slice • IBM S390? • Sun E 10,000? • 100 K$/slice • Wintel 8X • 10 K$/slice • Wintel 4x • 1 K$/slice • Wintel 1x

  34. Clones: Availability+Scalability • Some applications are • Read-mostly • Low consistency requirements • Modest storage requirement (less than 1TB) • Examples: • HTTP web servers (IP sprayer/sieve + replication) • LDAP servers (replication via gossip) • App/compute servers or firewalls • Replicate app at all nodes (clones) • Spray requests across nodes. • Grow by adding clones • Fault tolerance: stop sending to dead clone. • Growth: add a clone.

  35. Facilities Clones Need • Automatic replication • Applications (and system software) • Data • Automatic request routing • Spray or sieve • Management: • Who is up? • Update management & propagation • Application monitoring. • Clones are very easy to manage: • Rule of thumb: 100’s of clones per admin

  36. Partitions for Scalability • Clones are not appropriate for some apps. • Statefull apps do not replicate well • high update rates do not replicate well • Huge DBs (disk to expensive to clone) • Examples • Email / chat / … • Databases • Partition state among servers • Scalability (online): • Partition split/merge • Partitioning must be transparent to client.

  37. Partitioned (aka. Clustered) Apps • Mail servers • Perfectly partitionable • Business Object Servers • Partition by set of objects. • Parallel Databases • Transparent access to partitioned tables • Parallel Query

  38. Packsfor Availability • Each partition may fail (independent of others) • Partitions migrate to new node via fail-over • Fail-over in seconds • Pack: the nodes supporting a partition • VMS Cluster • Tandem Process Pair • SP2 HACMP • Sysplex™ • WinNT MSCS (wolfpack) • Cluster In A Box now commodity • Partitions grow in packs.

  39. What Parts+Packs Need • Automatic partitioning (in dbms, mail, files,…) • Location transparent • Partition split/merge • Grow without limits (100x10TB) • Simple failover model • Partition migration is transparent • MSCS-like model for services • Application-centric request routing • Management: • Who is up? • Automatic partition management (split/merge) • Application monitoring.

  40. Services on Clones & Partitions • Application provides a set of services • If cloned: • Services are on subset of clones • If partitioned: • Services run at each partition • System load balancing routes request to • Any clone • Correct partition. • Routes around failures.

  41. Farm pairs: Always Up • Two farms • Changes from one sent to other • When one farm failsother provides service • Masks • Hardware/Software faults • Operations tasks (reorganize, upgrade move • Environmental faults (power fail)

  42. Clones for availability Packs for availability Load Balance Web Clients Availabilty for a simple web site SQL Database Web File Store SQL Temp State Front End

  43. Packed Partitions: Database Transparency SQL Partition 3 SQL Partition 2 SQL Partition1 SQL Database replication Web File StoreB Cloned Packed file servers Farm Scale Out Scenarios The FARM: Clones and Packs of Partitions Web File StoreA SQL Temp State ClonedFront Ends(firewall, sprayer, web server) Web Clients Load Balance

  44. Reliable, Scalable, Modular Component Load Balancing (COM+) Clones Network Load Balancing Clones Clients Cluster Service Pack 1 1 2 2 3 3 4 … … 8 32 COM+ Components Data Servers SQL, Exchange, File IIS Web Server or other IP based services Application Servers

  45. Talk 2 (if there is time) Farm • Terminology for scaleability • Farms of servers: • Clones: identical • Scaleability + availability • Partitions: • Scaleability • Packs • Partition availability via fail-over Clone Partition Pack

  46. Scalability: COM+ progress serving 1,000-statement ASP’s (servelets) SPS: servelets per second (ASPs served per second by IIS,1,000 statement VBscript) • Poor SMP Scaleability on IIS4 NT4 • Big improvementsfrom standard TransactionProcessing tricks • Out of Proc(safe execution)now much fasterthan In Proc was on IIS4 Shift from 4x200 Mhz to 8 450 Mhz

  47. Scaleability: So, What about the death of NT/Alpha?Two simultaneous Compaq TPC-C numbers • Alpha • Unix/Sybase/Tuexdo • 700 Mhz • 8 Processors • 16 GB memory • 42,437 TPM-C @ 55.45 $/tpmC • $2.35 M$ 5-year cost • Avail: 10/18/99 • Intel Profusion • NT/SQL/COM+ • 550 Mhz • 8 Processors • 4 GB memory • 40,368 TPM-C @ 18.46$/tpmC • 745 K$ 5-year cost • Avail: 12/31/99 200% more expense for 5% more performance?

  48. Outline • Summary of what you heard. (10 min) • The database scene in general. (10 min) • Scaleability: Farms, Clones,Parts & Packs (15 min) • Microsoft DB research focus. (15 min) • TerraServer (design and ops). • RAGS. • Data Mining • Q&A (10 min)

  49. The TerraServerhttp://www.terraserver.microsoft.com/

  50. 200x200 m tile .4 x.4 km browse .8 x .8 km 8m thumbnail 1.6x 1.6 km “city view” Database & application UI • Coverage: Range from 70ºN to 70ºStoday: 35% U.S., 1% outside U.S. • Source Imagery: • 4 TB 1sq meter/pixel Aerial (USGS - 60,000 46Mb B&W- 151Mb Color IR files) • 1 TB 1.56 meter/pixel Satellite (Spin-2 - 2400 300 Mb B&W) • Display Imagery: 200x200 pixel images, subsample to build image pyramid • Store 5x compressed data • Nav Tools: • 1.5 m place names • “Click-on” Coverage map • Expedia & Virtual Globe • Pick of the week • Concept: User navigates an ‘almost seamless’ image of earth

More Related