390 likes | 582 Vues
Database as a Service September 13, 2007 Matt Emmerton DB2 Performance and Solutions Development IBM Toronto Laboratory memmerto@ca.ibm.com. Database as a Service. Definition A hardware/software combination that provides (relational) data store capabilities What about context?
E N D
Database as a ServiceSeptember 13, 2007Matt EmmertonDB2 Performance and Solutions Development IBM Toronto Laboratorymemmerto@ca.ibm.com
Database as a Service • Definition • A hardware/software combination that provides (relational) data store capabilities • What about context? • Is the service for a person, department, or enterprise? • The context defines the level of reliability and robustness that is required
Requirements • Reliable hardware • Redundant components, infrastructure and/or systems • Intelligent software • Must be dynamic • Able to react to changes in environment and workload • Must have features to simplify the DBA’s job • Make the simple tasks automatic • Make the hard tasks easier • Make the impossible tasks possible
How can software help DBAs? • Application Development Environment • Rich set of support languages • Native XML storage and query capabilities • Database failover • HADR • General Robustness • Read Retry • Storage Keys • Ability to react to changing workloads • Autonomic Capabilities • Self-Tuning Memory Manager (STMM) • Workload Management (WLM) • Ease of administration • Automatic Storage • Table Compression and Large RID • Table Partitioning • Single-Image System Backup and Flash Copy • Roles
Application Development • Application developers want to use what is familiar to them: • C/C++, Java • Various “scripting” languages (PHP, Perl) • Having native DB2 interfaces for all of these languages eliminates the need for: • Additional third-party products to interface between applications and DB2 • Extra training, support and testing of these products
XML • Many applications are designed around “rich” data • There is a desire to store and query this “rich” data natively in a data server • DB2 has rich XML support: • Supports X/Query and SQL/XML • Specialized data types, indexes, join operators and access mechanisms tailored to XML • Supports relational and XML data in the same database
How can software help DBAs? • Application Development Environment • Rich set of support languages • Native XML storage and query capabilities • Database failover • HADR • General Robustness • Read Retry • Storage Keys • Ability to react to changing workloads • Autonomic Capabilities • Self-Tuning Memory Manager (STMM) • Workload Management (WLM) • Ease of administration • Automatic Storage • Table Compression and Large RID • Table Partitioning • Single-Image System Backup and Flash Copy • Roles
High Availability (HA) and Disaster Recovery (DR) • HADR is a warm-standby setup for DB2 • Primary and secondary servers • Logs are shipped from the primary to the secondary and replayed on the secondary, which keeps it “warm” • When the primary fails, all remaining logs are shipped to the secondary and replayed and the secondary takes over • Degree of “warm”-ness can be controlled by the DBA • Affects the amount of data to be transferred and replayed • Affects the amount of downtime between primary failure and secondary takeover • Clients will be rerouted automatically when a failure occurs
High Availability (HA) and Disaster Recovery (DR) • Restrictions • OS and DB2 must be same version (including patches) but hardware can be different • Allows smaller hardware to be used for standby • Administration (pre-Viper2) • Required lots of scripting to manage failover properly • Any changes (eg: add/drop tablespace container, add/drop node) had to be replicated on the secondary and scripts updated appropriately • Administration (Viper2) • TSA (Tivoli System Automation) bundled with HADR • Completely integrated with DB2 • Manages all schema and configuration changes automatically • Manages all failover operations automatically
How can software help DBAs? • Application Development Environment • Rich set of support languages • Native XML storage and query capabilities • Database failover • HADR • General Robustness • Read Retry • Storage Keys • Ability to react to changing workloads • Autonomic Capabilities • Self-Tuning Memory Manager (STMM) • Workload Management (WLM) • Ease of administration • Automatic Storage • Table Compression and Large RID • Table Partitioning • Single-Image System Backup and Flash Copy • Roles
Robustness • Read Retry Logic • Any kind of network-mounted (NFS) or network-attached (SAN/NAS) storage can suffer from transient failures due network outages • Prior to v9 (Viper), any read error would be considered fatal and the instance would terminate • In v9 (Viper) and later, we retry failed reads up to 10 times to allow transient network failures to correct themselves before terminating • Greatly improved uptime when using network-backed storage
Robustness • Storage Keys • This is a form of hardware memory protection on POWER6 • First used in Viper2 to protect bufferpool memory against rogue memory corruption • From internal DB2 coding errors • From external UDF coding errors
Automatic RUNSTATS • Statistics / RUNSTATS • Proper statistics are essential for proper query plans • Statistics need to be updated periodically to reflect changes in data (quantity and distribution) • A manual process managed by DBAs • Automatic RUNSTATS • Query engine compares estimated cost with actual cost • Once the difference in costs exceeds some threshold, we will execute RUNSTATS to update statistics • Throttling infrastructure is used to ensure minimal impact
How can software help DBAs? • Application Development Environment • Rich set of support languages • Native XML storage and query capabilities • Database failover • HADR • General Robustness • Read Retry • Storage Keys • Ability to react to changing workloads • Autonomic Capabilities • Self-Tuning Memory Manager (STMM) • Workload Management (WLM) • Ease of administration • Automatic Storage • Table Compression and Large RID • Table Partitioning • Single-Image System Backup and Flash Copy • Roles
Automatic RUNSTATS • Viper2 introduces two new features • Just-In-Time Statistics (JITS) • If a high number of queries can benefit from updated statistics right now, RUNSTATS will be run immediately if it can be done without impacting performance • Statistics Fabrication • We can estimate key statistics (cardinality, index key distributions) just by looking at overall object statistics (number of pages in a table, number of keys in each level of the index btree structure, etc) • These statistics are often better than what is currently being used by the optimizer
Automatic Backup and Reorg • Automatic Backup • Performs a full backup once: • The last backup is > X hours old • More than Y log files have been written • Automatic Reorg • Tables and indexes can be automatically REORGed in the background • Proper statistics are essential here! • Using throttling infrastructure to ensure that performance is not impacted • Reduces the need for large REORG operations during maintenance window
Self-Tuning Memory Manager • DB2 has lots of memory areas to manage • Bufferpools, LOCKLIST, SORTHEAP, Package Cache • How to determine the “optimal” amount of memory to put in each pool, especially with a changing workload? • Trial and error • Workload analysis • Use STMM • performs cost/benefit analysis on moving memory between memory areas • applies beneficial changes at runtime
Workload Management • In the past, Query Patroller was the only way to control various services classes • Now we are introducing workload management (WLM) capabilities directly into DB2 • Allow SQL-based monitoring and reporting of the current system state • Does not require extensive logging and data mining • Can work alongside Query Patroller • Integrates with AIX WLM features
How can software help DBAs? • Application Development Environment • Rich set of support languages • Native XML storage and query capabilities • Database failover • HADR • General Robustness • Read Retry • Storage Keys • Ability to react to changing workloads • Autonomic Capabilities • Self-Tuning Memory Manager (STMM) • Workload Management (WLM) • Ease of administration • Automatic Storage • Table Compression and Large RID • Table Partitioning • Single-Image System Backup and Flash Copy • Roles
Automatic Storage • Automatic Storage automates tablespace administration • Type of tablespaces (DMS, SMS) chosen automatically at CREATE TABLESPACE time depending on tablespace type • Resizes tablespaces as necessary at runtime • Never get a midnight page for “out of space” again! • Threshold for resize and amount of resize can be controlled by DBA
Large RID • What is a RID? • A RID is a row identifier • “Small” RIDs are 4 bytes (3 byte page number and 1 byte slot) • Up to 16 million pages per tablespace, 255 rows per page • “Large” RIDs are 6 bytes (4 byte page number and 2 byte slot) • Up to 512 million pages per tablespace, 65,536 rows per page • What does this mean? • Potential of hitting tablespace size limits greatly reduced • No need to redesign schema to get around these limits • DPF partitioning, UNION ALL views, Table partitioning are all ways to get around tablespace size limits • Allows any table to have more than 255 rows per page, but is essential for compression
Table Compression - Overview • How it works • Dictionary (LZ-based) compression method (similar to WinZip) • Data is compressed at the row level • Data is stored in compressed form on disk and in memory; uncompressed at row access/modify time • What it buys you • Reduced I/O cost at runtime • Reduced storage hardware cost • Improved bufferpool hit ratios (due to higher density of rows/page) • What restrictions exist • XML and LOB data is not compressed • Still limited to 255 rows/page unless you are using Large RIDs • Degree of compression depends on workload: • Typical “customer” data: 68% - 78% • DSS: 47% - 68% • OLTP: 0% - 23%
Table Compression - Dictionary • Viper • Compression dictionary not created by default • Some data must be populated • Run REORG to create dictionary and compress data • All subsequent operations will be compressed • Viper2 • Compression dictionary created by default
Table Partitioning • What is it? • Allows partitioning of tables by key ranges • Each range can be in a different tablespaces • Each range is completely independent • Access to a single range at runtime will not force accesses to other ranges • Easier Roll-In/Roll-Out • New ATTACH operation for roll-in • New DETACH operation for roll-out • SET INTEGRITY is now online • Performance benefits • Some BI-style queries can benefit due to range elimination • smaller joins, fewer rows to process, etc. • Data can be spread across multiple tablespaces • Increased I/O parallelism if tablespace containers are on different storage devices
Single Image Backup/Restore • Backing up a clustered database: • Each node backed up separately (in parallel) • Catalogs backed up separately (serially) • Logs backed up separately (serially) • Keeping track of everything is difficult • Viper2 now supports a single backup command for clustered database that does all three of these things together
Flash Copy • A painful process • Find LUNs to use for flash copy • Suspend IO • <perform flash copy> • Unsuspend IO • Much simpler in Viper2 • BACKUP DB … USE SNAPSHOT
Roles • Before Viper 2: • Privileges and authorities are per-object • Multiple commands required to allow a user access to the objects that they need GRANT SELECT ON TABLE ACCOUNTS TO USER GEORGE GRANT SELECT ON TABLE CLIENT TO USER GEORGE GRANT UPDATE ON TABLE ACCT_BAL TO USER GEORGE • Granting or revoking access is tedious, error-prone and time-consuming • A single missed statement could be a security hole!
Roles • After Viper 2: • Roles can be defined that encompass privileges and authorities on multiple objects CREATE ROLE TELLER GRANT SELECT ON TABLE ACCOUNTS TO ROLE TELLER GRANT SELECT ON TABLE CLIENT TO ROLE TELLER GRANT UPDATE ON TABLE ACCT_BAL TO ROLE TELLER • Allowing or disallowing a user is now as simple as granting or revoking that role from a user REVOKE ROLE TELLER FROM USER GEORGE GRANT ROLE TELLER TO USER BILL
Benefits • Who benefits? • External ISVs • We have a strong relationship with SAP which is driving many of these improvements into our product • Internal ISVs (You!) • These improvements allow you to do more with less, which is beneficial in a (typically) resource-constrained internal IT department
Conclusion • An effective Database Service relies on … • hardware to provide 24/7 uptime • software to assist with workload migration and/or failover • software to mitigate and/or limit failures • software to minimize to amount of operator intervention for routine administration • software to support a diverse application development environment • You should rely on … • DB2 Data Server v9 (Viper) and v9.5 (Viper2)
Definition of a Service • A service is something that is ubiquitous • Present (nearly) everywhere • (Almost) always available • (Usually) reliable and dependable • Typical infrastructure examples: • Municipal water, sewer, gas, electricity service • Telephone / Cable / Internet service • Services are dependent on infrastructure
Requirements • Present everywhere • Requires network connectivity to database or app server • Direct connectivity requires database client • Native DB2 client, CLI/ODBC, JDBC • App server connectivity just requires a web browser • Pushes the database client piece onto the app server, which is a more centralized resource and thus easier to administer • Always available • Requires redundant hardware and network • Reliable and dependable • Requires robust hardware and software
Backup Slides • Robustness Examples • SCADA • Databases
Service Example: SCADA • Telephone network • R1 (US) / R2 (Europe) in 1960s • SS5 (1970s) • SS7 (1980s-) • Infrastructure Control • Utilities: Water, Gas, Electricity • Transport: Bus, Rail • Many use SCADA control systems, technology first developed in the 70s • Hardware and software in these areas has had 30+ years of development and can be considered robust • Data acquisition hardware/software is “special-purpose” • Data monitoring and retention hardware/software is “general-purpose”
Service Failure Example: SCADA • But problems do exist: • North-east electricity blackout of 2003 • Over 40 million people were without power • 30 million in USA (including Michigan, Ohio, Pennsylvania and New York) • 10 million in Canada (entire province of Ontario) • Most power was restored 2 days later • Estimated losses of US$6 billion (for a 2 day outage!) • Sequence of Events • Primary and Secondary control systems get stuck in a race condition; alarms were no longer being processed • Numerous high-voltage power lines in Ohio fail due to fallen trees • Power outage ripples through the state • Failures were not seen in the control center because the primary and secondary servers were not processing alarms • Primary and secondary servers crashed due to backlog of alarms • Chain reaction had spread to neighboring states and could not be stopped
Service Example: Databases • The first relational/SQL databases: • Oracle: 1979 (Oracle V2) • DB2 for Mainframe: 1982 (SQL/DS) • Software in these areas has had 20+ years of development • Hardware and software is considered “general-purpose” • While key concepts (relational data model, SQL) are robust, the implementations may not be. Why? • Rapidly changing hardware means new compiler toolsets, which may introduce bugs • Theoretically impossible ensure correctness in a “general-purpose” product
Service Failure Example: Databases • Airlines rely on computer systems (“databases”) a great deal: • Tickets and boarding passes • DHS, TSA, INS, Customs, etc • Scheduling of ground crew, cleaning, air traffic control, etc • Many outages due to computer failures: • Dec 2004 Computer Failure • Bad weather forced cancellation of many flights, and system could not handle load for rescheduling and crashed, stranding 30,000+ • June 2007 FAA Computer Failure • Caused delays on 40% of flights on east coast • August 2007 US Customs Computer Failure at LAX • All inbound international flights were grounded for 12+ hours