200 likes | 382 Vues
Some Surprising Benefits of Using Storage Areas. - and - Managing A Parallel Dump And Load In A Tight Window. Tom Bascom Greenfield Technologies PUG Challenge 2003 tom@greenfieldtech.com. Agenda. Storage Areas Dumping & Loading Surprising Results. Storage Areas.
E N D
Some Surprising Benefits of Using Storage Areas - and -Managing A Parallel Dump And Load In A Tight Window Tom Bascom Greenfield Technologies PUG Challenge 2003 tom@greenfieldtech.com
Agenda • Storage Areas • Dumping & Loading • Surprising Results
Storage Areas • Structure for Grouping Tables and Indexes • Administrative Benefits • Index Rebuilds • Dump & Load • Dropping Areas • DB Analysis • Drawbacks • More complex structure • No single “high water” mark • More “overhead” (free space in areas)
Dump & Load Choices • “Classic” Dictionary Dump & Load • Bulk Loader • Binary Dump & Load • Parallelization • Many Benefits • Difficult to Effectively Balance • Automating the process • Index Rebuild – inline vs single thread • Exotic techniques – buffer copy…, -RO, Table Filters
Classic Dictionary D&L 1st the dump… Cust Order-line order hist … and then the load. cust Order-line order
Multi-threaded D&L 1st the dump… cust order hist … and then the load. Order-line cust Order-line order
Load Balanced D&L Both the dump… cust cust order hist Order-line Order-line order … and the load – in parallel, multi-threaded and load-balanced.
Drivers • What is the window? • What are the goals? • What resources are available?
The Window • 2 Day Weekend -- Midnight Friday to 6am Monday (54 hours) • Predictability is Vital • Recoverability is Paramount • If it fails we must not impact production on Monday am • Available weekends are rare
The Goals • Reduce Downtime for Maintenance • Index Rebuilds • Dump & Load • Application Maintenance • Reduce Disk Footprint • Improve Reporting Performance
Resources • Production Server (HP rp7410) • 8 x 750mhz • 16GB RAM • HPUX 11i • Failover Server (HP V2250) • 16 x 240mhz • 8GB RAM • HPUX 11 • Old EMC Disk Array Symmetrix 8100 • Database striped (and mirrored) across 16 disks • New EMC Disk Array Symmetrix 8830 • 4 Meta volumes of 4 LUNs
The SA Design • Separate Tables from Indexes • Isolate “large” Tables (80/20 rule) • Any table over 1 million rows • Any table over 1GB • Isolate Transaction Data from Overhead Data • Easily identifiable in dictionary • 256 Records per Block
The SA Design -- continued • 80 Storage Areas • 40 Data, 40 Index • 1 “miscellaneous” transaction data • 1 “miscellaneous” transaction indexes • 1 “miscellaneous” overhead data • 1 “miscellaneous” overhead indexes • 38 large table data areas • 38 index areas for large tables • Plus schema, bi & ai areas
Parallel Binary Dump & Load • 24 Dump Threads • Effect of adding threads • 47 Load Threads • Resource contention issues • Tabanalys • Expected D&L Time is Longest Combined D&L of a single given table • Longest dump may not be longest load • And vice versa • Some variability is possible • Other d&l threads can overlap – reducing contention
Application Load • Graphs show before & after application load (unchanged) • Logical Reads • Commits/sec
Surprise! • Dramatic Reduction in: • OS Reads • Latch Contention
Why? • Homogenous Data Blocks Take Better Advantage of “Locality of Reference” • LOR says that most data access is “close” to previous access • Mixed data blocks scatter data widely • Prior to d&l a block with 64 slots might have 3 records from a table; best case hit ratio for a table scan is about 85% • Post d&l it has 256; best case hit ratio becomes 99.98% • Large number of SAs enables more tables to share in benefit
? Questions http://www.greenfieldtech.com/downloads.shtml