250 likes | 452 Vues
Turkcell Backup & Recovery Strategy. Hüsnü Şensoy Turkcell Telecommunication Services VLDB Expert Oracle ACE Director Member of Global DWH Leaders & Oracle CAB Oracle DBA of 2009. Agenda. Backup & Recovery Strategies for Oracle Databases Motivation behind those strategies
 
                
                E N D
TurkcellBackup & Recovery Strategy HüsnüŞensoy TurkcellTelecommunication ServicesVLDB Expert Oracle ACE Director Member of Global DWH Leaders &Oracle CAB Oracle DBA of 2009
Agenda • Backup & Recovery Strategies for Oracle Databases • Motivation behind those strategies • Revisiting “Incrementally Updated Backup” • Revisiting “FRA” • How to bring your database back without restore ? • Sick backup will not work • Centralized scheduling & monitoring • 11g Release 2 Backup & Recovery New Features with real Telco data warehouse data • Brand new compression algorithms • Summary
Turkcell Overview • Leading GSM operator of Turkey established in February 1994. • Third GSM operator in Europe in terms of subscriber (+36 million). • First and only Turkish company ever to be listed on New York Stock Exchange. • Member of Board of Directors of GSMA since 2003. • 25th company of INFOTECH 100 list.
TurkcellBackup & Recovery Strategy Backup & Recovery Strategies for Oracle Databases
Design Considerations • Define your backup & recovery policies upfront • A well documented strategy that can be used to bring everything back • KISS: Even a junior DBA should be able to bring your database back. • Standardize, standardize, standardize… • Be prepared to justify the cost in terms of business impact of downtime
Design Considerations • Proactively validate database and backup integrity • Physical errors • Logical inconsistencies • Transmission errors • Do you perform regular full recoveries to separate host and storage?
Design Considerations • Centralized backup reporting: • Is there a single point of access for all my databases’ backup logs ? • What is the average backup duration for database X ? • How do brand new tape drives affect backup performance ?
What type of Architecture ? • What’s in there ? • 7 RAC databases • More than 20 services 20 Gbit/s APPDB VASSE VASCMT BSSOSS VASNIF 120 Intel Cores 640 GB Memory VASRES BSSARCH DATA FRA ARCHIVE 25 TB
How Do We Backup ? • Incrementally Updated Backup Strategy • Initial image copy backup to FRA • Fast incremental backups thereafter • Image copy is rolled forward with incremental backup on regular basis to create full on-disk backup • Full database backup times only depend on the amount of blocks changed since last incremental backup. • The longest backup time is only ~30 minutes, with ZLIB backupcompression and logicalblock checking turned on. run{ backup ascompressed backupset check logicalincremental level 1 for recover of copy with tag DAILY_COPY database filesperset 1; recover copy of database with tagDAILY_COPY; } This is the shortest, cleanest, and most elegant backup script that I have seen in all my years at Turkcell.
Setting Up Flash Recovery Area(Oracle Database 11g Release 1) • Self managed & organized logical storage area. • Setup as part of Universal Installer wizard. • Redo log copy, control file copy, archived logs, and Flashback logs are automatically stored there. • RMAN automatically utilizes FRA for all disk backups. • Or, just enable it by setting two init.ora parameters : • db_recovery_file_dest_size • db_recovery_file_dest
FlashRecoveryArea • ASM is the best infrastructure to be used as FRA destination: • Raw device performance. • No other solution (except Sun ZFS file system with its online FS check capability) will practically let you implement large storage pools as ASMdoes. • Ease of management. • ASM allows you to provision the same diskgroup to multiple FRA destinations. DB1 FRA DB2 FRA DB3 FRA DB4 FRA ASM Diskgroup (+FRA)
What Are the Commands? From hours to minutes
Backup Validation • Backups on disk or tape might be damaged due to • Physical problems on media (fabric problems, dust, cosmic rays, etc) • Media library errors (error in checksum computation) • How you can increase the probability that your backups are healthy ?
RMAN Backup Validation RMAN> backupchecklogicalvalidate datafilecopyall filesperset1; • This will report • For any inconsistent data,index, or other type of blocks. • Number of total and empty blocks examined. • Highest change number of each datafile copy.
CentralizedScheduling & Monitoring • Develop standard backup job scheduling and monitoring routines. • This enables you to: • See all backup schedules at once • Check details of previously completed backups (duration, logs,etc.) • Easily modify backup scripts and bulk deploy them.
Grid Control Backup Jobs Managebackup of alldatabases of theclusterbyusingjustonescreen
TurkcellBackup & Recovery Strategy 11g Release 2 RMAN Compression
Backup Compression Summary In Oracle Database 11g Release 2, RMAN extends its compression capabilities to fit any CPU power and I/O throughput combination. MEDIUM compression level can backup faster than BASIC with the same compression ratio and 3X faster with 50% less CPU utilization. Even if you don’t have need to reduce backup sizes, LOW/MEDIUM compression level might be faster than uncompressed backup depending on your I/O throughput, by significantly reducing the amount of data/sec written by RMAN.
Best Practices Summary A well defined, documented, standard, manageable, and fast backup & recovery strategy is a MUST if you manage tens (even hundreds) of databases. Whatever solution you pick, the indicator of a good backup & recovery strategy is simple: It shouldn’t depend on the size of database. FRA over ASM and RMAN satisfies these requirements with zero cost.