Fuzzy Checkpointing Alternatives for Main Memory Databases

Fuzzy Checkpointing Alternatives for Main Memory Databases - Margaret H. Dunham, Jun-Lin Lin, and Xi Li - Recovery Mechanism in Database systems, edited by Vijay Kumar, 1997 이인선, 97/8/14

Abstract • Dynamic Segmented Fuzzy Checkpointing(DSFC) • divides the main memory into segments automatically to adapt to changing runtime conditions. • The checkpointing process checkpoints each segment in a round-robin fashion • Partition Checkpointing(PC) • assumes that main memory is divided into partitions based upon frequency of update. • Partitions are checkpointed independently at rates proportional to the update frequency • a global checkpoint across all partitions is performed implicitly when each of the updated partitions has had a local checkpoint

Introduction(1) • The impact of the primary location of data on the DBMS design • recovery techniques • data structures • query processing algorithms • concurrency control • checkpoint techniques provides an important way to refresh the backup database and keep the amount of log data processed at recovery small • The efficiency of the checkpointing has been examined by many researchers.

Introduction(2) • Dynamic Segmented Fuzzy Checkpointing(DSFC) • divides the MMDB into segments based on the transaction access pattern • the checkpointing activity proceeds according to the order of segments in a round-robin fashion • constructs a new complete checkpoint interval whenever a segment has been completely checkpointed • provide the restart operation will more up-to-date information, and yield better recovery performance • automatically adjusts the way the database is segmented

Introduction(3) • Partition Checkpointing • Based on prior knowledge of data update patterns in databases • the checkpoint no longer treats the entire database as a single object, but as a collection of smaller data partitions, each with different update frequency • the partitions with a high frequency of update are checkpointed more often • attempts to maximize the recovery value of the checkpointing activity • two types of checkpointing : local/global checkpoint • the global checkpoint is required to a transaction consistent state

MMDB Architecture(1) • One or more CPUs/ volatile main memory/ log buffer/ archived disks/ log disks • assume the entire database is memory-resident. • assume the immediate-update scheme • each transaction has the redo-log-records and the undo-log-records. • Redo rule • a transaction cannot commit until all of its redo-log-records have been flushed to the nonvolatile log • WAL(Write Ahead Log) • the undo-log-records of an update operation must be flushed to the nonvolatile log ahead of flushing the updated result to the archived disks • LAW (Logging After Write) • the redo-log-records of an update operation are flushed to the nonvolatile log after updating the main memory database

CFC forever do { check all pages in the database, flush dirty ones, and update dirty-page-bitmap accordingly; write a chkpt-record into log buffer, and flush log buffer; store the address of the chkpt-record in the restart file; } SFC i1; /* segment counter */ forever do { check all pages in Segment i, flush dirty ones, and update dirty-page-bitmap accordingly; store the address of the chkpt-record in the restart file; i(i +1) mod n; } Dynamic Segmenting Fuzzy Checkpointing(1)

Dynamic Segmenting Fuzzy Checkpointing(2)

Dynamic Segmenting Fuzzy Checkpointing(3) • Segmenting Main Memory Databases • Lemma 1 • a guideline about how to segment the database. • for a fixed number of segments, if the size of the log generated within a complete checkpoint interval is independent of the way the database is segmented, and the size of the log generated is the same for each segment, then the average size of the log from the redo-point to the end of the log is minimum • Lemma 2 • provides a simple way to estimate the value of n( # of the segments) • when the number of segments equals to where schk is the size of a chkpt-record, the average size of the log from the redo-point to the end of the log is minimum

Dynamic Segmenting Fuzzy Checkpointing(4) • DSFC • some difficulties of SFC • DBAs may have problems deciding how to partition the database • access pattern and/or transaction arrival rate may change • it is difficult to find a segmenting pattern that strictly follows lemmas 1 and 2 • the basic ideas behind DSFC • dynamically calculate the size of the log generated within a complete checkpoint interval (S’) • use lemma 2 to calculate a proper value for the number of segments • once the value of n has been decided, follow lemma 1 to partition the database

Dynamic Segmenting Fuzzy Checkpointing(5)

Dynamic Segmenting Fuzzy Checkpointing(6) • Further Improvement of DSFC • Eliminating Unsegmented Mode • Overlapping Two Consecutive Segmenting Patterns

Partition Checkpointing(1) • Partition • composed of one or more database pages (variable-length data segements) • several relations may share one partition or relation may span partitions • local checkpoint • a partition is checkpointed independently of the others • BC(local) to its local log, moves out all the dirty pages in its partition to archive memory, EC(local) to its local log, sets the bit in the checkpoint bit map • no locking or quiescing of the system is needed • global checkpoint • the entire database is checkpointed

Partition Checkpointing(2) • MMDB architecture for partition checkpointing

The global log BT, ET for preserves transaction serialization history generate list of committed/uncommitted transactions assume physical logging local log BT :since transaction access may span partitions checkpointing bit map whether a partition has been checkpointed during the current global checkpoint Partition Checkpointing(3)

Algorithm LC:Local Checkpoint(Pi) Input:database pages in this partition Process : 1 write the BC(local) record in the local log 2 save the location of the BC(local) record in the global checkpoint record 3 While there is a database page unchecked DO 3.1 if the page is dirty then reset the bit in the dirty bit map copy data to the IO buffer request IO to flush this page to its location in AM end of while 4 write the EC(local) record in the local log 5 switch current its in the global checkpoint record 6 set checkpoint bit map 7 if all bits in the checkpoint bit map are on then execute algorithm GC Algorithm GC:Global Checkpointing(P1,... Pn) Input : MM partitions:p1,... Pn Process : 1 reset the checkpointing bit map 2 write the EC(global)record in the global log 3 write the BC(global) record record in the global log Partition Checkpointing(4)

Partition Checkpointing(5) • Recovery Algorithm

Partition Checkpointing(6)

Partition Checkpointing(7) • Scheduling Algorithms for Local Checkpoints • Frequency Scheduling • hot spot partitions will be flushed out more frequently • the MMDB improves the recovery performance for these partitions Algorithm FS : Frequency Scheduling Input : MM partitions P1,..., Pn : checkpoint frequency f1, f2,..., fn Process : 1 Loop FOREVER 1.1 let i = value returned by F; 1.2 invoke local checkpoint LC(Pi); end of Loop

Partition Checkpointing(8) • Interleaved Frequency Scheduling Algorithm IFS: Interleaved Frequency Scheduling Input : MM partitions P1,..., Pn : checkpoint frequency c1, c2,...,cn Process : i = 1; For partitions P1,..., Pn Do invoke local checkpoint LC(pi) if ci> 0; ci--; i++; end of for; until (c1 = c2 = ...= cn= 0)

Partition Checkpointing(9) • Dynamic Scheduling • based on the actual updates to that partition Algorithm DS: Dynamic Scheduling Input : MM partitions P1,..., Pn : number of updates in partitions P1_no,... ,Pn_no Process : IF there is Pi_no  threshold then invoke local checkpoint LC(Pi); Pi= 0; end of if

Comparison and Conclusion • the DSFC can speedup the log processing time by more than 20%, compared to the CFC • the partition checkpointing scheme is the first hot spot based technique and can improve processing time at system recovery by reducing the number of log records which need to be processed.

Introduction(2) • The efficiency of the checkpointing has been examined by many researchers. • Hangmann [9] • first presented fuzzy checkpointing for MMDB. • Either all the pages or only dirty pages are copied out during checkpointing • Lehman[12] • checkpoints a page at a time based on number of updates to the pages or time since last checkpoint. • (disadv.) Checkpoints are executed as normal transactions, read locks are held on the database pages when checkpoint transactions are invoked

Introduction(3) • Levy[13] • proposed a technique based on the application of log records to the backup disk. • A dedicated processor directly applies log information in log tails to a backup dataase • (disadv.) synchronization control is required between updates from the normal page replacement and the log processing of the recovery processor. • Jagadisk el al[11] • proposed an action consistent checkpointing scheme • when checkpointing, the undo-log-records of active transactions are first written to the log and then dirty pages are flushed to disks to enforce the WAL protocol • during normal transaction processing, the logger only writes the redo-log-records of the committed transaction to the log • (disadv.) in order to achieve action consistenc, no update actions can e in progress when checkpointing • tests conducted with this algorithm led to a restructuring of Dali’s recovery algorithm to include fuzzy checkpointing

Fuzzy Checkpointing Alternatives for Main Memory Databases

Fuzzy Checkpointing Alternatives for Main Memory Databases

Presentation Transcript

Main Memory

Main Memory

Main Memory

Main Memory

Rebound: Scalable Checkpointing for Coherent Shared Memory

Main Memory

Main Memory

Main Memory

Main Memory

Main Memory

Main Memory

Main Memory

Recovery in Main Memory Databases

Main Memory Databases

Main Memory

Main Memory

Main Memory

Main Memory