350 likes | 468 Vues
This thesis presentation by Jason Cornwell focuses on advanced I/O techniques aimed at enhancing process crash recovery protocols. It addresses the challenges posed by large-scale computing environments, including reliability and availability issues due to frequent hardware failures and system unresponsiveness. By proposing novel solutions like checkpointing, remote checkpoint storage, and caching techniques, the presentation presents experimental results demonstrating significant improvements in I/O performance. It also outlines future work directions for optimizing recovery processes in computing-intensive applications and network-centric services.
E N D
Advanced I/O Techniques for Efficient and Highly Available Process Crash Recovery Protocols Thesis Presentation Jason Cornwell 03/15/2011
Agenda • Introduction • Challenges • Pertinent Background • Proposed Techniques • Implementations • Experimental Setup & Results • Conclusions • Future Work
Motivation & Goals Demand for more computing power and high-bandwidth network connections Advances in Microprocessors and Networks Parallel Computing Performance and Scalability Reliability and Availability Simplicity and Accessibility
Agenda • Introduction • Challenges • Pertinent Background • Proposed Techniques • Implementations • Experimental Setup & Results • Conclusions • Future Work
Reliability Problems Large numbers of CPUs, Memory Modules, Hard Disk Drives, Network Interfaces, Network Switches Low Mean-Time-To-Failure (MTTF) and/or High Failure-In-Time (FIT)
Classification of Failure • Transient Failure • Power glitch • System patch and reboot • ECC trap • Partial “Permanent” Failure • Disk failure • Partial network failure • Wholesale “Permanent” Failure • Total hardware failure • Natural disaster
Availability Problems Large numbers Processes, Threads, Software Barriers, Busy Waiting Temporarily Unresponsive and/or Unavailable
Agenda • Introduction • Challenges • Pertinent Background • Proposed Techniques • Implementations • Experimental Setup & Results • Conclusions • Future Work
Possible Solutions • Transient Failure • Restart/replay/resume on the same node • Task-migration is possible • Permanent Partial Failure • Rebalance the workload on surviving nodes • Partial task-migration is needed • Permanent Wholesale Failure • Reconfigure the applications and services • Massive task-migration to new platform
Checkpointing • Common feature in high-performance computing (HPC) platforms • Saves the execution state • Application or system-level • Mechanism for task migration
Application vs System Level • Application-level Recovery Point • Developed application specific • Generally smaller footprint • Data accessiblity restrictions • Kernel-level Recovery Point • Snapshot processes • Full resource restoration • Flexibility due to system level preemption
Berkeley Labs Checkpoint/Restart • System-level • Kernel-module • Checkpoint creation implemented • Process recovery implemented • Linked to BLCR libraries at execution • Stores checkpoint data locally (stack, heap, registers, signals, etc.)
Agenda • Introduction • Challenges • Pertinent Background • Proposed Techniques • Implementations • Experimental Setup & Results • Conclusions • Future Work
Contribution • Enhanced BLCR performance through latency tolerant technique • Increased BLCR availability through novel checkpoint creation technique
I/O Optimization • Avoided extreme modification to BLCR • Reduce the disk latency of checkpoint creation • Implemented a caching technique • Improved I/O performance 4-fold or more • System overhead less than 300KB in experimental test results
Checkpoint Caching • Buffer used as temporary storage • Storage block flushed in large volume • Trade-off between resource consumption and improved I/O efficiency cr_copy(chkptData, count) if(chkptBuf is NULL) kmalloc size of count for chkptBuf space; copy chkptData into chkptBuf; else kmalloc size of count + chkptBuf size for tempBuf space; copy chkptBuf into tempBuf; krealloc chkptBuf for its expanded size; memmove tempBuf into chkptBuf; kfree memory for tempBuf; end if
Remote Checkpoint • BLCR is limited to local disk storage • Remote checkpoint offers off-site storage option • Uses sockets to transmit data • Needs predefined destination • Outperforms BLCR in some experimental tests
Remote Checkpoint Server • Single thread daemon • Used GCC compiler • Stores the recovery point external to the client node • Could be ported to Microsoft derivative while(true) create socket; bind to address; listen for incoming connections; wait for client to connect; create file descriptor; while(data buffered received) write checkpoint data; close file descriptor; close socket;
Modified Write Operation • TCP packets • MTU must be reached before delivery • Only modification is to the write operation of BLCR if(remote chkpt) if(socket is NULL) create socket; establish connection, if handshake fails break and perform the original_chkpt; end if package checkpoint data; send data message; end if if(original_chkpt) original BLCR write operation; end if
Agenda • Introduction • Challenges • Pertinent Background • Proposed Techniques • Implementations • Experimental Setup & Results • Conclusions • Future Work
Design I/O Optimization Write Remote Checkpoint Write write(chkptData, count) if(chkptBuf has space for the incoming chkptData) cr_copy(ckptData, count); else vfs_write(chkptBuf); vfs_write(chkptData); kfree(chkptBuf); end if
Agenda • Introduction • Challenges • Pertinent Background • Proposed Techniques • Implementations • Experimental Setup & Results • Conclusions • Future Work
Experimental Setup I/O Optimization Remote Checkpoint Dell PowerEdge 700, 2.80 GHz Dual-processor Intel Pentium 4, 3 GB Memory, 5,400 RPM Hard Disk, Linux 2.6 Dell Workstation, 3.06 GHz Intel Pentium 4, 1 GB Memory, 5,400 RPM Hard Disk, Linux 2.6 BLCR Implementation BLCR with NFS (BLCR+NFS) BLCR with our Remote Checkpoint Technique (BLCR+R) • Dell Workstation, 3.06 GHz Intel Pentium 4, 1 GB Memory, 5,400 RPM Hard Disk, Linux 2.6 • BLCR Implementation • Optimized BLCR (O-BLCR) Implementation
Benchmarks Program Resource Utilization • NP-Complete • Data Encryption • Linear Equation Solver • File Compression
Agenda • Introduction • Challenges • Pertinent Background • Proposed Techniques • Implementations • Experimental Setup & Results • Conclusions • Future Work
Conclusion • Minimal modification to BLCR • I/O optimization technique reduced the write latency of BLCR • Remote checkpoint increases BLCR availability with new feature • These techniques should be deployed into the foundation of BLCR source code
Agenda • Introduction • Challenges • Pertinent Background • Proposed Techniques • Implementations • Experimental Setup & Results • Conclusions • Future Work
Future Work • Server authentication protocol • Data packet encryption • Automated process load balancing