290 likes | 651 Vues
Boxwood: Abstractions as the Foundation for Storage Infrastructure. Lidong Zhou, Microsoft Research Silicon Valley Joint work with Chandu Thekkath, John MacCormick, Nick Murphy, and Marc Najork. Distributed Storage Applications are Hard to Build.
E N D
Boxwood: Abstractions as the Foundation for Storage Infrastructure Lidong Zhou, Microsoft Research Silicon Valley Joint work with Chandu Thekkath, John MacCormick, Nick Murphy, and Marc Najork
Distributed Storage Applications are Hard to Build • Distributed storage: low hardware cost, but high development/deployment cost • Application logic on low-level storage interface • Hardware parallelism and concurrency control • Fault tolerance a necessity • Incremental expansion and dynamic reconfiguration vs. system consistency • Our goal: Distributed storage applications made easyto design, build, and deploy Boxwood
Target Application and Setting Enterprise storage applications and back-end storage for data-intensive Internet services Boxwood
Roadmap • Boxwood Vision • Boxwood Architecture • Building Applications on Boxwood • Performance • Related Work and Conclusion Boxwood
Boxwood Vision Incorporate rich virtualized abstractions into low levels of the storage An evolution path for distributed storage: Storage Applications Boxwood
Boxwood Vision Incorporate rich virtualized abstractions into low levels of the storage An evolution path for distributed storage: Storage Applications Virtual Disk Boxwood
Boxwood Vision Incorporate rich virtualized abstractions into low levels of the storage An evolution path for distributed storage: Storage Applications Tree Table List … … Boxwood
Why High-Level Abstractions • Reduce the complexity of distributed storage applications • Natural continuum of storage virtualization • “High-level programming language” for building distributed storage applications • Potential built-in performance optimization by exploiting structural information • Caching • Prefetching Boxwood
Roadmap • Boxwood Vision • Boxwood Architecture • Building Applications on Boxwood • Performance • Related Work and Conclusion Boxwood
Services Locking Logging Consensus Boxwood Architecture Storage Application B-Tree High-level Storage Abstractions Chunk Store Reliable “Media” Replicated Logical Device Magnetic Media Boxwood
Persistent storage with “malloc”-like interface Virtualization layer that hides the distributed nature Manage address space or free space for higher layers Reliable storage through replicated logical device Chunk Store Allocate Read De-allocate Write Chunk Store Replicated Logical Device Boxwood
B-Tree: A proven useful data structure for storage applications Distributed/reliable B-Link trees in Boxwood B-Link trees: high concurrency with simple locking Distributed reliable storage from chunk store Caching for performance Distributed lock service for consistency Logging for recovery B-Tree Abstraction Create Lookup Insert Enumerate Delete B-Link Tree Logging Locking Chunk Store Boxwood
Boxwood Services • Distributed lock service for coordinating concurrent access to shared data • Logging and recovery service for atomicity in face of transient failures • Consensus service for system consistency Clean design of these services is crucial for scalability and for managing complexity Boxwood
Roadmap • Boxwood Vision • Boxwood Architecture • Building Applications on Boxwood • Performance • Related Work and Conclusion Boxwood
Distributed Storage Applications on Boxwood: A Recipe • Design applications for local storage • Map application logic to storage abstractions • Adapt the design for a distributed storage infrastructure • Boxwood abstractions are virtualized • Boxwood offers facilitating distributed services Separating algorithmic design from distributed system concerns is attractive. Boxwood
Logging Local Disks Local Disks From B-Link Tree Algorithm to Distributed Reliable B-Link Trees B-Link Tree Algorithm Local Locks B-Link trees on a single machine Boxwood
From B-Link Tree Algorithm to Distributed Reliable B-Link Trees B-Link Tree Algorithm Global Lock Service Reliable Logging Chunk Store Replicated Logical Device Distributed and reliable B-Link trees Boxwood
Exported via NFS v2 Directory/File B-Tree Directory: maps names to NFS file handle with embedded B-tree handle File: maps block number to chunk handle File blocks chunks Locking/caching at file system level ~2500 lines of C# code BoxFS:Multi-Node File Server on Boxwood BoxFS Services B-Link Tree Chunk Store Boxwood
Roadmap • Boxwood Vision • Boxwood Architecture • Building Applications on Boxwood • Performance • Related Work and Conclusion Boxwood
Prototype Deployment and Performance Evaluation • System setup • Eight Dell PowerEdge 2650 servers with a single 2.4 GHz Xeon processor, 1GB of RAM • Gigabit Ethernet switch • Adaptec AIC-7899 dual SCSI adapter, and 5 SCSI drives • Performance evaluation • Single-machine non-replicated performance (BoxFS vs. NFS) • B-tree operation scalability • BoxFS operation scalability Boxwood
B-Tree Scaling (Private Tree) Boxwood
BoxFS Scaling (Read) Boxwood
B-Tree Scaling (Shared Tree) Boxwood
BoxFS Scaling (Write/MkDirEnt) Boxwood
Roadmap • Boxwood Vision • Boxwood Architecture • Building Applications on Boxwood • Performance • Related Work and Conclusion Boxwood
Related Work • Distributed Storage/Operating Systems • Virtual/Logical disks • File systems • Database systems • Scalable Distributed Data Structures • Linear Hash Table (LH) and its variants (Litwin, 1980--present) • Scalable distributed hash table(Gribble et al., 2000) • Highly concurrent B-trees (Lehman and Yao, 1981; Sagiv, 1986) Boxwood
Conclusion and Future Directions A storage infrastructure offering virtualized high-level abstractions is promising Future Work: • Explore more abstractions and applications; expose flexible interfaces (e.g., through hints) • Leverage high-level abstractions for better load balancing, prefetching, and caching • Graceful degradation during massive failures Boxwood