Optimizing Erasure Coding for Reliable Distributed Storage Systems
This research focuses on implementing a fast erasure coding algorithm that ensures consistent performance and data reliability in distributed storage systems. Key optimizations include optimized scheduling, MemXOR functions, and uniform coverage LT codes. We address the limitations of initial implementations, enhancing robustness and guaranteeing decodability, while tailoring performance across various computing speeds and network bandwidths. Supported by the National Science Foundation and collaborative entities, this work aims to improve fault tolerance and efficiency in high-bandwidth environments.
Optimizing Erasure Coding for Reliable Distributed Storage Systems
E N D
Presentation Transcript
Objective: Implement a fast erasure coding algorithm with consistent performance and guaranteed data reliability for use in a distributed storage system. Optimizations: • Optimized Scheduling • Optimized MemXOR Function • Coverage Threshold • Including the Original Data • Guaranteed Decodability • Uniform Coverage LT Codes for High Performance Distributed Storage Frank Uyeda, Huaxia Xia, Andrew Chien {fuyeda, hxia, achien}@cs.ucsd.edu Background Results Overview OptIPuter Project – Exploring the new opportunities in distributed computing presented by dedicated, high-speed, dynamically configurable fiber-optic light paths. Method: Luby Transform (LT) Codes were chosen as the focus of our research because of their low time cost and relative simplicity. However, initial implementations were slower than expected and could not provide absolute data reliability. Several improvements were devised to remedy these problems as well as improve the robustness of the encodings. Finally, we analyzed the coding parameters to optimize performance for a wide variety of computing speeds and network bandwidths. RobuSTore - Working within OptIPuter to achieve predictable, high-bandwidth, low latency, fault-tolerant distributed storage. Supported in part by the National Science Foundation under awards NSF Cooperative Agreement ANI-0225642 (OptIPuter), NSF CCR-0331645 (VGrADS), NSF ACI-0305390, and NSF Research Infrastructure Grant EIA-0303622. Support from the California Institute for Telecommunications and Information Technology, UCSD Center for Networked Systems, BigBangwidth, and Fujitsu is also gratefully acknowledged.