1 / 54

Erasure coding

Erasure coding. Group J’ Yetian Huang Yang Liu Yi He. Overview. Get Started with Erasure coding Network coding for Distributed Storage System (DSS) Introduction the Fundamentals of Network Coding Performance Results and Analysis A New Erasure Coding Method in Windows Azure Storage

varden
Télécharger la présentation

Erasure coding

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Erasure coding Group J’ Yetian Huang Yang Liu Yi He

  2. Overview • Get Started with Erasure coding • Network coding for Distributed Storage System (DSS) • Introduction the Fundamentals of Network Coding • Performance Results and Analysis • A New Erasure Coding Method in Windows Azure Storage • Conventional Reed-Solomon (RS) code • Improved Local Reconstruction Coding • Novel Erasure Codes for Big Data • Introduction of Locally Repairable Code(LRC) • Performance evaluation on Amazon EC2 and X's cluster

  3. What is Erasure Coding? • Term from Information Theory • Used for controlling errors • Add redundancy • Complex mathematical function

  4. Erasure Coding for Distributed Storage • Numerous disk failures • Redundancy is necessary • 3-replication • Erasure Coding • Storage saving

  5. Network Coding for Distributed Storage System Introduction of Erasure Code Performance Results and Analysis

  6. Distributed Storage System

  7. BIG DATA

  8. Motivation • Data Security • High Failure Tolerance • High Availability • High Reliable • Prevent Data Loss • Low Redundancy Level • Low Storage Cost • Low Bandwidth Cost • Low Coding Complexity

  9. Replication Code Replication A A File or Data Object A B • Reliable • High Redundancy Level • High Storage Cost B

  10. Erasure Coding • Any k out of the n code word symbols are sufficient to recover the original message. • Optimal erasure codes are maximum distance separablecodes (MDS codes).

  11. Erasure Coding • Low Bandwidth Cost

  12. Example of Erasure Code Each storage node is storing two fragments that are linear binary combinations of the original data fragments𝐀𝟏, 𝐀𝟐, 𝐁𝟏, 𝐁𝟐. In this example, the total stored size is M = 4 fragments. Observe that any k = 2 out of the n = 4 storage nodes contain enough information to recover all the data (adopted from [1])

  13. Example of Erasure Code Assume that the first node in the previous storage system failed. it is obtain exact repair by communicating three fragments.

  14. Erasure Coding: MDS N=4 A N=3 K=2 A B File or Data Object A B A+B A+2B A+B (3,2) MDS Code (Single parity) Raid 5 Mode (4,2) MDS Code Raid 6 Mode

  15. Erasure Coding VS. Replica (4,2) MDS Erasure Code Replication A A File or Data Object A B B A A+B B Low Redundancy Level Low Storage Cost B A+2B

  16. Improvement of MDS • Regenerating Code • Repairing lost encoded fragments from existing encoded fragments. • A new class of erasure code. • Reduce repair bandwidth. • Increase number of surviving node connected.

  17. Example of MDS and Regenerating Code

  18. Example of MDS and Regenerating Code

  19. MDS and Regenerating Code • MDS code: • High complexity. • Uses a random linear network coding. • “repair-by-transfer regenerating code” : • Less complexity. • Its process is addition of two packets using bit-wise exclusive OR (XOR).

  20. Performance Results

  21. Performance Results (Cont.) (4,2) Regenerating Code

  22. Performance Results (Cont.)

  23. Performance Results (Cont.)

  24. A New Erasure Coding Method in Windows Azure Storage Conventional Reed-Solomon (RS) code Improved Local Reconstruction Code

  25. Reed-Solomon Code • used to correct errors • RS Encoder: takes a block of digital data and adds extra "redundant" bits • RS Decoder: processes each block and attempts to correct errors and recover the original data

  26. RS Code Definition • RS(n, k): • k data symbols of s bits each • n symbol codeword. • n-k parity symbols of s bits each • Correct up to t symbols errors, where 2t = n-k. • Example: • RS(255,223) • n=255, k=223, s=8, 2t=32, t=16

  27. Reed-Solomon Code in DSS • RS (6, 3) Code • 6 fragments, and 3 parties • GFS II in Google • RS (10, 4) • Facebook HDFS-RAID

  28. Which Erasure Codes are best? • Consideration • Storage overhead • The less, the better. • Reliability • At least higher than 3-replication • Fault Tolerance • Repair Performance • Lower bandwidth consumption • Fewer the number of disk I/Os • Decoding latency

  29. Overhead Overhead=# of nodes / # of fragments

  30. Further reduce the overhead? • Increase the number of fragments • 6+3 => 12 + 3… • But reliability decreased… • Need more redundancy • Finally • 12+4 • 1.5x -> 1.33x

  31. When failure occurs… • Reconstruction cost the number of fragments required to reconstruct an unavailable data fragment

  32. Goal • Could we achieve • Reconstruction cost = 6 • Overhead = 1.33x • We found • All failures are equally considered in conventional EC. • Actually Probability (1 failure) >> Probability (2 failures) Local Reconstruction Code

  33. Local Reconstruction Code • Group fragments and reconstruct locally 12 fragments, 2 local parties and 2 global parties • Overhead: (12+2+2) / 12 = 1.33x • reconstruction cost = 6 fragments. • Definition: • k – # of data fragments • l – # of groups (or local parities) • r – # of global parities • n - # of nodes, i.e. n = k + l + r Pattern: LRC(k, l, r)

  34. Fault Tolerance • Arbitrary 3 failures • All the information-theoretically decodable 4 failure patterns (86% of all 4 failure patterns) Satisfy Maximally Recoverable (MR) property: • decode any failure pattern which is information-theoretically decodable.

  35. 3 failures

  36. 3 failures (cont.)

  37. 3 failures

  38. 4 failures 3 parities cannot recover 4 failures

  39. 4 failures • Decodable

  40. Code Parameter Selection • Factors • Reliability (MTTF) • Threshold: the MTTF of 3-replication • Reconstruction cost • Storage overhead • Reduce trade-off space to 2D

  41. Better cost and performance trade-off LRC vs. RS code

  42. Novel Erasure Codes for Big Data Introduction of Locally Repairable Code(LRC) Performance evaluation on Amazon EC2 and X's cluster

  43. Definition A randomized an explicit family of codes that have logarithmic locality on all coded blocks and distance that is asymptotically equal to that of an MDS code. We call such codes (k; n-k; r) Locally Repairable Codes (LRCs) LRC

  44. Introduce “locality” to RS X1 X2 X3 X4 X5 A STRIP 5 File blocks C2 C3 C4 C1 C5 A Local Parity Block S1 X1C1+X2C2+X3C3+X4C4+X5C5 = S1 44

  45. How Locally Repairable Codes Work If a single block failure! For example, if block X3 is lost! 45

  46. How Locally Repairable Codes Work A strip (10 file blocks) RS Code (4 parity blocks) X1C1+X2C2+X3C3+X4C4+X5C5=S1 X6C6+X7C7+X8C8+X9C9+X10C10=S2

  47. How to keep the parity blocks satisfy 1. We set C5’=C6’=1 2. Set S1+S2+S3=0 3. Set P1C1’+P2C2’+P3C3’+P4C4’=S3 If p2 failure: 47

  48. Locally Repairable Code vsLocal Reconstruction Code • They have different storage space. • When the parity block fails, Locally Repairable Code could cost less to repair. • Faster for single block failure. Efficiency: Local parity > RS parity block.

  49. Performance evaluation on Amazon EC2 and X's cluster Repair Duration is simply calculated as the time interval between the starting time of the first repair job and the ending time of the last repair job. HDFS Bytes Read corresponds to the total amount of data read by the jobs initiated for repair. Network traffic represents the total amount of data sent from nodes in the cluster

  50. Evaluation on Amazon EC2 Be encoding to one stripe with 14 and 16 size in HDFS-RS and HDFS-Xorbas. HDFS Bytes Read Network Traffic Repair Duration

More Related