470 likes | 606 Vues
This paper presents Tornado Codes as a solution for reliable multicast applications, particularly in the context of software distribution. The authors discuss practical loss-resilient coding techniques, demonstrating the advantages of users being able to initiate and resume downloads seamlessly, even amid moderate packet losses. Comparisons between point-to-point and broadcast solutions highlight the issues of server and network load, elaborating on the efficiency of Tornado Codes in encoding and decoding processes. The findings suggest that while Tornado Codes offer scalability, challenges remain concerning encoding and decoding speeds for large files.
E N D
Tornado Codes, with Applicationsto Reliable Multicast Michael Mitzenmacher
Based On • Practical Loss-Resilient Codes • Michael Luby, Michael Mitzenmacher, Amin Shokrollahi, Dan Spielman, Volker Stemann • STOC ‘97 • Analysis of Random Processes Using And-Or Tree Evaluation • Michael Luby, Michael Mitzenmacher, Amin Shokrollahi • SODA ‘98 • A Digital Fountain Solution to Reliable Multicast of Bulk Data • John Byers, Michael Luby, Michael Mitzenmacher, Ashu Rege • SIGCOMM ‘
Application:Software Distribution Problem • Millions of users want to download a new version of a software package. • 32 megabyte file, at 56 Kbits/second. • Download takes around 75 minutes at full speed.
Point-to-Point Solution Features • Good • Users can initiate the download at their discretion. • Users can continue download seamlessly after temporary interruption. • Moderate packet loss is not a problem. • Bad • High server load. • High network load. • Doesn’t scale well (without more resources).
Broadcast Solution Features • Bad • Users cannot initiate the download at their discretion. • Users cannot continue download seamlessly after temporary interruption. • Packet loss is a problem. • Good • Low server load. • Low network load. • Does scale well.
A Coding Solution: Assumptions • We can take a file of n packets, and encode it into cn encoded packets. • From any set of n encoded packets, the original message can be decoded.
5 hours 4 hours 3 hours 2 hours 1 hour 0 hours Coding Solution Encoding Copy 2 Encoding File Encoding Copy 1 User 1 Reception User 2 Reception Transmission
Coding Solution Features • Users can initiate the download at their discretion. • Users can continue download seamlessly after temporary interruption. • Moderate packet loss is not a problem. • Low server load - simple protocol. • Does scale well. • Low network load.
So, Why Aren’t We Using This... • Encoding and decoding are slow for large files -- especially decoding. • So we need fast codes to use a coding scheme. • We may have to give something up for fast codes...
n Erasure Codes n Message Encoding Algorithm cn Encoding Transmission Received Decoding Algorithm n Message
Performance Measures • Time Overhead • The time to encode and decode expressed as a multiple of the encoding length. • Reception efficiency • Ratio of packets in message to packets needed to decode. Optimal is 1.
Reception Efficiency • Optimal • Can decode from any n words of encoding. • Reception efficiency is 1. • Relaxation • Decode from any (1+e) n words of encoding • Reception efficiency is 1/(1+e).
Parameters of the Code n Message cn Encoding (1+e)n Reception efficiency is 1/(1+e)
Previous Work • Reception efficiency is1. • Standard Reed-Solomon • Time overhead is number of redundant packets. • Uses finite field operations. • Fast Fourier-based • Time overhead is ln2 nfield operations. • Reception efficiency is1/(1+e). • Random mixed-length linear equations • Time overhead is ln(1/e)/e.
Tornado Code Performance • Reception efficiency is1/(1+e). • Time overhead is ln(1/e). • Simple, fast, and practical.
is the shrink factor = message = redundancy Encoding Structure Bipartite graph Bipartite graph Standard loss-resilient code Message n to encoding cn, c=2
Decoding Structure Bipartite graph Bipartite graph Standard loss-resilient code = received directly = missing
Transmission Model • Which packets received • Can depend on when sent, route taken. • Does not depend on payload. • Algorithm • Compute entire encoding. • Place encoding into packets in random order. • Implication • Random portion of encoding is received. • Same amount from each level.
Decoding Process: Substitution Recovery indicates right node has one edge
Regular Graphs Random Permutation of the Edges Degree 6 Degree 3
Decoding Process Analysis Induced Graph = Recovered = Missing/not yet recovered
3-6 Regular Graph Analysis Left Right Left
3-6 Regular Graph Equation Want:y < xfor all0 < x < a Works fora < 0.43
Regular Graph Performance Reception efficiency Time overhead (Left degree)
Why regular graphs are bad Right degree 2d implies Pr[right degree =1] = d Left node has on average neighbors of degree one.
Degree 1 Degree 4 Random Permutation of the Edges Degree 5 Degree 2 Degree 6 Degree 10 Degree 3 Degree 4 Irregular Graphs
Degree Sequence Functions • Left Side • fraction of edges of degree ion the left in the originalgraph. • Right Side • fraction of edges of degree ion the right in the original graph.
Irregular Graph Analysis Left Right Left
Irregular Graph Condition Want:y < xfor all0 < x < a
Good Left Degree Sequence:Truncated Heavy Tail D= 9, N = Fraction of nodes of degree i is Average node degree is
Good Right Degree Sequence:Poisson Average node degree is
Good Degree Sequence Functions Want:y < xfor all0 < x < a Works for
Tornado Code Performance Reception Efficiency Time overhead (Average left degree)
Why irregular graphs are good Average right degree 2ln(D) implies Pr[right degree =1] ~1/(D+1) D+1 Left node of max degree has on average one neighbor of degree one.
Application:Software Distribution Problem • Millions of users want to download a new version of a software package. • 32 megabyte file, at 56 Kbits/second. • Download takes around 75 minutes at full speed.
5 hours 4 hours 3 hours 2 hours 1 hour 0 hours Cyclic Tornado Coding Solution Encoding Copy 2 Encoding File Encoding Copy 1 User 1 Reception User 2 Reception Transmission
Cylcic Tornado Coding Solution Features • Users can initiate the download at their discretion. • Users can continue download seamlessly after temporary interruption. • Moderate packet loss is not a problem. • Reception efficiency is near optimal. • Low server load - simple protocol. • Does scale well. • Low network load.
Why Now and Not Before? Encoding time (seconds), 1K packets Decoding time (seconds), 1K packets Size Reed-Solomon Tornado Size Reed-Solomon Tornado 4.6 0.06 2.06 0.06 250 K 250 K 500 K 19 0.12 500 K 8.4 0.09 93 40.5 1 MB 0.26 1 MB 0.14 2 MB 442 0.53 2 MB 199 0.19 4 MB 1717 1.06 4 MB 800 0.40 8 MB 6994 2.13 8 MB 3166 0.87 16 MB 30802 4.33 16 MB 13829 1.75 Model: half redundant packets, half message packets
Reception Efficiency 10,000 runs Extra packets needed Avg: 5.5% Max: 8.5%
Previous Work • Local correction, hierarchical methods • Try to get lost packet from nearby neighbor in multicast tree. • Scalability? Satellite? • Erasure codes, interleaving • Break message into blocks, encode over blocks. • Interleave blocks to protect against bursty loss. • We have better performance, scalability. • Layering • Offer channels at different rates. • We use simple layering in prototype.
Cyclic Interleaving Encoding Copy 2 Interleaved Encoded Blocks Encoded Blocks Blocks Encoding File Encoding Copy 1 Transmission
k B blocks Cyclic Interleaving: Problems • The Coupon Collector’s Problem • Wait for packets for the last blocks. • Need many blocks for fast decoding. • Tornado twice as fast even for blocks of 20 packets.
Contributions • Tornado codes • Analysis tools • Simple, effective multicast protocol • Being implemented by Digital Fountain, Inc. • Error-correcting codes • STOC ‘98, ISIT ‘98 • Luby, Mitzenmacher, Shokrollahi, Spielman • Recent improvements • Reudiger, Shokrollahi, Urbanke