1 / 31

Re-Configurable Byzantine Quorum System

Re-Configurable Byzantine Quorum System. Lei Kong S. Arun Mustaque Ahamad Doug Blough. Related Work. Byzantine Quorum System – D. Malkhi and M. Reiter: Increase read/write quorum intersection size to tolerate arbitrary server failures.

raoul
Télécharger la présentation

Re-Configurable Byzantine Quorum System

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Re-Configurable Byzantine Quorum System Lei Kong S. Arun Mustaque Ahamad Doug Blough

  2. Related Work • Byzantine Quorum System – D. Malkhi and M. Reiter: Increase read/write quorum intersection size to tolerate arbitrary server failures. • Dynamic Byzantine Quorum System – L. Alvisi, D. Malkhi, E. Pierce M. Reiter and R. N. Wright: define fault resilience threshold b as an variable, so it can be adjusted dynamically. • Fault Detection for Byzantine Quorum Systems – Alvisi, Malkhi, Pierce and Reiter, IEEE Trans. on Parallel and Distributed Systems, Sept. 2001.

  3. Our Contribution • Explicitly add fault detection mechanism into the system, and remove faulty servers for smaller quorum size and lighter system load. • Proxy scheme enables servers to monitor each other instead of using clients to monitor servers. • A new statistical fault detection technique.

  4. System Model • Server failure model is Byzantine failure, the number of concurrent failures is assumed to be in the range [bmin,bmax]. • The data service protocol is based on threshold masking quorum system, and but our re-configurable approach could be applied to other types of quorum systems. • Quorum data operations are done through proxy servers. • Communication channels are assumed to be asynchronous but reliable.

  5. System Architecture Distributed store P P Server fault detection Clients

  6. Quorum Variables • Four quorum variables N, B, Qminand S are defined. • Defining system size a variable makes it possible to remove faulty nodes out of the system. • Suppose Q(V) stands for the number servers that are currently in the system and belong to the quorum of the most recently finished write of V, then Qmin is the minimum value of Q(V) for all data objects in the system. • S is an boolean array used to indicate server status. • Quorum variable operations use a static quorum setting.

  7. Read/Write Quorum Variables • Read Protocol • Read from a quorum of 3bmax+1 servers • Return the value that is returned by at least bmax+1 servers and is not countermanded, return error if such a pair doesn’t exist. • Write Protocol • Write to a quorum of n- bmax servers • Write quorum size = n-bmax, read quorum size = 3bmax+1, then the intersection size is at least 2bmax+1.

  8. Read Data Object V 1. The client executes a read on quorum variables; 2. The client randomly chooses a server as the proxy and choose a read quorum of size n+2b+1-qmin; 3. The client sends the read request and the chosen quorum to the proxy; 4. The proxy first read from n+b+bmin+1-qminservers in the chosen quorum and forwards results to the client;

  9. Read Data Object V - continue 5. Among all pairs returned by at least bmin+1 servers, if the one with the highest timestamp doesn’t have at least b+1 representatives, the proxy read from the rest of the servers in the read quorum and forwards results to the client. 6. The client chooses pairs returned by at least bmin+1 servers, if the one with the highest timestamp has at least b+1 representatives, then return it, otherwise restart from step 2.

  10. Write Data Object V 1. The client executes a read on V to get the current timestamp of V and current quorum variable values; 2. If read quorum size increases according to quorum variable values received in step one, then read quorum variable values from all server nodes, until at least n-bmax servers return the same values; 3. The client generate a new timestamp for V; 4. The client chooses its proxy and write quorum of size [(n+2b+1)/2];

  11. Write Data Object V - continue 5. The client sends the write request and the write quorum to the proxy. 6. The proxy writes to the servers in the write quorum and forwards back server confirmations back to the client; 7. The client check server confirmations, restart from step 2 if error detected.

  12. Quorum Sizes for Data Objects • Write quorum size for data objects: [(n+2b+1)/2] • Read quorum size for data objects: n+2b+1-qmin, and qmin <= any write quorum size, then the intersection size is at least 2b+1.

  13. Misc. • Message authentication code is used to protect message integrity, faulty proxy servers could only drop messages, they cannot tamper with them. • Proxy nodes do not forward client read request MACs to server nodes, which makes it feasible for server nodes to use explicit testing on each other. • Reducing the overhead of reading quorum variables: cached them on client side.

  14. Simulation - Plot of n vs. b

  15. Simulation - Comparison of Read Quorum Size

  16. Simulation - Comparison of Write Quorum Size

  17. Simulation -Comparison of Workload (%r.|Qr|+%w.|Qw|)/n

  18. Fault Detection and Diagnosis • Identify faulty servers thereby enabling their removal • Fault detection done by monitoring a server’s responses to read requests over time • Fault detection probability close to 1 for a wide range of pic and very low false alarm probability • To avoid detection, a faulty server must operate as though it were a correct server

  19. Fault Detection Algorithm • Observe #correct responses over several read operations returned by each server • Two-tiered diagnosis – proxy-node level and diagnosis-node level • Faulty servers try to avoid getting detected • If a faulty server has the correct value for a variable, it returns an incorrect value with probability “pic” • pic=1 corresponds to a server blatantly exposing itself as faulty • pic=0 corresponds to a correct server

  20. Analysis based on Hypothesis Testing • Probability a correct server returns “u” correct responses in “r” read operations is rCu (|Qw|/n)u (1-|Qw|/n)r-u , where |Qw| is the size of the write quorum and n is the total number of servers in the system.

  21. Plot for n=50, |Qw|=34, r=10000

  22. Plot for n=50, |Qw|=34, r=1000

  23. Plot for n=50, |Qw|=34, r=100

  24. Faulty Server Modeling and Analysis • If 0.05 is the false alarm probability tolerated at the proxy-node level, then let uth be the maximum value of u such that Σui=0rCi (|Qw|/n)i (1-|Qw|/n)r-i≤ 0.05. • A server is diagnosed as faulty if it returns uth or fewer correct responses in “r” read operations. • Let pic denote the probability with which a faulty server returns an incorrect response when it has the correct value for the data object being read. • Probability that a faulty server will be detected in “r” read operations is Σuthi=0rCi ((1-pic)|Qw|/n)i (1-|Qw|/n+pic|Qw|/n)r-i

  25. Faulty Server Detection Prob. for n=50,|Qw|=34

  26. Prob. Faulty Server is Undetected, n=50, |Qw|=34

  27. Faulty Server Det. Prob. for different |Qw|, n=50

  28. Fault Detection Algorithm at the Diagnosis Node • Proxy nodes report detected state of servers (faulty or not-faulty) to the diagnosis node every “r” read operations • If at any time a server has been found to be faulty by “m” servers, then that server is diagnosed as faulty and removed from the system • Choice of “m” depends on the desired final false alarm probability

  29. Diagnosis-Node Fault Detection Algorithm (cont’d) • If the final desired false alarm probability is 10-4, then let m2 be the smallest m1 such that n-bmaxCm1 0.05m1 (1-0.05)n-bmax-m1 ≤ 10-4 , where 0.05 is the false alarm probability at the proxy-node level. • Then “m” is given by m2+bmax. • An assumption: The behavior of faulty servers is independent of the identity of the proxy servers.

  30. More Algorithm Specifications and Features • Data servers keep track of the |Qw|/n ratio for each object they store and return this value along with the data object during a read. • Several values of “r” considered to increase chances of detecting faulty servers at the earliest. • Statistical analysis over several read operations tolerates low levels of concurrency between reads and writes. • Above analysis holds only if write quorums are chosen randomly. Analysis can be easily modified to suit other quorum-picking strategies.

  31. Future Work • Make use of background dissemination, explore possible approaches to adjust Qmin: termination determination, objects groups. • Allow new nodes to be added into the system, add replacements for removed faulty servers.

More Related