120 likes | 515 Vues
The Chubby Lock Service for Loosely-coupled Distributed Systems. Mike Burrows @OSDI’06 Mosharaf Chowdhury. Problem. (Un)Locking- as-a- service Leader election Synchronization … Uses Paxos to solve the distributed consensus problem in an asynchronous environment. Overview.
E N D
The Chubby Lock Service for Loosely-coupled Distributed Systems Mike Burrows @OSDI’06 Mosharaf Chowdhury
Problem • (Un)Locking-as-a-service • Leader election • Synchronization • … • Uses Paxos to solve the distributed consensus problem in an asynchronous environment
Overview Goals/Non-goals Use Cases Planned usage by GFS, BigTable, and MegaStore Also heavily used as Internal name service MapReduce rendezvous point • Primary • Availability • Reliability • Usability & deployability • Secondary • Performance • Non-goals • Storage capacity
System Structure R M R R R Proxy Server Reads are satisfied by the master alone Writes are acknowledged after updating a majority
How to Use it? • UNIX File system like interface • API modeled in a similar way • Read/write locks on each file/directory • Advisory locks • Coarse-grained • Event notification • After corresponding action
Cogs • KeepAlives • Piggybacks event notifications, cache invalidations etc. • Leases/TimeOuts • Master and client-side local leases • Failover handling
Developers are … Human Despite attempts at education, our developers regularly write loops that retry indefinitely when a file is not present, or poll a file by opening it and closing it repeatedly when one might expect they would open the file just once. Our developers sometimes do not plan for high availability in the way one would wish. However, mistakes, misunderstanding and the differing expectations of our developers lead to efforts that are similar to attacks. Our developers are confusedby non-intuitive caching semantics, so we prefer consistent caching. Developers also fail to appreciate the difference between a service being up, and that service being available to their applications. A lock-based interface is more familiar to our programmers.
Hit or Miss? • Scales to 90000 clients • Can be scaled further using proxies and partitioning • 61 outages in total • 52 under 30s (almost no impact) • 1 outage due to overload • Data loss on 6 occasions • 4 due to software error (fixed) • 2 due to operators Scalable Available Reliable
More Numbers • Naming-related events • 60% file opens • 46% stored files • 10 clients use each cached file out of 230k • 14% of cache are negative caches for names • KeepAlive account for 93% traffic • Uses UDP instead of TCP