1 / 47

Composable Consistency for Wide Area Replication

Composable Consistency for Wide Area Replication. Sai Susarla Advisor: Prof. John Carter. Overview. Goal : m iddleware support for wide area caching in diverse distributed applications Key Hurdle : flexible consistency management

Rita
Télécharger la présentation

Composable Consistency for Wide Area Replication

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Composable Consistency for Wide Area Replication Sai Susarla Advisor: Prof. John Carter

  2. Overview • Goal: middleware support for wide area caching in diverse distributed applications • Key Hurdle: flexible consistency management • Our Solution: novel consistency interface/model - Composable Consistency • Benefit: supports broader set of sharing needs than existing models. Examples: • file systems, databases, directories, collaborative apps – wider variety than any existing consistency model can support • Demo Platform: novel P2P middleware data store - Swarm

  3. Caching: Overview • The Idea: cache frequently used items locally for quick retrieval • Benefits • Within cluster: load-balancing, scalability • Across WAN: lower latency, improved throughput & availability • Applications • Data stored in one place, accessed from multiple locations • Examples: • File system: personal files, calendars, log files, software, … • Database: online shopping, inventory, auctions, … • Directory: DNS, LDAP, Active Directory, KaZaa, … • Collaboration: chat, multi-player games, meetings, …

  4. Centralized Service Primary server cluster client user Internet

  5. Proxy-based Caching Primary server cluster Consistency protocol client user Internet Caching proxy Server cluster

  6. Caching: The Challenge Applications have diverse consistency needs

  7. Caching: The Problem • Consistency requirements are diverse • Caching is difficult over WANs • Variable delays, node failures, network partitions, admin domains, … • Thus, most WAN applications either: • Roll their own caching solution, or • Do not cache and live with the latency Can we do better?

  8. Thesis "A consistency management system that provides a small set of customizable consistency mechanisms can efficiently satisfy the data sharing needs of a wide variety of distributed applications."

  9. Outline • Further Motivation • Application study  new taxonomy to classify application sharing needs • Composable Consistency (CC) model • Novel interface to express consistency semantics for each access • Small option set can express more diverse semantics • Evaluation

  10. Existing Models are Inadequate • Provide a few packaged consistency semantics for specific needs: • e.g., optimistic/eventual, close-to-open, strong • Or, lack enough flexibility to support diverse needs • TACT (cannot express weak consistency or session semantics) • Bayou (cannot support strong consistency) • Or, leave consistency management burden on applications • e.g., Oceanstore, Globe

  11. Existing Middleware is Inadequate • Existing middleware support specific sharing needs • Read-only data: PAST, BitTorrent • Rare write-sharing: file systems (NFS, Coda, Ficus …) • Master-slave (read-only) replication: storage vendors, mySQL • Scheduled (nightly) replication: storage and DB services • Read-write replication in a cluster: commercial DB vendors, Petal

  12. Application Survey 40+ applications with diverse consistency needs

  13. Survey Results Found common issues, overlapping choices • Are parallel read and writes ok? • How often should replicas synchronize? • Does update order matter? • What if some copies are inaccessible? • … Can we exploit this commonality?

  14. Composable Consistency:Novel interface to express consistency semantics Concurrency control Replica synchronization Failure handling View Isolation Update Visibility

  15. Example: Close-to-open (AFS) Allow parallel reads and writes Latest data guaranteed at open() Fail access when partitioned Accept remote updates only at open() Reveal local updates to others only on close()

  16. Example: Eventual Consistency (Bayou) Allow parallel reads and writes Sync copies at most once every 10 minutes Syncing should not block or fail operations Accept remote updates as they arrive Reveal local updates to others as they happen

  17. Handling Conflicting Semantics • What if two sessions have different semantics? • If conflicting, block a session until conflict goes away (serialize) • Otherwise, allow them in parallel • Simple rules for checking conflicts (conflict matrix) • Examples: • Exclusive write vs. exclusive read vs. eventual write: serialize • Write-immediate vs. session-grain isolation: serialize • Write-immediate vs. eventual read: no conflict

  18. Using Composable Consistency • Perform data access within a session e.g., • session_id = open(object, CC_option_vector); • read(session_id, buf); • write(session_id, buf); OR, update(session_id, incr_counter(value)); • close(session_id); • Specify consistency semantics per-session at open() via the CC option vector • Concurrency control, replica synchronization, failure handling, view isolation and update visibility. • System enforces semantics by mediating each access

  19. Composable Consistency Benefits • Powerful: Small option set can express diverse semantics • Customizable: allows different semantics for each access • Effective: amenable to efficient WAN implementation • Benefit to middleware • Can provide read-write caching to a broader set of apps. • Benefit for an application • Can customize consistency to diverse and varying sharing needs • Can simultaneously enforce different semantics on the same data for different users

  20. Evaluation

  21. Swarm: A Middleware Providing CC • Swarm: • Shared file interface with CC options • Location-transparent page-grained file access • Aggressive P2P caching • Dynamic cycle-free replica hierarchy per file • Prototype implements CC (except causality & atomicity) • Per-file, per-replica and per-session consistency • Network economy (exploit nearby replicas) • Contention-aware replication (RPC vs caching) • Multi-level leases for failure resilience

  22. Client-server BerkeleyDB Application App users App users LAN LAN Primary App server Internet App logic DB kernel FS

  23. BerkeleyDB Application using Swarm App users App users LAN LAN Primary App server Internet App logic Swarm server RDB wrapper RDB plugin DB DB kernel FS

  24. Caching Proxy App Server using Swarm App users App users LAN LAN Proxy App server Primary App server Internet App logic Swarm server App logic Swarm server RDB plugin RDB wrapper RDB wrapper RDB plugin DB DB DB DB kernel FS kernel FS

  25. Swarm-based Applications • SwarmDB: Transparent BerkeleyDB database replication across WAN • SwarmFS: wide area P2P read-write file system • SwarmProxy: Caching WAN proxies for an auction service with strong consistency • SwarmChat: Efficient message/event dissemination No single model can support the sharing needs of all these applications

  26. SwarmDB: Replicated BerkeleyDB • Replication support built as wrapper library • Uses unmodified BerkeleyDB binary • Evaluated with five consistency flavors: • Lock-based updates, eventual reads • Master-slave writes, eventual reads • Close-to-open reads, writes • Staleness-bounded reads, writes • Eventual reads, writes Compared against BerkeleyDB-provided RPC version • Order-of-magnitude throughput gains over RPC by relaxing consistency

  27. SwarmDB Evaluation • BerkeleyDB B-tree index replicated across N nodes • Nodes linked via 1Mbps links to common router 40ms RTT to each other • Full-speed workload • 30% Writes: inserts, deletes, updates • 70% Reads: lookups, cursor scans • Varied # replicas from 1 to 48

  28. SwarmDB Write Throughput/replica Local SwarmDB server Optimistic 20msec stale 10msec stale RPC over WAN Master-slave writes, eventual reads Close-to-open Locking writes, eventual reads

  29. SwarmDB Query Throughput/replica Local SwarmDB server Optimistic 10msec stale RPC over WAN Close-to-open

  30. SwarmDB Results • Customizing consistency can improve WAN caching performance dramatically • App can enforce diverse semantics by simply modifying CC options • Updates & queries with different semantics possible

  31. SwarmFS Distributed File System • Sample SwarmFS path • /swarmfs/swid:0x1234.2/home/sai/thesis.pdf • Performance Summary • Achieves >80% of local FS performance on Andrew Benchmark • More network-efficient than Coda for wide area access • Correctly supports fine-grain collaboration across WANs • Correctly supports file locking for RCS repository sharing

  32. SwarmFS: Distributed Development

  33. Replica Topology

  34. SwarmFS vs. Coda Roaming File Access Network Economy Coda-s always gets files from distant U1. SwarmFS gets files from nearest copy.

  35. SwarmFS vs. Coda Roaming File Access P2P protocol more efficient Coda-s writes files through to U1 for close-to-open semantics. Swarm’s P2P pull-based protocol avoids this. Hence, SwarmFS performs better for temporary files.

  36. SwarmFS vs. Coda Roaming File Access Eventual consistency inadequate Coda-w Compile errors • Coda-w behaves incorrectly • `make’ skipped files • linker found corrupt object files. • Trickle reintegration pushed huge obj files to U1, clogging network link.

  37. Evaluation Summary • SwarmDB: gains of customizable consistency • SwarmFS: network economy under write-sharing • SwarmProxy: strong consistency over WANs under varying contention • SwarmChat: update dissemination in real-time By employing CC, Swarm middleware data store can support diverse app needs effectively

  38. Related Work • Flexible consistency models/interfaces • Munin, WebFS, Fluid Replication, TACT • Wide area caching solutions/middleware • File systems and data stores: AFS, Coda, Ficus, Pangaea, Bayou, Thor, … • Peer-to-peer systems: Napster, PAST, Farsite, Freenet, Oceanstore, BitTorrent, …

  39. Future Work • Security and authentication • Fault-tolerance via first-class replication

  40. Thesis Contributions • Survey of sharing needs of numerous applications • New taxonomy to classify application sharing needs • Composable consistency model based on taxonomy • Demonstrated CC model is practical and supports diverse applications across WANs effectively

  41. Conclusion • Can a storage service provide effective WAN caching support for diverse distributed applications? YES • Key enabler: a novel flexible consistency interface called Composable consistency • Allows an application to customize consistency to diverse and varying sharing needs • Allows middleware to serve a broader set of apps effectively

  42. SwarmDB Control Flow

  43. Composing Master-slave • Master-slave replication • serialize updates • Concurrent mode writes (WR) • Serial update ordering (apply updates at central master) • eventual consistency for queries • Options mentioned earlier • Use: mySQL DB read-only replication across WANs

  44. Clustered BerkeleyDB

  45. BerkeleyDB Proxy using Swarm

  46. A Swarm-based Chat Room 3 callback(handle, newdata) { display(newdata); } main() { handle = sw_open(kid, "a+"); sw_snoop(handle, callback); while (! done) { read(&newdata); display(newdata); sw_write(handle, newdata); } sw_close(handle); } P 4 2 1 Sample Chat client code Chat transcript: WR mode, 0 second soft staleness, immediate visibility, no isolation Update propagation path

More Related