1 / 30

Building Distributed, Wide-Area Applications with WheelFS

Building Distributed, Wide-Area Applications with WheelFS. Jeremy Stribling, Emil Sit, Frans Kaashoek, Jinyang Li, and Robert Morris MIT CSAIL and NYU. Grid Computations Share Data. Nodes in a distributed computation share: Program binaries Initial input data

dennis
Télécharger la présentation

Building Distributed, Wide-Area Applications with WheelFS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Building Distributed, Wide-Area Applications with WheelFS Jeremy Stribling, Emil Sit, Frans Kaashoek, Jinyang Li, and Robert Morris MIT CSAIL and NYU

  2. Grid Computations Share Data Nodes in a distributed computation share: • Program binaries • Initial input data • Processed output from one node as intermediary input to another node

  3. So Do Users and Distributed Apps • Shared home directory for testbeds (e.g., PlanetLab, RON) • Distributed apps reinvent the wheel: • Distributed digital research library • Wide-area measurement experiments • Cooperative web cache • Can we invent a shared data layer once?

  4. Our Goal • Distributed file system for testbeds/Grids • App can share data between nodes • Users can easily access data • Simple-to-build distributed apps Testbed/Grid Node Node Node File foo File foo Node Node Node File foo

  5. Current Solutions Testbed/Grid Node Node Node Usual drawbacks: • All data flows through one node • File systems are too transparent • Mask failures • Incur long delays Central File Server Copy foo File foo Node Node Node

  6. Our Proposal: WheelFS • A decentralized, wide-area FS • Main contributions: 1) Provide good performance according to Read Globally, Write Locally 2) Give apps control with semantic cues

  7. Talk Outline • How to decentralize your file system • How to control your files

  8. What Does a File System Buy You? • A familiar interface • Language-independent usage model • Hierarchical namespace useful for apps • Quick-prototyping for apps

  9. File Systems 101 Node App 1 App 2 • File system (FS) API: • Open <filename>  <file_id> • {Close/Read/Write} <file_id> • Directories translate file names to IDs API call File System Operating System Local hard disk

  10. Distributed File Systems Testbed/Grid Node App 1 App 2 API call File System Operating System Local hard disk Node Node Node Node Node File 135 Dir 500: “foo”  135

  11. File 135? Consistency Servers Basic Design of WheelFS Node 653 Node 076 135 File 135 File 135 v2 Node 150 File 135 v3 Node 554 076 150 257 402 554 653 Node 257 Node 402 135 v3 135 v3 135 v2 135 v2 135 135

  12. Read Globally, Write Locally • Perform writes at local disk speeds • Efficient bulk data transfer • Avoid overloading nodes w/ popular files

  13. Read foo/bar Create foo/bar Write Locally • Choose an ID • Create dir entry • Write local file Node 653 Node 076 Node 554 Node 150 File 550 (bar) bar = 550 550 Node 402 Node 257 Dir 209 (foo)

  14. Read file 135 Read file 135 Read Globally Chunk Cached 135 • Contact node • Receive list • Get chunks Node 653 Cached 135 Node 076 Chunk Node 554 Node 150 Cached 135 076 653 076 653 076 554 653 File 135 File 135 Node 402 Node 257 Chunk

  15. Example: BLAST • DNA alignment tool run on Grids • Copy separate DB portions and queries to many nodes • Run separate computations • Later fetch and combine results

  16. Example: BLAST • With WheelFS, however: • No explicit DB copying necessary • Efficient initial DB transfers • Automatic caching for reused DBs and queries • Could be better since data is never updated

  17. Example: Cooperative Web Cache Collection of nodes that: • Serve redirected web requests • Fetch web content from original web servers • Cache web content and serve it directly • Find cached content on other CWC nodes

  18. Example: Cooperative Web Cache • Avoid hotspots if [ -f /wfs/cwc/$URL ]; then if notexpired /wfs/cwc/$URL; then cat /wfs/cwc/$URL exit fi fi wget $URL –O - | tee /wfs/cwc/$URL

  19. $URL Example: Cooperative Web Cache Dir 070 (/wfs/cwc) Node 653 Client Node 076 No! 135 “$URL” == 550 “$URL”? Node 554 Node 150 File 550 Cached 135 File 135 Chunk 135? 135 = v1 402 Chunk Node 402 Node 257 $URL Cached 135 Chunk if [ -f /wfs/cwc/$URL ]; then if notexpired /wfs/cwc/$URL; then cat /wfs/cwc/$URL exit fi fi wget $URL –O - | tee /wfs/cwc/$URL

  20. Talk Outline • How to decentralize your file system • How to control your files

  21. Example: Cooperative Web Cache • Would rather fail and refetch than wait • Perfect consistency isn’t crucial if [ -f /wfs/cwc/$URL ]; then if notexpired /wfs/cwc/$URL; then cat /wfs/cwc/$URL exit fi fi wget $URL –O - | tee /wfs/cwc/$URL

  22. Explicit Semantic Cues • Allow direct control over system behavior • Meta-data that attach to files, dirs, or refs • Apply recursively down dir tree • Possible impl: intra-path component • /wfs/cwc/.cue/foo/bar

  23. File 135? Semantic Cues: Writability • Applies to files • WriteMany (default) • WriteOnce Node 653 Node 076 Node 554 Node 150 Cached 135 v3 Cached 135 File 135 File 135 v2 File 135 v3 Node 402 Node 257

  24. File 135? Semantic Cues: Freshness • Applies to file references • LatestVersion (default) • AnyVersion • BestVersion Node 653 Node 076 Node 554 Node 150 Cached 135 File 135 Node 402 Node 257

  25. Write File 135 Write File 135 Semantic Cues: Write Consistency • Applies to files or directories • Strict (default) • Lax Node 653 Node 076 Node 554 Node 150 File 135 File 135 v2 Node 402 Node 257 135 v2 135

  26. Example: BLAST • WriteOnce for all: • DB files • Query files • Result files • Improves cachability of these files

  27. Example: Cooperative Web Cache • Reading an older version is ok: • cat /wfs/cwc/.maxtime=250,bestversion/foo • Writing conflicting versions is ok: • wget http://foo > /wfs/cwc/.lax,writemany/foo if [ -f /wfs/cwc/.maxtime=250,bestversion/$URL ]; then if notexpired /wfs/cwc/.maxtime=250,bestversion/$URL; then cat /wfs/cwc/.maxtime=250,bestversion/$URL exit fi fi wget $URL –O - | tee /wfs/cwc/.lax,writemany/$URL

  28. Discussion • Must break data up into files small enough to fit on one disk • Stuff we swept under the rug: • Security • Atomic renames across dirs • Unreferenced files

  29. Related Work • Every FS paper ever written • Specifically: • Cluster FS: Farsite, GFS, xFS, Ceph • Wide-area FS: JetFile, CFS, Shark • Grid: LegionFS, GridFTP, IBP • POSIX I/O High Performance Computing Extensions

  30. Conclusion • WheelFS: distributed storage layer for newly-written applications • Performance by reading globally and writing locally • Control through explicit semantic cues

More Related