1 / 48

P2P Apps

P2P Apps. CS525 Advanced Topics in Distributed Systems Spring 07 Presented By: Imranul Hoque, Sonia Jahid. P2P Apps. Applications that use P2P techniques for better performance OR Applications that are built on P2P protocols We’ll consider both PAST: Built on top of Pastry

lou
Télécharger la présentation

P2P Apps

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. P2P Apps CS525 Advanced Topics in Distributed Systems Spring 07 Presented By: Imranul Hoque, Sonia Jahid

  2. P2P Apps • Applications that use P2P techniques for better performance OR • Applications that are built on P2P protocols • We’ll consider both • PAST: Built on top of Pastry • OverCite: Uses DHT instead of DB for storage • Colyseus: Uses range-queriable DHT for pub-sub

  3. Storage management & caching in PAST Antony Rowstron, Peter Druschel

  4. Background • P2P Storage Utility • Built on top of Pastry • Aim • Strong Persistence • k number of replicas • High Availability • Caching • Scalability • High utilization • Security • Quota, Key, Encrypted Routing Table

  5. Operations • Insert • Parameters: name, owner-credentials, k, file • Returns: 160 bit identifier (fileId) • Lookup • Parameter: fileId • Returns: file from one of the (near) k nodes • Reclaim • Parameters: fileId, owner-credentials

  6. Operations: Insert Insert • fileId = SHA-1(name, public-keyowner, salt) • storage = storage – k * filesize • fileCert = fileId, SHA-1(file), k, salt, date, owner • [file, fileCert, {fileCert}private-key] routed via Pastry • Before insertion, verify the certificate • Forward insert request to other (k-1) nodes • Once done each of the k replicasissues store receipt • Client gets the ack and does verification

  7. Operations: Lookup Lookup • Client node issues request message • Pastry routes request message to a node containing file • Node replies with [file, fileCert, {fileCert}private-key] • Client verifies the file

  8. Operations: Reclaim Reclaim • Client issues reclaim certificate • Reclaim certificate routed via Pastry • Replicas verify the reclaim certificate • Each storing nodes issues a reclaim receipt • A client receiving a reclaim receipt: • Verifies the receipt • storage = storage + filesize Reclaim vs. Delete

  9. Storage Management • Goal • High global storage utilization • Graceful degradation at maximal utilization • Responsibility • Balancing free storage space • File copies are maintained by the k nodes with nodeIds closest to fileId • Conflicting Responsibility! • Solution: • Replica Diversion & File Diversion

  10. Replica Diversion • Purpose: Balance remaining free storage among the nodes in a leaf set • Policy: • if (fileSize/freeSpacepri > tpri) • { • Choose a node div from the leaf set: • fileSize/freeSpacediv < tdiv • Not among the k closest nodes of fileId • Do not hold a replica of the same file • freeSpacediv is maximum among all such nodes • If no such node then: • Nodes already stored the replica discard them • Send a NACK to client causing File Diversion • }

  11. Replica Diversion (2) File A 2 3 1 4 Leaf Set of A

  12. Replica Diversion (3) A 2 File 3 1 A stores a pointer to 4 4 Leaf Set of A

  13. Replica Diversion (4) C A 2 File 3 1 A stores a pointer to 4 4 Leaf Set of A A inserts a pointer in C C = (k+1)th closest node

  14. File Diversion • Purpose: Balance remaining free storage space among different portions of nodeId space • Method: • Generate fileId with different salt • Retry insert operation • Repeat process for three times • If third attempt fails, report insert failure

  15. Maintaining Replicas • Nodes exchange keep-alive messages to keep track of failed nodes • Unresponsive nodes replaced by new entries • New joining nodes also cause change in leaf set • If a joining node becomes one of the kthclosest of a node N • The joining node keeps a pointer to the file table of N • Affected files are gradually migrated in background • A node discovering that a diverted replica is not a part of its leaf set • Gradually migrates files to a node within the node’s leaf set

  16. Maintaining Replicas (2) • Node failure may cause storage shortage • Remaining nodes in leaf set can’t store files • Contact 2 most distant members of the leaf set to locate space • Fail otherwise • Open Pastry and Proprietary Pastry differ in replica management scheme

  17. Optimization • File Encoding • Idea 1: Don’t store k replicas, add m checksum blocks (Pros and Cons?) • Idea 2: Store fragments of file at separate nodes (Pros and Cons?) • File Caching • Idea: Use unused portion of advertised disk space to cache files • Cache copies can be discarded or evicted at any time

  18. Experimental Setup • 2250 PAST Nodes • Number of replicas, k = 5 • For Pastry, b = 4 • Two different workloads: • Proxy logs from NLANR • File name and size from several file systems at Microsoft

  19. Experimental Results • tdiv = 0.05 • Lower tpri = higher success rate • Lower tpri = lower utilization • Why? • tpri = 0.1 • tdiv = 0.05

  20. Experimental Results (2) File diversions are negligible as long as storage utilization below 83%

  21. Experimental Results (3) Even at 80% utilization less than 10% are diverted replicas

  22. Experimental Results (4) • At low storage utilization Hit Rate is high • # of hop increases as utilization increases

  23. Discussion • CFS: Built on top of Chord • CFS stores blocks rather than whole files • CFS relies on caching for small files only • Ivy: Read/Write P2P file system • Based on a set of logs and DHash • Provides an NFS like file system view to the user • Uses version vectors for synchronization • OceanStore: Uses un-trusted servers • Data is encrypted • Uses ACL for write access, key for read access • Data migrated based on access patterns

  24. OverCiteA Distributed, Cooperative CiteSeer Jeremy Stribling, Jinyang Li, Isaac G. Councill, M. Frans Kaashoek, Robert Morris

  25. Motivation • CiteSeer allows users to search and browse a large archive of research papers • Centralized design: Crawler, Database, Indexer • OverCite aggregates donated resources at multiple cites • Challenge: • Load balancing storage • Query processing with automatic data management

  26. Contribution • A 3-tier DHT-backed design • OverCite • Experimental evaluation with 27 nodes, full CiteSeer document, trace of CiteSeer user queries

  27. Architecture Web server Web server Tier1 Keyword Search Keyword Search Crawler Crawler Tier2 Local index file Local index file DHT storage for docs & meta data Tier3

  28. Life of a Query Front End Index Server DHT • Front End (FE) chosen via Round Robin DNS • FE sends query to k Index Servers • Index Servers contact DHT for metadata • Result forwarded to FE, FE aggregates

  29. Global Data Structure • Document ID (DID) for each document for which a PDF/PS file is found • Citation ID (CID) to every bib entry in a document • Group ID (GID) for use in contexts where a file is not required

  30. Global Data Structure (2) If exists DID -> 110 GID -> 150 FID -> 425 CID -> 231 GID ->118

  31. Local Data Structures • Required for keyword searches • Local data includes Inverted Index Table • OverCite uses k index partitions • Total number of nodes = n • n > k • n/k copies of each index partition • Large k vs. Small k

  32. Local Data Structures (2) Partition 0 Partition 1 Crawler Partition 2 File ID = 124 124%3 = 1 DHT

  33. Web Crawler • Builds on several existing proposals for distributed crawling • Crawler performs a lookup for each new PDF/PS link in the URLs table • Download, parse, extract metadata • Check for duplicate • FID in Files • Title in Titles • Shingle in Shins • Update Files, Docs, Cites, Groups, Titles

  34. Implementation • Current implementation does not include Crawler module • Populated with existing CiteSeer docs • Indexes the first 5000 words of each doc • OK Web Server • DHash DHT

  35. Performance • 27 nodes: 16 at MIT, 11 over N. America • 47 physical disks (each 35 ~ 400 GB) • Inserted 674,720 original copies of documents from CiteSeer repository • k =2 , m = 20 • Each node has a complete on-disk cache of the text files for all documents in its index partition

  36. Performance (2) Query Throughput • Total nodes in each configuration is twice the front ends • A client at MIT keeps 128 queries active File Download • Client requests 128 files concurrently

  37. Performance (3) • Adding n nodes would decrease per node storage costs by a factor of roughly n/4 Centralized server OverCite deployment

  38. Discussion • Future Works: • Detecting plagiarism • Automatic alert system • Shallow paper • Google Scholar (http://scholar.google.com) • Search query: Impossibility of Consensus • Google Scholar: Top result • CiteSeer: Not even in top 20 • OverCite: Not in the top 20 • Search query: Chord • Google Scholar: Top result • CiteSeer: Not in top 20 • OverCite: Top result

  39. ColyseusA Distributed Architecture for Online Multiplier Games Ashwin Bharambe, Jeffrey Pang, Srinivasan Seshan

  40. Background & Motivation • Contemporary Game Design: • Client-Server architecture • e.g., Quake, World of Warcraft, Final Fantasy, etc. • Problems? • Single server: a computation & communication bottleneck • Optimization? • Area of interest filtering • Delta encoding

  41. Background & Motivation • Quake II server running on PIII 1GHz machine, 512MB RAM • Each player simulated with server side AI bot • AOI filtering & DE implemented • Each game run for 10 mins at 10 frames/sec

  42. Colyseus Architecture • Challenges: • Arrive at a scalable & efficient state & logic partitioning that enables reasonably consistent, low latency game play • Objects: • Immutable- map geometry, graphics, etc. • Updated infrequently, so globally replicated • Mutable- players’ avatars, doors, etc. • Updated frequently

  43. Colyseus Architecture (2) Node 1 Node 2 Game application Game application P P Object Placer Object Placer R R P R Replica Manager Replica Manager Local Object Store Local Object Store Object Locator Object Locator Colyseus Components

  44. Colyseus Architecture: Object Location • Subscription: Range queries describing area-of-interests are sent and stored in DHT • Publication: Other objects periodically publish metadata (e.g., their x, y, z coordinates) in the DHT • Challenge? • Overcome the delay between submission of subscription & reception of matching publication

  45. Range Queriable DHT: Overview 8 here 7 here 8 here 7 here 6 here 6 stored here Range-queriable DHT Traditional DHT

  46. Optimization • Pre-fetching: • Primary objects predict the set of objects they may need in near future • Colyseus pre-fetches them (area-of-interest) • Pro-active replication: • Allow short-lived objects to attach themselves to others • Soft state storage: • Object locator stores both publication-subscription • TTL added pub-sub • Publish objects at different rates

  47. Experimental Results: Communication Cost • p2p rect configuration • Workload keeps mean player density constant by increasing map size • At very small scale object location overhead is very high • Per node bandwidth rises very slowly in Colyseus

  48. Discussion • Colyseus enables low latency game play through optimizations • Range-queriable DHT achieves better scalability & load balance than traditional DHT (Consistency penalty?) • Security?

More Related