Enhancing P2P File Storage and Distribution: Pastry Architecture Overview
Our project focuses on developing a robust peer-to-peer (P2P) file storage and distribution system through the use of Pastry architecture. The primary goal is to ensure data storage that is persistent, reliable, and secure while maintaining low costs. We explore various aspects, including state storage, routing mechanisms, and scalabibility challenges within the network. Future improvements will address security concerns, enhance the Manager service, and improve user experience through a GUI and better encryption. Join us for a demonstration of our system and discussion on potential advancements.
Enhancing P2P File Storage and Distribution: Pastry Architecture Overview
E N D
Presentation Transcript
Breadbox p2p file storage and distribution Team: Brian Smith, Daniel Suskin, Dylan Nunley, Forrest Vines Mentor: Brendan Burns
Overview • Objective • Background • Design • Pastry • Data Manager • Client • Future Improvements • Demo • Questions
Project Objective • Storage that is: • Persistent • Available • Reliable • Inexpensive • Secure
P2P Systems(Past/Present) • Napster • Gnutella • Freenet • Oceanstore • BitTorrent
Architectural Design Node A Node X Pastry Overlay Network
Pastry • Storing state • Routing • Maintaining state
Storing state • Leaf set • Closest numerically • Final routing • Neighborhood set • Locality, upkeep
Storing State • Routing table • Structure based on IDs
Routing • Option 1: Leaf set • Fastest • Option 2: Routing table • Prefix-based • Option 3: Other • Randomized • Option 4: Deliver
State maintenance: new nodes • Two phases • Request • Build state for new node • Announce • Tell others to add to state
State maintenance: old nodes • Lazy • Routing fails • Ask other nodes for new state. Either: • Leaf set • Neighborhood set • A single routing table entry
State maintenance: heartbeat • Neighbors
Scalability • Untested • But works in theory • State scales well • ID maximum digits x ID representation base + M + L
Reliability • Largely untested, but: • Lazy repair • Low chance of entire leaf set going down
Server/Data Manager • Service that is run on each node of the network • Connection between client and pastry substrate Client Client Data Manager Service Data Manager Service Pastry
Data Storage • Service manages local file chunks Data Chunk Data Manager Service Data Chunk Data Chunk
Scalability and Reliability:Replication • Each node is responsible for replicating chunks whose Ids are closest to its nodeID • Each chunk is replicated to the n closest nodes • Chunks that are no longer within the closest n will timeout
Client: Put CHUNK send chunk FILE compress MANAGER DATA log LOG
Client: Get CHUNK unchunk receive FILE decompress MANAGER DATA request GET LOG check
Future Improvements • Pastry • Security (Malicious Nodes) • Manager • Caching • Message aggregation • Client • Hash Check • GUI • Encryption