Peer-to-Peer Databases

Peer-to-Peer Databases David Andersen Advanced Databases

What is Peer-to-Peer? • Shared Resources Each peer is a shares its resources with others, acting as both a client and server. • Decentralization and Self-organization Peers coordinate their activities with other peers rather than with a centralized server. • Autonomy Peers are free to come and go at will.

Napster • Hybrid P2P • Data stored on peers, but a central server maintained index of file location. • File sharing - not a DBMS system.

Gnutella • True P2P - Peer need only know one other peer to join. The Gnutella Protocol

Gnutella • Uses Flooding Queries hop from peer-to-peer. A TTL (time-to-live) sent with the query prevents eternal searching. • Very High Bandwidth Usage. • File Sharing – Not DBMS

P2P and Databases • Advantages • No Bottlenecks • Vast Resources Available • Improved Scalability • Improved Robustness • Less Management • Access to a tremendous amount of data

P2P and Databases • Challenges • Coordinating Semantics • Query Processing Efficiency • Topology/Bandwidth Considerations • Indexing • Replication • Performing Updates and Avoiding Stale Data • Security - Access Control and Peer Reputation

Case Study – Hyperion Project • Peers have a own local DBMS. • PeerDBMS layer augments the local DBMS to support peer-to-peer functionality. • Peers can form acquaintances. • Metadata is exchanged and the semantics of the peer acquaintance is mapped on the local system. • Uses Pair-wise Mappings to resolve queries.

The Hyperion PDBMS • Query Service • Handles Local Queries • Uses Mapping Tables to Rewrite or Translate Queries destined for Remote Databases • Peer Coordination Service • Manages and Executes Updates • Uses Event-Condition-Action Rules

The Hyperion PDBMS • P2P User Interface • Local and Peer Queries are posed through the interface • User is unaware of differing semantics at the peer • Peer Manager Messaging system to communicate with peers • Acquaintance Manager Manages exchange of schemas, mapping tables, and rules for updating data

Hyperion Mapping Tables Table from Airline ‘A’ Table from Airline ‘B’ Mapping Tables

Case Study – The Piazza Project Project Goals • Focus on developing query reformulation algorithms • Assist in defining mappings • Indexing • Enforcing access control

Piazza Schema Mappings • Two types of mappings • Peer Description Relates two or more peer schemas Example: DBProjects:Member(pName, member) = UW:Member(mid, pid, member), UW:Project(pid, pName) • Storage Description Relates data stored in at a peer into peer’s view of the world. Example:UPenn:student(sid, name, advisor) UPenn:Student(sid, name), UPenn:Advisor(sid, fid), UPenn:Faculty(fid, advisor)

Piazza Querying Reformulation Example

Piazza Indexing • Challenge How to send a query to a peer most likely to have the answer and avoid flooding entire network. • Piazza attempts to index schema and value mappings. • Current implementation is centralized • Peers upload summaries of differing granularity of data they possess • Peers periodically refresh their data summaries at the index.

Piazza Indexing • Peers upload attribute value pairs. • Index maintains a table of these pairs together with the object id of its origin. • Users query to the index and are returned the object which contains at least a partial match. • An example of an object that is indexed: s2 = [name = "Por%", age IN [50, 70], disease ="tuberculosis", type = "%"]

Update Management • Data is often replicated with traditional distributed databases • Problem is to avoid reading stale data • Technique – Use Read Consensus and Write Consensus • Example: Write to majority before performing update and/or read to a majority and accept newest version.

Update Management • Quorum Consensus can work with P2P too, but not with 100% guarantee because actual number of replications is not known, so setting a quorum very difficult. • Allow user to set quorum thresholds and accept the consequences of their decisions.

Update Management • Trade-offs

Questions?

References • Flexible Update Management in Peer-to-Peer Database Systems,David Del Vecchio and Sang H. Son, Department of Computer Science, University of Virginia • An Overview on Peer-to-Peer Information Systems, Karl Aberer, Manfred Hauswirth, Swiss Federal Institute of Technology (EPFL), Switzerland • Data Sharing in the Hyperion Peer Database System, Patricia Rodríguez-Gianolli et al, Proceedings of the 31st VLDB Conference,Trondheim, Norway, 2005 • The Hyperion Project:From Data Integration to Data Coordination, Marcelo Arenas et al, SIGMOD Record, Vol. 32, No. 3, September 2003 • The Piazza Peer Data Management Project, Igor Tatarinov et al, SIGMOD Record, Vol. 32, No. 3, September 2003 • Distributed Query Processing in P2P Systems with incomplete schema information, Marcel Karnstedt, Katja Hose, Kai-Uwe Sattler, Department of Computer Science and Automation, TU Ilmenau P.O. Box 100565, D-98684 Ilmenau, Germany

Peer-to-Peer Databases

Peer-to-Peer Databases

Presentation Transcript

Peer to peer

Peer to Peer

Peer to Peer

Peer-to-Peer Systems

Peer-to-peer networks

Peer-to-peer systems

Peer-to-Peer

Mobile Peer-to-peer Databases and Incentives for Participation

PEER-TO-PEER

Peer-to-Peer

Peer-to-Peer

Peer-to-peer networks

Peer to Peer

Peer-to-Peer Computing

Peer-to-Peer Networking

Peer-to-Peer Networks

Peer-to-Peer

What Can Databases Do for Peer-to-Peer

Peer-to-Peer Services

Peer-to-Peer