790 likes | 923 Vues
This lecture by Professor Dr. Robert Tolksdorf dives into the intricacies of peer-to-peer networks, focusing on innovative search methodologies. The talk covers two primary approaches: content-agnostic and content-based search strategies. Key topics include the advertisement/query model, the role of centralized and decentralized mediators, and the implications of power law networks on search efficiency. Attendees will gain insights into constructing dynamic distributed directories and the advantages of using decentralized mediators for optimized resource discovery.
E N D
Search in Distributed Networks Lecture: Peer-to-peer networksProfessor: Dr. Robert TolksdorfElena Antonenko elena.Antonenko@web.deMalte Münchert muencher@inf.fu-berlin.deJing Zhao zhao@inf.fu-berlin.deShunfeng Zhang zhang@inf.fu-berlin.de
Language of the talk: • English instead of German! • Comment: German is also a very beautiful language! • Question can asked in German!
Structure of our talk: • Introduction • Content-Agnostic Search (Shunfeng); • Contect-Based Search (Elena); • Pastry(Malte); • JXTA Search (Jing)
Introduction • Most applications (file sharing, instant-messaging, chatting) involve • finding objects and resource of interest • exchanging resources with other peers. • Accomplished by a system of advertisements and queries
Introduction • Advertisement/query model: • Resource providers publish resource and resource consumer send • search queries; • Resource seekers advertise needs on the network and resource providers query the network for resource;
Introduction • The problem reduced to: • query a dynamic and distributed directory of • advertiesements by advertisement consumers • Distributed directory is built using a subset of all the peers in the network
Content-Agnostic Search >>>basic concept Organization of the peers not depend on the resources they index or point to;
Content-Agnostic Search >>> central mediator • Register content with the central server; • Query the central server for Information; • Roles of central server: • Matchmaker • Broker;
Content-Agnostic Search >>> central mediator as Matchmaker ASK-ALL: who can help? Reply: name1 + info1… Unadvertise Advertise STREAM-All „request“ REPLY… Matchmaker Requester Peer
Content-Agnostic Search >>> central mediator as Matchmaker • Requester: an agent with an objective that it wants to be achieved by some other agent. • Matchmaker: an agent that • knows the names of many agents • and their corresponding capabilities. • Server: an agent that has committed itself to fulfilling objectives on behalf of other agents.
Content-Agnostic Search >>> central mediator as Broker STREAM-ALL: „Request“ REPLY Unadvertise Advertise Broker Requester Peer
Content-Agnostic Search >>>central mediator as Broker • Requester: an agent that has an objective that the agent wants to has achieved by another agent. • Broker: • an agent that knows the names of some other agents and their corresponding capabilities, • and advertises its own capabilities as some function of the capabilities of these other agents. • Brokered Server: an agent that has committed to the broker to taking on a predetermined class of objectives.
Advantages Comprehensive Fast update Minimized messages exchange Disadvantages Central point failure Non-scalabe Needing central authority Comment: Be solved with decentralized mediator Content-Agnostic Search >>>central mediator
Content-Agnostic Search >>>Network forming random connected Graphs • Nodes are connected to few random neighbors • Example: Gnutella network • Already done in 2.nd Talk in the Lecture • Power Law Networks The search takes advantage of the power law link distribution of naturally occurring networks
Content-Agnostic Search >>>Power Law Networks • Power law distribution:few nodes have very high connectivitymany nodes with very low connectivity
Content-Agnostic Search >>>Power Law Networks Rule: Each time: one node two edges connect to node with higher degree
Content-Agnostic Search >>>Power Law Networks • Power law graphs are dynamically constructed • the rewiring of nodes occurs not randomly, but preferentially attaching to the most connected nodes.
Content-Agnostic Search >>>Power Law Networks • Power law search algorithm • needs modification to the basic Gnutella approach;
the Gnutella approach Broadcasting to all neighbors Can exchange with every neighbors Modified Gnutella the neighbor with highest connechtions Exchange with the first- and second-degree neighbors Content-Agnostic Search >>>Power Law Networks
Content-Agnostic Search >>>Power Law Networks • Advantages of PLN • Networks of decentralized mediators • Broadcasting queries to all neighbors avoided • Search cost reduced
Content-Based Search: Introduction • Content of queries is used to efficiently route the messages to the most relevant peers • Search techniques include: • Content-mapping networks; • Some variations of publish/subscribe networks; Content-Based Search
Content – Mapping Search Networks • All peer in network index a „zone“ of the advertisement space • The zone is dynamic • Size of the zone depends on the number of peers • Peers map advertisement content to the space • Mapping is performed using hash functions • Examples include: CAN, Chord, Tapestry, Pastry Content-Based Search
Distributed Hash Table (DHT) • DHT provides the same functionality as traditional hash table • DHT stores key value pair • Data structure is distributed over different nodes • Provides functions: • insert(id, item); • item = query(id); • Item can be anything: a data object, document, file, pointer to a file Content-Based Search
Content Addressable Network (CAN) • CAN is based on virtual d-dimensional coordinate space • Associate to each node and item a unique idin an d-dimensional space • Goals • Scales to hundreds of thousands of nodes • Handles rapid arrival and failure of nodes Content-Based Search
Space divided between nodes All nodes cover the entire space Each node covers either a square or a rectangular area Example: Node n1: (1, 2) first node that joins cover the entire space CAN Example: Two Dimensional Space Content-Based Search
Node n2: (4, 2) joins space is divided between n1 and n2 CAN Example: Two Dimensional Space Content-Based Search
Node n3:(3, 5) joins too CAN Example: Two Dimensional Space Content-Based Search
Nodes n4:(5, 5) and n5:(6,6) join CAN Example: Two Dimensional Space Content-Based Search
Nodes: n1:(1, 2); n2:(4,2); n3:(3, 5); n4:(5,5); n5:(6,6) Items: f1:(2,3); f2:(5,0); f3:(2,1); f4:(7,5) CAN Example: Two Dimensional Space Content-Based Search
Each item is stored by the node who owns its mapping in the space CAN Example: Two Dimensional Space Content-Based Search
Each node knows ist neighbors in the d-space Forward query to the neighbor that is closest to the query id Example: assume n1 queries f4 CAN: Query Example Content-Based Search
CAN Routing • For d dimensions with n equal zones each node has 2d neighbors • Routing table size O(d) • Guarantees that a file is found in at most d x n 1/d steps, where n is the total number of nodes • Algorithm: Choose the neighbor nearest to the destination Content-Based Search
CAN: Multi-Dimension • Increase in the dimension reduces the path length Content-Based Search
Chord: Introduction • Chord is a distributed lookup protocol • Given a key (data item), it maps the key onto a node (peer). • Hash function assigns each node and key anm-bit identifier. • A node’sidentifier is defined by hashing the node’s IP address. • A key identifier is produced by hashing the key • ID(node) = hash(196.178.0.1) • ID(key) = hash(“jingle-bells.mp3”) Content-Based Search
Chord: Data Structure • Identifiers are ordered in a virtual ring of size 2m • Each node maintains • Finger table • Entry iin the finger table of node nis the first node that succeeds or equals n + 2i: successor(id) • Predecessor node • An item identified by idis stored on the successor node of id Content-Based Search
Chord: Example • Assume an identifier space 0..7 • Node n1:(1) joins all entries in its finger table are initialized to itself Content-Based Search
Chord: Example • Nodes n2:(2), n0:(0), n6:(6) join Content-Based Search
Chord: Example Nodes: n0(0),n1:(1), n2(2), n6(6) Items: f1:(1), f7:(7) Content-Based Search
Chord: Example Upon receiving a query for item id, a node • Check whether stores the item locally • If not, forwards the query to the largest node in its successor table that does not exceed id Content-Based Search
Chord: Properties • Routing table size O(log(N)) , where N is the total number of nodes • Guarantees that a file is found in O(log(N)) steps Content-Based Search
Pastry - Introduction • Decentralized and scalable DHT-network • Designed for efficient message routing between nodes
What does DHT mean? • Distributed Hash Table • Hash value for every peer • Every peer has knowledge of some other peers (stored in a hash table) • All hash tables from all peers represent a complete map for all peers
Peers reside on a virtual circle made up from all possible addresses Blue points represents peers The Pastry namespace 2128 20
Message is sent to (known) node which is numerically closest to the target-node Procedure is repeated until target-node is reached Pastry routing Origin Closest to target Distance Destination
Message is sent to (known) node which is numerically closest to the target-node Procedure is repeated until target-node is reached Pastry routing Origin Destination