180 likes | 284 Vues
Routing Indices For P-to-P Systems. ICDCS 2002. Introduction. Search in a P2P system Mechanisms without an index Mechanisms with specialized index nodes (centralized search) Mechanisms with indices at each node Structure P2P network Unstructure P2P network
E N D
Routing Indices For P-to-P Systems ICDCS 2002
Introduction • Search in a P2P system • Mechanisms without an index • Mechanisms with specialized index nodes (centralized search) • Mechanisms with indices at each node • Structure P2P network • Unstructure P2P network • Parallel v.s. sequentially search • Response time • Network traffic
Routing indices(RI) • Query • Documents are on zero or more “topics”, and queries request documents on particular topics. • Documents topics are independent • Local index • RI • Each node has a local routing index which contains following information • The number of documents along each path • The number of documents on each topic of interest • Allow a node to select the “best” neighbors to send a query to
The RI may be “coarser” than the local indices • overcounts • Undercounts
Goodness measure • Number of results in a path • Using Routing indices
Storage space • N: number of nodes in the P2P network • b: branching factor • c: number of categories • s: counter size in bytes Centralized index : s*( c+1) *N Distributed system: s*(c+1)*b (each node)
Maintaining Routing Indices • Trade off between RI freshness and update cost • No requiring the participation of a disconnecting node • Discussion • If the search topics is dependent? • Can the number of “hops” necessary to reach a document be estimated?
Alternative Routing Indices • Hop-count RI • Aggregated RIs for each “hop” up to a maximum number of hops are stored
Search cost • Number of messages • The goodness of a neighbor • The ratio between the number of documents available through that neighbor and the number of messages required to get those documents • Regular tree with fanout F • It takes Fh messages to find all documents at hop h • Storage cost?
Exponentially aggregated RI • Store the result of applying the regular-tree cost formula to a hop-count RI • How to compute the goodness of a path for the query containing several topics?
Introduction • Structured overlays • Only support search with a single keyword • Similarity between two documents • Keyword sets • Vector space • Measure • Problems • Search problem • New keyword?
Meteorograph • Absolute angle
Publishing and Searching • Publish • Hash • Publish the item to a node np with the hash key closest to hash value
Search problem • Nearest answers • K_nearest answers • e • Partial • Comprehensive • Search strategy • Discussions • What happened when keyword vector is represented by q?
Other issues • Load balance (HW) • Changes of vector space • Republished? • Comprehensive set of keywords