Leveraging Caching for Internet-Scale Content-Based Networks

Leveraging caching for Internet-scale content-based publish/subscribe networks Mohamed Diallo, Serge Fdida – UPMC Sorbonne Universités, France. Vasilis Sourlas, Paris Flegkas and Leandros Tassiulas – University of Thessaly, CERTH-ITI, Greece. 6/7/2011, ICC 2011, Kyoto

Outline • Challenges for Internet-scale content-based publish/subscribe networks • Proposal for enhanced scalability • Service model capturing heterogeneous consumer requirements • Caching to efficiently implement the service model • Performance comparison of several caching policies • Ongoing and future work

Content-based Publish/Subscribe networks a.k.a Content-based networks (CBNs) Subscribe( ) Broker Subscribe ( ) Publish( ) S1: [type,=,music/mp3] [artist, =, yusuf], [album, =, *], [year, >, 1970] S2 : [type,=,article/english], [conference,=,infocom],[year,=,*] Declarative Decoupling Decentralized Generic Asynchronous Receiver-driven Content-based forwarding Huge complexity Communication-efficient

Challenges for Internet-scale CBN • Addressing heterogeneous consumer requirements. • Communication-efficiency • Huge volume of information available • Limited end-to-end bandwidth • Limited attention span for information consumers • Locality of popularity patterns Exhaustive filtering is not enough.

Proposal for enhanced scalability • A framework for Internet-scale CBPS services defining: • A generic service model capturing heterogeneous consumer requirements and supporting content-based retrieval and dissemination. • We leverage caching to increase content availability for future requests and consequently increase the quality of service offered to consumers as well as to reduce communication costs.

Generic service model I (predicate) Freshness (F) I (predicate, max, lifetime, freshness) Subscriptions Fmax Non-persistent interests Loose subscriptions Subscription with the exhaustivesemantic (predicate:=(P); max:=all; lifetime:=1week; freshness:=0) Search (predicate:=(P); max:=1; lifetime:=0; freshness:=24h) News alerting services (predicate:=(P); max:=10; lifetime:=24h; freshness:=24h) Lmax Lifetime (L)

Service model: Loose subscriptions • Loose subscriptions can be refreshed and satisfied with cached publications that have not been delivered to them in previous lifetimes. • Brokers should not keep an exhaustive history of all subscriptions satisfied with a publication or of all publications used to satisfy a subscription (REQ). Refreshed interests consume only publications which have never been used to satisfy any previously advertised interest Pending publications Dispatched publications

Broker Simplified Model PPT: Pending publication table DPT: Dispatched publication table PIT: Pending interest table Incomingpublicationflow inside a broker

Protocols Overview Selection policy Duplicate dropping Replacement policy Starvation BrokerC BrokerR Consumer BrokerP Provider I (ps, 1, l1, f1) {c2} {c1} {c1,c2} I (ps, 2, l1, f1) I (ps, 1, l1, f1) Upload(c2) lifetime Forward(c2) Notify(c1, c2) Consume I (ps, max, lifetime, freshness, refresh) Overload notification Forward(c2) BrokerW

Evaluation: Caching Policies DPF: Dispatched pending first PPF: Pending publication first MF: Most fresh LF: Less fresh MFU: Most frequently used LFU: Least frequently used MRU: Most recently used LRU: Least recently used

Evaluation Setup • PEERSIM-based simulations. • We focus on loose subscriptions and assume that the traffic is driven by 20 events. • We consider a hierarchical topology of 100 nodes of average degree 2. • We assume that event popularity follow a zip distribution (α=0.7) and that more popular events generate more publications and have larger and more widely spread audiences. • We simulate 10k basic interests refreshed with probability Pr.

Evaluation: Cache size (slots) impact Slots  cache size – ε

Evaluation: Impact of refreshed interests ratio

Ongoing and future work • Extension of the broker model to increase content availability [1] • Better characterize the framework. • Realistic setting (From Google news) • Evaluate more caching policies and measure their fairness. • Consider the impact of locality and event popularity distributions on performances. • Quantify communication gains over baseline CBN under various workload assumptions. • Compute from traces the optimal performances achievable. • Design high-throughput data-structures (PIT) supporting heterogeneous application schemas.

Conclusions • We introduced a generic service model capturing heterogeneous consumer requirements. • We described how to leverage caching to implement effectively the service model. • Preliminary results indicate that performances are very sensitive to caching policies.

Related work [1] M. Diallo et al., ``Towards extreme scale content-based networking," to appear in IEICE Transactions on communications, special section on New paradigms for content distribution and sharing. [2] M. Diallo and S. Fdida, ``IOA-CBR: information overload-aware content-based routing," in proc. 4th ACM Inter. Conference on Distributed Event-Based Systems, Cambridge, UK, 2010. [3] M. Diallo and S. Fdida , ``Avalanche: towards a scalable content-based Pub/Sub network service," in proc. of the 3rd ACM Inter. Conference on Distributed Event-Based Systems, Nashville, TN, 2009. [5] CBPSgen: A workload generator for content-based publish/subscribe (Under development)

Questions 設問

Leveraging Caching for Internet-Scale Content-Based Networks

Leveraging Caching for Internet-Scale Content-Based Networks

Presentation Transcript

Smartfinds Internet Marketing Content Marketing

The Internet, Intranets, and Extranets

Surviving Large Scale Internet Outages

Review 2

Internet

InstantScan Content Manager

Modeling the Internet and the Web: Text Analysis

The Publish/Subscribe Communication Paradigm and its Application to Mobile Systems

Content Marketing “Everything Becomes an Inspiration”

Large Scale Machine Learning for Content Recommendation and Computational Advertising

An Introduction to Game-Based Assessment

Caching and caching dependencies explained in Kentico CMS

On the Scale and Performance of Cooperative Web Proxy Caching

Computer Networks (EENG 4810)

The Internet Protocol

Leveraging Logistics: a 3PL Perspective

CSC 2260 Operating Systems and Networks

Internet

STRING Large-scale data and text mining

Chapter 8 Communication Networks and Services

Key Factors - Mobile Internet Users Select Operators