1 / 31

A Framework for Highly-Available Real-Time Internet Services

A Framework for Highly-Available Real-Time Internet Services. Bhaskaran Raman, EECS, U.C.Berkeley. Video smoothing proxy. Internet. Content server. Transcoding proxy. Real-time services. Clients. Problem Statement. Real-time services with long-lived sessions

basil
Télécharger la présentation

A Framework for Highly-Available Real-Time Internet Services

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Framework for Highly-Available Real-Time Internet Services Bhaskaran Raman, EECS, U.C.Berkeley

  2. Video smoothing proxy Internet Content server Transcoding proxy Real-time services Clients Problem Statement • Real-time services with long-lived sessions • Need to provide continued service in the face of failures

  3. Service failure Network path failure Service replica Monitoring Problem Statement • Goals: • Quick recovery • Scalability Video-on-demand server Client

  4. Our Approach • Service infrastructure • Computer clusters deployed at several points on the Internet • For service replication & path monitoring • (path = path of the data stream in a client session)

  5. Summary of Related Work • Fail-over within a single cluster • Active Services (e.g., video transcoding) • TACC (web-proxy) • Only service failure is handled • Web mirror selection • E.g., SPAND • Does not handle failure during a session • Fail-over for network paths • Internet route recovery • Not quick enough • ATM, Telephone networks, MPLS • No mechanism for wide-area Internet No architecture to address network path failure during a real-time session

  6. Research Challenges • Wide-area monitoring • Feasibility: how quickly and reliably can failures be detected? • Efficiency: per-session monitoring would impose significant overhead • Architecture • Who monitors who? • How are services replicated? • What is the mechanism for fail-over?

  7. Is Wide-Area Monitoring Feasible? Monitoring for liveness of path using keep-alive heartbeat Time Failure: detected by timeout Time Timeout period False-positive: failure detected incorrectly Time Timeout period There’s a trade-off between time-to-detection and rate of false-positives

  8. Is Wide-Area Monitoring Feasible? • False-positives due to: • Simultaneous losses • Sudden increase in RTT • Related studies: • Internet RTT study, Acharya & Saltz, UMD 1996 • RTT spikes are isolated • TCP RTO study, Allman & Paxson, SIGCOMM 1999 • Significant RTT increase is quite transient • Our experiments: • Ping data from ping servers • UDP heartbeats between Internet hosts

  9. Ping measurements Berkeley HTTP 1 • Ping servers • 12 geographically distributed servers chosen • Approximation of a keep-alive stream • Count number of loss runs with > 4 simultaneous losses • Could be an actual failure or just intermittant losses • If we have 1 second HBs, and timeout after losing 4 HBs • This count gives the upper bound on the number of false-positives 3 Ping server ICMP 2 Internet host

  10. Ping measurements

  11. UDP-based keep-alive stream • Geographically distributed hosts: • Berkeley, Stanford, UIUC, TU-Berlin, UNSW • UDP heart-beat every 300ms • Measure gaps between receipt of successive heart-beats • False positive: • No heartbeat received for > 2 seconds, but received before 30 seconds • Failure: • No HB for > 30 seconds

  12. UDP-based keep-alive stream

  13. What does this mean? • If we have a failure detection scheme • Timeout of 2 sec • False positives can be as low as once a day • For many pairs of Internet hosts • In comparison, BGP route recovery: • > 30 seconds • Can take upto 10s of minutes, Labovitz & Ahuja, SIGCOMM 2000 • Worse with multi-homing (an increasing trend)

  14. Architectural Requirements • Efficiency: • Monitoring per-session  too much overhead • End-to-end  more latency  monitoring less effective • Need aggregation • Client-side aggregation: using a SPAND-like server • Server-side aggregation: clusters • But not all clients have the same server & vice-versa • Service infrastructure to address this • Several service clusters on the Internet

  15. Internet Source Client Keep-alive stream Service cluster: Compute cluster capable of running services Architecture Overlay topology Nodes = service clusters Links = monitoring channels between clusters Source  Client Routed via monitored paths Could go through an intermediate service Local recovery using a backup path

  16. Architecture Monitoring within cluster for process/machine failure Monitoring across clusters for network path failure Peering of service clusters – to server as backups for one another – or for monitoring the path between them

  17. Architecture: Advantages • 2-level monitoring  • Process/machine failures still handled within cluster • Common failure cases do not require wide-area mechanisms • Aggregation of monitoring across clusters  • Efficiency • Model works for cascaded services as well S1 Source S2 Client

  18. Architecture: Potential Criticism • Potential criticism: • Does not handle resource reservation • Response: • Related issue, but could be orthogonal • Aggregated/hierarchical reservation schemes (e.g., the Clearing House) • Even if reservation is solved (or is not needed), we still need to address failures

  19. Architecture: Issues(1) Routing Internet Source Given the overlay topology, Need a routing algorithm to go from source to destination via service(s) Also need: (a) WA-SDS, (b) Closest service cluster to a given host Client

  20. Architecture: Issues(2) Topology How many service-clusters? (nodes) How many monitored-paths? (links)

  21. BGP Border router Peering session Heartbeats IP route Destination based Service infrastructure Service cluster Peering session Heartbeats Route across clusters Based on destination and intermediate service(s) Ideas on Routing Similarities with BGP

  22. S’ (San Francisco) S (Berkeley) Ideas on Routing • How is routing in the overlay topology different from BGP? • Overlay topology • More freedom than physical topology • Constraints on graph can be imposed more easily • For example, can have local recovery

  23. Ideas on Routing Source AP1 AP2 Client AP3 BGP exchanges a lot of information Increases with the number of APs – O(100,000) Service clusters need to exchange very little information Problems of (a) Service discovery (b) Knowing the nearest service cluster – are decoupled from routing

  24. Ideas on Routing • BGP routers do not have per-session state • But, service clusters can maintain per-session state • Can have local recovery pre-determined for each session • Finally, we probably don’t need to have as many nodes as the number of BGP routers • can have more aggressive routing algorithm It is feasible to have fail-over in the overlay topology quicker than Internet route recovery with BGP

  25. Ideas on topology • Decision criteria • Additional latency for client session • Sparse topology  more latency • Monitoring overhead • Many monitored paths  more overhead, but additional flexibility  might reduce end-to-end latency

  26. Ideas on topology • Number of nodes: • Claim: upper bound is: one per AS • This will ensure that there is a service cluster “close” to every client • Topology will be close to Internet backbone topology • Number of monitoring channels: • # AS: ~7000 as of 1999 • Monitoring overhead: ~100 Bytes/sec • Can have ~1000 peering sessions per service cluster easily • Dense overlay topology possible (to minimize additional latency)

  27. Implementation + Performance Service cluster Exchange of session-information + Monitoring heartbeat Peer cluster Manager node

  28. Implementation + Performance • PCM  GSM codec service • Overhead of “hot” backup service: 1 process (idle) • Service reinstantiation time: ~100ms • End-to-end recovery over wide-area: three components Detection time – O(2sec) Communication with replica – RTT – O(100ms) Replica activation: ~100ms

  29. Implementation + Performance • Overhead of monitoring • Keep-alive heartbeat • One per 300ms in our implementation • O(100Bytes/sec) • Overhead of false-positive in failure detection • Session transfer • One message exchange across adjacent clusters • Few hundred bytes

  30. Summary • Real-time applications with long-lived sessions • No support exists for path-failure recovery (unlike say, the PSTN) • Service infrastructure to provide this support • Wide-area monitoring for path liveness: • O(2sec) failure detection with low rate of false positives • Peering and replication model for quick fail-over • Interesting issues: • Nature of overlay topology • Algorithms for routing and fail-over

  31. Questions • What are the factors in deciding topology? • Strategies for handling failure near client/source • Currently deployed mechanisms for RIP/OSPF, ATM/MPLS failure recovery • What is the time to recover? • Applications: video/audio streaming • More? • Games? Proxies for games on hand-held devices? http://www.cs.berkeley.edu/~bhaskar (Presentation running under VMWare under Linux)

More Related